1
|
Kleinova R, Rajendra V, Leuchtenberger AF, Lo Giudice C, Vesely C, Kapoor U, Tanzer A, Derdak S, Picardi E, Jantsch MF. The ADAR1 editome reveals drivers of editing-specificity for ADAR1-isoforms. Nucleic Acids Res 2023; 51:4191-4207. [PMID: 37026479 PMCID: PMC10201426 DOI: 10.1093/nar/gkad265] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 03/10/2023] [Accepted: 04/06/2023] [Indexed: 04/08/2023] Open
Abstract
Adenosine deaminase acting on RNA ADAR1 promotes A-to-I conversion in double-stranded and structured RNAs. ADAR1 has two isoforms transcribed from different promoters: cytoplasmic ADAR1p150 is interferon-inducible while ADAR1p110 is constitutively expressed and primarily localized in the nucleus. Mutations in ADAR1 cause Aicardi - Goutières syndrome (AGS), a severe autoinflammatory disease associated with aberrant IFN production. In mice, deletion of ADAR1 or the p150 isoform leads to embryonic lethality driven by overexpression of interferon-stimulated genes. This phenotype is rescued by deletion of the cytoplasmic dsRNA-sensor MDA5 indicating that the p150 isoform is indispensable and cannot be rescued by ADAR1p110. Nevertheless, editing sites uniquely targeted by ADAR1p150 remain elusive. Here, by transfection of ADAR1 isoforms into ADAR-less mouse cells we detect isoform-specific editing patterns. Using mutated ADAR variants, we test how intracellular localization and the presence of a Z-DNA binding domain-α affect editing preferences. These data show that ZBDα only minimally contributes to p150 editing-specificity while isoform-specific editing is primarily directed by the intracellular localization of ADAR1 isoforms. Our study is complemented by RIP-seq on human cells ectopically expressing tagged-ADAR1 isoforms. Both datasets reveal enrichment of intronic editing and binding by ADAR1p110 while ADAR1p150 preferentially binds and edits 3'UTRs.
Collapse
Affiliation(s)
- Renata Kleinova
- Center for Anatomy and Cell Biology, Division of Cell and Developmental Biology, Medical University of Vienna, Schwarzspanierstrasse 17, A-1090 Vienna, Austria
| | - Vinod Rajendra
- Center for Anatomy and Cell Biology, Division of Cell and Developmental Biology, Medical University of Vienna, Schwarzspanierstrasse 17, A-1090 Vienna, Austria
| | - Alina F Leuchtenberger
- Center for Integrative Bioinformatics Vienna (CIBIV) Max Perutz Labs, University of Vienna and Medical University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Claudio Lo Giudice
- Department of Bioscience, Biotechnology and Biopharmaceutics, University of Bari Aldo Moro, University Campus “Ernesto Quagliariello”, Via Orabona 4, Bari, Italy
| | - Cornelia Vesely
- Center for Anatomy and Cell Biology, Division of Cell and Developmental Biology, Medical University of Vienna, Schwarzspanierstrasse 17, A-1090 Vienna, Austria
| | - Utkarsh Kapoor
- Center for Anatomy and Cell Biology, Division of Cell and Developmental Biology, Medical University of Vienna, Schwarzspanierstrasse 17, A-1090 Vienna, Austria
| | - Andrea Tanzer
- Center for Anatomy and Cell Biology, Division of Cell and Developmental Biology, Medical University of Vienna, Schwarzspanierstrasse 17, A-1090 Vienna, Austria
| | - Sophia Derdak
- Core Facilities Medical University of Vienna, Spitalgasse 23, A-1090 Vienna, Austria
| | - Ernesto Picardi
- Department of Bioscience, Biotechnology and Biopharmaceutics, University of Bari Aldo Moro, University Campus “Ernesto Quagliariello”, Via Orabona 4, Bari, Italy
- Institute of Biomembranes and Bioenergetics (IBBE), National Research Council (CNR), Via Amendola 122, Bari, Italy
| | - Michael F Jantsch
- Center for Anatomy and Cell Biology, Division of Cell and Developmental Biology, Medical University of Vienna, Schwarzspanierstrasse 17, A-1090 Vienna, Austria
| |
Collapse
|
2
|
Hess L, Moos V, Lauber AA, Reiter W, Schuster M, Hartl N, Lackner D, Boenke T, Koren A, Guzzardo PM, Gundacker B, Riegler A, Vician P, Miccolo C, Leiter S, Chandrasekharan MB, Vcelkova T, Tanzer A, Jun JQ, Bradner J, Brosch G, Hartl M, Bock C, Bürckstümmer T, Kubicek S, Chiocca S, Bhaskara S, Seiser C. A toolbox for class I HDACs reveals isoform specific roles in gene regulation and protein acetylation. PLoS Genet 2022; 18:e1010376. [PMID: 35994477 PMCID: PMC9436093 DOI: 10.1371/journal.pgen.1010376] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 09/01/2022] [Accepted: 08/06/2022] [Indexed: 02/07/2023] Open
Abstract
The class I histone deacetylases are essential regulators of cell fate decisions in health and disease. While pan- and class-specific HDAC inhibitors are available, these drugs do not allow a comprehensive understanding of individual HDAC function, or the therapeutic potential of isoform-specific targeting. To systematically compare the impact of individual catalytic functions of HDAC1, HDAC2 and HDAC3, we generated human HAP1 cell lines expressing catalytically inactive HDAC enzymes. Using this genetic toolbox we compare the effect of individual HDAC inhibition with the effects of class I specific inhibitors on cell viability, protein acetylation and gene expression. Individual inactivation of HDAC1 or HDAC2 has only mild effects on cell viability, while HDAC3 inactivation or loss results in DNA damage and apoptosis. Inactivation of HDAC1/HDAC2 led to increased acetylation of components of the COREST co-repressor complex, reduced deacetylase activity associated with this complex and derepression of neuronal genes. HDAC3 controls the acetylation of nuclear hormone receptor associated proteins and the expression of nuclear hormone receptor regulated genes. Acetylation of specific histone acetyltransferases and HDACs is sensitive to inactivation of HDAC1/HDAC2. Over a wide range of assays, we determined that in particular HDAC1 or HDAC2 catalytic inactivation mimics class I specific HDAC inhibitors. Importantly, we further demonstrate that catalytic inactivation of HDAC1 or HDAC2 sensitizes cells to specific cancer drugs. In summary, our systematic study revealed isoform-specific roles of HDAC1/2/3 catalytic functions. We suggest that targeted genetic inactivation of particular isoforms effectively mimics pharmacological HDAC inhibition allowing the identification of relevant HDACs as targets for therapeutic intervention.
Collapse
Affiliation(s)
- Lena Hess
- Center for Anatomy and Cell Biology, Medical University of Vienna, Vienna, Austria
| | - Verena Moos
- Center for Anatomy and Cell Biology, Medical University of Vienna, Vienna, Austria
| | - Arnel A. Lauber
- Center for Anatomy and Cell Biology, Medical University of Vienna, Vienna, Austria
| | - Wolfgang Reiter
- Mass Spectrometry Core Facility, Max Perutz Labs, Vienna BioCenter, Vienna, Austria
- Department of Biochemistry and Cell Biology, Max Perutz Labs, University of Vienna, Vienna BioCenter, Vienna, Austria
| | - Michael Schuster
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Natascha Hartl
- Mass Spectrometry Core Facility, Max Perutz Labs, Vienna BioCenter, Vienna, Austria
| | | | - Thorina Boenke
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Anna Koren
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | | | - Brigitte Gundacker
- Center for Anatomy and Cell Biology, Medical University of Vienna, Vienna, Austria
| | - Anna Riegler
- Center for Anatomy and Cell Biology, Medical University of Vienna, Vienna, Austria
| | - Petra Vician
- Center for Anatomy and Cell Biology, Medical University of Vienna, Vienna, Austria
| | - Claudia Miccolo
- Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan, Italy
| | - Susanna Leiter
- Center for Anatomy and Cell Biology, Medical University of Vienna, Vienna, Austria
| | - Mahesh B. Chandrasekharan
- Department of Radiation Oncology and Huntsman Cancer Institute, University of Utah School of Medicine, Salt Lake City, Utah, United States of America
| | - Terezia Vcelkova
- Center for Anatomy and Cell Biology, Medical University of Vienna, Vienna, Austria
| | - Andrea Tanzer
- Center for Anatomy and Cell Biology, Medical University of Vienna, Vienna, Austria
| | - Jun Qi Jun
- Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America
| | - James Bradner
- Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America
| | - Gerald Brosch
- Institute of Molecular Biology, Innsbruck Medical University, Innsbruck, Austria
| | - Markus Hartl
- Mass Spectrometry Core Facility, Max Perutz Labs, Vienna BioCenter, Vienna, Austria
- Department of Biochemistry and Cell Biology, Max Perutz Labs, University of Vienna, Vienna BioCenter, Vienna, Austria
| | - Christoph Bock
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
- Institute of Artificial Intelligence, Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| | | | - Stefan Kubicek
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Susanna Chiocca
- Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan, Italy
| | - Srividya Bhaskara
- Department of Radiation Oncology and Huntsman Cancer Institute, University of Utah School of Medicine, Salt Lake City, Utah, United States of America
| | - Christian Seiser
- Center for Anatomy and Cell Biology, Medical University of Vienna, Vienna, Austria
| |
Collapse
|
3
|
Entzian G, Hofacker I, Ponty Y, Lorenz R, Tanzer A. RNAxplorer: Harnessing the Power of Guiding Potentials to Sample RNA Landscapes. Bioinformatics 2021; 37:2126-2133. [PMID: 33538792 PMCID: PMC8352504 DOI: 10.1093/bioinformatics/btab066] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 12/16/2020] [Accepted: 02/02/2021] [Indexed: 11/13/2022] Open
Abstract
Motivation Predicting the folding dynamics of RNAs is a computationally difficult problem, first and foremost due to the combinatorial explosion of alternative structures in the folding space. Abstractions are therefore needed to simplify downstream analyses, and thus make them computationally tractable. This can be achieved by various structure sampling algorithms. However, current sampling methods are still time consuming and frequently fail to represent key elements of the folding space. Method We introduce RNAxplorer, a novel adaptive sampling method to efficiently explore the structure space of RNAs. RNAxplorer uses dynamic programming to perform an efficient Boltzmann sampling in the presence of guiding potentials, which are accumulated into pseudo-energy terms and reflect similarity to already well-sampled structures. This way, we effectively steer sampling toward underrepresented or unexplored regions of the structure space. Results We developed and applied different measures to benchmark our sampling methods against its competitors. Most of the measures show that RNAxplorer produces more diverse structure samples, yields rare conformations that may be inaccessible to other sampling methods and is better at finding the most relevant kinetic traps in the landscape. Thus, it produces a more representative coarse graining of the landscape, which is well suited to subsequently compute better approximations of RNA folding kinetics. Availabilityand implementation https://github.com/ViennaRNA/RNAxplorer/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gregor Entzian
- Faculty of Chemistry, Department of Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Ivo Hofacker
- Faculty of Chemistry, Department of Theoretical Chemistry, University of Vienna, Vienna, Austria.,Faculty of Computer Science, Bioinformatics and Computational Biology, University of Vienna, Vienna, Austria
| | - Yann Ponty
- LIX, CNRS UMR 7161, Ecole Polytechnique, Institut Polytechnique de Paris, France
| | - Ronny Lorenz
- Faculty of Chemistry, Department of Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Andrea Tanzer
- Faculty of Chemistry, Department of Theoretical Chemistry, University of Vienna, Vienna, Austria.,Center for Anatomy and Cell Biology, Division of Cell and Developmental Biology, Medical University of Vienna, Vienna, Austria
| |
Collapse
|
4
|
Deforges J, Reis RS, Jacquet P, Sheppard S, Gadekar VP, Hart-Smith G, Tanzer A, Hofacker IL, Iseli C, Xenarios I, Poirier Y. Control of Cognate Sense mRNA Translation by cis-Natural Antisense RNAs. Plant Physiol 2019; 180:305-322. [PMID: 30760640 PMCID: PMC6501089 DOI: 10.1104/pp.19.00043] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 02/03/2019] [Indexed: 05/06/2023]
Abstract
Cis-Natural Antisense Transcripts (cis-NATs), which overlap protein coding genes and are transcribed from the opposite DNA strand, constitute an important group of noncoding RNAs. Whereas several examples of cis-NATs regulating the expression of their cognate sense gene are known, most cis-NATs function by altering the steady-state level or structure of mRNA via changes in transcription, mRNA stability, or splicing, and very few cases involve the regulation of sense mRNA translation. This study was designed to systematically search for cis-NATs influencing cognate sense mRNA translation in Arabidopsis (Arabidopsis thaliana). Establishment of a pipeline relying on sequencing of total polyA+ and polysomal RNA from Arabidopsis grown under various conditions (i.e. nutrient deprivation and phytohormone treatments) allowed the identification of 14 cis-NATs whose expression correlated either positively or negatively with cognate sense mRNA translation. With use of a combination of cis-NAT stable over-expression in transgenic plants and transient expression in protoplasts, the impact of cis-NAT expression on mRNA translation was confirmed for 4 out of 5 tested cis-NAT:sense mRNA pairs. These results expand the number of cis-NATs known to regulate cognate sense mRNA translation and provide a foundation for future studies of their mode of action. Moreover, this study highlights the role of this class of noncoding RNAs in translation regulation.
Collapse
Affiliation(s)
- Jules Deforges
- Department of Plant Molecular Biology, University of Lausanne, Biophore Building, CH-1015 Lausanne, Switzerland
| | - Rodrigo S Reis
- Department of Plant Molecular Biology, University of Lausanne, Biophore Building, CH-1015 Lausanne, Switzerland
| | - Philippe Jacquet
- Department of Plant Molecular Biology, University of Lausanne, Biophore Building, CH-1015 Lausanne, Switzerland
| | - Shaoline Sheppard
- Department of Plant Molecular Biology, University of Lausanne, Biophore Building, CH-1015 Lausanne, Switzerland
| | - Veerendra P Gadekar
- Institute of Theoretical Chemistry, University of Vienna, Wahringer Str 17, A-1090 Vienna, Austria
| | - Gene Hart-Smith
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney NSW 2052, Australia
| | - Andrea Tanzer
- Institute of Theoretical Chemistry, University of Vienna, Wahringer Str 17, A-1090 Vienna, Austria
| | - Ivo L Hofacker
- Institute of Theoretical Chemistry, University of Vienna, Wahringer Str 17, A-1090 Vienna, Austria
| | - Christian Iseli
- Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
| | - Ioannis Xenarios
- Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
| | - Yves Poirier
- Department of Plant Molecular Biology, University of Lausanne, Biophore Building, CH-1015 Lausanne, Switzerland
| |
Collapse
|
5
|
Tanzer A, Hofacker IL, Lorenz R. RNA modifications in structure prediction - Status quo and future challenges. Methods 2018; 156:32-39. [PMID: 30385321 DOI: 10.1016/j.ymeth.2018.10.019] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Revised: 10/12/2018] [Accepted: 10/26/2018] [Indexed: 01/01/2023] Open
Abstract
Chemical modifications of RNA nucleotides change their identity and characteristics and thus alter genetic and structural information encoded in the genomic DNA. tRNA and rRNA are probably the most heavily modified genes, and often depend on derivatization or isomerization of their nucleobases in order to correctly fold into their functional structures. Recent RNomics studies, however, report transcriptome wide RNA modification and suggest a more general regulation of structuredness of RNAs by this so called epitranscriptome. Modification seems to require specific substrate structures, which in turn are stabilized or destabilized and thus promote or inhibit refolding events of regulatory RNA structures. In this review, we revisit RNA modifications and the related structures from a computational point of view. We discuss known substrate structures, their properties such as sub-motifs as well as consequences of modifications on base pairing patterns and possible refolding events. Given that efficient RNA structure prediction methods for canonical base pairs have been established several decades ago, we review to what extend these methods allow the inclusion of modified nucleotides to model and study epitranscriptomic effects on RNA structures.
Collapse
Affiliation(s)
- Andrea Tanzer
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Waehringerstrasse 17, 1090 Vienna, Austria
| | - Ivo L Hofacker
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Waehringerstrasse 17, 1090 Vienna, Austria; Research Group Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Waehringerstrasse 29, 1090 Vienna, Austria
| | - Ronny Lorenz
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Waehringerstrasse 17, 1090 Vienna, Austria
| |
Collapse
|
6
|
Thiel BC, Ochsenreiter R, Gadekar VP, Tanzer A, Hofacker IL. RNA Structure Elements Conserved between Mouse and 59 Other Vertebrates. Genes (Basel) 2018; 9:E392. [PMID: 30071678 PMCID: PMC6116022 DOI: 10.3390/genes9080392] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Revised: 07/25/2018] [Accepted: 07/27/2018] [Indexed: 12/24/2022] Open
Abstract
In this work, we present a computational screen conducted for functional RNA structures, resulting in over 100,000 conserved RNA structure elements found in alignments of mouse (mm10) against 59 other vertebrates. We explicitly included masked repeat regions to explore the potential of transposable elements and low-complexity regions to give rise to regulatory RNA elements. In our analysis pipeline, we implemented a four-step procedure: (i) we screened genome-wide alignments for potential structure elements using RNAz-2, (ii) realigned and refined candidate loci with LocARNA-P, (iii) scored candidates again with RNAz-2 in structure alignment mode, and (iv) searched for additional homologous loci in mouse genome that were not covered by genome alignments. The 3'-untranslated regions (3'-UTRs) of protein-coding genes and small noncoding RNAs are enriched for structures, while coding sequences are depleted. Repeat-associated loci make up about 95% of the homologous loci identified and are, as expected, predominantly found in intronic and intergenic regions. Nevertheless, we report the structure elements enriched in specific genome elements, such as 3'-UTRs and long noncoding RNAs (lncRNAs). We provide full access to our results via a custom UCSC genome browser trackhub freely available on our website (http://rna.tbi.univie.ac.at/trackhubs/#RNAz).
Collapse
Affiliation(s)
- Bernhard C Thiel
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria.
| | - Roman Ochsenreiter
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria.
| | - Veerendra P Gadekar
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria.
| | - Andrea Tanzer
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria.
| | - Ivo L Hofacker
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria.
- Research Group Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Währingerstraße 29, 1090 Wien, Austria.
| |
Collapse
|
7
|
Tajaddod M, Tanzer A, Licht K, Wolfinger MT, Badelt S, Huber F, Pusch O, Schopoff S, Janisiw M, Hofacker I, Jantsch MF. Transcriptome-wide effects of inverted SINEs on gene expression and their impact on RNA polymerase II activity. Genome Biol 2016; 17:220. [PMID: 27782844 PMCID: PMC5080714 DOI: 10.1186/s13059-016-1083-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2016] [Accepted: 10/10/2016] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Short interspersed elements (SINEs) represent the most abundant group of non-long-terminal repeat transposable elements in mammalian genomes. In primates, Alu elements are the most prominent and homogenous representatives of SINEs. Due to their frequent insertion within or close to coding regions, SINEs have been suggested to play a crucial role during genome evolution. Moreover, Alu elements within mRNAs have also been reported to control gene expression at different levels. RESULTS Here, we undertake a genome-wide analysis of insertion patterns of human Alus within transcribed portions of the genome. Multiple, nearby insertions of SINEs within one transcript are more abundant in tandem orientation than in inverted orientation. Indeed, analysis of transcriptome-wide expression levels of 15 ENCODE cell lines suggests a cis-repressive effect of inverted Alu elements on gene expression. Using reporter assays, we show that the negative effect of inverted SINEs on gene expression is independent of known sensors of double-stranded RNAs. Instead, transcriptional elongation seems impaired, leading to reduced mRNA levels. CONCLUSIONS Our study suggests that there is a bias against multiple SINE insertions that can promote intramolecular base pairing within a transcript. Moreover, at a genome-wide level, mRNAs harboring inverted SINEs are less expressed than mRNAs harboring single or tandemly arranged SINEs. Finally, we demonstrate a novel mechanism by which inverted SINEs can impact on gene expression by interfering with RNA polymerase II.
Collapse
Affiliation(s)
- Mansoureh Tajaddod
- Department of Chromosome Biology, Max F. Perutz Laboratories, University of Vienna, Dr. Bohr Gasse 9/5, Vienna, A-1030, Austria
| | - Andrea Tanzer
- Institute for Theoretical Chemistry, University of Vienna, Währinger Strasse 17, Vienna, A-1090, Austria
| | - Konstantin Licht
- Department of Cell and Developmental Biology, Medical University of Vienna, Schwarzspanierstrasse 17, Vienna, A-1090, Austria
| | - Michael T Wolfinger
- Department of Cell and Developmental Biology, Medical University of Vienna, Schwarzspanierstrasse 17, Vienna, A-1090, Austria
- Institute for Theoretical Chemistry, University of Vienna, Währinger Strasse 17, Vienna, A-1090, Austria
| | - Stefan Badelt
- Institute for Theoretical Chemistry, University of Vienna, Währinger Strasse 17, Vienna, A-1090, Austria
| | - Florian Huber
- Department of Chromosome Biology, Max F. Perutz Laboratories, University of Vienna, Dr. Bohr Gasse 9/5, Vienna, A-1030, Austria
- Present address: Center for molecular biology of the University Heidelberg, Im Neuenheimer Feld 282, Heidelberg, D-69120, Germany
| | - Oliver Pusch
- Department of Cell and Developmental Biology, Medical University of Vienna, Schwarzspanierstrasse 17, Vienna, A-1090, Austria
| | - Sandy Schopoff
- Department of Chromosome Biology, Max F. Perutz Laboratories, University of Vienna, Dr. Bohr Gasse 9/5, Vienna, A-1030, Austria
| | - Michael Janisiw
- Department of Cell and Developmental Biology, Medical University of Vienna, Schwarzspanierstrasse 17, Vienna, A-1090, Austria
| | - Ivo Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Währinger Strasse 17, Vienna, A-1090, Austria
| | - Michael F Jantsch
- Department of Cell and Developmental Biology, Medical University of Vienna, Schwarzspanierstrasse 17, Vienna, A-1090, Austria.
- Department of Cell and Developmental Biology, Medical University of Vienna, Center of Anatomy and Cell Biology, Schwarzspanierstrasse 17, Vienna, A-1090, Austria.
| |
Collapse
|
8
|
Lagarde J, Uszczynska-Ratajczak B, Santoyo-Lopez J, Gonzalez JM, Tapanari E, Mudge JM, Steward CA, Wilming L, Tanzer A, Howald C, Chrast J, Vela-Boza A, Rueda A, Lopez-Domingo FJ, Dopazo J, Reymond A, Guigó R, Harrow J. Extension of human lncRNA transcripts by RACE coupled with long-read high-throughput sequencing (RACE-Seq). Nat Commun 2016; 7:12339. [PMID: 27531712 PMCID: PMC4992054 DOI: 10.1038/ncomms12339] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Accepted: 06/23/2016] [Indexed: 12/22/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) constitute a large, yet mostly uncharacterized fraction of the mammalian transcriptome. Such characterization requires a comprehensive, high-quality annotation of their gene structure and boundaries, which is currently lacking. Here we describe RACE-Seq, an experimental workflow designed to address this based on RACE (rapid amplification of cDNA ends) and long-read RNA sequencing. We apply RACE-Seq to 398 human lncRNA genes in seven tissues, leading to the discovery of 2,556 on-target, novel transcripts. About 60% of the targeted loci are extended in either 5′ or 3′, often reaching genomic hallmarks of gene boundaries. Analysis of the novel transcripts suggests that lncRNAs are as long, have as many exons and undergo as much alternative splicing as protein-coding genes, contrary to current assumptions. Overall, we show that RACE-Seq is an effective tool to annotate an organism's deep transcriptome, and compares favourably to other targeted sequencing techniques. Long non-coding RNAs are increasingly recognised to be important factors in regulating cellular processes and comprise a large faction of the transcriptome, however most are uncharacterised. Here the authors present RACE-Seq, a tool to improve and extend the annotation of low-expression transcripts.
Collapse
Affiliation(s)
- Julien Lagarde
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr Aiguader 88, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Barbara Uszczynska-Ratajczak
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr Aiguader 88, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | | | | | - Electra Tapanari
- Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire CB10 1HH, UK
| | - Jonathan M Mudge
- Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire CB10 1HH, UK
| | - Charles A Steward
- Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire CB10 1HH, UK
| | - Laurens Wilming
- Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire CB10 1HH, UK
| | - Andrea Tanzer
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr Aiguader 88, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Cédric Howald
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | - Jacqueline Chrast
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | - Alicia Vela-Boza
- Genomics and Bioinformatics Platform of Andalusia (GBPA), 41092 Seville, Spain.,Roche Diagnostics, 08174 Sant Cugat Del Vallès, Barcelona, Spain
| | - Antonio Rueda
- Genomics and Bioinformatics Platform of Andalusia (GBPA), 41092 Seville, Spain
| | | | - Joaquin Dopazo
- Genomics and Bioinformatics Platform of Andalusia (GBPA), 41092 Seville, Spain.,Computational Genomics Department, Centro de Investigación Príncipe Felipe, 46012 Valencia, Spain.,Functional Genomics Node (INB), Centro de Investigación Príncipe Felipe, 46012 Valencia, Spain
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr Aiguader 88, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Jennifer Harrow
- Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire CB10 1HH, UK
| |
Collapse
|
9
|
Sedlyarov V, Fallmann J, Ebner F, Huemer J, Sneezum L, Ivin M, Kreiner K, Tanzer A, Vogl C, Hofacker I, Kovarik P. Tristetraprolin binding site atlas in the macrophage transcriptome reveals a switch for inflammation resolution. Mol Syst Biol 2016; 12:868. [PMID: 27178967 PMCID: PMC4988506 DOI: 10.15252/msb.20156628] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Precise regulation of mRNA decay is fundamental for robust yet not exaggerated inflammatory responses to pathogens. However, a global model integrating regulation and functional consequences of inflammation‐associated mRNA decay remains to be established. Using time‐resolved high‐resolution RNA binding analysis of the mRNA‐destabilizing protein tristetraprolin (TTP), an inflammation‐limiting factor, we qualitatively and quantitatively characterize TTP binding positions in the transcriptome of immunostimulated macrophages. We identify pervasive destabilizing and non‐destabilizing TTP binding, including a robust intronic binding, showing that TTP binding is not sufficient for mRNA destabilization. A low degree of flanking RNA structuredness distinguishes occupied from silent binding motifs. By functionally relating TTP binding sites to mRNA stability and levels, we identify a TTP‐controlled switch for the transition from inflammatory into the resolution phase of the macrophage immune response. Mapping of binding positions of the mRNA‐stabilizing protein HuR reveals little target and functional overlap with TTP, implying a limited co‐regulation of inflammatory mRNA decay by these proteins. Our study establishes a functionally annotated and navigable transcriptome‐wide atlas (http://ttp-atlas.univie.ac.at) of cis‐acting elements controlling mRNA decay in inflammation.
Collapse
Affiliation(s)
- Vitaly Sedlyarov
- Max F. Perutz Laboratories, University of Vienna, Vienna, Austria
| | - Jörg Fallmann
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Florian Ebner
- Max F. Perutz Laboratories, University of Vienna, Vienna, Austria
| | - Jakob Huemer
- Max F. Perutz Laboratories, University of Vienna, Vienna, Austria
| | - Lucy Sneezum
- Max F. Perutz Laboratories, University of Vienna, Vienna, Austria
| | - Masa Ivin
- Max F. Perutz Laboratories, University of Vienna, Vienna, Austria
| | - Kristina Kreiner
- Max F. Perutz Laboratories, University of Vienna, Vienna, Austria
| | - Andrea Tanzer
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Claus Vogl
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Vienna, Vienna, Austria
| | - Ivo Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria Research Group Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Vienna, Austria Center for non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg C, Denmark
| | - Pavel Kovarik
- Max F. Perutz Laboratories, University of Vienna, Vienna, Austria
| |
Collapse
|
10
|
Lorenz R, Wolfinger MT, Tanzer A, Hofacker IL. Predicting RNA secondary structures from sequence and probing data. Methods 2016; 103:86-98. [PMID: 27064083 DOI: 10.1016/j.ymeth.2016.04.004] [Citation(s) in RCA: 66] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Revised: 03/29/2016] [Accepted: 04/04/2016] [Indexed: 01/08/2023] Open
Abstract
RNA secondary structures have proven essential for understanding the regulatory functions performed by RNA such as microRNAs, bacterial small RNAs, or riboswitches. This success is in part due to the availability of efficient computational methods for predicting RNA secondary structures. Recent advances focus on dealing with the inherent uncertainty of prediction by considering the ensemble of possible structures rather than the single most stable one. Moreover, the advent of high-throughput structural probing has spurred the development of computational methods that incorporate such experimental data as auxiliary information.
Collapse
Affiliation(s)
- Ronny Lorenz
- University of Vienna, Faculty of Chemistry, Department of Theoretical Chemistry, Währingerstrasse 17, 1090 Vienna, Austria.
| | - Michael T Wolfinger
- University of Vienna, Faculty of Chemistry, Department of Theoretical Chemistry, Währingerstrasse 17, 1090 Vienna, Austria; Medical University of Vienna, Center for Anatomy and Cell Biology, Währingerstraße 13, 1090 Vienna, Austria.
| | - Andrea Tanzer
- University of Vienna, Faculty of Chemistry, Department of Theoretical Chemistry, Währingerstrasse 17, 1090 Vienna, Austria.
| | - Ivo L Hofacker
- University of Vienna, Faculty of Chemistry, Department of Theoretical Chemistry, Währingerstrasse 17, 1090 Vienna, Austria; University of Vienna, Faculty of Computer Science, Research Group Bioinformatics and Computational Biology, Währingerstr. 29, 1090 Vienna, Austria.
| |
Collapse
|
11
|
Fallmann J, Sedlyarov V, Tanzer A, Kovarik P, Hofacker IL. AREsite2: an enhanced database for the comprehensive investigation of AU/GU/U-rich elements. Nucleic Acids Res 2015; 44:D90-5. [PMID: 26602692 PMCID: PMC4702876 DOI: 10.1093/nar/gkv1238] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Accepted: 11/02/2015] [Indexed: 12/31/2022] Open
Abstract
AREsite2 represents an update for AREsite, an on-line resource for the investigation of AU-rich elements (ARE) in human and mouse mRNA 3′UTR sequences. The new updated and enhanced version allows detailed investigation of AU, GU and U-rich elements (ARE, GRE, URE) in the transcriptome of Homo sapiens, Mus musculus, Danio rerio, Caenorhabditis elegans and Drosophila melanogaster. It contains information on genomic location, genic context, RNA secondary structure context and conservation of annotated motifs. Improvements include annotation of motifs not only in 3′UTRs but in the whole gene body including introns, additional genomes, and locally stable secondary structures from genome wide scans. Furthermore, we include data from CLIP-Seq experiments in order to highlight motifs with validated protein interaction. Additionally, we provide a REST interface for experienced users to interact with the database in a semi-automated manner. The database is publicly available at: http://rna.tbi.univie.ac.at/AREsite
Collapse
Affiliation(s)
- Jörg Fallmann
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17/3, A-1090 Vienna, Austria
| | - Vitaly Sedlyarov
- Max F. Perutz Laboratories, University of Vienna, Dr. Bohr-Gasse 9, A-1030 Vienna, Austria
| | - Andrea Tanzer
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17/3, A-1090 Vienna, Austria
| | - Pavel Kovarik
- Max F. Perutz Laboratories, University of Vienna, Dr. Bohr-Gasse 9, A-1030 Vienna, Austria
| | - Ivo L Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17/3, A-1090 Vienna, Austria Research Group Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Währingerstraße 29, A-1090 Vienna, Austria Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| |
Collapse
|
12
|
Yue F, Cheng Y, Breschi A, Vierstra J, Wu W, Ryba T, Sandstrom R, Ma Z, Davis C, Pope BD, Shen Y, Pervouchine DD, Djebali S, Thurman RE, Kaul R, Rynes E, Kirilusha A, Marinov GK, Williams BA, Trout D, Amrhein H, Fisher-Aylor K, Antoshechkin I, DeSalvo G, See LH, Fastuca M, Drenkow J, Zaleski C, Dobin A, Prieto P, Lagarde J, Bussotti G, Tanzer A, Denas O, Li K, Bender MA, Zhang M, Byron R, Groudine MT, McCleary D, Pham L, Ye Z, Kuan S, Edsall L, Wu YC, Rasmussen MD, Bansal MS, Kellis M, Keller CA, Morrissey CS, Mishra T, Jain D, Dogan N, Harris RS, Cayting P, Kawli T, Boyle AP, Euskirchen G, Kundaje A, Lin S, Lin Y, Jansen C, Malladi VS, Cline MS, Erickson DT, Kirkup VM, Learned K, Sloan CA, Rosenbloom KR, Lacerda de Sousa B, Beal K, Pignatelli M, Flicek P, Lian J, Kahveci T, Lee D, Kent WJ, Ramalho Santos M, Herrero J, Notredame C, Johnson A, Vong S, Lee K, Bates D, Neri F, Diegel M, Canfield T, Sabo PJ, Wilken MS, Reh TA, Giste E, Shafer A, Kutyavin T, Haugen E, Dunn D, Reynolds AP, Neph S, Humbert R, Hansen RS, De Bruijn M, Selleri L, Rudensky A, Josefowicz S, Samstein R, Eichler EE, Orkin SH, Levasseur D, Papayannopoulou T, Chang KH, Skoultchi A, Gosh S, Disteche C, Treuting P, Wang Y, Weiss MJ, Blobel GA, Cao X, Zhong S, Wang T, Good PJ, Lowdon RF, Adams LB, Zhou XQ, Pazin MJ, Feingold EA, Wold B, Taylor J, Mortazavi A, Weissman SM, Stamatoyannopoulos JA, Snyder MP, Guigo R, Gingeras TR, Gilbert DM, Hardison RC, Beer MA, Ren B. A comparative encyclopedia of DNA elements in the mouse genome. Nature 2015; 515:355-64. [PMID: 25409824 PMCID: PMC4266106 DOI: 10.1038/nature13992] [Citation(s) in RCA: 1135] [Impact Index Per Article: 126.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2014] [Accepted: 10/24/2014] [Indexed: 12/11/2022]
Abstract
The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.
Collapse
Affiliation(s)
- Feng Yue
- 1] Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA. [2] Department of Biochemistry and Molecular Biology, College of Medicine, The Pennsylvania State University, Hershey, Pennsylvania 17033, USA
| | - Yong Cheng
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Alessandra Breschi
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Jeff Vierstra
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Weisheng Wu
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Tyrone Ryba
- Department of Biological Science, 319 Stadium Drive, Florida State University, Tallahassee, Florida 32306-4295, USA
| | - Richard Sandstrom
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Zhihai Ma
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Carrie Davis
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Benjamin D Pope
- Department of Biological Science, 319 Stadium Drive, Florida State University, Tallahassee, Florida 32306-4295, USA
| | - Yin Shen
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Dmitri D Pervouchine
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Sarah Djebali
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Robert E Thurman
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Rajinder Kaul
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Eric Rynes
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Anthony Kirilusha
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Georgi K Marinov
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Brian A Williams
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Diane Trout
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Henry Amrhein
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Katherine Fisher-Aylor
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Igor Antoshechkin
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Gilberto DeSalvo
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Lei-Hoon See
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Meagan Fastuca
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Jorg Drenkow
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Chris Zaleski
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Alex Dobin
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Pablo Prieto
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Julien Lagarde
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Giovanni Bussotti
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Andrea Tanzer
- 1] Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain. [2] Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Waehringerstrasse 17/3/303, A-1090 Vienna, Austria
| | - Olgert Denas
- Departments of Biology and Mathematics and Computer Science, Emory University, O. Wayne Rollins Research Center, 1510 Clifton Road NE, Atlanta, Georgia 30322, USA
| | - Kanwei Li
- Departments of Biology and Mathematics and Computer Science, Emory University, O. Wayne Rollins Research Center, 1510 Clifton Road NE, Atlanta, Georgia 30322, USA
| | - M A Bender
- 1] Department of Pediatrics, University of Washington, Seattle, Washington 98195, USA. [2] Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | - Miaohua Zhang
- Basic Science Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | - Rachel Byron
- Basic Science Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | - Mark T Groudine
- 1] Basic Science Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA. [2] Department of Radiation Oncology, University of Washington, Seattle, Washington 98195, USA
| | - David McCleary
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Long Pham
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Zhen Ye
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Samantha Kuan
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Lee Edsall
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Yi-Chieh Wu
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
| | - Matthew D Rasmussen
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
| | - Mukul S Bansal
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
| | - Manolis Kellis
- 1] Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA. [2] Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Cheryl A Keller
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Christapher S Morrissey
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Tejaswini Mishra
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Deepti Jain
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Nergiz Dogan
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Robert S Harris
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Philip Cayting
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Trupti Kawli
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Alan P Boyle
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Ghia Euskirchen
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Shin Lin
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Yiing Lin
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Camden Jansen
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, California 92697, USA
| | - Venkat S Malladi
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Melissa S Cline
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Drew T Erickson
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Vanessa M Kirkup
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Katrina Learned
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Cricket A Sloan
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Kate R Rosenbloom
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Beatriz Lacerda de Sousa
- Departments of Obstetrics/Gynecology and Pathology, and Center for Reproductive Sciences, University of California San Francisco, San Francisco, California 94143, USA
| | - Kathryn Beal
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Miguel Pignatelli
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jin Lian
- Yale University, Department of Genetics, PO Box 208005, 333 Cedar Street, New Haven, Connecticut 06520-8005, USA
| | - Tamer Kahveci
- Computer &Information Sciences &Engineering, University of Florida, Gainesville, Florida 32611, USA
| | - Dongwon Lee
- McKusick-Nathans Institute of Genetic Medicine and Department of Biomedical Engineering, Johns Hopkins University, 733 N. Broadway, BRB 573 Baltimore, Maryland 21205, USA
| | - W James Kent
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Miguel Ramalho Santos
- Departments of Obstetrics/Gynecology and Pathology, and Center for Reproductive Sciences, University of California San Francisco, San Francisco, California 94143, USA
| | - Javier Herrero
- 1] European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. [2] Bill Lyons Informatics Centre, UCL Cancer Institute, University College London, London WC1E 6DD, UK
| | - Cedric Notredame
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Audra Johnson
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Shinny Vong
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Kristen Lee
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Daniel Bates
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Fidencio Neri
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Morgan Diegel
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Theresa Canfield
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Peter J Sabo
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Matthew S Wilken
- Department of Biological Structure, University of Washington, HSB I-516, 1959 NE Pacific Street, Seattle, Washington 98195, USA
| | - Thomas A Reh
- Department of Biological Structure, University of Washington, HSB I-516, 1959 NE Pacific Street, Seattle, Washington 98195, USA
| | - Erika Giste
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Anthony Shafer
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Tanya Kutyavin
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Eric Haugen
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Douglas Dunn
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Alex P Reynolds
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Shane Neph
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Richard Humbert
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - R Scott Hansen
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Marella De Bruijn
- MRC Molecular Haemotology Unit, University of Oxford, Oxford OX3 9DS, UK
| | - Licia Selleri
- Department of Cell and Developmental Biology, Weill Cornell Medical College, New York, New York 10065, USA
| | - Alexander Rudensky
- HHMI and Ludwig Center at Memorial Sloan Kettering Cancer Center, Immunology Program, Memorial Sloan Kettering Cancer Canter, New York, New York 10065, USA
| | - Steven Josefowicz
- HHMI and Ludwig Center at Memorial Sloan Kettering Cancer Center, Immunology Program, Memorial Sloan Kettering Cancer Canter, New York, New York 10065, USA
| | - Robert Samstein
- HHMI and Ludwig Center at Memorial Sloan Kettering Cancer Center, Immunology Program, Memorial Sloan Kettering Cancer Canter, New York, New York 10065, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Stuart H Orkin
- Dana Farber Cancer Institute, Harvard Medical School, Cambridge, Massachusetts 02138, USA
| | - Dana Levasseur
- University of Iowa Carver College of Medicine, Department of Internal Medicine, Iowa City, Iowa 52242, USA
| | - Thalia Papayannopoulou
- Division of Hematology, Department of Medicine, University of Washington, Seattle, Washington 98195, USA
| | - Kai-Hsin Chang
- University of Iowa Carver College of Medicine, Department of Internal Medicine, Iowa City, Iowa 52242, USA
| | - Arthur Skoultchi
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, New York 10461, USA
| | - Srikanta Gosh
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, New York 10461, USA
| | - Christine Disteche
- Department of Pathology, University of Washington, Seattle, Washington 98195, USA
| | - Piper Treuting
- Department of Comparative Medicine, University of Washington, Seattle, Washington 98195, USA
| | - Yanli Wang
- Bioinformatics and Genomics program, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Mitchell J Weiss
- Department of Hematology, St Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Gerd A Blobel
- 1] Division of Hematology, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA. [2] Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Xiaoyi Cao
- Department of Bioengineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Sheng Zhong
- Department of Bioengineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Ting Wang
- Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63108, USA
| | - Peter J Good
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Rebecca F Lowdon
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Leslie B Adams
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Xiao-Qiao Zhou
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Michael J Pazin
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Elise A Feingold
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Barbara Wold
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - James Taylor
- Departments of Biology and Mathematics and Computer Science, Emory University, O. Wayne Rollins Research Center, 1510 Clifton Road NE, Atlanta, Georgia 30322, USA
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, California 92697, USA
| | - Sherman M Weissman
- Yale University, Department of Genetics, PO Box 208005, 333 Cedar Street, New Haven, Connecticut 06520-8005, USA
| | | | - Michael P Snyder
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Roderic Guigo
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Thomas R Gingeras
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - David M Gilbert
- Department of Biological Science, 319 Stadium Drive, Florida State University, Tallahassee, Florida 32306-4295, USA
| | - Ross C Hardison
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Michael A Beer
- McKusick-Nathans Institute of Genetic Medicine and Department of Biomedical Engineering, Johns Hopkins University, 733 N. Broadway, BRB 573 Baltimore, Maryland 21205, USA
| | - Bing Ren
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | | |
Collapse
|
13
|
Gerstein MB, Rozowsky J, Yan KK, Wang D, Cheng C, Brown JB, Davis CA, Hillier L, Sisu C, Li JJ, Pei B, Harmanci AO, Duff MO, Djebali S, Alexander RP, Alver BH, Auerbach R, Bell K, Bickel PJ, Boeck ME, Boley NP, Booth BW, Cherbas L, Cherbas P, Di C, Dobin A, Drenkow J, Ewing B, Fang G, Fastuca M, Feingold EA, Frankish A, Gao G, Good PJ, Guigó R, Hammonds A, Harrow J, Hoskins RA, Howald C, Hu L, Huang H, Hubbard TJP, Huynh C, Jha S, Kasper D, Kato M, Kaufman TC, Kitchen RR, Ladewig E, Lagarde J, Lai E, Leng J, Lu Z, MacCoss M, May G, McWhirter R, Merrihew G, Miller DM, Mortazavi A, Murad R, Oliver B, Olson S, Park PJ, Pazin MJ, Perrimon N, Pervouchine D, Reinke V, Reymond A, Robinson G, Samsonova A, Saunders GI, Schlesinger F, Sethi A, Slack FJ, Spencer WC, Stoiber MH, Strasbourger P, Tanzer A, Thompson OA, Wan KH, Wang G, Wang H, Watkins KL, Wen J, Wen K, Xue C, Yang L, Yip K, Zaleski C, Zhang Y, Zheng H, Brenner SE, Graveley BR, Celniker SE, Gingeras TR, Waterston R. Comparative analysis of the transcriptome across distant species. Nature 2014; 512:445-8. [PMID: 25164755 PMCID: PMC4155737 DOI: 10.1038/nature13424] [Citation(s) in RCA: 239] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2013] [Accepted: 04/30/2014] [Indexed: 12/30/2022]
Abstract
The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.
Collapse
Affiliation(s)
- Mark B Gerstein
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3] Department of Computer Science, Yale University, 51 Prospect Street, New Haven, Connecticut 06511, USA [4] [5]
| | - Joel Rozowsky
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3]
| | - Koon-Kiu Yan
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3]
| | - Daifeng Wang
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3]
| | - Chao Cheng
- 1] Department of Genetics, Geisel School of Medicine at Dartmouth, Hanover, New Hampshire 03755, USA [2] Institute for Quantitative Biomedical Sciences, Norris Cotton Cancer Center, Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire 03766, USA [3]
| | - James B Brown
- 1] Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA [2] Department of Statistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California 94720-3860, USA [3]
| | - Carrie A Davis
- 1] Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA [2]
| | - LaDeana Hillier
- 1] Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA [2]
| | - Cristina Sisu
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3]
| | - Jingyi Jessica Li
- 1] Department of Statistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California 94720-3860, USA [2] Department of Statistics, University of California, Los Angeles, California 90095-1554, USA [3] Department of Human Genetics, University of California, Los Angeles, California 90095-7088, USA [4]
| | - Baikang Pei
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3]
| | - Arif O Harmanci
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3]
| | - Michael O Duff
- 1] Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, Connecticut 06030, USA [2]
| | - Sarah Djebali
- 1] Centre for Genomic Regulation, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain [2] Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain [3]
| | - Roger P Alexander
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Burak H Alver
- Center for Biomedical Informatics, Harvard Medical School, 10 Shattuck Street, Boston, Massachusetts 02115, USA
| | - Raymond Auerbach
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Kimberly Bell
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Peter J Bickel
- Department of Statistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California 94720-3860, USA
| | - Max E Boeck
- Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA
| | - Nathan P Boley
- 1] Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA [2] Department of Biostatistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California 94720-3860, USA
| | - Benjamin W Booth
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Lucy Cherbas
- 1] Department of Biology, Indiana University, 1001 East 3rd Street, Bloomington, Indiana 47405-7005, USA [2] Center for Genomics and Bioinformatics, Indiana University, 1001 East 3rd Street, Bloomington, Indiana 47405-7005, USA
| | - Peter Cherbas
- 1] Department of Biology, Indiana University, 1001 East 3rd Street, Bloomington, Indiana 47405-7005, USA [2] Center for Genomics and Bioinformatics, Indiana University, 1001 East 3rd Street, Bloomington, Indiana 47405-7005, USA
| | - Chao Di
- MOE Key Lab of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Alex Dobin
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Jorg Drenkow
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Brent Ewing
- Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA
| | - Gang Fang
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Megan Fastuca
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Elise A Feingold
- National Human Genome Research Institute, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Adam Frankish
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Guanjun Gao
- MOE Key Lab of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Peter J Good
- National Human Genome Research Institute, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Roderic Guigó
- 1] Centre for Genomic Regulation, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain [2] Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Ann Hammonds
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Jen Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Roger A Hoskins
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Cédric Howald
- 1] Center for Integrative Genomics, University of Lausanne, Genopode building, Lausanne 1015, Switzerland [2] Swiss Institute of Bioinformatics, Genopode building, Lausanne 1015, Switzerland
| | - Long Hu
- MOE Key Lab of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Haiyan Huang
- Department of Statistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California 94720-3860, USA
| | - Tim J P Hubbard
- 1] Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK [2] Medical and Molecular Genetics, King's College London, London WC2R 2LS, UK
| | - Chau Huynh
- Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA
| | - Sonali Jha
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Dionna Kasper
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06520-8005, USA
| | - Masaomi Kato
- Department of Molecular, Cellular and Developmental Biology, PO Box 208103, Yale University, New Haven, Connecticut 06520, USA
| | - Thomas C Kaufman
- Department of Biology, Indiana University, 1001 East 3rd Street, Bloomington, Indiana 47405-7005, USA
| | - Robert R Kitchen
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Erik Ladewig
- Sloan-Kettering Institute, 1275 York Avenue, Box 252, New York, New York 10065, USA
| | - Julien Lagarde
- 1] Centre for Genomic Regulation, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain [2] Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Eric Lai
- Sloan-Kettering Institute, 1275 York Avenue, Box 252, New York, New York 10065, USA
| | - Jing Leng
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Zhi Lu
- MOE Key Lab of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Michael MacCoss
- Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA
| | - Gemma May
- 1] Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, Connecticut 06030, USA [2] Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 USA
| | - Rebecca McWhirter
- Department of Cell and Developmental Biology, Vanderbilt University, 465 21st Avenue South, Nashville, Tennessee 37232-8240, USA
| | - Gennifer Merrihew
- Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA
| | - David M Miller
- Department of Cell and Developmental Biology, Vanderbilt University, 465 21st Avenue South, Nashville, Tennessee 37232-8240, USA
| | - Ali Mortazavi
- 1] Developmental and Cell Biology, University of California, Irvine, California 92697, USA [2] Center for Complex Biological Systems, University of California, Irvine, California 92697, USA
| | - Rabi Murad
- 1] Developmental and Cell Biology, University of California, Irvine, California 92697, USA [2] Center for Complex Biological Systems, University of California, Irvine, California 92697, USA
| | - Brian Oliver
- Section of Developmental Genomics, Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Sara Olson
- Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, Connecticut 06030, USA
| | - Peter J Park
- Center for Biomedical Informatics, Harvard Medical School, 10 Shattuck Street, Boston, Massachusetts 02115, USA
| | - Michael J Pazin
- National Human Genome Research Institute, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Norbert Perrimon
- 1] Department of Genetics and Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, Massachusetts 02115, USA [2] Howard Hughes Medical Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, Massachusetts 02115, USA
| | - Dmitri Pervouchine
- 1] Centre for Genomic Regulation, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain [2] Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Valerie Reinke
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06520-8005, USA
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, Genopode building, Lausanne 1015, Switzerland
| | - Garrett Robinson
- Department of Statistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California 94720-3860, USA
| | - Anastasia Samsonova
- 1] Department of Genetics and Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, Massachusetts 02115, USA [2] Howard Hughes Medical Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, Massachusetts 02115, USA
| | - Gary I Saunders
- 1] Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK [2] European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Felix Schlesinger
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Anurag Sethi
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Frank J Slack
- Department of Molecular, Cellular and Developmental Biology, PO Box 208103, Yale University, New Haven, Connecticut 06520, USA
| | - William C Spencer
- Department of Cell and Developmental Biology, Vanderbilt University, 465 21st Avenue South, Nashville, Tennessee 37232-8240, USA
| | - Marcus H Stoiber
- 1] Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA [2] Department of Biostatistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California 94720-3860, USA
| | - Pnina Strasbourger
- Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA
| | - Andrea Tanzer
- 1] Bioinformatics and Genomics Programme, Center for Genomic Regulation, Universitat Pompeu Fabra (CRG-UPF), 08003 Barcelona, Catalonia, Spain [2] Institute for Theoretical Chemistry, Theoretical Biochemistry Group (TBI), University of Vienna, Währingerstrasse 17/3/303, A-1090 Vienna, Austria
| | - Owen A Thompson
- Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA
| | - Kenneth H Wan
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Guilin Wang
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06520-8005, USA
| | - Huaien Wang
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Kathie L Watkins
- Department of Cell and Developmental Biology, Vanderbilt University, 465 21st Avenue South, Nashville, Tennessee 37232-8240, USA
| | - Jiayu Wen
- Sloan-Kettering Institute, 1275 York Avenue, Box 252, New York, New York 10065, USA
| | - Kejia Wen
- MOE Key Lab of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Chenghai Xue
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Li Yang
- 1] Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, Connecticut 06030, USA [2] Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Kevin Yip
- 1] Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong [2] 5 CUHK-BGI Innovation Institute of Trans-omics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Chris Zaleski
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Yan Zhang
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Henry Zheng
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Steven E Brenner
- 1] Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA [2] Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA [3]
| | - Brenton R Graveley
- 1] Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, Connecticut 06030, USA [2]
| | - Susan E Celniker
- 1] Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA [2]
| | - Thomas R Gingeras
- 1] Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA [2]
| | - Robert Waterston
- 1] Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA [2]
| |
Collapse
|
14
|
Hoffmann S, Otto C, Doose G, Tanzer A, Langenberger D, Christ S, Kunz M, Holdt LM, Teupser D, Hackermüller J, Stadler PF. A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection. Genome Biol 2014; 15:R34. [PMID: 24512684 PMCID: PMC4056463 DOI: 10.1186/gb-2014-15-2-r34] [Citation(s) in RCA: 190] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2013] [Accepted: 02/10/2014] [Indexed: 11/25/2022] Open
Abstract
Numerous high-throughput sequencing studies have focused on detecting conventionally spliced mRNAs in RNA-seq data. However, non-standard RNAs arising through gene fusion, circularization or trans-splicing are often neglected. We introduce a novel, unbiased algorithm to detect splice junctions from single-end cDNA sequences. In contrast to other methods, our approach accommodates multi-junction structures. Our method compares favorably with competing tools for conventionally spliced mRNAs and, with a gain of up to 40% of recall, systematically outperforms them on reads with multiple splits, trans-splicing and circular products. The algorithm is integrated into our mapping tool segemehl (http://www.bioinf.uni-leipzig.de/Software/segemehl/).
Collapse
Affiliation(s)
- Steve Hoffmann
- Junior Research Group Transcriptome Bioinformatics, Leipzig University, Haertelstrasse 16-18, Leipzig, Germany
- Interdisciplinary Center for Bioinformatics and Bioinformatics Group, University Leipzig, Haertelstrasse 16-18, Leipzig, Germany
- LIFE Research Center for Civilization Diseases, Leipzig University
| | - Christian Otto
- Junior Research Group Transcriptome Bioinformatics, Leipzig University, Haertelstrasse 16-18, Leipzig, Germany
- Interdisciplinary Center for Bioinformatics and Bioinformatics Group, University Leipzig, Haertelstrasse 16-18, Leipzig, Germany
- LIFE Research Center for Civilization Diseases, Leipzig University
| | - Gero Doose
- Junior Research Group Transcriptome Bioinformatics, Leipzig University, Haertelstrasse 16-18, Leipzig, Germany
- Interdisciplinary Center for Bioinformatics and Bioinformatics Group, University Leipzig, Haertelstrasse 16-18, Leipzig, Germany
- LIFE Research Center for Civilization Diseases, Leipzig University
| | - Andrea Tanzer
- Department of Theoretical Chemistry, University of Vienna, Währinger Strasse 17, Vienna, Austria
| | - David Langenberger
- Junior Research Group Transcriptome Bioinformatics, Leipzig University, Haertelstrasse 16-18, Leipzig, Germany
- Interdisciplinary Center for Bioinformatics and Bioinformatics Group, University Leipzig, Haertelstrasse 16-18, Leipzig, Germany
- LIFE Research Center for Civilization Diseases, Leipzig University
| | - Sabina Christ
- RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology – IZI, Perlickstrasse 1, Leipzig, Germany
| | - Manfred Kunz
- Department of Dermatology, Venerology and Allergology, Leipzig University, Philipp-Rosenthal-Strasse 23, Leipzig, Germany
| | - Lesca M Holdt
- LIFE Research Center for Civilization Diseases, Leipzig University
- Institute of Laboratory Medicine, Ludwig Maximilian University, Marchioninistrasse 15, Munich, Germany
| | - Daniel Teupser
- LIFE Research Center for Civilization Diseases, Leipzig University
- Institute of Laboratory Medicine, Ludwig Maximilian University, Marchioninistrasse 15, Munich, Germany
| | - Jörg Hackermüller
- Interdisciplinary Center for Bioinformatics and Bioinformatics Group, University Leipzig, Haertelstrasse 16-18, Leipzig, Germany
- RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology – IZI, Perlickstrasse 1, Leipzig, Germany
- Young Investigators Group Bioinformatics and Transcriptomics, Department of Proteomics, Helmholtz Centre for Environmental Research – UFZ, Permoserstrasse 15, Leipzig, Germany
| | - Peter F Stadler
- Junior Research Group Transcriptome Bioinformatics, Leipzig University, Haertelstrasse 16-18, Leipzig, Germany
- Interdisciplinary Center for Bioinformatics and Bioinformatics Group, University Leipzig, Haertelstrasse 16-18, Leipzig, Germany
- LIFE Research Center for Civilization Diseases, Leipzig University
- Department of Theoretical Chemistry, University of Vienna, Währinger Strasse 17, Vienna, Austria
- Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, Leipzig, Germany
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg, Denmark
- Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM, USA
| |
Collapse
|
15
|
Lorenz R, Bernhart SH, Qin J, Höner zu Siederdissen C, Tanzer A, Amman F, Hofacker IL, Stadler PF. 2D meets 4G: G-quadruplexes in RNA secondary structure prediction. IEEE/ACM Trans Comput Biol Bioinform 2013; 10:832-844. [PMID: 24334379 DOI: 10.1109/tcbb.2013.7] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
G-quadruplexes are abundant locally stable structural elements in nucleic acids. The combinatorial theory of RNA structures and the dynamic programming algorithms for RNA secondary structure prediction are extended here to incorporate G-quadruplexes using a simple but plausible energy model. With preliminary energy parameters, we find that the overwhelming majority of putative quadruplex-forming sequences in the human genome are likely to fold into canonical secondary structures instead. Stable G-quadruplexes are strongly enriched, however, in the 5'UTR of protein coding mRNAs.
Collapse
Affiliation(s)
| | | | - Jing Qin
- Max Planck Institute for Mathematics in the Sciences, Leipzig and University of Leipzig, Leipzig
| | | | - Andrea Tanzer
- University of Vienna, Vienna and Center for Genomic Regulation (CRG), Barcelona
| | | | - Ivo L Hofacker
- University of Vienna, Vienna and University of Copenhagen
| | - Peter F Stadler
- University of Leipzig, Leipzig, Max Planck Institute for Mathematics in the Sciences, Leipzig, Fraunhofer Institute for CellTherapy and Immunology and University of Copenhagen
| |
Collapse
|
16
|
Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, Tilgner H, Guernec G, Martin D, Merkel A, Knowles DG, Lagarde J, Veeravalli L, Ruan X, Ruan Y, Lassmann T, Carninci P, Brown JB, Lipovich L, Gonzalez JM, Thomas M, Davis CA, Shiekhattar R, Gingeras TR, Hubbard TJ, Notredame C, Harrow J, Guigó R. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res 2013; 22:1775-89. [PMID: 22955988 PMCID: PMC3431493 DOI: 10.1101/gr.132159.111] [Citation(s) in RCA: 3733] [Impact Index Per Article: 339.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The human genome contains many thousands of long noncoding RNAs (lncRNAs). While several studies have demonstrated compelling biological and disease roles for individual examples, analytical and experimental approaches to investigate these genes have been hampered by the lack of comprehensive lncRNA annotation. Here, we present and analyze the most complete human lncRNA annotation to date, produced by the GENCODE consortium within the framework of the ENCODE project and comprising 9277 manually annotated genes producing 14,880 transcripts. Our analyses indicate that lncRNAs are generated through pathways similar to that of protein-coding genes, with similar histone-modification profiles, splicing signals, and exon/intron lengths. In contrast to protein-coding genes, however, lncRNAs display a striking bias toward two-exon transcripts, they are predominantly localized in the chromatin and nucleus, and a fraction appear to be preferentially processed into small RNAs. They are under stronger selective pressure than neutrally evolving sequences—particularly in their promoter regions, which display levels of selection comparable to protein-coding genes. Importantly, about one-third seem to have arisen within the primate lineage. Comprehensive analysis of their expression in multiple human organs and brain regions shows that lncRNAs are generally lower expressed than protein-coding genes, and display more tissue-specific expression patterns, with a large fraction of tissue-specific lncRNAs expressed in the brain. Expression correlation analysis indicates that lncRNAs show particularly striking positive correlation with the expression of antisense coding genes. This GENCODE annotation represents a valuable resource for future studies of lncRNAs.
Collapse
Affiliation(s)
- Thomas Derrien
- Bioinformatics and Genomics, Centre for Genomic Regulation and UPF, 08003 Barcelona, Catalonia, Spain
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M, Kokocinski F, Aken BL, Barrell D, Zadissa A, Searle S, Barnes I, Bignell A, Boychenko V, Hunt T, Kay M, Mukherjee G, Rajan J, Despacio-Reyes G, Saunders G, Steward C, Harte R, Lin M, Howald C, Tanzer A, Derrien T, Chrast J, Walters N, Balasubramanian S, Pei B, Tress M, Rodriguez JM, Ezkurdia I, van Baren J, Brent M, Haussler D, Kellis M, Valencia A, Reymond A, Gerstein M, Guigó R, Hubbard TJ. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res 2013; 22:1760-74. [PMID: 22955987 PMCID: PMC3431492 DOI: 10.1101/gr.135350.111] [Citation(s) in RCA: 3491] [Impact Index Per Article: 317.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily increased. The GENCODE 7 release contains 20,687 protein-coding and 9640 long noncoding RNA loci and has 33,977 coding transcripts not represented in UCSC genes and RefSeq. It also has the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. We have examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites. Over one-third of GENCODE protein-coding genes are supported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. New models derived from the Illumina Body Map 2.0 RNA-seq data identify 3689 new loci not currently in GENCODE, of which 3127 consist of two exon models indicating that they are possibly unannotated long noncoding loci. GENCODE 7 is publicly available from gencodegenes.org and via the Ensembl and UCSC Genome Browsers.
Collapse
Affiliation(s)
- Jennifer Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Bompfünewerer AF, Flamm C, Fried C, Fritzsch G, Hofacker IL, Lehmann J, Missal K, Mosig A, Müller B, Prohaska SJ, Stadler BMR, Stadler PF, Tanzer A, Washietl S, Witwer C. Evolutionary patterns of non-coding RNAs. Theory Biosci 2012; 123:301-69. [PMID: 18202870 DOI: 10.1016/j.thbio.2005.01.002] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2004] [Accepted: 01/24/2005] [Indexed: 01/04/2023]
Abstract
A plethora of new functions of non-coding RNAs (ncRNAs) have been discovered in past few years. In fact, RNA is emerging as the central player in cellular regulation, taking on active roles in multiple regulatory layers from transcription, RNA maturation, and RNA modification to translational regulation. Nevertheless, very little is known about the evolution of this "Modern RNA World" and its components. In this contribution, we attempt to provide at least a cursory overview of the diversity of ncRNAs and functional RNA motifs in non-translated regions of regular messenger RNAs (mRNAs) with an emphasis on evolutionary questions. This survey is complemented by an in-depth analysis of examples from different classes of RNAs focusing mostly on their evolution in the vertebrate lineage. We present a survey of Y RNA genes in vertebrates and study the molecular evolution of the U7 snRNA, the snoRNAs E1/U17, E2, and E3, the Y RNA family, the let-7 microRNA (miRNA) family, and the mRNA-like evf-1 gene. We furthermore discuss the statistical distribution of miRNAs in metazoans, which suggests an explosive increase in the miRNA repertoire in vertebrates. The analysis of the transcription of ncRNAs suggests that small RNAs in general are genetically mobile in the sense that their association with a hostgene (e.g. when transcribed from introns of a mRNA) can change on evolutionary time scales. The let-7 family demonstrates, that even the mode of transcription (as intron or as exon) can change among paralogous ncRNA.
Collapse
|
19
|
Pei B, Sisu C, Frankish A, Howald C, Habegger L, Mu XJ, Harte R, Balasubramanian S, Tanzer A, Diekhans M, Reymond A, Hubbard TJ, Harrow J, Gerstein MB. The GENCODE pseudogene resource. Genome Biol 2012; 13:R51. [PMID: 22951037 PMCID: PMC3491395 DOI: 10.1186/gb-2012-13-9-r51] [Citation(s) in RCA: 253] [Impact Index Per Article: 21.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2012] [Revised: 05/30/2012] [Accepted: 06/25/2012] [Indexed: 12/11/2022] Open
Abstract
Background Pseudogenes have long been considered as nonfunctional genomic sequences. However, recent evidence suggests that many of them might have some form of biological activity, and the possibility of functionality has increased interest in their accurate annotation and integration with functional genomics data. Results As part of the GENCODE annotation of the human genome, we present the first genome-wide pseudogene assignment for protein-coding genes, based on both large-scale manual annotation and in silico pipelines. A key aspect of this coupled approach is that it allows us to identify pseudogenes in an unbiased fashion as well as untangle complex events through manual evaluation. We integrate the pseudogene annotations with the extensive ENCODE functional genomics information. In particular, we determine the expression level, transcription-factor and RNA polymerase II binding, and chromatin marks associated with each pseudogene. Based on their distribution, we develop simple statistical models for each type of activity, which we validate with large-scale RT-PCR-Seq experiments. Finally, we compare our pseudogenes with conservation and variation data from primate alignments and the 1000 Genomes project, producing lists of pseudogenes potentially under selection. Conclusions At one extreme, some pseudogenes possess conventional characteristics of functionality; these may represent genes that have recently died. On the other hand, we find interesting patterns of partial activity, which may suggest that dead genes are being resurrected as functioning non-coding RNAs. The activity data of each pseudogene are stored in an associated resource, psiDR, which will be useful for the initial identification of potentially functional pseudogenes.
Collapse
Affiliation(s)
- Baikang Pei
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi AM, Tanzer A, Lagarde J, Lin W, Schlesinger F, Xue C, Marinov GK, Khatun J, Williams BA, Zaleski C, Rozowsky J, Röder M, Kokocinski F, Abdelhamid RF, Alioto T, Antoshechkin I, Baer MT, Bar NS, Batut P, Bell K, Bell I, Chakrabortty S, Chen X, Chrast J, Curado J, Derrien T, Drenkow J, Dumais E, Dumais J, Duttagupta R, Falconnet E, Fastuca M, Fejes-Toth K, Ferreira P, Foissac S, Fullwood MJ, Gao H, Gonzalez D, Gordon A, Gunawardena H, Howald C, Jha S, Johnson R, Kapranov P, King B, Kingswood C, Luo OJ, Park E, Persaud K, Preall JB, Ribeca P, Risk B, Robyr D, Sammeth M, Schaffer L, See LH, Shahab A, Skancke J, Suzuki AM, Takahashi H, Tilgner H, Trout D, Walters N, Wang H, Wrobel J, Yu Y, Ruan X, Hayashizaki Y, Harrow J, Gerstein M, Hubbard T, Reymond A, Antonarakis SE, Hannon G, Giddings MC, Ruan Y, Wold B, Carninci P, Guigó R, Gingeras TR. Landscape of transcription in human cells. Nature 2012; 489:101-8. [PMID: 22955620 PMCID: PMC3684276 DOI: 10.1038/nature11233] [Citation(s) in RCA: 3716] [Impact Index Per Article: 309.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2011] [Accepted: 05/15/2012] [Indexed: 02/07/2023]
Abstract
Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.
Collapse
Affiliation(s)
- Sarah Djebali
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Carrie A. Davis
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Angelika Merkel
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Alex Dobin
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Timo Lassmann
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Ali M. Mortazavi
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
- University of California Irvine, Dept of. Developmental and Cell Biology, 2300 Biological Sciences III, Irving, CA USA 92697
| | - Andrea Tanzer
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Julien Lagarde
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Wei Lin
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Felix Schlesinger
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Chenghai Xue
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Georgi K. Marinov
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Jainab Khatun
- Boise State University, College of Arts & Sciences, 1910 University Dr. Boise, ID USA 83725
| | - Brian A. Williams
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Chris Zaleski
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Joel Rozowsky
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
| | - Maik Röder
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Felix Kokocinski
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire United Kingdom CB10 1SA
| | - Rehab F. Abdelhamid
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Tyler Alioto
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Igor Antoshechkin
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Michael T. Baer
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Nadav S. Bar
- Department of Chemical Engineering, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
| | - Philippe Batut
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Kimberly Bell
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Ian Bell
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Sudipto Chakrabortty
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Xian Chen
- University of North Carolina at Chapel Hill, Department of Biochemistry & Biophysics, 120 Mason Farm Rd., Chapel Hill, NC USA 27599
| | - Jacqueline Chrast
- University of Lausanne, Center for Integrative Genomics, Genopode building, Lausanne, Switzerland 1015
| | - Joao Curado
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Thomas Derrien
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Jorg Drenkow
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Erica Dumais
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Jacqueline Dumais
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Radha Duttagupta
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Emilie Falconnet
- University of Geneva Medical School, Department of Genetic Medicine and Development and iGE3 Institute of Genetics and Genomics of Geneva, 1 rue Michel-Servet, Geneva, Switzerland 1015
| | - Meagan Fastuca
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Kata Fejes-Toth
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Pedro Ferreira
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Sylvain Foissac
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Melissa J. Fullwood
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Hui Gao
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - David Gonzalez
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Assaf Gordon
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Harsha Gunawardena
- University of North Carolina at Chapel Hill, Department of Biochemistry & Biophysics, 120 Mason Farm Rd., Chapel Hill, NC USA 27599
| | - Cedric Howald
- University of Lausanne, Center for Integrative Genomics, Genopode building, Lausanne, Switzerland 1015
| | - Sonali Jha
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Rory Johnson
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Philipp Kapranov
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
- St. Laurent Institute, One Kendall Square, Cambridge, MA
| | - Brandon King
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Colin Kingswood
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Oscar J. Luo
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Eddie Park
- University of California Irvine, Dept of. Developmental and Cell Biology, 2300 Biological Sciences III, Irving, CA USA 92697
| | - Kimberly Persaud
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Jonathan B. Preall
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Paolo Ribeca
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Brian Risk
- Boise State University, College of Arts & Sciences, 1910 University Dr. Boise, ID USA 83725
| | - Daniel Robyr
- University of Geneva Medical School, Department of Genetic Medicine and Development and iGE3 Institute of Genetics and Genomics of Geneva, 1 rue Michel-Servet, Geneva, Switzerland 1015
| | - Michael Sammeth
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Lorian Schaffer
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Lei-Hoon See
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Atif Shahab
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Jorgen Skancke
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
- Department of Chemical Engineering, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
| | - Ana Maria Suzuki
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Hazuki Takahashi
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Hagen Tilgner
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Diane Trout
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Nathalie Walters
- University of Lausanne, Center for Integrative Genomics, Genopode building, Lausanne, Switzerland 1015
| | - Huaien Wang
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - John Wrobel
- Boise State University, College of Arts & Sciences, 1910 University Dr. Boise, ID USA 83725
| | - Yanbao Yu
- University of North Carolina at Chapel Hill, Department of Biochemistry & Biophysics, 120 Mason Farm Rd., Chapel Hill, NC USA 27599
| | - Xiaoan Ruan
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Yoshihide Hayashizaki
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Jennifer Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire United Kingdom CB10 1SA
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
- Department of Computer Science, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
| | - Tim Hubbard
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire United Kingdom CB10 1SA
| | - Alexandre Reymond
- University of Lausanne, Center for Integrative Genomics, Genopode building, Lausanne, Switzerland 1015
| | - Stylianos E. Antonarakis
- University of Geneva Medical School, Department of Genetic Medicine and Development and iGE3 Institute of Genetics and Genomics of Geneva, 1 rue Michel-Servet, Geneva, Switzerland 1015
| | - Gregory Hannon
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Morgan C. Giddings
- Boise State University, College of Arts & Sciences, 1910 University Dr. Boise, ID USA 83725
- University of North Carolina at Chapel Hill, Department of Biochemistry & Biophysics, 120 Mason Farm Rd., Chapel Hill, NC USA 27599
| | - Yijun Ruan
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Barbara Wold
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Piero Carninci
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Thomas R. Gingeras
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| |
Collapse
|
21
|
Howald C, Tanzer A, Chrast J, Kokocinski F, Derrien T, Walters N, Gonzalez JM, Frankish A, Aken BL, Hourlier T, Vogel JH, White S, Searle S, Harrow J, Hubbard TJ, Guigó R, Reymond A. Combining RT-PCR-seq and RNA-seq to catalog all genic elements encoded in the human genome. Genome Res 2012; 22:1698-710. [PMID: 22955982 PMCID: PMC3431487 DOI: 10.1101/gr.134478.111] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2011] [Accepted: 05/01/2012] [Indexed: 12/21/2022]
Abstract
Within the ENCODE Consortium, GENCODE aimed to accurately annotate all protein-coding genes, pseudogenes, and noncoding transcribed loci in the human genome through manual curation and computational methods. Annotated transcript structures were assessed, and less well-supported loci were systematically, experimentally validated. Predicted exon-exon junctions were evaluated by RT-PCR amplification followed by highly multiplexed sequencing readout, a method we called RT-PCR-seq. Seventy-nine percent of all assessed junctions are confirmed by this evaluation procedure, demonstrating the high quality of the GENCODE gene set. RT-PCR-seq was also efficient to screen gene models predicted using the Human Body Map (HBM) RNA-seq data. We validated 73% of these predictions, thus confirming 1168 novel genes, mostly noncoding, which will further complement the GENCODE annotation. Our novel experimental validation pipeline is extremely sensitive, far more than unbiased transcriptome profiling through RNA sequencing, which is becoming the norm. For example, exon-exon junctions unique to GENCODE annotated transcripts are five times more likely to be corroborated with our targeted approach than with extensive large human transcriptome profiling. Data sets such as the HBM and ENCODE RNA-seq data fail sampling of low-expressed transcripts. Our RT-PCR-seq targeted approach also has the advantage of identifying novel exons of known genes, as we discovered unannotated exons in ~11% of assessed introns. We thus estimate that at least 18% of known loci have yet-unannotated exons. Our work demonstrates that the cataloging of all of the genic elements encoded in the human genome will necessitate a coordinated effort between unbiased and targeted approaches, like RNA-seq and RT-PCR-seq.
Collapse
Affiliation(s)
- Cédric Howald
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Andrea Tanzer
- Centre de Regulacio Genomica, Grup de Recerca en Informatica Biomedica, E-08003 Barcelona, Spain
| | - Jacqueline Chrast
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Felix Kokocinski
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Thomas Derrien
- Centre de Regulacio Genomica, Grup de Recerca en Informatica Biomedica, E-08003 Barcelona, Spain
| | - Nathalie Walters
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Jose M. Gonzalez
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Adam Frankish
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Bronwen L. Aken
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Thibaut Hourlier
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Jan-Hinnerk Vogel
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Simon White
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Stephen Searle
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Jennifer Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Tim J. Hubbard
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Roderic Guigó
- Centre de Regulacio Genomica, Grup de Recerca en Informatica Biomedica, E-08003 Barcelona, Spain
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
22
|
Marz M, Gruber AR, Höner Zu Siederdissen C, Amman F, Badelt S, Bartschat S, Bernhart SH, Beyer W, Kehr S, Lorenz R, Tanzer A, Yusuf D, Tafer H, Hofacker IL, Stadler PF. Animal snoRNAs and scaRNAs with exceptional structures. RNA Biol 2011; 8:938-46. [PMID: 21955586 DOI: 10.4161/rna.8.6.16603] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
The overwhelming majority of small nucleolar RNAs (snoRNAs) fall into two clearly defined classes characterized by distinctive secondary structures and sequence motifs. A small group of diverse ncRNAs, however, shares the hallmarks of one or both classes of snoRNAs but differs substantially from the norm in some respects. Here, we compile the available information on these exceptional cases, conduct a thorough homology search throughout the available metazoan genomes, provide improved and expanded alignments, and investigate the evolutionary histories of these ncRNA families as well as their mutual relationships.
Collapse
Affiliation(s)
- Manja Marz
- RNA Bioinformatik Gruppe, Institut f¨ur Pharmazeutische Chemie, Philipps Universit¨at Marburg, Marburg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Boria I, Gruber AR, Tanzer A, Bernhart SH, Lorenz R, Mueller MM, Hofacker IL, Stadler PF. Nematode sbRNAs: Homologs of Vertebrate Y RNAs. J Mol Evol 2010; 70:346-58. [DOI: 10.1007/s00239-010-9332-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2009] [Accepted: 03/01/2010] [Indexed: 01/20/2023]
|
24
|
Rederstorff M, Bernhart SH, Tanzer A, Zywicki M, Perfler K, Lukasser M, Hofacker IL, Hüttenhofer A. RNPomics: defining the ncRNA transcriptome by cDNA library generation from ribonucleo-protein particles. Nucleic Acids Res 2010; 38:e113. [PMID: 20150415 PMCID: PMC2879528 DOI: 10.1093/nar/gkq057] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Up to 450 000 non-coding RNAs (ncRNAs) have been predicted to be transcribed from the human genome. However, it still has to be elucidated which of these transcripts represent functional ncRNAs. Since all functional ncRNAs in Eukarya form ribonucleo-protein particles (RNPs), we generated specialized cDNA libraries from size-fractionated RNPs and validated the presence of selected ncRNAs within RNPs by glycerol gradient centrifugation. As a proof of concept, we applied the RNP method to human Hela cells or total mouse brain, and subjected cDNA libraries, generated from the two model systems, to deep-sequencing. Bioinformatical analysis of cDNA sequences revealed several hundred ncRNP candidates. Thereby, ncRNAs candidates were mainly located in intergenic as well as intronic regions of the genome, with a significant overrepresentation of intron-derived ncRNA sequences. Additionally, a number of ncRNAs mapped to repetitive sequences. Thus, our RNP approach provides an efficient way to identify new functional small ncRNA candidates, involved in RNP formation.
Collapse
Affiliation(s)
- Mathieu Rederstorff
- Division of Genomics and RNomics, Innsbruck Biocentre, Innsbruck Medical University, Innsbruck and Institute of Theoretical Chemistry, University of Vienna, Vienna, Austria
| | | | | | | | | | | | | | | |
Collapse
|
25
|
Hertel J, de Jong D, Marz M, Rose D, Tafer H, Tanzer A, Schierwater B, Stadler PF. Non-coding RNA annotation of the genome of Trichoplax adhaerens. Nucleic Acids Res 2009; 37:1602-15. [PMID: 19151082 PMCID: PMC2655684 DOI: 10.1093/nar/gkn1084] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2008] [Revised: 12/22/2008] [Accepted: 12/23/2008] [Indexed: 02/06/2023] Open
Abstract
A detailed annotation of non-protein coding RNAs is typically missing in initial releases of newly sequenced genomes. Here we report on a comprehensive ncRNA annotation of the genome of Trichoplax adhaerens, the presumably most basal metazoan whose genome has been published to-date. Since blast identified only a small fraction of the best-conserved ncRNAs--in particular rRNAs, tRNAs and some snRNAs--we developed a semi-global dynamic programming tool, GotohScan, to increase the sensitivity of the homology search. It successfully identified the full complement of major and minor spliceosomal snRNAs, the genes for RNase P and MRP RNAs, the SRP RNA, as well as several small nucleolar RNAs. We did not find any microRNA candidates homologous to known eumetazoan sequences. Interestingly, most ncRNAs, including the pol-III transcripts, appear as single-copy genes or with very small copy numbers in the Trichoplax genome.
Collapse
Affiliation(s)
- Jana Hertel
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Danielle de Jong
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Manja Marz
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Dominic Rose
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Hakim Tafer
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Andrea Tanzer
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Bernd Schierwater
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| |
Collapse
|
26
|
Geis M, Flamm C, Wolfinger MT, Tanzer A, Hofacker IL, Middendorf M, Mandl C, Stadler PF, Thurner C. Folding kinetics of large RNAs. J Mol Biol 2008; 379:160-73. [PMID: 18440024 DOI: 10.1016/j.jmb.2008.02.064] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2007] [Revised: 02/08/2008] [Accepted: 02/27/2008] [Indexed: 11/29/2022]
Abstract
We introduce here a heuristic approach to kinetic RNA folding that constructs secondary structures by stepwise combination of building blocks. These blocks correspond to subsequences and their thermodynamically optimal structures. These are determined by the standard dynamic programming approach to RNA folding. Folding trajectories are modeled at base-pair resolution using the Morgan-Higgs heuristic and a barrier tree-based heuristic to connect combinations of the local building blocks. Implemented in the program Kinwalker, the algorithm allows co-transcriptional folding and can be used to fold sequences of up to about 1500 nucleotides in length. A detailed comparison with several well-studied examples from the literature, including the delayed folding of bacteriophage cloverleaf structures, the adenine sensing riboswitch, and the hok RNA, shows an excellent agreement of predicted trajectories and experimental evidence. The software is available as part of the ViennaRNA Package.
Collapse
Affiliation(s)
- Michael Geis
- Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, 04107 Leipzig, Germany.
| | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Korbel JO, Urban AE, Affourtit JP, Godwin B, Grubert F, Simons JF, Kim PM, Palejev D, Carriero NJ, Du L, Taillon BE, Chen Z, Tanzer A, Saunders ACE, Chi J, Yang F, Carter NP, Hurles ME, Weissman SM, Harkins TT, Gerstein MB, Egholm M, Snyder M. Paired-end mapping reveals extensive structural variation in the human genome. Science 2007; 318:420-6. [PMID: 17901297 PMCID: PMC2674581 DOI: 10.1126/science.1149504] [Citation(s) in RCA: 900] [Impact Index Per Article: 52.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Structural variation of the genome involves kilobase- to megabase-sized deletions, duplications, insertions, inversions, and complex combinations of rearrangements. We introduce high-throughput and massive paired-end mapping (PEM), a large-scale genome-sequencing method to identify structural variants (SVs) approximately 3 kilobases (kb) or larger that combines the rescue and capture of paired ends of 3-kb fragments, massive 454 sequencing, and a computational approach to map DNA reads onto a reference genome. PEM was used to map SVs in an African and in a putatively European individual and identified shared and divergent SVs relative to the reference genome. Overall, we fine-mapped more than 1000 SVs and documented that the number of SVs among humans is much larger than initially hypothesized; many of the SVs potentially affect gene function. The breakpoint junction sequences of more than 200 SVs were determined with a novel pooling strategy and computational analysis. Our analysis provided insights into the mechanisms of SV formation in humans.
Collapse
Affiliation(s)
- Jan O Korbel
- Molecular Biophysics and Biochemistry Department, Yale University, New Haven, CT 06520, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Birney E, Stamatoyannopoulos JA, Dutta A, Guigó R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, Kuehn MS, Taylor CM, Neph S, Koch CM, Asthana S, Malhotra A, Adzhubei I, Greenbaum JA, Andrews RM, Flicek P, Boyle PJ, Cao H, Carter NP, Clelland GK, Davis S, Day N, Dhami P, Dillon SC, Dorschner MO, Fiegler H, Giresi PG, Goldy J, Hawrylycz M, Haydock A, Humbert R, James KD, Johnson BE, Johnson EM, Frum TT, Rosenzweig ER, Karnani N, Lee K, Lefebvre GC, Navas PA, Neri F, Parker SCJ, Sabo PJ, Sandstrom R, Shafer A, Vetrie D, Weaver M, Wilcox S, Yu M, Collins FS, Dekker J, Lieb JD, Tullius TD, Crawford GE, Sunyaev S, Noble WS, Dunham I, Denoeud F, Reymond A, Kapranov P, Rozowsky J, Zheng D, Castelo R, Frankish A, Harrow J, Ghosh S, Sandelin A, Hofacker IL, Baertsch R, Keefe D, Dike S, Cheng J, Hirsch HA, Sekinger EA, Lagarde J, Abril JF, Shahab A, Flamm C, Fried C, Hackermüller J, Hertel J, Lindemeyer M, Missal K, Tanzer A, Washietl S, Korbel J, Emanuelsson O, Pedersen JS, Holroyd N, Taylor R, Swarbreck D, Matthews N, Dickson MC, Thomas DJ, Weirauch MT, Gilbert J, Drenkow J, Bell I, Zhao X, Srinivasan KG, Sung WK, Ooi HS, Chiu KP, Foissac S, Alioto T, Brent M, Pachter L, Tress ML, Valencia A, Choo SW, Choo CY, Ucla C, Manzano C, Wyss C, Cheung E, Clark TG, Brown JB, Ganesh M, Patel S, Tammana H, Chrast J, Henrichsen CN, Kai C, Kawai J, Nagalakshmi U, Wu J, Lian Z, Lian J, Newburger P, Zhang X, Bickel P, Mattick JS, Carninci P, Hayashizaki Y, Weissman S, Hubbard T, Myers RM, Rogers J, Stadler PF, Lowe TM, Wei CL, Ruan Y, Struhl K, Gerstein M, Antonarakis SE, Fu Y, Green ED, Karaöz U, Siepel A, Taylor J, Liefer LA, Wetterstrand KA, Good PJ, Feingold EA, Guyer MS, Cooper GM, Asimenos G, Dewey CN, Hou M, Nikolaev S, Montoya-Burgos JI, Löytynoja A, Whelan S, Pardi F, Massingham T, Huang H, Zhang NR, Holmes I, Mullikin JC, Ureta-Vidal A, Paten B, Seringhaus M, Church D, Rosenbloom K, Kent WJ, Stone EA, Batzoglou S, Goldman N, Hardison RC, Haussler D, Miller W, Sidow A, Trinklein ND, Zhang ZD, Barrera L, Stuart R, King DC, Ameur A, Enroth S, Bieda MC, Kim J, Bhinge AA, Jiang N, Liu J, Yao F, Vega VB, Lee CWH, Ng P, Shahab A, Yang A, Moqtaderi Z, Zhu Z, Xu X, Squazzo S, Oberley MJ, Inman D, Singer MA, Richmond TA, Munn KJ, Rada-Iglesias A, Wallerman O, Komorowski J, Fowler JC, Couttet P, Bruce AW, Dovey OM, Ellis PD, Langford CF, Nix DA, Euskirchen G, Hartman S, Urban AE, Kraus P, Van Calcar S, Heintzman N, Kim TH, Wang K, Qu C, Hon G, Luna R, Glass CK, Rosenfeld MG, Aldred SF, Cooper SJ, Halees A, Lin JM, Shulha HP, Zhang X, Xu M, Haidar JNS, Yu Y, Ruan Y, Iyer VR, Green RD, Wadelius C, Farnham PJ, Ren B, Harte RA, Hinrichs AS, Trumbower H, Clawson H, Hillman-Jackson J, Zweig AS, Smith K, Thakkapallayil A, Barber G, Kuhn RM, Karolchik D, Armengol L, Bird CP, de Bakker PIW, Kern AD, Lopez-Bigas N, Martin JD, Stranger BE, Woodroffe A, Davydov E, Dimas A, Eyras E, Hallgrímsdóttir IB, Huppert J, Zody MC, Abecasis GR, Estivill X, Bouffard GG, Guan X, Hansen NF, Idol JR, Maduro VVB, Maskeri B, McDowell JC, Park M, Thomas PJ, Young AC, Blakesley RW, Muzny DM, Sodergren E, Wheeler DA, Worley KC, Jiang H, Weinstock GM, Gibbs RA, Graves T, Fulton R, Mardis ER, Wilson RK, Clamp M, Cuff J, Gnerre S, Jaffe DB, Chang JL, Lindblad-Toh K, Lander ES, Koriabine M, Nefedov M, Osoegawa K, Yoshinaga Y, Zhu B, de Jong PJ. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 2007; 447:799-816. [PMID: 17571346 PMCID: PMC2212820 DOI: 10.1038/nature05874] [Citation(s) in RCA: 3782] [Impact Index Per Article: 222.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Collapse
|
29
|
Washietl S, Pedersen JS, Korbel JO, Stocsits C, Gruber AR, Hackermüller J, Hertel J, Lindemeyer M, Reiche K, Tanzer A, Ucla C, Wyss C, Antonarakis SE, Denoeud F, Lagarde J, Drenkow J, Kapranov P, Gingeras TR, Guigó R, Snyder M, Gerstein MB, Reymond A, Hofacker IL, Stadler PF. Structured RNAs in the ENCODE selected regions of the human genome. Genes Dev 2007; 17:852-64. [PMID: 17568003 PMCID: PMC1891344 DOI: 10.1101/gr.5650707] [Citation(s) in RCA: 136] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2006] [Accepted: 12/12/2006] [Indexed: 12/16/2022]
Abstract
Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack characteristic signals in primary sequence, comparative approaches evaluating evolutionary conservation of structures are most promising. We have used three recently introduced programs based on either phylogenetic-stochastic context-free grammar (EvoFold) or energy directed folding (RNAz and AlifoldZ), yielding several thousand candidate structures (corresponding to approximately 2.7% of the ENCODE regions). EvoFold has its highest sensitivity in highly conserved and relatively AU-rich regions, while RNAz favors slightly GC-rich regions, resulting in a relatively small overlap between methods. Comparison with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3'-UTRs. While we estimate a significant false discovery rate of approximately 50%-70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz and EvoFold, and an additional 239 RNAz or EvoFold predictions are supported by the (more stringent) AlifoldZ algorithm. Five hundred seventy RNAz structure predictions fall into regions that show signs of selection pressure also on the sequence level (i.e., conserved elements). More than 700 predictions overlap with noncoding transcripts detected by oligonucleotide tiling arrays. One hundred seventy-five selected candidates were tested by RT-PCR in six tissues, and expression could be verified in 43 cases (24.6%).
Collapse
Affiliation(s)
- Stefan Washietl
- Institute for Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Backofen R, Bernhart SH, Flamm C, Fried C, Fritzsch G, Hackermüller J, Hertel J, Hofacker IL, Missal K, Mosig A, Prohaska SJ, Rose D, Stadler PF, Tanzer A, Washietl S, Will S. RNAs everywhere: genome-wide annotation of structured RNAs. J Exp Zool B Mol Dev Evol 2007; 308:1-25. [PMID: 17171697 DOI: 10.1002/jez.b.21130] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Starting with the discovery of microRNAs and the advent of genome-wide transcriptomics, non-protein-coding transcripts have moved from a fringe topic to a central field research in molecular biology. In this contribution we review the state of the art of "computational RNomics", i.e., the bioinformatics approaches to genome-wide RNA annotation. Instead of rehashing results from recently published surveys in detail, we focus here on the open problem in the field, namely (functional) annotation of the plethora of putative RNAs. A series of exploratory studies are used to provide non-trivial examples for the discussion of some of the difficulties.
Collapse
|
31
|
Abstract
MicroRNAs (miRNAs) form a large class of small regulatory RNAs in eukaryotes. Although they share a common processing pathway and certain structural features, in general, there is no detectable sequence similarity among miRNAs from a given organism. On the other hand, many miRNAs are members of a family of a few, often very similar, paralogs. It is, thus, of interest to trace the evolutionary history of individual miRNAs, to identify the timing of gene duplications, and to study relationships between the histories of different miRNA families. Some miRNAs are transcribed from polycistronic primary transcripts. In these cases, we will study the evolution of entire clusters.
Collapse
Affiliation(s)
- Andrea Tanzer
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | | |
Collapse
|
32
|
Hertel J, Lindemeyer M, Missal K, Fried C, Tanzer A, Flamm C, Hofacker IL, Stadler PF. The expansion of the metazoan microRNA repertoire. BMC Genomics 2006; 7:25. [PMID: 16480513 PMCID: PMC1388199 DOI: 10.1186/1471-2164-7-25] [Citation(s) in RCA: 257] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2005] [Accepted: 02/15/2006] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND MicroRNAs have been identified as crucial regulators in both animals and plants. Here we report on a comprehensive comparative study of all known miRNA families in animals. We expand the MicroRNA Registry 6.0 by more than 1000 new homologs of miRNA precursors whose expression has been verified in at least one species. Using this uniform data basis we analyze their evolutionary history in terms of individual gene phylogenies and in terms of preservation of genomic nearness across species. This allows us to reliably identify microRNA clusters that are derived from a common transcript. RESULTS We identify three episodes of microRNA innovation that correspond to major developmental innovations: A class of about 20 miRNAs is common to protostomes and deuterostomes and might be related to the advent of bilaterians. A second large wave of innovations maps to the branch leading to the vertebrates. The third significant outburst of miRNA innovation coincides with placental (eutherian) mammals. In addition, we observe the expected expansion of the microRNA inventory due to genome duplications in early vertebrates and in an ancestral teleost. The non-local duplications in the vertebrate ancestor are predated by local (tandem) duplications leading to the formation of about a dozen ancient microRNA clusters. CONCLUSION Our results suggest that microRNA innovation is an ongoing process. Major expansions of the metazoan miRNA repertoire coincide with the advent of bilaterians, vertebrates, and (placental) mammals.
Collapse
Affiliation(s)
- Jana Hertel
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| | - Manuela Lindemeyer
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| | - Kristin Missal
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| | - Claudia Fried
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| | - Andrea Tanzer
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, A-1090 Wien, Austria
| | - Christoph Flamm
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, A-1090 Wien, Austria
| | - Ivo L Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, A-1090 Wien, Austria
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, A-1090 Wien, Austria
- The Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe NM 87501
| | | |
Collapse
|
33
|
Dieterich C, Grossmann S, Tanzer A, Röpcke S, Arndt PF, Stadler PF, Vingron M. Comparative promoter region analysis powered by CORG. BMC Genomics 2005; 6:24. [PMID: 15723697 PMCID: PMC555765 DOI: 10.1186/1471-2164-6-24] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2004] [Accepted: 02/21/2005] [Indexed: 11/10/2022] Open
Abstract
Background Promoters are key players in gene regulation. They receive signals from various sources (e.g. cell surface receptors) and control the level of transcription initiation, which largely determines gene expression. In vertebrates, transcription start sites and surrounding regulatory elements are often poorly defined. To support promoter analysis, we present CORG , a framework for studying upstream regions including untranslated exons (5' UTR). Description The automated annotation of promoter regions integrates information of two kinds. First, statistically significant cross-species conservation within upstream regions of orthologous genes is detected. Pairwise as well as multiple sequence comparisons are computed. Second, binding site descriptions (position-weight matrices) are employed to predict conserved regulatory elements with a novel approach. Assembled EST sequences and verified transcription start sites are incorporated to distinguish exonic from other sequences. As of now, we have included 5 species in our analysis pipeline (man, mouse, rat, fugu and zebrafish). We characterized promoter regions of 16,127 groups of orthologous genes. All data are presented in an intuitive way via our web site. Users are free to export data for single genes or access larger data sets via our DAS server . The benefits of our framework are exemplarily shown in the context of phylogenetic profiling of transcription factor binding sites and detection of microRNAs close to transcription start sites of our gene set. Conclusion The CORG platform is a versatile tool to support analyses of gene regulation in vertebrate promoter regions. Applications for CORG cover a broad range from studying evolution of DNA binding sites and promoter constitution to the discovery of new regulatory sequence elements (e.g. microRNAs and binding sites).
Collapse
Affiliation(s)
- Christoph Dieterich
- Computational Molecular Biology Department, Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany
| | - Steffen Grossmann
- Computational Molecular Biology Department, Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany
| | - Andrea Tanzer
- Institute for Theoretical Chemistry and Structural Biology, University of Vienna, Währingerstrasse 17, A-1090 Wien, Austria
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Kreuzstraße 7b, D-04103 Leipzig, Germany
| | - Stefan Röpcke
- Computational Molecular Biology Department, Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany
| | - Peter F Arndt
- Computational Molecular Biology Department, Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany
| | - Peter F Stadler
- Institute for Theoretical Chemistry and Structural Biology, University of Vienna, Währingerstrasse 17, A-1090 Wien, Austria
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Kreuzstraße 7b, D-04103 Leipzig, Germany
| | - Martin Vingron
- Computational Molecular Biology Department, Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany
| |
Collapse
|
34
|
Abstract
MicroRNAs (miRNAs) form an abundant class of non-coding RNA genes that have an important function in post-transcriptional gene regulation and in particular modulate the expression of developmentally important transcription factors including Hox genes. Two families of microRNAs are genomically located in intergenic regions in the Hox clusters of vertebrates. Here we describe their evolution in detail. We show that the micro RNAs closely follow the patterns of protein evolution in the Hox clusters, which is characterized by cluster duplications followed by differential gene loss.
Collapse
Affiliation(s)
- Andrea Tanzer
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Kreuzstrasse 7b, D 04103 Leipzig, Germany.
| | | | | | | |
Collapse
|
35
|
Abstract
Many of the known microRNAs are encoded in polycistronic transcripts. Here, we reconstruct the evolutionary history of the mir17 microRNA clusters which consist of miR-17, miR-18, miR-19a, miR-19b, miR-20, miR-25, miR-92, miR-93, miR-106a, and miR-106b. The history of this cluster is governed by an initial phase of local (tandem) duplications, a series of duplications of entire clusters and subsequent loss of individual microRNAs from the resulting paralogous clusters. The complex history of the mir17 microRNA family appears to be closely linked to the early evolution of the vertebrate lineage.
Collapse
Affiliation(s)
- Andrea Tanzer
- Lehrstuhl für Bioinformatik am Institut für Informatik und Interdisziplinäres, Zentrum für Bioinformatik, Universität Leipzig, Germany
| | | |
Collapse
|
36
|
Tanzer A. [The neuro-radiology of carotid stenoses]. Rev Neurol (Paris) 1966; 115:711-21. [PMID: 5982593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|