1
|
Mattick JS, Amaral PP, Carninci P, Carpenter S, Chang HY, Chen LL, Chen R, Dean C, Dinger ME, Fitzgerald KA, Gingeras TR, Guttman M, Hirose T, Huarte M, Johnson R, Kanduri C, Kapranov P, Lawrence JB, Lee JT, Mendell JT, Mercer TR, Moore KJ, Nakagawa S, Rinn JL, Spector DL, Ulitsky I, Wan Y, Wilusz JE, Wu M. Long non-coding RNAs: definitions, functions, challenges and recommendations. Nat Rev Mol Cell Biol 2023; 24:430-447. [PMID: 36596869 PMCID: PMC10213152 DOI: 10.1038/s41580-022-00566-8] [Citation(s) in RCA: 306] [Impact Index Per Article: 306.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/16/2022] [Indexed: 01/05/2023]
Abstract
Genes specifying long non-coding RNAs (lncRNAs) occupy a large fraction of the genomes of complex organisms. The term 'lncRNAs' encompasses RNA polymerase I (Pol I), Pol II and Pol III transcribed RNAs, and RNAs from processed introns. The various functions of lncRNAs and their many isoforms and interleaved relationships with other genes make lncRNA classification and annotation difficult. Most lncRNAs evolve more rapidly than protein-coding sequences, are cell type specific and regulate many aspects of cell differentiation and development and other physiological processes. Many lncRNAs associate with chromatin-modifying complexes, are transcribed from enhancers and nucleate phase separation of nuclear condensates and domains, indicating an intimate link between lncRNA expression and the spatial control of gene expression during development. lncRNAs also have important roles in the cytoplasm and beyond, including in the regulation of translation, metabolism and signalling. lncRNAs often have a modular structure and are rich in repeats, which are increasingly being shown to be relevant to their function. In this Consensus Statement, we address the definition and nomenclature of lncRNAs and their conservation, expression, phenotypic visibility, structure and functions. We also discuss research challenges and provide recommendations to advance the understanding of the roles of lncRNAs in development, cell biology and disease.
Collapse
Affiliation(s)
- John S Mattick
- School of Biotechnology and Biomolecular Sciences, UNSW, Sydney, NSW, Australia.
- UNSW RNA Institute, UNSW, Sydney, NSW, Australia.
| | - Paulo P Amaral
- INSPER Institute of Education and Research, São Paulo, Brazil
| | - Piero Carninci
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Human Technopole, Milan, Italy
| | - Susan Carpenter
- Department of Molecular, Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Howard Y Chang
- Center for Personal Dynamics Regulomes, Stanford University School of Medicine, Stanford, CA, USA
- Department of Dermatology, Stanford, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Ling-Ling Chen
- CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Runsheng Chen
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Caroline Dean
- John Innes Centre, Norwich Research Park, Norwich, UK
| | - Marcel E Dinger
- School of Biotechnology and Biomolecular Sciences, UNSW, Sydney, NSW, Australia
- UNSW RNA Institute, UNSW, Sydney, NSW, Australia
| | - Katherine A Fitzgerald
- Division of Innate Immunity, Department of Medicine, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | | | - Mitchell Guttman
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Tetsuro Hirose
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
| | - Maite Huarte
- Department of Gene Therapy and Regulation of Gene Expression, Center for Applied Medical Research, University of Navarra, Pamplona, Spain
- Institute of Health Research of Navarra, Pamplona, Spain
| | - Rory Johnson
- School of Biology and Environmental Science, University College Dublin, Dublin, Ireland
- Conway Institute for Biomolecular and Biomedical Research, University College Dublin, Dublin, Ireland
| | - Chandrasekhar Kanduri
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Philipp Kapranov
- Institute of Genomics, School of Medicine, Huaqiao University, Xiamen, China
| | - Jeanne B Lawrence
- Department of Neurology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Jeannie T Lee
- Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Joshua T Mendell
- Howard Hughes Medical Institute, UT Southwestern Medical Center, Dallas, TX, USA
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Timothy R Mercer
- Australian Institute for Bioengineering and Nanotechnology, University of Queensland, Brisbane, QLD, Australia
| | - Kathryn J Moore
- Department of Medicine, New York University Grossman School of Medicine, New York, NY, USA
| | - Shinichi Nakagawa
- RNA Biology Laboratory, Faculty of Pharmaceutical Sciences, Hokkaido University, Sapporo, Japan
| | - John L Rinn
- Department of Biochemistry, University of Colorado Boulder, Boulder, CO, USA
- BioFrontiers Institute, University of Colorado Boulder, Boulder, CO, USA
- Howard Hughes Medical Institute, University of Colorado Boulder, Boulder, CO, USA
| | - David L Spector
- Cold Spring Harbour Laboratory, Cold Spring Harbour, NY, USA
| | - Igor Ulitsky
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot, Israel
| | - Yue Wan
- Laboratory of RNA Genomics and Structure, Genome Institute of Singapore, A*STAR, Singapore, Singapore
- Department of Biochemistry, National University of Singapore, Singapore, Singapore
| | - Jeremy E Wilusz
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Therapeutic Innovation Center, Baylor College of Medicine, Houston, TX, USA
| | - Mian Wu
- Translational Research Institute, Henan Provincial People's Hospital, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| |
Collapse
|
2
|
Rozowsky J, Gao J, Borsari B, Yang YT, Galeev T, Gürsoy G, Epstein CB, Xiong K, Xu J, Li T, Liu J, Yu K, Berthel A, Chen Z, Navarro F, Sun MS, Wright J, Chang J, Cameron CJF, Shoresh N, Gaskell E, Drenkow J, Adrian J, Aganezov S, Aguet F, Balderrama-Gutierrez G, Banskota S, Corona GB, Chee S, Chhetri SB, Cortez Martins GC, Danyko C, Davis CA, Farid D, Farrell NP, Gabdank I, Gofin Y, Gorkin DU, Gu M, Hecht V, Hitz BC, Issner R, Jiang Y, Kirsche M, Kong X, Lam BR, Li S, Li B, Li X, Lin KZ, Luo R, Mackiewicz M, Meng R, Moore JE, Mudge J, Nelson N, Nusbaum C, Popov I, Pratt HE, Qiu Y, Ramakrishnan S, Raymond J, Salichos L, Scavelli A, Schreiber JM, Sedlazeck FJ, See LH, Sherman RM, Shi X, Shi M, Sloan CA, Strattan JS, Tan Z, Tanaka FY, Vlasova A, Wang J, Werner J, Williams B, Xu M, Yan C, Yu L, Zaleski C, Zhang J, Ardlie K, Cherry JM, Mendenhall EM, Noble WS, Weng Z, Levine ME, Dobin A, Wold B, Mortazavi A, Ren B, Gillis J, Myers RM, Snyder MP, Choudhary J, Milosavljevic A, Schatz MC, Bernstein BE, Guigó R, Gingeras TR, Gerstein M. The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models. Cell 2023; 186:1493-1511.e40. [PMID: 37001506 PMCID: PMC10074325 DOI: 10.1016/j.cell.2023.02.018] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 10/16/2022] [Accepted: 02/10/2023] [Indexed: 04/03/2023]
Abstract
Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.
Collapse
Affiliation(s)
- Joel Rozowsky
- Section on Biomedical Informatics and Data Science, Yale University, New Haven, CT, USA; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jiahao Gao
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Beatrice Borsari
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA; Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Yucheng T Yang
- Institute of Science and Technology for Brain-Inspired Intelligence; MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence; MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Timur Galeev
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Gamze Gürsoy
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | | | - Kun Xiong
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jinrui Xu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Tianxiao Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jason Liu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Keyang Yu
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Ana Berthel
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Zhanlin Chen
- Department of Statistics and Data Science, Yale University, New Haven, CT, USA
| | - Fabio Navarro
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Maxwell S Sun
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | | | - Justin Chang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Christopher J F Cameron
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Noam Shoresh
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Jorg Drenkow
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jessika Adrian
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Sergey Aganezov
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
| | | | | | | | | | - Sora Chee
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
| | - Surya B Chhetri
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Gabriel Conte Cortez Martins
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Cassidy Danyko
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Carrie A Davis
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Daniel Farid
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | | | - Idan Gabdank
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Yoel Gofin
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - David U Gorkin
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
| | - Mengting Gu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Vivian Hecht
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Benjamin C Hitz
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Robbyn Issner
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Yunzhe Jiang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Melanie Kirsche
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Xiangmeng Kong
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Bonita R Lam
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Shantao Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Bian Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Xiqi Li
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Khine Zin Lin
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Ruibang Luo
- Department of Computer Science, The University of Hong Kong, Hong Kong, CHN
| | - Mark Mackiewicz
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Ran Meng
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - Jonathan Mudge
- European Bioinformatics Institute, Cambridge, Cambridgeshire, GB
| | | | - Chad Nusbaum
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ioann Popov
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Henry E Pratt
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - Yunjiang Qiu
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
| | - Srividya Ramakrishnan
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Joe Raymond
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Leonidas Salichos
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA; Department of Biological and Chemical Sciences, New York Institute of Technology, Old Westbury, NY, USA
| | - Alexandra Scavelli
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jacob M Schreiber
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Fritz J Sedlazeck
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA; Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Lei Hoon See
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Rachel M Sherman
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Xu Shi
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Minyi Shi
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Cricket Alicia Sloan
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - J Seth Strattan
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Zhen Tan
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Forrest Y Tanaka
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Anna Vlasova
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain; Comparative Genomics Group, Life Science Programme, Barcelona Supercomputing Centre, Barcelona, Spain; Institute of Research in Biomedicine, Barcelona, Spain
| | - Jun Wang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jonathan Werner
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Brian Williams
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Min Xu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Chengfei Yan
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Lu Yu
- Institute of Cancer Research, London, UK
| | - Christopher Zaleski
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jing Zhang
- Department of Computer Science, University of California, Irvine, Irvine, CA, USA
| | | | - J Michael Cherry
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | | | - William S Noble
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - Morgan E Levine
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Pathology, Yale University School of Medicine, New Haven, CT, USA
| | - Alexander Dobin
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Barbara Wold
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA, USA
| | - Bing Ren
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
| | - Jesse Gillis
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA; Department of Physiology, University of Toronto, Toronto, ON, Canada
| | - Richard M Myers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Michael P Snyder
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | | | | | - Michael C Schatz
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA; Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
| | - Bradley E Bernstein
- Broad Institute of MIT and Harvard, Cambridge, MA, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
| | - Roderic Guigó
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain; Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.
| | - Thomas R Gingeras
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
| | - Mark Gerstein
- Section on Biomedical Informatics and Data Science, Yale University, New Haven, CT, USA; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA; Department of Statistics and Data Science, Yale University, New Haven, CT, USA; Department of Computer Science, Yale University, New Haven, CT, USA.
| |
Collapse
|
3
|
Moore JE, Purcaro MJ, Pratt HE, Epstein CB, Shoresh N, Adrian J, Kawli T, Davis CA, Dobin A, Kaul R, Halow J, Van Nostrand EL, Freese P, Gorkin DU, Shen Y, He Y, Mackiewicz M, Pauli-Behn F, Williams BA, Mortazavi A, Keller CA, Zhang XO, Elhajjajy SI, Huey J, Dickel DE, Snetkova V, Wei X, Wang X, Rivera-Mulia JC, Rozowsky J, Zhang J, Chhetri SB, Zhang J, Victorsen A, White KP, Visel A, Yeo GW, Burge CB, Lécuyer E, Gilbert DM, Dekker J, Rinn J, Mendenhall EM, Ecker JR, Kellis M, Klein RJ, Noble WS, Kundaje A, Guigó R, Farnham PJ, Cherry JM, Myers RM, Ren B, Graveley BR, Gerstein MB, Pennacchio LA, Snyder MP, Bernstein BE, Wold B, Hardison RC, Gingeras TR, Stamatoyannopoulos JA, Weng Z. Author Correction: Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 2022; 605:E3. [PMID: 35474001 PMCID: PMC9095460 DOI: 10.1038/s41586-021-04226-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Affiliation(s)
| | - Jill E Moore
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
| | - Michael J Purcaro
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
| | - Henry E Pratt
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
| | | | - Noam Shoresh
- The Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Jessika Adrian
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Trupti Kawli
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Carrie A Davis
- Cold Spring Harbor Laboratory, Functional Genomics, Cold Spring Harbor, NY, USA
| | - Alexander Dobin
- Cold Spring Harbor Laboratory, Functional Genomics, Cold Spring Harbor, NY, USA
| | - Rajinder Kaul
- Altius Institute for Biomedical Sciences, Seattle, WA, USA.,Department of Medicine, University of Washington School of Medicine, Seattle, WA, USA
| | - Jessica Halow
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | - Eric L Van Nostrand
- Department of Cellular and Molecular Medicine, Institute for Genomic Medicine, Stem Cell Program, Sanford Consortium for Regenerative Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Peter Freese
- Program in Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - David U Gorkin
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA.,Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
| | - Yin Shen
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA.,Institute for Human Genetics, Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Yupeng He
- Genomics Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Mark Mackiewicz
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | | | - Brian A Williams
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA, USA
| | - Cheryl A Keller
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA
| | - Xiao-Ou Zhang
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
| | - Shaimae I Elhajjajy
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
| | - Jack Huey
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
| | - Diane E Dickel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Valentina Snetkova
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Xintao Wei
- Department of Genetics and Genome Sciences, Institute for Systems Genomics, UConn Health, Farmington, CT, USA
| | - Xiaofeng Wang
- Département de Biochimie et Médecine Moléculaire, Université de Montréal, Montréal, Quebec, Canada.,Division of Experimental Medicine, McGill University, Montreal, Quebec, Canada.,Institut de Recherches Cliniques de Montréal (IRCM), Montréal, Quebec, Canada
| | - Juan Carlos Rivera-Mulia
- Department of Biological Science, Florida State University, Tallahassee, FL, USA.,Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Medical School, Minneapolis, MN, USA
| | | | | | - Surya B Chhetri
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.,Biological Sciences, University of Alabama in Huntsville, Huntsville, AL, USA
| | - Jialing Zhang
- Department of Genetics, School of Medicine, Yale University, New Haven, CT, USA
| | - Alec Victorsen
- Department of Human Genetics, Institute for Genomics and Systems Biology, The University of Chicago, Chicago, IL, USA
| | | | - Axel Visel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.,US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.,School of Natural Sciences, University of California, Merced, Merced, CA, USA
| | - Gene W Yeo
- Department of Cellular and Molecular Medicine, Institute for Genomic Medicine, Stem Cell Program, Sanford Consortium for Regenerative Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Christopher B Burge
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Eric Lécuyer
- Département de Biochimie et Médecine Moléculaire, Université de Montréal, Montréal, Quebec, Canada.,Division of Experimental Medicine, McGill University, Montreal, Quebec, Canada.,Institut de Recherches Cliniques de Montréal (IRCM), Montréal, Quebec, Canada
| | - David M Gilbert
- Department of Biological Science, Florida State University, Tallahassee, FL, USA
| | - Job Dekker
- HHMI and Program in Systems Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - John Rinn
- University of Colorado Boulder, Boulder, CO, USA
| | - Eric M Mendenhall
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.,Biological Sciences, University of Alabama in Huntsville, Huntsville, AL, USA
| | - Joseph R Ecker
- Genomics Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA.,Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Manolis Kellis
- The Broad Institute of Harvard and MIT, Cambridge, MA, USA.,Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Robert J Klein
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - William S Noble
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Anshul Kundaje
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Roderic Guigó
- Bioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology and Universitat Pompeu Fabra, Barcelona, Spain
| | - Peggy J Farnham
- Department of Biochemistry and Molecular Medicine, Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - J Michael Cherry
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA.
| | - Richard M Myers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.
| | - Bing Ren
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA. .,Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA.
| | - Brenton R Graveley
- Department of Genetics and Genome Sciences, Institute for Systems Genomics, UConn Health, Farmington, CT, USA.
| | | | - Len A Pennacchio
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA. .,US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA. .,Comparative Biochemistry Program, University of California, Berkeley, CA, USA.
| | - Michael P Snyder
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA. .,Cardiovascular Institute, Stanford School of Medicine, Stanford, CA, USA.
| | - Bradley E Bernstein
- Broad Institute and Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.
| | - Barbara Wold
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
| | - Ross C Hardison
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA.
| | - Thomas R Gingeras
- Cold Spring Harbor Laboratory, Functional Genomics, Cold Spring Harbor, NY, USA.
| | - John A Stamatoyannopoulos
- Altius Institute for Biomedical Sciences, Seattle, WA, USA. .,Department of Medicine, University of Washington School of Medicine, Seattle, WA, USA. .,Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
| | - Zhiping Weng
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA. .,Department of Thoracic Surgery, Clinical Translational Research Center, Shanghai Pulmonary Hospital, The School of Life Sciences and Technology, Tongji University, Shanghai, China. .,Bioinformatics Program, Boston University, Boston, MA, USA.
| |
Collapse
|
4
|
Ortiz-Ramírez C, Guillotin B, Xu X, Rahni R, Zhang S, Yan Z, Coqueiro Dias Araujo P, Demesa-Arevalo E, Lee L, Van Eck J, Gingeras TR, Jackson D, Gallagher KL, Birnbaum KD. Ground tissue circuitry regulates organ complexity in maize and Setaria. Science 2021; 374:1247-1252. [PMID: 34855479 DOI: 10.1126/science.abj2327] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
[Figure: see text].
Collapse
Affiliation(s)
- Carlos Ortiz-Ramírez
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY 10003, USA.,UGA Laboratorio Nacional de Genómica para la Biodiversidad, CINVESTAV Irapuato, Guanajuato 36821, México
| | - Bruno Guillotin
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY 10003, USA
| | - Xiaosa Xu
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Ramin Rahni
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY 10003, USA
| | - Sanqiang Zhang
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY 10003, USA
| | - Zhe Yan
- School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 1904, USA
| | | | | | - Laura Lee
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY 10003, USA
| | - Joyce Van Eck
- Boyce Thompson Institute, Ithaca, NY 14853, USA.,Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | | | - David Jackson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Kimberly L Gallagher
- School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 1904, USA
| | - Kenneth D Birnbaum
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY 10003, USA
| |
Collapse
|
5
|
Dachet F, Brown JB, Valyi-Nagy T, Narayan KD, Serafini A, Boley N, Gingeras TR, Celniker SE, Mohapatra G, Loeb JA. Selective time-dependent changes in activity and cell-specific gene expression in human postmortem brain. Sci Rep 2021; 11:6078. [PMID: 33758256 PMCID: PMC7988150 DOI: 10.1038/s41598-021-85801-6] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Accepted: 02/24/2021] [Indexed: 12/15/2022] Open
Abstract
As a means to understand human neuropsychiatric disorders from human brain samples, we compared the transcription patterns and histological features of postmortem brain to fresh human neocortex isolated immediately following surgical removal. Compared to a number of neuropsychiatric disease-associated postmortem transcriptomes, the fresh human brain transcriptome had an entirely unique transcriptional pattern. To understand this difference, we measured genome-wide transcription as a function of time after fresh tissue removal to mimic the postmortem interval. Within a few hours, a selective reduction in the number of neuronal activity-dependent transcripts occurred with relative preservation of housekeeping genes commonly used as a reference for RNA normalization. Gene clustering indicated a rapid reduction in neuronal gene expression with a reciprocal time-dependent increase in astroglial and microglial gene expression that continued to increase for at least 24 h after tissue resection. Predicted transcriptional changes were confirmed histologically on the same tissue demonstrating that while neurons were degenerating, glial cells underwent an outgrowth of their processes. The rapid loss of neuronal genes and reciprocal expression of glial genes highlights highly dynamic transcriptional and cellular changes that occur during the postmortem interval. Understanding these time-dependent changes in gene expression in post mortem brain samples is critical for the interpretation of research studies on human brain disorders.
Collapse
Affiliation(s)
- Fabien Dachet
- University of Illinois at Chicago, Chicago, IL, 60612, USA.
| | - James B Brown
- Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | | | | | - Anna Serafini
- University of Illinois at Chicago, Chicago, IL, 60612, USA
| | - Nathan Boley
- University of California, Berkeley, CA, 94720, USA
| | | | | | | | - Jeffrey A Loeb
- University of Illinois at Chicago, Chicago, IL, 60612, USA.
| |
Collapse
|
6
|
Xu X, Crow M, Rice BR, Li F, Harris B, Liu L, Demesa-Arevalo E, Lu Z, Wang L, Fox N, Wang X, Drenkow J, Luo A, Char SN, Yang B, Sylvester AW, Gingeras TR, Schmitz RJ, Ware D, Lipka AE, Gillis J, Jackson D. Single-cell RNA sequencing of developing maize ears facilitates functional analysis and trait candidate gene discovery. Dev Cell 2021; 56:557-568.e6. [PMID: 33400914 DOI: 10.1016/j.devcel.2020.12.015] [Citation(s) in RCA: 102] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 10/31/2020] [Accepted: 12/15/2020] [Indexed: 12/30/2022]
Abstract
Crop productivity depends on activity of meristems that produce optimized plant architectures, including that of the maize ear. A comprehensive understanding of development requires insight into the full diversity of cell types and developmental domains and the gene networks required to specify them. Until now, these were identified primarily by morphology and insights from classical genetics, which are limited by genetic redundancy and pleiotropy. Here, we investigated the transcriptional profiles of 12,525 single cells from developing maize ears. The resulting developmental atlas provides a single-cell RNA sequencing (scRNA-seq) map of an inflorescence. We validated our results by mRNA in situ hybridization and by fluorescence-activated cell sorting (FACS) RNA-seq, and we show how these data may facilitate genetic studies by predicting genetic redundancy, integrating transcriptional networks, and identifying candidate genes associated with crop yield traits.
Collapse
Affiliation(s)
- Xiaosa Xu
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Megan Crow
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Brian R Rice
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Forrest Li
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Benjamin Harris
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Lei Liu
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | | | - Zefu Lu
- Department of Genetics, University of Georgia, Athens, GA 30602, USA
| | - Liya Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Nathan Fox
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Xiaofei Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Jorg Drenkow
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Anding Luo
- Department of Molecular Biology, University of Wyoming, Laramie, WY 82071, USA
| | - Si Nian Char
- Division of Plant Sciences, Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Bing Yang
- Division of Plant Sciences, Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA; Donald Danforth Plant Science Center, St. Louis, MO 63132, USA
| | - Anne W Sylvester
- Department of Molecular Biology, University of Wyoming, Laramie, WY 82071, USA
| | | | - Robert J Schmitz
- Department of Genetics, University of Georgia, Athens, GA 30602, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA; USDA-ARS, Robert W. Holley Center, Ithaca, NY 14853, USA
| | - Alexander E Lipka
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Jesse Gillis
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - David Jackson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.
| |
Collapse
|
7
|
Nechooshtan G, Yunusov D, Chang K, Gingeras TR. Processing by RNase 1 forms tRNA halves and distinct Y RNA fragments in the extracellular environment. Nucleic Acids Res 2020; 48:8035-8049. [PMID: 32609822 PMCID: PMC7430647 DOI: 10.1093/nar/gkaa526] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2020] [Revised: 06/07/2020] [Accepted: 06/26/2020] [Indexed: 12/11/2022] Open
Abstract
Extracellular RNAs participate in intercellular communication, and are being studied as promising minimally invasive diagnostic markers. Several studies in recent years showed that tRNA halves and distinct Y RNA fragments are abundant in the extracellular space, including in biofluids. While their regulatory and diagnostic potential has gained a substantial amount of attention, the biogenesis of these extracellular RNA fragments remains largely unexplored. Here, we demonstrate that these fragments are produced by RNase 1, a highly active secreted nuclease. We use RNA sequencing to investigate the effect of a null mutation of RNase 1 on the levels of tRNA halves and Y RNA fragments in the extracellular environment of cultured human cells. We complement and extend our RNA sequencing results with northern blots, showing that tRNAs and Y RNAs in the non-vesicular extracellular compartment are released from cells as full-length precursors and are subsequently cleaved to distinct fragments. In support of these results, formation of tRNA halves is recapitulated by recombinant human RNase 1 in our in vitro assay. These findings assign a novel function for RNase 1, and position it as a strong candidate for generation of tRNA halves and Y RNA fragments in biofluids.
Collapse
Affiliation(s)
- Gal Nechooshtan
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Dinar Yunusov
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Kenneth Chang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | | |
Collapse
|
8
|
Moore JE, Purcaro MJ, Pratt HE, Epstein CB, Shoresh N, Adrian J, Kawli T, Davis CA, Dobin A, Kaul R, Halow J, Van Nostrand EL, Freese P, Gorkin DU, Shen Y, He Y, Mackiewicz M, Pauli-Behn F, Williams BA, Mortazavi A, Keller CA, Zhang XO, Elhajjajy SI, Huey J, Dickel DE, Snetkova V, Wei X, Wang X, Rivera-Mulia JC, Rozowsky J, Zhang J, Chhetri SB, Zhang J, Victorsen A, White KP, Visel A, Yeo GW, Burge CB, Lécuyer E, Gilbert DM, Dekker J, Rinn J, Mendenhall EM, Ecker JR, Kellis M, Klein RJ, Noble WS, Kundaje A, Guigó R, Farnham PJ, Cherry JM, Myers RM, Ren B, Graveley BR, Gerstein MB, Pennacchio LA, Snyder MP, Bernstein BE, Wold B, Hardison RC, Gingeras TR, Stamatoyannopoulos JA, Weng Z. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 2020; 583:699-710. [PMID: 32728249 PMCID: PMC7410828 DOI: 10.1038/s41586-020-2493-4] [Citation(s) in RCA: 919] [Impact Index Per Article: 229.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2017] [Accepted: 05/27/2020] [Indexed: 12/13/2022]
Abstract
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.
Collapse
Affiliation(s)
- Jill E Moore
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
| | - Michael J Purcaro
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
| | - Henry E Pratt
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
| | | | - Noam Shoresh
- The Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Jessika Adrian
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Trupti Kawli
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Carrie A Davis
- Cold Spring Harbor Laboratory, Functional Genomics, Cold Spring Harbor, NY, USA
| | - Alexander Dobin
- Cold Spring Harbor Laboratory, Functional Genomics, Cold Spring Harbor, NY, USA
| | - Rajinder Kaul
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
- Department of Medicine, University of Washington School of Medicine, Seattle, WA, USA
| | - Jessica Halow
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | - Eric L Van Nostrand
- Department of Cellular and Molecular Medicine, Institute for Genomic Medicine, Stem Cell Program, Sanford Consortium for Regenerative Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Peter Freese
- Program in Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - David U Gorkin
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
| | - Yin Shen
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
- Institute for Human Genetics, Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Yupeng He
- Genomics Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Mark Mackiewicz
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | | | - Brian A Williams
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA, USA
| | - Cheryl A Keller
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA
| | - Xiao-Ou Zhang
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
| | - Shaimae I Elhajjajy
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
| | - Jack Huey
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
| | - Diane E Dickel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Valentina Snetkova
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Xintao Wei
- Department of Genetics and Genome Sciences, Institute for Systems Genomics, UConn Health, Farmington, CT, USA
| | - Xiaofeng Wang
- Département de Biochimie et Médecine Moléculaire, Université de Montréal, Montréal, Quebec, Canada
- Division of Experimental Medicine, McGill University, Montreal, Quebec, Canada
- Institut de Recherches Cliniques de Montréal (IRCM), Montréal, Quebec, Canada
| | - Juan Carlos Rivera-Mulia
- Department of Biological Science, Florida State University, Tallahassee, FL, USA
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Medical School, Minneapolis, MN, USA
| | | | | | - Surya B Chhetri
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
- Biological Sciences, University of Alabama in Huntsville, Huntsville, AL, USA
| | - Jialing Zhang
- Department of Genetics, School of Medicine, Yale University, New Haven, CT, USA
| | - Alec Victorsen
- Department of Human Genetics, Institute for Genomics and Systems Biology, The University of Chicago, Chicago, IL, USA
| | | | - Axel Visel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- School of Natural Sciences, University of California, Merced, Merced, CA, USA
| | - Gene W Yeo
- Department of Cellular and Molecular Medicine, Institute for Genomic Medicine, Stem Cell Program, Sanford Consortium for Regenerative Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Christopher B Burge
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Eric Lécuyer
- Département de Biochimie et Médecine Moléculaire, Université de Montréal, Montréal, Quebec, Canada
- Division of Experimental Medicine, McGill University, Montreal, Quebec, Canada
- Institut de Recherches Cliniques de Montréal (IRCM), Montréal, Quebec, Canada
| | - David M Gilbert
- Department of Biological Science, Florida State University, Tallahassee, FL, USA
| | - Job Dekker
- HHMI and Program in Systems Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - John Rinn
- University of Colorado Boulder, Boulder, CO, USA
| | - Eric M Mendenhall
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
- Biological Sciences, University of Alabama in Huntsville, Huntsville, AL, USA
| | - Joseph R Ecker
- Genomics Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Manolis Kellis
- The Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Robert J Klein
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - William S Noble
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Anshul Kundaje
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Roderic Guigó
- Bioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology and Universitat Pompeu Fabra, Barcelona, Spain
| | - Peggy J Farnham
- Department of Biochemistry and Molecular Medicine, Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - J Michael Cherry
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA.
| | - Richard M Myers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.
| | - Bing Ren
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA.
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA.
| | - Brenton R Graveley
- Department of Genetics and Genome Sciences, Institute for Systems Genomics, UConn Health, Farmington, CT, USA.
| | | | - Len A Pennacchio
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Comparative Biochemistry Program, University of California, Berkeley, CA, USA.
| | - Michael P Snyder
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA.
- Cardiovascular Institute, Stanford School of Medicine, Stanford, CA, USA.
| | - Bradley E Bernstein
- Broad Institute and Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.
| | - Barbara Wold
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
| | - Ross C Hardison
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA.
| | - Thomas R Gingeras
- Cold Spring Harbor Laboratory, Functional Genomics, Cold Spring Harbor, NY, USA.
| | - John A Stamatoyannopoulos
- Altius Institute for Biomedical Sciences, Seattle, WA, USA.
- Department of Medicine, University of Washington School of Medicine, Seattle, WA, USA.
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
| | - Zhiping Weng
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA.
- Department of Thoracic Surgery, Clinical Translational Research Center, Shanghai Pulmonary Hospital, The School of Life Sciences and Technology, Tongji University, Shanghai, China.
- Bioinformatics Program, Boston University, Boston, MA, USA.
| |
Collapse
|
9
|
Breschi A, Muñoz-Aguirre M, Wucher V, Davis CA, Garrido-Martín D, Djebali S, Gillis J, Pervouchine DD, Vlasova A, Dobin A, Zaleski C, Drenkow J, Danyko C, Scavelli A, Reverter F, Snyder MP, Gingeras TR, Guigó R. A limited set of transcriptional programs define major cell types. Genome Res 2020; 30:1047-1059. [PMID: 32759341 PMCID: PMC7397875 DOI: 10.1101/gr.263186.120] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Accepted: 04/29/2020] [Indexed: 12/12/2022]
Abstract
We have produced RNA sequencing data for 53 primary cells from different locations in the human body. The clustering of these primary cells reveals that most cells in the human body share a few broad transcriptional programs, which define five major cell types: epithelial, endothelial, mesenchymal, neural, and blood cells. These act as basic components of many tissues and organs. Based on gene expression, these cell types redefine the basic histological types by which tissues have been traditionally classified. We identified genes whose expression is specific to these cell types, and from these genes, we estimated the contribution of the major cell types to the composition of human tissues. We found this cellular composition to be a characteristic signature of tissues and to reflect tissue morphological heterogeneity and histology. We identified changes in cellular composition in different tissues associated with age and sex, and found that departures from the normal cellular composition correlate with histological phenotypes associated with disease.
Collapse
Affiliation(s)
- Alessandra Breschi
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, E-08003 Barcelona, Catalonia, Spain
- Universitat Pompeu Fabra (UPF), E-08003 Barcelona, Catalonia, Spain
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Manuel Muñoz-Aguirre
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, E-08003 Barcelona, Catalonia, Spain
- Universitat Politècnica de Catalunya. Departament d'Estadística i Investigació Operativa, 08034 Barcelona, Catalonia, Spain
| | - Valentin Wucher
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, E-08003 Barcelona, Catalonia, Spain
| | - Carrie A Davis
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11742, USA
| | - Diego Garrido-Martín
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, E-08003 Barcelona, Catalonia, Spain
- Universitat Pompeu Fabra (UPF), E-08003 Barcelona, Catalonia, Spain
| | - Sarah Djebali
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, E-08003 Barcelona, Catalonia, Spain
- Universitat Pompeu Fabra (UPF), E-08003 Barcelona, Catalonia, Spain
- Institut National de Recherche en Santé Digestive (IRSD), Université de Toulouse, Institut National de la Santé et de la Recherche Médicale (INSERM), Institut National de Recherche pour l'Agriculture, l'Alimentation et l'Environnement (INRAE), École Nationale Vétérinaire de Toulouse (ENVT), Université Paul Sabatier (UPS), 31024 Toulouse, France
| | - Jesse Gillis
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Dmitri D Pervouchine
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, E-08003 Barcelona, Catalonia, Spain
- Skolkovo Institute for Science and Technology, Moscow, Russia 143025
| | - Anna Vlasova
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), 1030 Vienna, Austria
| | - Alexander Dobin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11742, USA
| | - Chris Zaleski
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11742, USA
| | - Jorg Drenkow
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11742, USA
| | - Cassidy Danyko
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11742, USA
| | | | - Ferran Reverter
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, E-08003 Barcelona, Catalonia, Spain
- Universitat Pompeu Fabra (UPF), E-08003 Barcelona, Catalonia, Spain
| | - Michael P Snyder
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Thomas R Gingeras
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11742, USA
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, E-08003 Barcelona, Catalonia, Spain
- Universitat Pompeu Fabra (UPF), E-08003 Barcelona, Catalonia, Spain
| |
Collapse
|
10
|
Snyder MP, Gingeras TR, Moore JE, Weng Z, Gerstein MB, Ren B, Hardison RC, Stamatoyannopoulos JA, Graveley BR, Feingold EA, Pazin MJ, Pagan M, Gilchrist DA, Hitz BC, Cherry JM, Bernstein BE, Mendenhall EM, Zerbino DR, Frankish A, Flicek P, Myers RM. Perspectives on ENCODE. Nature 2020; 583:693-698. [PMID: 32728248 PMCID: PMC7410827 DOI: 10.1038/s41586-020-2449-8] [Citation(s) in RCA: 81] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2019] [Accepted: 05/05/2020] [Indexed: 12/25/2022]
Abstract
The Encylopedia of DNA Elements (ENCODE) Project launched in 2003 with the long-term goal of developing a comprehensive map of functional elements in the human genome. These included genes, biochemical regions associated with gene regulation (for example, transcription factor binding sites, open chromatin, and histone marks) and transcript isoforms. The marks serve as sites for candidate cis-regulatory elements (cCREs) that may serve functional roles in regulating gene expression1. The project has been extended to model organisms, particularly the mouse. In the third phase of ENCODE, nearly a million and more than 300,000 cCRE annotations have been generated for human and mouse, respectively, and these have provided a valuable resource for the scientific community.
Collapse
Affiliation(s)
- Michael P Snyder
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA.
- Cardiovascular Institute, Stanford School of Medicine, Stanford, CA, USA.
| | - Thomas R Gingeras
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jill E Moore
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
| | - Zhiping Weng
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
- Department of Thoracic Surgery, Clinical Translational Research Center, Shanghai Pulmonary Hospital, The School of Life Sciences and Technology, Tongji University, Shanghai, China
- Bioinformatics Program, Boston University, Boston, MA, USA
| | | | - Bing Ren
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
- Center for Epigenomics, University of California, San Diego, La Jolla, CA, USA
| | - Ross C Hardison
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA
| | - John A Stamatoyannopoulos
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Department of Medicine, University of Washington, Seattle, WA, USA
| | - Brenton R Graveley
- Department of Genetics and Genome Sciences, Institute for Systems Genomics, UConn Health, Farmington, CT, USA
| | - Elise A Feingold
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Michael J Pazin
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Michael Pagan
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Daniel A Gilchrist
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Benjamin C Hitz
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - J Michael Cherry
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Bradley E Bernstein
- Broad Institute and Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Eric M Mendenhall
- Biological Sciences, University of Alabama in Huntsville, Huntsville, AL, USA
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Daniel R Zerbino
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Richard M Myers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| |
Collapse
|
11
|
Wang L, Lu Z, delaBastide M, Van Buren P, Wang X, Ghiban C, Regulski M, Drenkow J, Xu X, Ortiz-Ramirez C, Marco CF, Goodwin S, Dobin A, Birnbaum KD, Jackson DP, Martienssen RA, McCombie WR, Micklos DA, Schatz MC, Ware DH, Gingeras TR. Management, Analyses, and Distribution of the MaizeCODE Data on the Cloud. Front Plant Sci 2020; 11:289. [PMID: 32296450 PMCID: PMC7136414 DOI: 10.3389/fpls.2020.00289] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Accepted: 02/26/2020] [Indexed: 06/11/2023]
Abstract
MaizeCODE is a project aimed at identifying and analyzing functional elements in the maize genome. In its initial phase, MaizeCODE assayed up to five tissues from four maize strains (B73, NC350, W22, TIL11) by RNA-Seq, Chip-Seq, RAMPAGE, and small RNA sequencing. To facilitate reproducible science and provide both human and machine access to the MaizeCODE data, we enhanced SciApps, a cloud-based portal, for analysis and distribution of both raw data and analysis results. Based on the SciApps workflow platform, we generated new components to support the complete cycle of MaizeCODE data management. These include publicly accessible scientific workflows for the reproducible and shareable analysis of various functional data, a RESTful API for batch processing and distribution of data and metadata, a searchable data page that lists each MaizeCODE experiment as a reproducible workflow, and integrated JBrowse genome browser tracks linked with workflows and metadata. The SciApps portal is a flexible platform that allows the integration of new analysis tools, workflows, and genomic data from multiple projects. Through metadata and a ready-to-compute cloud-based platform, the portal experience improves access to the MaizeCODE data and facilitates its analysis.
Collapse
Affiliation(s)
- Liya Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | - Zhenyuan Lu
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | | | - Peter Van Buren
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | - Xiaofei Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | - Cornel Ghiban
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | - Michael Regulski
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | - Jorg Drenkow
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | - Xiaosa Xu
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | | | - Cristina F. Marco
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | - Sara Goodwin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | - Alexander Dobin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | | | - David P. Jackson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | | | | | - David A. Micklos
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | - Michael C. Schatz
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
- Johns Hopkins University, Baltimore, MD, United States
| | - Doreen H. Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
- USDA-ARS Robert W. Holley Center for Agriculture and Health, Ithaca, NY, United States
| | | |
Collapse
|
12
|
Rahmanian S, Murad R, Breschi A, Zeng W, Mackiewicz M, Williams B, Davis CA, Roberts B, Meadows S, Moore D, Trout D, Zaleski C, Dobin A, Sei LH, Drenkow J, Scavelli A, Gingeras TR, Wold BJ, Myers RM, Guigó R, Mortazavi A. Dynamics of microRNA expression during mouse prenatal development. Genome Res 2019; 29:1900-1909. [PMID: 31645363 PMCID: PMC6836743 DOI: 10.1101/gr.248997.119] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Accepted: 08/29/2019] [Indexed: 12/15/2022]
Abstract
MicroRNAs (miRNAs) play a critical role as posttranscriptional regulators of gene expression. The ENCODE Project profiled the expression of miRNAs in an extensive set of organs during a time-course of mouse embryonic development and captured the expression dynamics of 785 miRNAs. We found distinct organ-specific and developmental stage–specific miRNA expression clusters, with an overall pattern of increasing organ-specific expression as embryonic development proceeds. Comparative analysis of conserved miRNAs in mouse and human revealed stronger clustering of expression patterns by organ type rather than by species. An analysis of messenger RNA expression clusters compared with miRNA expression clusters identifies the potential role of specific miRNA expression clusters in suppressing the expression of mRNAs specific to other developmental programs in the organ in which these miRNAs are expressed during embryonic development. Our results provide the most comprehensive time-course of miRNA expression as part of an integrated ENCODE reference data set for mouse embryonic development.
Collapse
Affiliation(s)
- Sorena Rahmanian
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, California 92697, USA.,Center for Complex Biological Systems, University of California Irvine, Irvine, California 92697, USA
| | - Rabi Murad
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, California 92697, USA.,Center for Complex Biological Systems, University of California Irvine, Irvine, California 92697, USA
| | - Alessandra Breschi
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Barcelona 08003, Catalonia, Spain
| | - Weihua Zeng
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, California 92697, USA.,Center for Complex Biological Systems, University of California Irvine, Irvine, California 92697, USA
| | - Mark Mackiewicz
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| | - Brian Williams
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Carrie A Davis
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Brian Roberts
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| | - Sarah Meadows
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| | - Dianna Moore
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| | - Diane Trout
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Chris Zaleski
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Alex Dobin
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Lei-Hoon Sei
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Jorg Drenkow
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Alex Scavelli
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Thomas R Gingeras
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Barbara J Wold
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Richard M Myers
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| | - Roderic Guigó
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Barcelona 08003, Catalonia, Spain
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, California 92697, USA.,Center for Complex Biological Systems, University of California Irvine, Irvine, California 92697, USA
| |
Collapse
|
13
|
Ballouz S, Dobin A, Gingeras TR, Gillis J. The fractured landscape of RNA-seq alignment: the default in our STARs. Nucleic Acids Res 2019; 46:5125-5138. [PMID: 29718481 PMCID: PMC6007662 DOI: 10.1093/nar/gky325] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Accepted: 04/16/2018] [Indexed: 12/28/2022] Open
Abstract
Many tools are available for RNA-seq alignment and expression quantification, with comparative value being hard to establish. Benchmarking assessments often highlight methods’ good performance, but are focused on either model data or fail to explain variation in performance. This leaves us to ask, what is the most meaningful way to assess different alignment choices? And importantly, where is there room for progress? In this work, we explore the answers to these two questions by performing an exhaustive assessment of the STAR aligner. We assess STAR’s performance across a range of alignment parameters using common metrics, and then on biologically focused tasks. We find technical metrics such as fraction mapping or expression profile correlation to be uninformative, capturing properties unlikely to have any role in biological discovery. Surprisingly, we find that changes in alignment parameters within a wide range have little impact on both technical and biological performance. Yet, when performance finally does break, it happens in difficult regions, such as X-Y paralogs and MHC genes. We believe improved reporting by developers will help establish where results are likely to be robust or fragile, providing a better baseline to establish where methodological progress can still occur.
Collapse
Affiliation(s)
- Sara Ballouz
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, NY 11797, USA
| | - Alexander Dobin
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, NY 11797, USA
| | - Thomas R Gingeras
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, NY 11797, USA
| | - Jesse Gillis
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, NY 11797, USA
| |
Collapse
|
14
|
Zhang XO, Gingeras TR, Weng Z. Genome-wide analysis of polymerase III-transcribed Alu elements suggests cell-type-specific enhancer function. Genome Res 2019; 29:1402-1414. [PMID: 31413151 PMCID: PMC6724667 DOI: 10.1101/gr.249789.119] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Accepted: 07/24/2019] [Indexed: 01/09/2023]
Abstract
Alu elements are one of the most successful families of transposons in the human genome. A portion of Alu elements is transcribed by RNA Pol III, whereas the remaining ones are part of Pol II transcripts. Because Alu elements are highly repetitive, it has been difficult to identify the Pol III–transcribed elements and quantify their expression levels. In this study, we generated high-resolution, long-genomic-span RAMPAGE data in 155 biosamples all with matching RNA-seq data and built an atlas of 17,249 Pol III–transcribed Alu elements. We further performed an integrative analysis on the ChIP-seq data of 10 histone marks and hundreds of transcription factors, whole-genome bisulfite sequencing data, ChIA-PET data, and functional data in several biosamples, and our results revealed that although the human-specific Alu elements are transcriptionally repressed, the older, expressed Alu elements may be exapted by the human host to function as cell-type–specific enhancers for their nearby protein-coding genes.
Collapse
Affiliation(s)
- Xiao-Ou Zhang
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA
| | - Thomas R Gingeras
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA.,Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA
| |
Collapse
|
15
|
Zhang Q, Chao TC, Patil VS, Qin Y, Tiwari SK, Chiou J, Dobin A, Tsai CM, Li Z, Dang J, Gupta S, Urdahl KB, Nizet V, Gingeras TR, Gaulton KJ, Rana TM. Genome-wide analysis identifies pairs of cis-acting lncRNAs and protein-coding genes involved in innate immunity. The Journal of Immunology 2019. [DOI: 10.4049/jimmunol.202.supp.185.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Abstract
Long noncoding RNAs (lncRNAs) can regulate target gene expression by acting in cis (locally) or in trans (non-locally). Here, we performed genome-wide expression analysis of Toll-like receptor (TLR)-stimulated human macrophages to identify pairs of cis-acting lncRNAs and protein-coding genes involved in innate immunity. A total of 229 gene pairs were identified, many of which were commonly regulated by signaling through multiple TLRs and were involved in the cytokine responses to infection by group B Streptococcus. We focused on elucidating the function of one lncRNA, named lnc-MARCKS or ROCKI (Regulator of Cytokines and Inflammation), which was induced by multiple TLR stimuli and acted as a master regulator of inflammatory responses. ROCKI interacted with APEX1 (apurinic/apyrimidinic endodeoxyribonuclease 1) to form a ribonucleoprotein complex at the MARCKS promoter. In turn, ROCKI–APEX1 recruited the histone deacetylase HDAC1, which removed the H3K27ac modification from the promoter, thus reducing MARCKS transcription and subsequent Ca2+ signaling and inflammatory gene expression. Finally, genetic variants affecting ROCKI expression were linked to a reduced risk of certain inflammatory and infectious disease in humans, including inflammatory bowel disease and tuberculosis. Collectively, these data highlight the importance of cis-acting lncRNAs in TLR signaling, innate immunity, and pathophysiological inflammation.
Collapse
Affiliation(s)
- Qiong Zhang
- 1Department of Pediatrics, University of California San Diego School of Medicine, 9500 Gilman Drive MC 0762, La Jolla, California 92093, USA
| | - Ti-chun Chao
- 1Department of Pediatrics, University of California San Diego School of Medicine, 9500 Gilman Drive MC 0762, La Jolla, California 92093, USA
| | - Veena S. Patil
- 1Department of Pediatrics, University of California San Diego School of Medicine, 9500 Gilman Drive MC 0762, La Jolla, California 92093, USA
| | - Yue Qin
- 1Department of Pediatrics, University of California San Diego School of Medicine, 9500 Gilman Drive MC 0762, La Jolla, California 92093, USA
| | - Shashi Kant Tiwari
- 1Department of Pediatrics, University of California San Diego School of Medicine, 9500 Gilman Drive MC 0762, La Jolla, California 92093, USA
| | - Joshua Chiou
- 1Department of Pediatrics, University of California San Diego School of Medicine, 9500 Gilman Drive MC 0762, La Jolla, California 92093, USA
| | - Alexander Dobin
- 2Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA
| | - Chih-Ming Tsai
- 1Department of Pediatrics, University of California San Diego School of Medicine, 9500 Gilman Drive MC 0762, La Jolla, California 92093, USA
| | - Zhonghan Li
- 1Department of Pediatrics, University of California San Diego School of Medicine, 9500 Gilman Drive MC 0762, La Jolla, California 92093, USA
| | - Jason Dang
- 1Department of Pediatrics, University of California San Diego School of Medicine, 9500 Gilman Drive MC 0762, La Jolla, California 92093, USA
| | - Shagun Gupta
- 1Department of Pediatrics, University of California San Diego School of Medicine, 9500 Gilman Drive MC 0762, La Jolla, California 92093, USA
| | - Kevin B Urdahl
- 3Center for Infectious Disease Research (CIDR), Seattle, WA 98109; Department of Immunology, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Victor Nizet
- 1Department of Pediatrics, University of California San Diego School of Medicine, 9500 Gilman Drive MC 0762, La Jolla, California 92093, USA
- 4Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego School of Medicine, 9500 Gilman Drive MC 0760, La Jolla, California 92093, US
| | | | - Kyle J Gaulton
- 1Department of Pediatrics, University of California San Diego School of Medicine, 9500 Gilman Drive MC 0762, La Jolla, California 92093, USA
| | - Tariq M. Rana
- 1Department of Pediatrics, University of California San Diego School of Medicine, 9500 Gilman Drive MC 0762, La Jolla, California 92093, USA
| |
Collapse
|
16
|
Zhang Q, Chao TC, Patil VS, Qin Y, Tiwari SK, Chiou J, Dobin A, Tsai CM, Li Z, Dang J, Gupta S, Urdahl K, Nizet V, Gingeras TR, Gaulton KJ, Rana TM. The long noncoding RNA ROCKI regulates inflammatory gene expression. EMBO J 2019; 38:embj.2018100041. [PMID: 30918008 PMCID: PMC6463213 DOI: 10.15252/embj.2018100041] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 02/12/2019] [Accepted: 02/14/2019] [Indexed: 12/15/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) can regulate target gene expression by acting in cis (locally) or in trans (non-locally). Here, we performed genome-wide expression analysis of Toll-like receptor (TLR)-stimulated human macrophages to identify pairs of cis-acting lncRNAs and protein-coding genes involved in innate immunity. A total of 229 gene pairs were identified, many of which were commonly regulated by signaling through multiple TLRs and were involved in the cytokine responses to infection by group B Streptococcus We focused on elucidating the function of one lncRNA, named lnc-MARCKS or ROCKI (Regulator of Cytokines and Inflammation), which was induced by multiple TLR stimuli and acted as a master regulator of inflammatory responses. ROCKI interacted with APEX1 (apurinic/apyrimidinic endodeoxyribonuclease 1) to form a ribonucleoprotein complex at the MARCKS promoter. In turn, ROCKI-APEX1 recruited the histone deacetylase HDAC1, which removed the H3K27ac modification from the promoter, thus reducing MARCKS transcription and subsequent Ca2+ signaling and inflammatory gene expression. Finally, genetic variants affecting ROCKI expression were linked to a reduced risk of certain inflammatory and infectious disease in humans, including inflammatory bowel disease and tuberculosis. Collectively, these data highlight the importance of cis-acting lncRNAs in TLR signaling, innate immunity, and pathophysiological inflammation.
Collapse
Affiliation(s)
- Qiong Zhang
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA
| | - Ti-Chun Chao
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA
| | - Veena S Patil
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA
| | - Yue Qin
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA
| | - Shashi Kant Tiwari
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA
| | - Joshua Chiou
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA
| | | | - Chih-Ming Tsai
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA
| | - Zhonghan Li
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA
| | - Jason Dang
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA
| | - Shagun Gupta
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA
| | - Kevin Urdahl
- Center for Infectious Disease Research (CIDR), Seattle, WA, USA.,Department of Immunology, University of Washington School of Medicine, Seattle, WA, USA
| | - Victor Nizet
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego School of Medicine, La Jolla, CA, USA
| | | | - Kyle J Gaulton
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA
| | - Tariq M Rana
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA
| |
Collapse
|
17
|
Batut PJ, Gingeras TR. Conserved noncoding transcription and core promoter regulatory code in early Drosophila development. eLife 2017; 6:29005. [PMID: 29260710 PMCID: PMC5754203 DOI: 10.7554/elife.29005] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Accepted: 12/19/2017] [Indexed: 01/30/2023] Open
Abstract
Multicellular development is driven by regulatory programs that orchestrate the transcription of protein-coding and noncoding genes. To decipher this genomic regulatory code, and to investigate the developmental relevance of noncoding transcription, we compared genome-wide promoter activity throughout embryogenesis in 5 Drosophila species. Core promoters, generally not thought to play a significant regulatory role, in fact impart restrictions on the developmental timing of gene expression on a global scale. We propose a hierarchical regulatory model in which core promoters define broad windows of opportunity for expression, by defining a range of transcription factors from which they can receive regulatory inputs. This two-tiered mechanism globally orchestrates developmental gene expression, including extremely widespread noncoding transcription. The sequence and expression specificity of noncoding RNA promoters are evolutionarily conserved, implying biological relevance. Overall, this work introduces a hierarchical model for developmental gene regulation, and reveals a major role for noncoding transcription in animal development.
Collapse
Affiliation(s)
- Philippe J Batut
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, New York, United States
| | - Thomas R Gingeras
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, New York, United States
| |
Collapse
|
18
|
Lagarde J, Uszczynska-Ratajczak B, Carbonell S, Pérez-Lluch S, Abad A, Davis C, Gingeras TR, Frankish A, Harrow J, Guigo R, Johnson R. High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing. Nat Genet 2017; 49:1731-1740. [PMID: 29106417 PMCID: PMC5709232 DOI: 10.1038/ng.3988] [Citation(s) in RCA: 166] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2017] [Accepted: 10/11/2017] [Indexed: 12/20/2022]
Abstract
Accurate annotation of genes and their transcripts is a foundation of genomics, but currently no annotation technique combines throughput and accuracy. As a result, reference gene collections remain incomplete-many gene models are fragmentary, and thousands more remain uncataloged, particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), which combines targeted RNA capture with third-generation long-read sequencing. Here we present an experimental reannotation of the GENCODE intergenic lncRNA populations in matched human and mouse tissues that resulted in novel transcript models for 3,574 and 561 gene loci, respectively. CLS approximately doubled the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enabled us to definitively characterize the genomic features of lncRNAs, including promoter and gene structure, and protein-coding potential. Thus, CLS removes a long-standing bottleneck in transcriptome annotation and generates manual-quality full-length transcript models at high-throughput scales.
Collapse
Affiliation(s)
- Julien Lagarde
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Barbara Uszczynska-Ratajczak
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Silvia Carbonell
- R&D Department, Quantitative Genomic Medicine Laboratories (qGenomics), Barcelona, Spain
| | - Sílvia Pérez-Lluch
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Amaya Abad
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Carrie Davis
- Functional Genomics Group, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Thomas R. Gingeras
- Functional Genomics Group, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Adam Frankish
- Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, UK CB10 1HH
| | - Jennifer Harrow
- Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, UK CB10 1HH
| | - Roderic Guigo
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Rory Johnson
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| |
Collapse
|
19
|
Breschi A, Djebali S, Gillis J, Pervouchine DD, Dobin A, Davis CA, Gingeras TR, Guigó R. Gene-specific patterns of expression variation across organs and species. Genome Biol 2016; 17:151. [PMID: 27391956 PMCID: PMC4937605 DOI: 10.1186/s13059-016-1008-y] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 06/14/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A comparison of transcriptional profiles derived from different tissues in a given species or among different species assumes that commonalities reflect evolutionarily conserved programs and that differences reflect species or tissue responses to environmental conditions or developmental program staging. Apparently conflicting results have been published regarding whether organ-specific transcriptional patterns dominate over species-specific patterns, or vice versa, making it unclear to what extent the biology of a given organism can be extrapolated to another. These studies have in common that they treat the transcriptomes monolithically, implicitly ignoring that each gene is likely to have a specific pattern of transcriptional variation across organs and species. RESULTS We use linear models to quantify this pattern. We find a continuum in the spectrum of expression variation: the expression of some genes varies considerably across species and little across organs, and simply reflects evolutionary distance. At the other extreme are genes whose expression varies considerably across organs and little across species; these genes are much more likely to be associated with diseases than are genes whose expression varies predominantly across species. CONCLUSIONS Whether transcriptomes, when considered globally, cluster preferentially according to one component or the other may not be a property of the transcriptomes, but rather a consequence of the dominant behavior of a subset of genes. Therefore, the values of the components of the variance of expression for each gene could become a useful resource when planning, interpreting, and extrapolating experimental data from mouse to humans.
Collapse
Affiliation(s)
- Alessandra Breschi
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Sarah Djebali
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- GenPhySE, Université de Toulouse, INRA, INPT, INP-ENVT, Castanet Tolosan, France
| | - Jesse Gillis
- Cold Spring Harbor LaboratoryCold Spring Harbor, NY, 11742, USA
| | - Dmitri D Pervouchine
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Alex Dobin
- Cold Spring Harbor LaboratoryCold Spring Harbor, NY, 11742, USA
| | - Carrie A Davis
- Cold Spring Harbor LaboratoryCold Spring Harbor, NY, 11742, USA
| | | | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
| |
Collapse
|
20
|
Abstract
Recent advances in high-throughput sequencing technology made it possible to probe the cell transcriptomes by generating hundreds of millions of short reads which represent the fragments of the transcribed RNA molecules. The first and the most crucial task in the RNA-seq data analysis is mapping of the reads to the reference genome. STAR (Spliced Transcripts Alignment to a Reference) is an RNA-seq mapper that performs highly accurate spliced sequence alignment at an ultrafast speed. STAR alignment algorithm can be controlled by many user-defined parameters. Here, we describe the most important STAR options and parameters, as well as best practices for achieving the maximum mapping accuracy and speed.
Collapse
Affiliation(s)
- Alexander Dobin
- Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY, 11746, USA.
| | - Thomas R Gingeras
- Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY, 11746, USA
| |
Collapse
|
21
|
Chakrabortty SK, Prakash A, Nechooshtan G, Hearn S, Gingeras TR. Extracellular vesicle-mediated transfer of processed and functional RNY5 RNA. RNA 2015; 21:1966-79. [PMID: 26392588 PMCID: PMC4604435 DOI: 10.1261/rna.053629.115] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Accepted: 08/03/2015] [Indexed: 05/22/2023]
Abstract
Extracellular vesicles (EVs) have been proposed as a means to promote intercellular communication. We show that when human primary cells are exposed to cancer cell EVs, rapid cell death of the primary cells is observed, while cancer cells treated with primary or cancer cell EVs do not display this response. The active agents that trigger cell death are 29- to 31-nucleotide (nt) or 22- to 23-nt processed fragments of an 83-nt primary transcript of the human RNY5 gene that are highly likely to be formed within the EVs. Primary cells treated with either cancer cell EVs, deproteinized total RNA from either primary or cancer cell EVs, or synthetic versions of 31- and 23-nt fragments trigger rapid cell death in a dose-dependent manner. The transfer of processed RNY5 fragments through EVs may reflect a novel strategy used by cancer cells toward the establishment of a favorable microenvironment for their proliferation and invasion.
Collapse
Affiliation(s)
| | - Ashwin Prakash
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Gal Nechooshtan
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Stephen Hearn
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Thomas R Gingeras
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| |
Collapse
|
22
|
Abstract
Mapping of large sets of high-throughput sequencing reads to a reference genome is one of the foundational steps in RNA-seq data analysis. The STAR software package performs this task with high levels of accuracy and speed. In addition to detecting annotated and novel splice junctions, STAR is capable of discovering more complex RNA sequence arrangements, such as chimeric and circular RNA. STAR can align spliced sequences of any length with moderate error rates, providing scalability for emerging sequencing technologies. STAR generates output files that can be used for many downstream analyses such as transcript/gene expression quantification, differential gene expression, novel isoform reconstruction, and signal visualization. In this unit, we describe computational protocols that produce various output files, use different RNA-seq datatypes, and utilize different mapping strategies. STAR is open source software that can be run on Unix, Linux, or Mac OS X systems.
Collapse
|
23
|
Yue F, Cheng Y, Breschi A, Vierstra J, Wu W, Ryba T, Sandstrom R, Ma Z, Davis C, Pope BD, Shen Y, Pervouchine DD, Djebali S, Thurman RE, Kaul R, Rynes E, Kirilusha A, Marinov GK, Williams BA, Trout D, Amrhein H, Fisher-Aylor K, Antoshechkin I, DeSalvo G, See LH, Fastuca M, Drenkow J, Zaleski C, Dobin A, Prieto P, Lagarde J, Bussotti G, Tanzer A, Denas O, Li K, Bender MA, Zhang M, Byron R, Groudine MT, McCleary D, Pham L, Ye Z, Kuan S, Edsall L, Wu YC, Rasmussen MD, Bansal MS, Kellis M, Keller CA, Morrissey CS, Mishra T, Jain D, Dogan N, Harris RS, Cayting P, Kawli T, Boyle AP, Euskirchen G, Kundaje A, Lin S, Lin Y, Jansen C, Malladi VS, Cline MS, Erickson DT, Kirkup VM, Learned K, Sloan CA, Rosenbloom KR, Lacerda de Sousa B, Beal K, Pignatelli M, Flicek P, Lian J, Kahveci T, Lee D, Kent WJ, Ramalho Santos M, Herrero J, Notredame C, Johnson A, Vong S, Lee K, Bates D, Neri F, Diegel M, Canfield T, Sabo PJ, Wilken MS, Reh TA, Giste E, Shafer A, Kutyavin T, Haugen E, Dunn D, Reynolds AP, Neph S, Humbert R, Hansen RS, De Bruijn M, Selleri L, Rudensky A, Josefowicz S, Samstein R, Eichler EE, Orkin SH, Levasseur D, Papayannopoulou T, Chang KH, Skoultchi A, Gosh S, Disteche C, Treuting P, Wang Y, Weiss MJ, Blobel GA, Cao X, Zhong S, Wang T, Good PJ, Lowdon RF, Adams LB, Zhou XQ, Pazin MJ, Feingold EA, Wold B, Taylor J, Mortazavi A, Weissman SM, Stamatoyannopoulos JA, Snyder MP, Guigo R, Gingeras TR, Gilbert DM, Hardison RC, Beer MA, Ren B. A comparative encyclopedia of DNA elements in the mouse genome. Nature 2015; 515:355-64. [PMID: 25409824 PMCID: PMC4266106 DOI: 10.1038/nature13992] [Citation(s) in RCA: 1135] [Impact Index Per Article: 126.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2014] [Accepted: 10/24/2014] [Indexed: 12/11/2022]
Abstract
The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.
Collapse
Affiliation(s)
- Feng Yue
- 1] Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA. [2] Department of Biochemistry and Molecular Biology, College of Medicine, The Pennsylvania State University, Hershey, Pennsylvania 17033, USA
| | - Yong Cheng
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Alessandra Breschi
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Jeff Vierstra
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Weisheng Wu
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Tyrone Ryba
- Department of Biological Science, 319 Stadium Drive, Florida State University, Tallahassee, Florida 32306-4295, USA
| | - Richard Sandstrom
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Zhihai Ma
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Carrie Davis
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Benjamin D Pope
- Department of Biological Science, 319 Stadium Drive, Florida State University, Tallahassee, Florida 32306-4295, USA
| | - Yin Shen
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Dmitri D Pervouchine
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Sarah Djebali
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Robert E Thurman
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Rajinder Kaul
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Eric Rynes
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Anthony Kirilusha
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Georgi K Marinov
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Brian A Williams
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Diane Trout
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Henry Amrhein
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Katherine Fisher-Aylor
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Igor Antoshechkin
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Gilberto DeSalvo
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Lei-Hoon See
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Meagan Fastuca
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Jorg Drenkow
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Chris Zaleski
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Alex Dobin
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Pablo Prieto
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Julien Lagarde
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Giovanni Bussotti
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Andrea Tanzer
- 1] Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain. [2] Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Waehringerstrasse 17/3/303, A-1090 Vienna, Austria
| | - Olgert Denas
- Departments of Biology and Mathematics and Computer Science, Emory University, O. Wayne Rollins Research Center, 1510 Clifton Road NE, Atlanta, Georgia 30322, USA
| | - Kanwei Li
- Departments of Biology and Mathematics and Computer Science, Emory University, O. Wayne Rollins Research Center, 1510 Clifton Road NE, Atlanta, Georgia 30322, USA
| | - M A Bender
- 1] Department of Pediatrics, University of Washington, Seattle, Washington 98195, USA. [2] Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | - Miaohua Zhang
- Basic Science Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | - Rachel Byron
- Basic Science Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | - Mark T Groudine
- 1] Basic Science Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA. [2] Department of Radiation Oncology, University of Washington, Seattle, Washington 98195, USA
| | - David McCleary
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Long Pham
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Zhen Ye
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Samantha Kuan
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Lee Edsall
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Yi-Chieh Wu
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
| | - Matthew D Rasmussen
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
| | - Mukul S Bansal
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
| | - Manolis Kellis
- 1] Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA. [2] Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Cheryl A Keller
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Christapher S Morrissey
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Tejaswini Mishra
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Deepti Jain
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Nergiz Dogan
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Robert S Harris
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Philip Cayting
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Trupti Kawli
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Alan P Boyle
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Ghia Euskirchen
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Shin Lin
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Yiing Lin
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Camden Jansen
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, California 92697, USA
| | - Venkat S Malladi
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Melissa S Cline
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Drew T Erickson
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Vanessa M Kirkup
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Katrina Learned
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Cricket A Sloan
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Kate R Rosenbloom
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Beatriz Lacerda de Sousa
- Departments of Obstetrics/Gynecology and Pathology, and Center for Reproductive Sciences, University of California San Francisco, San Francisco, California 94143, USA
| | - Kathryn Beal
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Miguel Pignatelli
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jin Lian
- Yale University, Department of Genetics, PO Box 208005, 333 Cedar Street, New Haven, Connecticut 06520-8005, USA
| | - Tamer Kahveci
- Computer &Information Sciences &Engineering, University of Florida, Gainesville, Florida 32611, USA
| | - Dongwon Lee
- McKusick-Nathans Institute of Genetic Medicine and Department of Biomedical Engineering, Johns Hopkins University, 733 N. Broadway, BRB 573 Baltimore, Maryland 21205, USA
| | - W James Kent
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Miguel Ramalho Santos
- Departments of Obstetrics/Gynecology and Pathology, and Center for Reproductive Sciences, University of California San Francisco, San Francisco, California 94143, USA
| | - Javier Herrero
- 1] European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. [2] Bill Lyons Informatics Centre, UCL Cancer Institute, University College London, London WC1E 6DD, UK
| | - Cedric Notredame
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Audra Johnson
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Shinny Vong
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Kristen Lee
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Daniel Bates
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Fidencio Neri
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Morgan Diegel
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Theresa Canfield
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Peter J Sabo
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Matthew S Wilken
- Department of Biological Structure, University of Washington, HSB I-516, 1959 NE Pacific Street, Seattle, Washington 98195, USA
| | - Thomas A Reh
- Department of Biological Structure, University of Washington, HSB I-516, 1959 NE Pacific Street, Seattle, Washington 98195, USA
| | - Erika Giste
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Anthony Shafer
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Tanya Kutyavin
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Eric Haugen
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Douglas Dunn
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Alex P Reynolds
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Shane Neph
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Richard Humbert
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - R Scott Hansen
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Marella De Bruijn
- MRC Molecular Haemotology Unit, University of Oxford, Oxford OX3 9DS, UK
| | - Licia Selleri
- Department of Cell and Developmental Biology, Weill Cornell Medical College, New York, New York 10065, USA
| | - Alexander Rudensky
- HHMI and Ludwig Center at Memorial Sloan Kettering Cancer Center, Immunology Program, Memorial Sloan Kettering Cancer Canter, New York, New York 10065, USA
| | - Steven Josefowicz
- HHMI and Ludwig Center at Memorial Sloan Kettering Cancer Center, Immunology Program, Memorial Sloan Kettering Cancer Canter, New York, New York 10065, USA
| | - Robert Samstein
- HHMI and Ludwig Center at Memorial Sloan Kettering Cancer Center, Immunology Program, Memorial Sloan Kettering Cancer Canter, New York, New York 10065, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Stuart H Orkin
- Dana Farber Cancer Institute, Harvard Medical School, Cambridge, Massachusetts 02138, USA
| | - Dana Levasseur
- University of Iowa Carver College of Medicine, Department of Internal Medicine, Iowa City, Iowa 52242, USA
| | - Thalia Papayannopoulou
- Division of Hematology, Department of Medicine, University of Washington, Seattle, Washington 98195, USA
| | - Kai-Hsin Chang
- University of Iowa Carver College of Medicine, Department of Internal Medicine, Iowa City, Iowa 52242, USA
| | - Arthur Skoultchi
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, New York 10461, USA
| | - Srikanta Gosh
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, New York 10461, USA
| | - Christine Disteche
- Department of Pathology, University of Washington, Seattle, Washington 98195, USA
| | - Piper Treuting
- Department of Comparative Medicine, University of Washington, Seattle, Washington 98195, USA
| | - Yanli Wang
- Bioinformatics and Genomics program, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Mitchell J Weiss
- Department of Hematology, St Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Gerd A Blobel
- 1] Division of Hematology, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA. [2] Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Xiaoyi Cao
- Department of Bioengineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Sheng Zhong
- Department of Bioengineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Ting Wang
- Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63108, USA
| | - Peter J Good
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Rebecca F Lowdon
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Leslie B Adams
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Xiao-Qiao Zhou
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Michael J Pazin
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Elise A Feingold
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Barbara Wold
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - James Taylor
- Departments of Biology and Mathematics and Computer Science, Emory University, O. Wayne Rollins Research Center, 1510 Clifton Road NE, Atlanta, Georgia 30322, USA
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, California 92697, USA
| | - Sherman M Weissman
- Yale University, Department of Genetics, PO Box 208005, 333 Cedar Street, New Haven, Connecticut 06520-8005, USA
| | | | - Michael P Snyder
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Roderic Guigo
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Thomas R Gingeras
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - David M Gilbert
- Department of Biological Science, 319 Stadium Drive, Florida State University, Tallahassee, Florida 32306-4295, USA
| | - Ross C Hardison
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Michael A Beer
- McKusick-Nathans Institute of Genetic Medicine and Department of Biomedical Engineering, Johns Hopkins University, 733 N. Broadway, BRB 573 Baltimore, Maryland 21205, USA
| | - Bing Ren
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | | |
Collapse
|
24
|
Lin S, Lin Y, Nery JR, Urich MA, Breschi A, Davis CA, Dobin A, Zaleski C, Beer MA, Chapman WC, Gingeras TR, Ecker JR, Snyder MP. Comparison of the transcriptional landscapes between human and mouse tissues. Proc Natl Acad Sci U S A 2014; 111:17224-9. [PMID: 25413365 PMCID: PMC4260565 DOI: 10.1073/pnas.1413624111] [Citation(s) in RCA: 258] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Although the similarities between humans and mice are typically highlighted, morphologically and genetically, there are many differences. To better understand these two species on a molecular level, we performed a comparison of the expression profiles of 15 tissues by deep RNA sequencing and examined the similarities and differences in the transcriptome for both protein-coding and -noncoding transcripts. Although commonalities are evident in the expression of tissue-specific genes between the two species, the expression for many sets of genes was found to be more similar in different tissues within the same species than between species. These findings were further corroborated by associated epigenetic histone mark analyses. We also find that many noncoding transcripts are expressed at a low level and are not detectable at appreciable levels across individuals. Moreover, the majority lack obvious sequence homologs between species, even when we restrict our attention to those which are most highly reproducible across biological replicates. Overall, our results indicate that there is considerable RNA expression diversity between humans and mice, well beyond what was described previously, likely reflecting the fundamental physiological differences between these two organisms.
Collapse
Affiliation(s)
- Shin Lin
- Department of Genetics, Stanford University, Stanford, CA 94305; Division of Cardiovascular Medicine, Stanford University, Stanford, CA 94305
| | - Yiing Lin
- Department of Surgery, Washington University School of Medicine, St. Louis, MO 63110
| | - Joseph R Nery
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037
| | - Mark A Urich
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037
| | - Alessandra Breschi
- Centre for Genomic Regulation and UPF, Catalonia, 08003 Barcelona, Spain; Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Carrie A Davis
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11742
| | - Alexander Dobin
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11742
| | - Christopher Zaleski
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11742
| | - Michael A Beer
- McKusick-Nathans Institute of Genetic Medicine and the Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205
| | - William C Chapman
- Department of Surgery, Washington University School of Medicine, St. Louis, MO 63110
| | - Thomas R Gingeras
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11742; Affymetrix, Inc., Santa Clara, CA 95051; and
| | - Joseph R Ecker
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037; Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA 92037
| | - Michael P Snyder
- Department of Genetics, Stanford University, Stanford, CA 94305;
| |
Collapse
|
25
|
Fagegaltier D, König A, Gordon A, Lai EC, Gingeras TR, Hannon GJ, Shcherbata HR. A genome-wide survey of sexually dimorphic expression of Drosophila miRNAs identifies the steroid hormone-induced miRNA let-7 as a regulator of sexual identity. Genetics 2014; 198:647-68. [PMID: 25081570 PMCID: PMC4196619 DOI: 10.1534/genetics.114.169268] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2014] [Accepted: 07/14/2014] [Indexed: 12/23/2022] Open
Abstract
MiRNAs bear an increasing number of functions throughout development and in the aging adult. Here we address their role in establishing sexually dimorphic traits and sexual identity in male and female Drosophila. Our survey of miRNA populations in each sex identifies sets of miRNAs differentially expressed in male and female tissues across various stages of development. The pervasive sex-biased expression of miRNAs generally increases with the complexity and sexual dimorphism of tissues, gonads revealing the most striking biases. We find that the male-specific regulation of the X chromosome is relevant to miRNA expression on two levels. First, in the male gonad, testis-biased miRNAs tend to reside on the X chromosome. Second, in the soma, X-linked miRNAs do not systematically rely on dosage compensation. We set out to address the importance of a sex-biased expression of miRNAs in establishing sexually dimorphic traits. Our study of the conserved let-7-C miRNA cluster controlled by the sex-biased hormone ecdysone places let-7 as a primary modulator of the sex-determination hierarchy. Flies with modified let-7 levels present doublesex-related phenotypes and express sex-determination genes normally restricted to the opposite sex. In testes and ovaries, alterations of the ecdysone-induced let-7 result in aberrant gonadal somatic cell behavior and non-cell-autonomous defects in early germline differentiation. Gonadal defects as well as aberrant expression of sex-determination genes persist in aging adults under hormonal control. Together, our findings place ecdysone and let-7 as modulators of a somatic systemic signal that helps establish and sustain sexual identity in males and females and differentiation in gonads. This work establishes the foundation for a role of miRNAs in sexual dimorphism and demonstrates that similar to vertebrate hormonal control of cellular sexual identity exists in Drosophila.
Collapse
Affiliation(s)
- Delphine Fagegaltier
- Howard Hughes Medical Institute, Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724 Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724
| | - Annekatrin König
- Max Planck Research Group of Gene Expression and Signaling, Max Planck Institute for Biophysical Chemistry, Göttingen 37077, Germany
| | - Assaf Gordon
- Howard Hughes Medical Institute, Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724
| | - Eric C Lai
- Department of Developmental Biology, Sloan-Kettering Institute, New York, New York 10065
| | - Thomas R Gingeras
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724
| | - Gregory J Hannon
- Howard Hughes Medical Institute, Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724 Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724
| | - Halyna R Shcherbata
- Max Planck Research Group of Gene Expression and Signaling, Max Planck Institute for Biophysical Chemistry, Göttingen 37077, Germany
| |
Collapse
|
26
|
Gerstein MB, Rozowsky J, Yan KK, Wang D, Cheng C, Brown JB, Davis CA, Hillier L, Sisu C, Li JJ, Pei B, Harmanci AO, Duff MO, Djebali S, Alexander RP, Alver BH, Auerbach R, Bell K, Bickel PJ, Boeck ME, Boley NP, Booth BW, Cherbas L, Cherbas P, Di C, Dobin A, Drenkow J, Ewing B, Fang G, Fastuca M, Feingold EA, Frankish A, Gao G, Good PJ, Guigó R, Hammonds A, Harrow J, Hoskins RA, Howald C, Hu L, Huang H, Hubbard TJP, Huynh C, Jha S, Kasper D, Kato M, Kaufman TC, Kitchen RR, Ladewig E, Lagarde J, Lai E, Leng J, Lu Z, MacCoss M, May G, McWhirter R, Merrihew G, Miller DM, Mortazavi A, Murad R, Oliver B, Olson S, Park PJ, Pazin MJ, Perrimon N, Pervouchine D, Reinke V, Reymond A, Robinson G, Samsonova A, Saunders GI, Schlesinger F, Sethi A, Slack FJ, Spencer WC, Stoiber MH, Strasbourger P, Tanzer A, Thompson OA, Wan KH, Wang G, Wang H, Watkins KL, Wen J, Wen K, Xue C, Yang L, Yip K, Zaleski C, Zhang Y, Zheng H, Brenner SE, Graveley BR, Celniker SE, Gingeras TR, Waterston R. Comparative analysis of the transcriptome across distant species. Nature 2014; 512:445-8. [PMID: 25164755 PMCID: PMC4155737 DOI: 10.1038/nature13424] [Citation(s) in RCA: 239] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2013] [Accepted: 04/30/2014] [Indexed: 12/30/2022]
Abstract
The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.
Collapse
Affiliation(s)
- Mark B Gerstein
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3] Department of Computer Science, Yale University, 51 Prospect Street, New Haven, Connecticut 06511, USA [4] [5]
| | - Joel Rozowsky
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3]
| | - Koon-Kiu Yan
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3]
| | - Daifeng Wang
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3]
| | - Chao Cheng
- 1] Department of Genetics, Geisel School of Medicine at Dartmouth, Hanover, New Hampshire 03755, USA [2] Institute for Quantitative Biomedical Sciences, Norris Cotton Cancer Center, Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire 03766, USA [3]
| | - James B Brown
- 1] Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA [2] Department of Statistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California 94720-3860, USA [3]
| | - Carrie A Davis
- 1] Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA [2]
| | - LaDeana Hillier
- 1] Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA [2]
| | - Cristina Sisu
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3]
| | - Jingyi Jessica Li
- 1] Department of Statistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California 94720-3860, USA [2] Department of Statistics, University of California, Los Angeles, California 90095-1554, USA [3] Department of Human Genetics, University of California, Los Angeles, California 90095-7088, USA [4]
| | - Baikang Pei
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3]
| | - Arif O Harmanci
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [3]
| | - Michael O Duff
- 1] Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, Connecticut 06030, USA [2]
| | - Sarah Djebali
- 1] Centre for Genomic Regulation, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain [2] Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain [3]
| | - Roger P Alexander
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Burak H Alver
- Center for Biomedical Informatics, Harvard Medical School, 10 Shattuck Street, Boston, Massachusetts 02115, USA
| | - Raymond Auerbach
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Kimberly Bell
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Peter J Bickel
- Department of Statistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California 94720-3860, USA
| | - Max E Boeck
- Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA
| | - Nathan P Boley
- 1] Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA [2] Department of Biostatistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California 94720-3860, USA
| | - Benjamin W Booth
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Lucy Cherbas
- 1] Department of Biology, Indiana University, 1001 East 3rd Street, Bloomington, Indiana 47405-7005, USA [2] Center for Genomics and Bioinformatics, Indiana University, 1001 East 3rd Street, Bloomington, Indiana 47405-7005, USA
| | - Peter Cherbas
- 1] Department of Biology, Indiana University, 1001 East 3rd Street, Bloomington, Indiana 47405-7005, USA [2] Center for Genomics and Bioinformatics, Indiana University, 1001 East 3rd Street, Bloomington, Indiana 47405-7005, USA
| | - Chao Di
- MOE Key Lab of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Alex Dobin
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Jorg Drenkow
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Brent Ewing
- Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA
| | - Gang Fang
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Megan Fastuca
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Elise A Feingold
- National Human Genome Research Institute, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Adam Frankish
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Guanjun Gao
- MOE Key Lab of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Peter J Good
- National Human Genome Research Institute, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Roderic Guigó
- 1] Centre for Genomic Regulation, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain [2] Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Ann Hammonds
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Jen Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Roger A Hoskins
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Cédric Howald
- 1] Center for Integrative Genomics, University of Lausanne, Genopode building, Lausanne 1015, Switzerland [2] Swiss Institute of Bioinformatics, Genopode building, Lausanne 1015, Switzerland
| | - Long Hu
- MOE Key Lab of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Haiyan Huang
- Department of Statistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California 94720-3860, USA
| | - Tim J P Hubbard
- 1] Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK [2] Medical and Molecular Genetics, King's College London, London WC2R 2LS, UK
| | - Chau Huynh
- Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA
| | - Sonali Jha
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Dionna Kasper
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06520-8005, USA
| | - Masaomi Kato
- Department of Molecular, Cellular and Developmental Biology, PO Box 208103, Yale University, New Haven, Connecticut 06520, USA
| | - Thomas C Kaufman
- Department of Biology, Indiana University, 1001 East 3rd Street, Bloomington, Indiana 47405-7005, USA
| | - Robert R Kitchen
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Erik Ladewig
- Sloan-Kettering Institute, 1275 York Avenue, Box 252, New York, New York 10065, USA
| | - Julien Lagarde
- 1] Centre for Genomic Regulation, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain [2] Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Eric Lai
- Sloan-Kettering Institute, 1275 York Avenue, Box 252, New York, New York 10065, USA
| | - Jing Leng
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Zhi Lu
- MOE Key Lab of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Michael MacCoss
- Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA
| | - Gemma May
- 1] Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, Connecticut 06030, USA [2] Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 USA
| | - Rebecca McWhirter
- Department of Cell and Developmental Biology, Vanderbilt University, 465 21st Avenue South, Nashville, Tennessee 37232-8240, USA
| | - Gennifer Merrihew
- Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA
| | - David M Miller
- Department of Cell and Developmental Biology, Vanderbilt University, 465 21st Avenue South, Nashville, Tennessee 37232-8240, USA
| | - Ali Mortazavi
- 1] Developmental and Cell Biology, University of California, Irvine, California 92697, USA [2] Center for Complex Biological Systems, University of California, Irvine, California 92697, USA
| | - Rabi Murad
- 1] Developmental and Cell Biology, University of California, Irvine, California 92697, USA [2] Center for Complex Biological Systems, University of California, Irvine, California 92697, USA
| | - Brian Oliver
- Section of Developmental Genomics, Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Sara Olson
- Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, Connecticut 06030, USA
| | - Peter J Park
- Center for Biomedical Informatics, Harvard Medical School, 10 Shattuck Street, Boston, Massachusetts 02115, USA
| | - Michael J Pazin
- National Human Genome Research Institute, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Norbert Perrimon
- 1] Department of Genetics and Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, Massachusetts 02115, USA [2] Howard Hughes Medical Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, Massachusetts 02115, USA
| | - Dmitri Pervouchine
- 1] Centre for Genomic Regulation, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain [2] Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Valerie Reinke
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06520-8005, USA
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, Genopode building, Lausanne 1015, Switzerland
| | - Garrett Robinson
- Department of Statistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California 94720-3860, USA
| | - Anastasia Samsonova
- 1] Department of Genetics and Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, Massachusetts 02115, USA [2] Howard Hughes Medical Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, Massachusetts 02115, USA
| | - Gary I Saunders
- 1] Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK [2] European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Felix Schlesinger
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Anurag Sethi
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Frank J Slack
- Department of Molecular, Cellular and Developmental Biology, PO Box 208103, Yale University, New Haven, Connecticut 06520, USA
| | - William C Spencer
- Department of Cell and Developmental Biology, Vanderbilt University, 465 21st Avenue South, Nashville, Tennessee 37232-8240, USA
| | - Marcus H Stoiber
- 1] Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA [2] Department of Biostatistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California 94720-3860, USA
| | - Pnina Strasbourger
- Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA
| | - Andrea Tanzer
- 1] Bioinformatics and Genomics Programme, Center for Genomic Regulation, Universitat Pompeu Fabra (CRG-UPF), 08003 Barcelona, Catalonia, Spain [2] Institute for Theoretical Chemistry, Theoretical Biochemistry Group (TBI), University of Vienna, Währingerstrasse 17/3/303, A-1090 Vienna, Austria
| | - Owen A Thompson
- Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA
| | - Kenneth H Wan
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Guilin Wang
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06520-8005, USA
| | - Huaien Wang
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Kathie L Watkins
- Department of Cell and Developmental Biology, Vanderbilt University, 465 21st Avenue South, Nashville, Tennessee 37232-8240, USA
| | - Jiayu Wen
- Sloan-Kettering Institute, 1275 York Avenue, Box 252, New York, New York 10065, USA
| | - Kejia Wen
- MOE Key Lab of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Chenghai Xue
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Li Yang
- 1] Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, Connecticut 06030, USA [2] Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Kevin Yip
- 1] Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong [2] 5 CUHK-BGI Innovation Institute of Trans-omics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Chris Zaleski
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Yan Zhang
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Henry Zheng
- 1] Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA [2] Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, Connecticut 06520, USA
| | - Steven E Brenner
- 1] Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA [2] Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA [3]
| | - Brenton R Graveley
- 1] Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, Connecticut 06030, USA [2]
| | - Susan E Celniker
- 1] Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA [2]
| | - Thomas R Gingeras
- 1] Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA [2]
| | - Robert Waterston
- 1] Department of Genome Sciences and University of Washington School of Medicine, William H. Foege Building S350D, 1705 Northeast Pacific Street, Box 355065 Seattle, Washington 98195-5065, USA [2]
| |
Collapse
|
27
|
Brown JB, Boley N, Eisman R, May GE, Stoiber MH, Duff MO, Booth BW, Wen J, Park S, Suzuki AM, Wan KH, Yu C, Zhang D, Carlson JW, Cherbas L, Eads BD, Miller D, Mockaitis K, Roberts J, Davis CA, Frise E, Hammonds AS, Olson S, Shenker S, Sturgill D, Samsonova AA, Weiszmann R, Robinson G, Hernandez J, Andrews J, Bickel PJ, Carninci P, Cherbas P, Gingeras TR, Hoskins RA, Kaufman TC, Lai EC, Oliver B, Perrimon N, Graveley BR, Celniker SE. Diversity and dynamics of the Drosophila transcriptome. Nature 2014; 512:393-9. [PMID: 24670639 PMCID: PMC4152413 DOI: 10.1038/nature12962] [Citation(s) in RCA: 470] [Impact Index Per Article: 47.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2013] [Accepted: 12/18/2013] [Indexed: 01/10/2023]
Abstract
Animal transcriptomes are dynamic, with each cell type, tissue and organ system expressing an ensemble of transcript isoforms that give rise to substantial diversity. Here we have identified new genes, transcripts and proteins using poly(A)+ RNA sequencing from Drosophila melanogaster in cultured cell lines, dissected organ systems and under environmental perturbations. We found that a small set of mostly neural-specific genes has the potential to encode thousands of transcripts each through extensive alternative promoter usage and RNA splicing. The magnitudes of splicing changes are larger between tissues than between developmental stages, and most sex-specific splicing is gonad-specific. Gonads express hundreds of previously unknown coding and long non-coding RNAs (lncRNAs), some of which are antisense to protein-coding genes and produce short regulatory RNAs. Furthermore, previously identified pervasive intergenic transcription occurs primarily within newly identified introns. The fly transcriptome is substantially more complex than previously recognized, with this complexity arising from combinatorial usage of promoters, splice sites and polyadenylation sites.
Collapse
|
28
|
Bassett AR, Akhtar A, Barlow DP, Bird AP, Brockdorff N, Duboule D, Ephrussi A, Ferguson-Smith AC, Gingeras TR, Haerty W, Higgs DR, Miska EA, Ponting CP. Considerations when investigating lncRNA function in vivo. eLife 2014; 3:e03058. [PMID: 25124674 PMCID: PMC4132285 DOI: 10.7554/elife.03058] [Citation(s) in RCA: 264] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Although a small number of the vast array of animal long non-coding RNAs (lncRNAs) have known effects on cellular processes examined in vitro, the extent of their contributions to normal cell processes throughout development, differentiation and disease for the most part remains less clear. Phenotypes arising from deletion of an entire genomic locus cannot be unequivocally attributed either to the loss of the lncRNA per se or to the associated loss of other overlapping DNA regulatory elements. The distinction between cis- or trans-effects is also often problematic. We discuss the advantages and challenges associated with the current techniques for studying the in vivo function of lncRNAs in the light of different models of lncRNA molecular mechanism, and reflect on the design of experiments to mutate lncRNA loci. These considerations should assist in the further investigation of these transcriptional products of the genome. DOI:http://dx.doi.org/10.7554/eLife.03058.001
Collapse
Affiliation(s)
- Andrew R Bassett
- Andrew R Bassett is in the MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom.
| | - Asifa Akhtar
- Asifa Akhtar is in the Department of Chromatin Regulation, Max-Planck-Institut für Immunbiologie und Epigenetik, Freiburg im Breisgau, Germany
| | - Denise P Barlow
- Denise P Barlow is in the CeMM, Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Adrian P Bird
- Adrian P Bird is in the Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh, United Kingdom
| | - Neil Brockdorff
- Neil Brockdorff is in the Department of Biochemistry, University of Oxford, Oxford, United Kingdom
| | - Denis Duboule
- Denis Duboule is in the School of Life Sciences, Ecole Polytechnique Fédérale Lausanne, Lausanne, Switzerland; Department of Genetics and Evolution, Université de Genève, Geneva, Switzerland
| | - Anne Ephrussi
- Anne Ephrussi is in the Developmental Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Anne C Ferguson-Smith
- Anne C Ferguson-Smith is in the Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Thomas R Gingeras
- Thomas R Gingeras is in the Functional Genomics Group, Cold Spring Harbor Laboratory, Cold Spring Harbor, United States
| | - Wilfried Haerty
- Wilfried Haerty is in the MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Douglas R Higgs
- Douglas R Higgs is in the MRC Molecular Haematology Unit, Weatherall Institute of Molecular Medicine, Oxford, United Kingdom
| | - Eric A Miska
- Eric A Miska is in the Wellcome Trust Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, United Kingdom; Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Chris P Ponting
- Chris P Ponting is in the MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom; Wellcome Trust Sanger Institute, Cambridge, United Kingdom
| |
Collapse
|
29
|
Abstract
RNA annotation and mapping of promoters for analysis of gene expression (RAMPAGE) is a method that harnesses highly specific sequencing of 5'-complete complementary DNAs to identify transcription start sites (TSSs) genome-wide. Although TSS mapping has historically relied on detection of 5'-complete cDNAs, current genome-wide approaches typically have limited specificity and provide only scarce information regarding transcript structure. RAMPAGE allows for highly stringent selection of 5'-complete molecules, thus allowing base-resolution TSS identification with a high signal-to-noise ratio. Paired-end sequencing of medium-length cDNAs yields transcript structure information that is essential to interpreting the relationship of TSSs to annotated genes and transcripts. As opposed to standard RNA-seq, RAMPAGE explicitly yields accurate and highly reproducible expression level estimates for individual promoters. Moreover, this approach offers a streamlined 2- to 3-day protocol that is optimized for extensive sample multiplexing, and is therefore adapted for large-scale projects. This method has been applied successfully to human and Drosophila samples, and in principle should be applicable to any eukaryotic system.
Collapse
Affiliation(s)
- Philippe Batut
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York
| | | |
Collapse
|
30
|
Schlesinger F, Smith AD, Gingeras TR, Hannon GJ, Hodges E. De novo DNA demethylation and noncoding transcription define active intergenic regulatory elements. Genome Res 2013; 23:1601-14. [PMID: 23811145 PMCID: PMC3787258 DOI: 10.1101/gr.157271.113] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Deep sequencing of mammalian DNA methylomes has uncovered a previously unpredicted number of discrete hypomethylated regions in intergenic space (iHMRs). Here, we combined whole-genome bisulfite sequencing data with extensive gene expression and chromatin-state data to define functional classes of iHMRs, and to reconstruct the dynamics of their establishment in a developmental setting. Comparing HMR profiles in embryonic stem and primary blood cells, we show that iHMRs mark an exclusive subset of active DNase hypersensitive sites (DHS), and that both developmentally constitutive and cell-type-specific iHMRs display chromatin states typical of distinct regulatory elements. We also observe that iHMR changes are more predictive of nearby gene activity than the promoter HMR itself, and that expression of noncoding RNAs within the iHMR accompanies full activation and complete demethylation of mature B cell enhancers. Conserved sequence features corresponding to iHMR transcript start sites, including a discernible TATA motif, suggest a conserved, functional role for transcription in these regions. Similarly, we explored both primate-specific and human population variation at iHMRs, finding that while enhancer iHMRs are more variable in sequence and methylation status than any other functional class, conservation of the TATA box is highly predictive of iHMR maintenance, reflecting the impact of sequence plasticity and transcriptional signals on iHMR establishment. Overall, our analysis allowed us to construct a three-step timeline in which (1) intergenic DHS are pre-established in the stem cell, (2) partial demethylation of blood-specific intergenic DHSs occurs in blood progenitors, and (3) complete iHMR formation and transcription coincide with enhancer activation in lymphoid-specified cells.
Collapse
Affiliation(s)
- Felix Schlesinger
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | | | | | | | | |
Collapse
|
31
|
Livyatan I, Harikumar A, Nissim-Rafinia M, Duttagupta R, Gingeras TR, Meshorer E. Non-polyadenylated transcription in embryonic stem cells reveals novel non-coding RNA related to pluripotency and differentiation. Nucleic Acids Res 2013; 41:6300-15. [PMID: 23630323 PMCID: PMC3695530 DOI: 10.1093/nar/gkt316] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The transcriptional landscape in embryonic stem cells (ESCs) and during ESC differentiation has received considerable attention, albeit mostly confined to the polyadenylated fraction of RNA, whereas the non-polyadenylated (NPA) fraction remained largely unexplored. Notwithstanding, the NPA RNA super-family has every potential to participate in the regulation of pluripotency and stem cell fate. We conducted a comprehensive analysis of NPA RNA in ESCs using a combination of whole-genome tiling arrays and deep sequencing technologies. In addition to identifying previously characterized and new non-coding RNA members, we describe a group of novel conserved RNAs (snacRNAs: small NPA conserved), some of which are differentially expressed between ESC and neuronal progenitor cells, providing the first evidence of a novel group of potentially functional NPA RNA involved in the regulation of pluripotency and stem cell fate. We further show that minor spliceosomal small nuclear RNAs, which are NPA, are almost completely absent in ESCs and are upregulated in differentiation. Finally, we show differential processing of the minor intron of the polycomb group gene Eed. Our data suggest that NPA RNA, both known and novel, play important roles in ESCs.
Collapse
|
32
|
Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, Tilgner H, Guernec G, Martin D, Merkel A, Knowles DG, Lagarde J, Veeravalli L, Ruan X, Ruan Y, Lassmann T, Carninci P, Brown JB, Lipovich L, Gonzalez JM, Thomas M, Davis CA, Shiekhattar R, Gingeras TR, Hubbard TJ, Notredame C, Harrow J, Guigó R. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res 2013; 22:1775-89. [PMID: 22955988 PMCID: PMC3431493 DOI: 10.1101/gr.132159.111] [Citation(s) in RCA: 3740] [Impact Index Per Article: 340.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The human genome contains many thousands of long noncoding RNAs (lncRNAs). While several studies have demonstrated compelling biological and disease roles for individual examples, analytical and experimental approaches to investigate these genes have been hampered by the lack of comprehensive lncRNA annotation. Here, we present and analyze the most complete human lncRNA annotation to date, produced by the GENCODE consortium within the framework of the ENCODE project and comprising 9277 manually annotated genes producing 14,880 transcripts. Our analyses indicate that lncRNAs are generated through pathways similar to that of protein-coding genes, with similar histone-modification profiles, splicing signals, and exon/intron lengths. In contrast to protein-coding genes, however, lncRNAs display a striking bias toward two-exon transcripts, they are predominantly localized in the chromatin and nucleus, and a fraction appear to be preferentially processed into small RNAs. They are under stronger selective pressure than neutrally evolving sequences—particularly in their promoter regions, which display levels of selection comparable to protein-coding genes. Importantly, about one-third seem to have arisen within the primate lineage. Comprehensive analysis of their expression in multiple human organs and brain regions shows that lncRNAs are generally lower expressed than protein-coding genes, and display more tissue-specific expression patterns, with a large fraction of tissue-specific lncRNAs expressed in the brain. Expression correlation analysis indicates that lncRNAs show particularly striking positive correlation with the expression of antisense coding genes. This GENCODE annotation represents a valuable resource for future studies of lncRNAs.
Collapse
Affiliation(s)
- Thomas Derrien
- Bioinformatics and Genomics, Centre for Genomic Regulation and UPF, 08003 Barcelona, Catalonia, Spain
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Cheng C, Alexander R, Min R, Leng J, Yip KY, Rozowsky J, Yan KK, Dong X, Djebali S, Ruan Y, Davis CA, Carninci P, Lassman T, Gingeras TR, Guigó R, Birney E, Weng Z, Snyder M, Gerstein M. Understanding transcriptional regulation by integrative analysis of transcription factor binding data. Genome Res 2013; 22:1658-67. [PMID: 22955978 PMCID: PMC3431483 DOI: 10.1101/gr.136838.111] [Citation(s) in RCA: 138] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Statistical models have been used to quantify the relationship between gene expression and transcription factor (TF) binding signals. Here we apply the models to the large-scale data generated by the ENCODE project to study transcriptional regulation by TFs. Our results reveal a notable difference in the prediction accuracy of expression levels of transcription start sites (TSSs) captured by different technologies and RNA extraction protocols. In general, the expression levels of TSSs with high CpG content are more predictable than those with low CpG content. For genes with alternative TSSs, the expression levels of downstream TSSs are more predictable than those of the upstream ones. Different TF categories and specific TFs vary substantially in their contributions to predicting expression. Between two cell lines, the differential expression of TSS can be precisely reflected by the difference of TF-binding signals in a quantitative manner, arguing against the conventional on-and-off model of TF binding. Finally, we explore the relationships between TF-binding signals and other chromatin features such as histone modifications and DNase hypersensitivity for determining expression. The models imply that these features regulate transcription in a highly coordinated manner.
Collapse
Affiliation(s)
- Chao Cheng
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Batut P, Dobin A, Plessy C, Carninci P, Gingeras TR. High-fidelity promoter profiling reveals widespread alternative promoter usage and transposon-driven developmental gene expression. Genome Res 2013; 23:169-80. [PMID: 22936248 PMCID: PMC3530677 DOI: 10.1101/gr.139618.112] [Citation(s) in RCA: 135] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2012] [Accepted: 08/29/2012] [Indexed: 12/20/2022]
Abstract
Many eukaryotic genes possess multiple alternative promoters with distinct expression specificities. Therefore, comprehensively annotating promoters and deciphering their individual regulatory dynamics is critical for gene expression profiling applications and for our understanding of regulatory complexity. We introduce RAMPAGE, a novel promoter activity profiling approach that combines extremely specific 5'-complete cDNA sequencing with an integrated data analysis workflow, to address the limitations of current techniques. RAMPAGE features a streamlined protocol for fast and easy generation of highly multiplexed sequencing libraries, offers very high transcription start site specificity, generates accurate and reproducible promoter expression measurements, and yields extensive transcript connectivity information through paired-end cDNA sequencing. We used RAMPAGE in a genome-wide study of promoter activity throughout 36 stages of the life cycle of Drosophila melanogaster, and describe here a comprehensive data set that represents the first available developmental time-course of promoter usage. We found that >40% of developmentally expressed genes have at least two promoters and that alternative promoters generally implement distinct regulatory programs. Transposable elements, long proposed to play a central role in the evolution of their host genomes through their ability to regulate gene expression, contribute at least 1300 promoters shaping the developmental transcriptome of D. melanogaster. Hundreds of these promoters drive the expression of annotated genes, and transposons often impart their own expression specificity upon the genes they regulate. These observations provide support for the theory that transposons may drive regulatory innovation through the distribution of stereotyped cis-regulatory modules throughout their host genomes.
Collapse
Affiliation(s)
- Philippe Batut
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA.
| | | | | | | | | |
Collapse
|
35
|
Abstract
MOTIVATION Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. RESULTS To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. AVAILABILITY AND IMPLEMENTATION STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.
Collapse
|
36
|
Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi AM, Tanzer A, Lagarde J, Lin W, Schlesinger F, Xue C, Marinov GK, Khatun J, Williams BA, Zaleski C, Rozowsky J, Röder M, Kokocinski F, Abdelhamid RF, Alioto T, Antoshechkin I, Baer MT, Bar NS, Batut P, Bell K, Bell I, Chakrabortty S, Chen X, Chrast J, Curado J, Derrien T, Drenkow J, Dumais E, Dumais J, Duttagupta R, Falconnet E, Fastuca M, Fejes-Toth K, Ferreira P, Foissac S, Fullwood MJ, Gao H, Gonzalez D, Gordon A, Gunawardena H, Howald C, Jha S, Johnson R, Kapranov P, King B, Kingswood C, Luo OJ, Park E, Persaud K, Preall JB, Ribeca P, Risk B, Robyr D, Sammeth M, Schaffer L, See LH, Shahab A, Skancke J, Suzuki AM, Takahashi H, Tilgner H, Trout D, Walters N, Wang H, Wrobel J, Yu Y, Ruan X, Hayashizaki Y, Harrow J, Gerstein M, Hubbard T, Reymond A, Antonarakis SE, Hannon G, Giddings MC, Ruan Y, Wold B, Carninci P, Guigó R, Gingeras TR. Landscape of transcription in human cells. Nature 2012; 489:101-8. [PMID: 22955620 PMCID: PMC3684276 DOI: 10.1038/nature11233] [Citation(s) in RCA: 3720] [Impact Index Per Article: 310.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2011] [Accepted: 05/15/2012] [Indexed: 02/07/2023]
Abstract
Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.
Collapse
Affiliation(s)
- Sarah Djebali
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Carrie A. Davis
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Angelika Merkel
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Alex Dobin
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Timo Lassmann
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Ali M. Mortazavi
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
- University of California Irvine, Dept of. Developmental and Cell Biology, 2300 Biological Sciences III, Irving, CA USA 92697
| | - Andrea Tanzer
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Julien Lagarde
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Wei Lin
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Felix Schlesinger
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Chenghai Xue
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Georgi K. Marinov
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Jainab Khatun
- Boise State University, College of Arts & Sciences, 1910 University Dr. Boise, ID USA 83725
| | - Brian A. Williams
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Chris Zaleski
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Joel Rozowsky
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
| | - Maik Röder
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Felix Kokocinski
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire United Kingdom CB10 1SA
| | - Rehab F. Abdelhamid
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Tyler Alioto
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Igor Antoshechkin
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Michael T. Baer
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Nadav S. Bar
- Department of Chemical Engineering, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
| | - Philippe Batut
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Kimberly Bell
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Ian Bell
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Sudipto Chakrabortty
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Xian Chen
- University of North Carolina at Chapel Hill, Department of Biochemistry & Biophysics, 120 Mason Farm Rd., Chapel Hill, NC USA 27599
| | - Jacqueline Chrast
- University of Lausanne, Center for Integrative Genomics, Genopode building, Lausanne, Switzerland 1015
| | - Joao Curado
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Thomas Derrien
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Jorg Drenkow
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Erica Dumais
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Jacqueline Dumais
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Radha Duttagupta
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Emilie Falconnet
- University of Geneva Medical School, Department of Genetic Medicine and Development and iGE3 Institute of Genetics and Genomics of Geneva, 1 rue Michel-Servet, Geneva, Switzerland 1015
| | - Meagan Fastuca
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Kata Fejes-Toth
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Pedro Ferreira
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Sylvain Foissac
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Melissa J. Fullwood
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Hui Gao
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - David Gonzalez
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Assaf Gordon
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Harsha Gunawardena
- University of North Carolina at Chapel Hill, Department of Biochemistry & Biophysics, 120 Mason Farm Rd., Chapel Hill, NC USA 27599
| | - Cedric Howald
- University of Lausanne, Center for Integrative Genomics, Genopode building, Lausanne, Switzerland 1015
| | - Sonali Jha
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Rory Johnson
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Philipp Kapranov
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
- St. Laurent Institute, One Kendall Square, Cambridge, MA
| | - Brandon King
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Colin Kingswood
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Oscar J. Luo
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Eddie Park
- University of California Irvine, Dept of. Developmental and Cell Biology, 2300 Biological Sciences III, Irving, CA USA 92697
| | - Kimberly Persaud
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Jonathan B. Preall
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Paolo Ribeca
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Brian Risk
- Boise State University, College of Arts & Sciences, 1910 University Dr. Boise, ID USA 83725
| | - Daniel Robyr
- University of Geneva Medical School, Department of Genetic Medicine and Development and iGE3 Institute of Genetics and Genomics of Geneva, 1 rue Michel-Servet, Geneva, Switzerland 1015
| | - Michael Sammeth
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Lorian Schaffer
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Lei-Hoon See
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Atif Shahab
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Jorgen Skancke
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
- Department of Chemical Engineering, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
| | - Ana Maria Suzuki
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Hazuki Takahashi
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Hagen Tilgner
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Diane Trout
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Nathalie Walters
- University of Lausanne, Center for Integrative Genomics, Genopode building, Lausanne, Switzerland 1015
| | - Huaien Wang
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - John Wrobel
- Boise State University, College of Arts & Sciences, 1910 University Dr. Boise, ID USA 83725
| | - Yanbao Yu
- University of North Carolina at Chapel Hill, Department of Biochemistry & Biophysics, 120 Mason Farm Rd., Chapel Hill, NC USA 27599
| | - Xiaoan Ruan
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Yoshihide Hayashizaki
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Jennifer Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire United Kingdom CB10 1SA
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
- Department of Computer Science, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
| | - Tim Hubbard
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire United Kingdom CB10 1SA
| | - Alexandre Reymond
- University of Lausanne, Center for Integrative Genomics, Genopode building, Lausanne, Switzerland 1015
| | - Stylianos E. Antonarakis
- University of Geneva Medical School, Department of Genetic Medicine and Development and iGE3 Institute of Genetics and Genomics of Geneva, 1 rue Michel-Servet, Geneva, Switzerland 1015
| | - Gregory Hannon
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Morgan C. Giddings
- Boise State University, College of Arts & Sciences, 1910 University Dr. Boise, ID USA 83725
- University of North Carolina at Chapel Hill, Department of Biochemistry & Biophysics, 120 Mason Farm Rd., Chapel Hill, NC USA 27599
| | - Yijun Ruan
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Barbara Wold
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Piero Carninci
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Thomas R. Gingeras
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| |
Collapse
|
37
|
Abstract
Analysis of bacterial transcriptomes have shown the existence of a genome-wide process of overlapping transcription due to the presence of antisense RNAs, as well as mRNAs that overlapped in their entire length or in some portion of the 5′- and 3′-UTR regions. The biological advantages of such overlapping transcription are unclear but may play important regulatory roles at the level of transcription, RNA stability and translation. In a recent report, the human pathogen Staphylococcus aureus is observed to generate genome-wide overlapping transcription in the same bacterial cells leading to a collection of short RNA fragments generated by the endoribonuclease III, RNase III. This processing appears most prominently in Gram-positive bacteria. The implications of both the use of pervasive overlapping transcription and the processing of these double stranded templates into short RNAs are explored and the consequences discussed.
Collapse
Affiliation(s)
- Iñigo Lasa
- Laboratory of Microbial Biofilms, Idab-CSIC-Universidad Pública de Navarra-Gobierno de Navarra, Pamplona, Spain.
| | | | | |
Collapse
|
38
|
Dong X, Greven MC, Kundaje A, Djebali S, Brown JB, Cheng C, Gingeras TR, Gerstein M, Guigó R, Birney E, Weng Z. Modeling gene expression using chromatin features in various cellular contexts. Genome Biol 2012; 13:R53. [PMID: 22950368 PMCID: PMC3491397 DOI: 10.1186/gb-2012-13-9-r53] [Citation(s) in RCA: 175] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2012] [Revised: 06/13/2012] [Accepted: 06/19/2012] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Previous work has demonstrated that chromatin feature levels correlate with gene expression. The ENCODE project enables us to further explore this relationship using an unprecedented volume of data. Expression levels from more than 100,000 promoters were measured using a variety of high-throughput techniques applied to RNA extracted by different protocols from different cellular compartments of several human cell lines. ENCODE also generated the genome-wide mapping of eleven histone marks, one histone variant, and DNase I hypersensitivity sites in seven cell lines. RESULTS We built a novel quantitative model to study the relationship between chromatin features and expression levels. Our study not only confirms that the general relationships found in previous studies hold across various cell lines, but also makes new suggestions about the relationship between chromatin features and gene expression levels. We found that expression status and expression levels can be predicted by different groups of chromatin features, both with high accuracy. We also found that expression levels measured by CAGE are better predicted than by RNA-PET or RNA-Seq, and different categories of chromatin features are the most predictive of expression for different RNA measurement methods. Additionally, PolyA+ RNA is overall more predictable than PolyA- RNA among different cell compartments, and PolyA+ cytosolic RNA measured with RNA-Seq is more predictable than PolyA+ nuclear RNA, while the opposite is true for PolyA- RNA. CONCLUSIONS Our study provides new insights into transcriptional regulation by analyzing chromatin features in different cellular contexts.
Collapse
Affiliation(s)
- Xianjun Dong
- Program in Bioinformatics and Integrative Biology, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Melissa C Greven
- Program in Bioinformatics and Integrative Biology, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, 318 Campus Drive, Stanford, CA 94304, USA
| | - Sarah Djebali
- Centre for Genomic Regulation (CRG) and UPF, Dr. Aiguader, 88, 08003 Barcelona, Spain
| | - James B Brown
- Department of Statistics, University of California, Berkeley, 367 Evans Hall, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Chao Cheng
- Computational Biology and Bioinformatics Program, Yale University, 266 Whitney Ave, New Haven, CT 06511, USA
| | - Thomas R Gingeras
- Cold Spring Harbor Laboratory, Genome Center, Woodbury, New York 11797, USA
| | - Mark Gerstein
- Computational Biology and Bioinformatics Program, Yale University, 266 Whitney Ave, New Haven, CT 06511, USA
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG) and UPF, Dr. Aiguader, 88, 08003 Barcelona, Spain
| | - Ewan Birney
- Vertebrate Genomics Group, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| |
Collapse
|
39
|
Djebali S, Lagarde J, Kapranov P, Lacroix V, Borel C, Mudge JM, Howald C, Foissac S, Ucla C, Chrast J, Ribeca P, Martin D, Murray RR, Yang X, Ghamsari L, Lin C, Bell I, Dumais E, Drenkow J, Tress ML, Gelpí JL, Orozco M, Valencia A, van Berkum NL, Lajoie BR, Vidal M, Stamatoyannopoulos J, Batut P, Dobin A, Harrow J, Hubbard T, Dekker J, Frankish A, Salehi-Ashtiani K, Reymond A, Antonarakis SE, Guigó R, Gingeras TR. Evidence for transcript networks composed of chimeric RNAs in human cells. PLoS One 2012; 7:e28213. [PMID: 22238572 PMCID: PMC3251577 DOI: 10.1371/journal.pone.0028213] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2011] [Accepted: 11/03/2011] [Indexed: 12/03/2022] Open
Abstract
The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5′ and 3′ transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network.
Collapse
Affiliation(s)
- Sarah Djebali
- Bioinformatics and Genomics, Centre for Genomic Regulation and Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Deng X, Hiatt JB, Nguyen DK, Ercan S, Sturgill D, Hillier LW, Schlesinger F, Davis CA, Reinke VJ, Gingeras TR, Shendure J, Waterston RH, Oliver B, Lieb JD, Disteche CM. Evidence for compensatory upregulation of expressed X-linked genes in mammals, Caenorhabditis elegans and Drosophila melanogaster. Nat Genet 2011; 43:1179-85. [PMID: 22019781 DOI: 10.1038/ng.948] [Citation(s) in RCA: 208] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2011] [Accepted: 08/25/2011] [Indexed: 12/12/2022]
Abstract
Many animal species use a chromosome-based mechanism of sex determination, which has led to the coordinate evolution of dosage-compensation systems. Dosage compensation not only corrects the imbalance in the number of X chromosomes between the sexes but also is hypothesized to correct dosage imbalance within cells that is due to monoallelic X-linked expression and biallelic autosomal expression, by upregulating X-linked genes twofold (termed 'Ohno's hypothesis'). Although this hypothesis is well supported by expression analyses of individual X-linked genes and by microarray-based transcriptome analyses, it was challenged by a recent study using RNA sequencing and proteomics. We obtained new, independent RNA-seq data, measured RNA polymerase distribution and reanalyzed published expression data in mammals, C. elegans and Drosophila. Our analyses, which take into account the skewed gene content of the X chromosome, support the hypothesis of upregulation of expressed X-linked genes to balance expression of the genome.
Collapse
Affiliation(s)
- Xinxian Deng
- Department of Pathology, University of Washington School of Medicine, Seattle, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Jiang L, Schlesinger F, Davis CA, Zhang Y, Li R, Salit M, Gingeras TR, Oliver B. Synthetic spike-in standards for RNA-seq experiments. Genome Res 2011; 21:1543-51. [PMID: 21816910 DOI: 10.1101/gr.121095.111] [Citation(s) in RCA: 439] [Impact Index Per Article: 33.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
High-throughput sequencing of cDNA (RNA-seq) is a widely deployed transcriptome profiling and annotation technique, but questions about the performance of different protocols and platforms remain. We used a newly developed pool of 96 synthetic RNAs with various lengths, and GC content covering a 2(20) concentration range as spike-in controls to measure sensitivity, accuracy, and biases in RNA-seq experiments as well as to derive standard curves for quantifying the abundance of transcripts. We observed linearity between read density and RNA input over the entire detection range and excellent agreement between replicates, but we observed significantly larger imprecision than expected under pure Poisson sampling errors. We use the control RNAs to directly measure reproducible protocol-dependent biases due to GC content and transcript length as well as stereotypic heterogeneity in coverage across transcripts correlated with position relative to RNA termini and priming sequence bias. These effects lead to biased quantification for short transcripts and individual exons, which is a serious problem for measurements of isoform abundances, but that can partially be corrected using appropriate models of bias. By using the control RNAs, we derive limits for the discovery and detection of rare transcripts in RNA-seq experiments. By using data collected as part of the model organism and human Encyclopedia of DNA Elements projects (ENCODE and modENCODE), we demonstrate that external RNA controls are a useful resource for evaluating sensitivity and accuracy of RNA-seq experiments for transcriptome discovery and quantification. These quality metrics facilitate comparable analysis across different samples, protocols, and platforms.
Collapse
Affiliation(s)
- Lichun Jiang
- Section of Developmental Genomics, Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | | | | | | | | | | | | | | |
Collapse
|
42
|
Roy S, Ernst J, Kharchenko PV, Kheradpour P, Negre N, Eaton ML, Landolin JM, Bristow CA, Ma L, Lin MF, Washietl S, Arshinoff BI, Ay F, Meyer PE, Robine N, Washington NL, Di Stefano L, Berezikov E, Brown CD, Candeias R, Carlson JW, Carr A, Jungreis I, Marbach D, Sealfon R, Tolstorukov MY, Will S, Alekseyenko AA, Artieri C, Booth BW, Brooks AN, Dai Q, Davis CA, Duff MO, Feng X, Gorchakov AA, Gu T, Henikoff JG, Kapranov P, Li R, MacAlpine HK, Malone J, Minoda A, Nordman J, Okamura K, Perry M, Powell SK, Riddle NC, Sakai A, Samsonova A, Sandler JE, Schwartz YB, Sher N, Spokony R, Sturgill D, van Baren M, Wan KH, Yang L, Yu C, Feingold E, Good P, Guyer M, Lowdon R, Ahmad K, Andrews J, Berger B, Brenner SE, Brent MR, Cherbas L, Elgin SCR, Gingeras TR, Grossman R, Hoskins RA, Kaufman TC, Kent W, Kuroda MI, Orr-Weaver T, Perrimon N, Pirrotta V, Posakony JW, Ren B, Russell S, Cherbas P, Graveley BR, Lewis S, Micklem G, Oliver B, Park PJ, Celniker SE, Henikoff S, Karpen GH, Lai EC, MacAlpine DM, Stein LD, White KP, Kellis M. Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 2010; 330:1787-97. [PMID: 21177974 PMCID: PMC3192495 DOI: 10.1126/science.1198374] [Citation(s) in RCA: 899] [Impact Index Per Article: 64.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation.
Collapse
Affiliation(s)
| | - Sushmita Roy
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02140, USA
| | - Jason Ernst
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02140, USA
| | - Peter V. Kharchenko
- Center for Biomedical Informatics, Harvard Medical School, 10 Shattuck Street, Boston, MA 02115, USA
| | - Pouya Kheradpour
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02140, USA
| | - Nicolas Negre
- Institute for Genomics and Systems Biology, Department of Human Genetics, The University of Chicago, 900 East 57th Street, Chicago, IL 60637, USA
| | - Matthew L. Eaton
- Department of Pharmacology and Cancer Biology, Duke University Medical Center, Durham, NC 27710, USA
| | - Jane M. Landolin
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory (LBNL), 1 Cyclotron Road, Berkeley, CA 94720 USA
| | - Christopher A. Bristow
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02140, USA
| | - Lijia Ma
- Institute for Genomics and Systems Biology, Department of Human Genetics, The University of Chicago, 900 East 57th Street, Chicago, IL 60637, USA
| | - Michael F. Lin
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02140, USA
| | - Stefan Washietl
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
| | - Bradley I. Arshinoff
- Department of Molecular Genetics, University of Toronto, 27 King’s College Circle, Toronto, Ontario M5S 1A1, Canada
- Ontario Institute for Cancer Research, 101 College Street, Suite 800, Toronto, Ontario M5G 0A3, Canada
| | - Ferhat Ay
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
- Computer and Information Science and Engineering, University of Florida, Gainesville, FL 32611, USA
| | - Patrick E. Meyer
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
- Machine Learning Group, Université Libre de Bruxelles, CP212, Brussels 1050, Belgium
| | - Nicolas Robine
- Sloan-Kettering Institute, 1275 York Avenue, Box 252, New York, NY 10065, USA
| | | | - Luisa Di Stefano
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
- Massachusetts General Hospital Cancer Center, Harvard Medical School, Charlestown, MA 02129, USA
| | - Eugene Berezikov
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences and University Medical Center Utrecht, Utrecht, Netherlands
| | - Christopher D. Brown
- Institute for Genomics and Systems Biology, Department of Human Genetics, The University of Chicago, 900 East 57th Street, Chicago, IL 60637, USA
| | - Rogerio Candeias
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
| | - Joseph W. Carlson
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory (LBNL), 1 Cyclotron Road, Berkeley, CA 94720 USA
| | - Adrian Carr
- Department of Genetics and Cambridge Systems Biology Centre, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK
| | - Irwin Jungreis
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02140, USA
| | - Daniel Marbach
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02140, USA
| | - Rachel Sealfon
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02140, USA
| | - Michael Y. Tolstorukov
- Center for Biomedical Informatics, Harvard Medical School, 10 Shattuck Street, Boston, MA 02115, USA
| | - Sebastian Will
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
| | - Artyom A. Alekseyenko
- Department of Medicine and Department of Genetics, Brigham and Women’s Hospital, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Carlo Artieri
- Section of Developmental Genomics, Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institutes of Health (NIH), Bethesda, MD 20892, USA
| | - Benjamin W. Booth
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory (LBNL), 1 Cyclotron Road, Berkeley, CA 94720 USA
| | - Angela N. Brooks
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
| | - Qi Dai
- Sloan-Kettering Institute, 1275 York Avenue, Box 252, New York, NY 10065, USA
| | - Carrie A. Davis
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Michael O. Duff
- Department of Genetics and Developmental Biology, University of Connecticut Stem Cell Institute, 263 Farmington, CT 06030–6403, USA
| | - Xin Feng
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
- Ontario Institute for Cancer Research, 101 College Street, Suite 800, Toronto, Ontario M5G 0A3, Canada
- Department of Biomedical Engineering, Stony Brook University, Stony Brook, NY 11794, USA
| | - Andrey A. Gorchakov
- Department of Medicine and Department of Genetics, Brigham and Women’s Hospital, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Tingting Gu
- Department of Biology CB-1137, Washington University, Saint Louis, MO 63130, USA
| | - Jorja G. Henikoff
- Sloan-Kettering Institute, 1275 York Avenue, Box 252, New York, NY 10065, USA
| | | | - Renhua Li
- Division of Extramural Research, National Human Genome Research Institute, NIH, 5635 Fishers Lane, Suite 4076, Bethesda, MD 20892–9305, USA
| | - Heather K. MacAlpine
- Department of Pharmacology and Cancer Biology, Duke University Medical Center, Durham, NC 27710, USA
| | - John Malone
- Section of Developmental Genomics, Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institutes of Health (NIH), Bethesda, MD 20892, USA
| | - Aki Minoda
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory (LBNL), 1 Cyclotron Road, Berkeley, CA 94720 USA
| | | | - Katsutomo Okamura
- Sloan-Kettering Institute, 1275 York Avenue, Box 252, New York, NY 10065, USA
| | - Marc Perry
- Ontario Institute for Cancer Research, 101 College Street, Suite 800, Toronto, Ontario M5G 0A3, Canada
| | - Sara K. Powell
- Department of Pharmacology and Cancer Biology, Duke University Medical Center, Durham, NC 27710, USA
| | - Nicole C. Riddle
- Department of Biology CB-1137, Washington University, Saint Louis, MO 63130, USA
| | - Akiko Sakai
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, 240 Longwood Avenue, Boston, MA 02115, USA
| | - Anastasia Samsonova
- Department of Genetics and Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Jeremy E. Sandler
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory (LBNL), 1 Cyclotron Road, Berkeley, CA 94720 USA
| | - Yuri B. Schwartz
- Center for Biomedical Informatics, Harvard Medical School, 10 Shattuck Street, Boston, MA 02115, USA
| | - Noa Sher
- White-head Institute, Cambridge, MA 02142, USA
| | - Rebecca Spokony
- Institute for Genomics and Systems Biology, Department of Human Genetics, The University of Chicago, 900 East 57th Street, Chicago, IL 60637, USA
| | - David Sturgill
- Section of Developmental Genomics, Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institutes of Health (NIH), Bethesda, MD 20892, USA
| | - Marijke van Baren
- Center for Genome Sciences, Washington University, 4444 Forest Park Boulevard, Saint Louis, MO 63108, USA
| | - Kenneth H. Wan
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory (LBNL), 1 Cyclotron Road, Berkeley, CA 94720 USA
| | - Li Yang
- Department of Genetics and Developmental Biology, University of Connecticut Stem Cell Institute, 263 Farmington, CT 06030–6403, USA
| | - Charles Yu
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory (LBNL), 1 Cyclotron Road, Berkeley, CA 94720 USA
| | - Elise Feingold
- Division of Extramural Research, National Human Genome Research Institute, NIH, 5635 Fishers Lane, Suite 4076, Bethesda, MD 20892–9305, USA
| | - Peter Good
- Division of Extramural Research, National Human Genome Research Institute, NIH, 5635 Fishers Lane, Suite 4076, Bethesda, MD 20892–9305, USA
| | - Mark Guyer
- Division of Extramural Research, National Human Genome Research Institute, NIH, 5635 Fishers Lane, Suite 4076, Bethesda, MD 20892–9305, USA
| | - Rebecca Lowdon
- Division of Extramural Research, National Human Genome Research Institute, NIH, 5635 Fishers Lane, Suite 4076, Bethesda, MD 20892–9305, USA
| | - Kami Ahmad
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, 240 Longwood Avenue, Boston, MA 02115, USA
| | - Justen Andrews
- Department of Biology, Indiana University, 1001 East 3rd Street, Bloomington, IN 47405–7005, USA
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02140, USA
| | - Steven E. Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA
| | - Michael R. Brent
- Center for Genome Sciences, Washington University, 4444 Forest Park Boulevard, Saint Louis, MO 63108, USA
| | - Lucy Cherbas
- Department of Biology, Indiana University, 1001 East 3rd Street, Bloomington, IN 47405–7005, USA
- Center for Genomics and Bioinformatics, Indiana University, 1001 East 3rd Street, Bloomington, IN 47405–7005, USA
| | - Sarah C. R. Elgin
- Department of Biology CB-1137, Washington University, Saint Louis, MO 63130, USA
| | - Thomas R. Gingeras
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
- Affymetrix, Santa Clara, CA 95051, USA
| | - Robert Grossman
- Institute for Genomics and Systems Biology, Department of Human Genetics, The University of Chicago, 900 East 57th Street, Chicago, IL 60637, USA
| | - Roger A. Hoskins
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory (LBNL), 1 Cyclotron Road, Berkeley, CA 94720 USA
| | - Thomas C. Kaufman
- Department of Biology, Indiana University, 1001 East 3rd Street, Bloomington, IN 47405–7005, USA
| | - William Kent
- Center for Biomolecular Science and Engineering, School of Engineering and Howard Hughes Medical Institute (HHMI), University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Mitzi I. Kuroda
- Department of Medicine and Department of Genetics, Brigham and Women’s Hospital, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | | | - Norbert Perrimon
- Department of Genetics and Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Vincenzo Pirrotta
- Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, NJ 08854, USA
| | - James W. Posakony
- Division of Biological Sciences, Section of Cell and Developmental Biology, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Bing Ren
- Division of Biological Sciences, Section of Cell and Developmental Biology, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Steven Russell
- Department of Genetics and Cambridge Systems Biology Centre, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK
| | - Peter Cherbas
- Department of Biology, Indiana University, 1001 East 3rd Street, Bloomington, IN 47405–7005, USA
- Center for Genomics and Bioinformatics, Indiana University, 1001 East 3rd Street, Bloomington, IN 47405–7005, USA
| | - Brenton R. Graveley
- Department of Genetics and Developmental Biology, University of Connecticut Stem Cell Institute, 263 Farmington, CT 06030–6403, USA
| | - Suzanna Lewis
- Genome Sciences Division, LBNL, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Gos Micklem
- Department of Genetics and Cambridge Systems Biology Centre, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK
| | - Brian Oliver
- Section of Developmental Genomics, Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institutes of Health (NIH), Bethesda, MD 20892, USA
| | - Peter J. Park
- Center for Biomedical Informatics, Harvard Medical School, 10 Shattuck Street, Boston, MA 02115, USA
| | - Susan E. Celniker
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory (LBNL), 1 Cyclotron Road, Berkeley, CA 94720 USA
| | - Steven Henikoff
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109, USA
| | - Gary H. Karpen
- Department of Genome Dynamics, Lawrence Berkeley National Laboratory (LBNL), 1 Cyclotron Road, Berkeley, CA 94720 USA
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
| | - Eric C. Lai
- Sloan-Kettering Institute, 1275 York Avenue, Box 252, New York, NY 10065, USA
| | - David M. MacAlpine
- Department of Pharmacology and Cancer Biology, Duke University Medical Center, Durham, NC 27710, USA
| | - Lincoln D. Stein
- Ontario Institute for Cancer Research, 101 College Street, Suite 800, Toronto, Ontario M5G 0A3, Canada
| | - Kevin P. White
- Institute for Genomics and Systems Biology, Department of Human Genetics, The University of Chicago, 900 East 57th Street, Chicago, IL 60637, USA
| | - Manolis Kellis
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02140, USA
| |
Collapse
|
43
|
Cherbas L, Willingham A, Zhang D, Yang L, Zou Y, Eads BD, Carlson JW, Landolin JM, Kapranov P, Dumais J, Samsonova A, Choi JH, Roberts J, Davis CA, Tang H, van Baren MJ, Ghosh S, Dobin A, Bell K, Lin W, Langton L, Duff MO, Tenney AE, Zaleski C, Brent MR, Hoskins RA, Kaufman TC, Andrews J, Graveley BR, Perrimon N, Celniker SE, Gingeras TR, Cherbas P. The transcriptional diversity of 25 Drosophila cell lines. Genome Res 2010; 21:301-14. [PMID: 21177962 DOI: 10.1101/gr.112961.110] [Citation(s) in RCA: 213] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Drosophila melanogaster cell lines are important resources for cell biologists. Here, we catalog the expression of exons, genes, and unannotated transcriptional signals for 25 lines. Unannotated transcription is substantial (typically 19% of euchromatic signal). Conservatively, we identify 1405 novel transcribed regions; 684 of these appear to be new exons of neighboring, often distant, genes. Sixty-four percent of genes are expressed detectably in at least one line, but only 21% are detected in all lines. Each cell line expresses, on average, 5885 genes, including a common set of 3109. Expression levels vary over several orders of magnitude. Major signaling pathways are well represented: most differentiation pathways are "off" and survival/growth pathways "on." Roughly 50% of the genes expressed by each line are not part of the common set, and these show considerable individuality. Thirty-one percent are expressed at a higher level in at least one cell line than in any single developmental stage, suggesting that each line is enriched for genes characteristic of small sets of cells. Most remarkable is that imaginal disc-derived lines can generally be assigned, on the basis of expression, to small territories within developing discs. These mappings reveal unexpected stability of even fine-grained spatial determination. No two cell lines show identical transcription factor expression. We conclude that each line has retained features of an individual founder cell superimposed on a common "cell line" gene expression pattern.
Collapse
Affiliation(s)
- Lucy Cherbas
- Center for Genomics and Bioinformatics, Indiana University, Bloomington, Indiana 47405, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Yang A, Zhu Z, Kettenbach A, Kapranov P, McKeon F, Gingeras TR, Struhl K. Genome-wide mapping indicates that p73 and p63 co-occupy target sites and have similar dna-binding profiles in vivo. PLoS One 2010; 5:e11572. [PMID: 20644729 PMCID: PMC2904373 DOI: 10.1371/journal.pone.0011572] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2010] [Accepted: 06/21/2010] [Indexed: 11/19/2022] Open
Abstract
Background The p53 homologs, p63 and p73, share ∼85% amino acid identity in their DNA-binding domains, but they have distinct biological functions. Principal Findings Using chromatin immunoprecipitation and high-resolution tiling arrays covering the human genome, we identify p73 DNA binding sites on a genome-wide level in ME180 human cervical carcinoma cells. Strikingly, the p73 binding profile is indistinguishable from the previously described binding profile for p63 in the same cells. Moreover, the p73∶p63 binding ratio is similar at all genomic loci tested, suggesting that there are few, if any, targets that are specific for one of these factors. As assayed by sequential chromatin immunoprecipitation, p63 and p73 co-occupy DNA target sites in vivo, suggesting that p63 and p73 bind primarily as heterotetrameric complexes in ME180 cells. Conclusions The observation that p63 and p73 associate with the same genomic targets suggest that their distinct biological functions are due to cell-type specific expression and/or protein domains that involve functions other than DNA binding.
Collapse
Affiliation(s)
- Annie Yang
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Zhou Zhu
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Arminja Kettenbach
- Department of Cell Biology, Harvard Medical School, Boston, Massachusetts, United States of America
| | | | - Frank McKeon
- Department of Cell Biology, Harvard Medical School, Boston, Massachusetts, United States of America
| | | | - Kevin Struhl
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
45
|
Plessy C, Bertin N, Takahashi H, Simone R, Salimullah M, Lassmann T, Vitezic M, Severin J, Olivarius S, Lazarevic D, Hornig N, Orlando V, Bell I, Gao H, Dumais J, Kapranov P, Wang H, Davis CA, Gingeras TR, Kawai J, Daub CO, Hayashizaki Y, Gustincich S, Carninci P. Linking promoters to functional transcripts in small samples with nanoCAGE and CAGEscan. Nat Methods 2010; 7:528-34. [PMID: 20543846 PMCID: PMC2906222 DOI: 10.1038/nmeth.1470] [Citation(s) in RCA: 116] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2009] [Accepted: 05/05/2010] [Indexed: 01/18/2023]
Abstract
Large-scale sequencing projects have revealed an unexpected complexity in the origins, structures and functions of mammalian transcripts. Many loci are known to produce overlapping coding and noncoding RNAs with capped 5' ends that vary in size. Methods to identify the 5' ends of transcripts will facilitate the discovery of new promoters and 5' ends derived from secondary capping events. Such methods often require high input amounts of RNA not obtainable from highly refined samples such as tissue microdissections and subcellular fractions. Therefore, we developed nano-cap analysis of gene expression (nanoCAGE), a method that captures the 5' ends of transcripts from as little as 10 ng of total RNA, and CAGEscan, a mate-pair adaptation of nanoCAGE that captures the transcript 5' ends linked to a downstream region. Both of these methods allow further annotation-agnostic studies of the complex human transcriptome.
Collapse
Affiliation(s)
- Charles Plessy
- RIKEN Yokohama Institute, Omics Science Center, Yokohama, Japan.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Makrythanasis P, Kapranov P, Bartoloni L, Reymond A, Deutsch S, Guigó R, Denoeud F, Drenkow J, Rossier C, Ariani F, Capra V, Excoffier L, Renieri A, Gingeras TR, Antonarakis SE. Variation in novel exons (RACEfrags) of the MECP2 gene in Rett syndrome patients and controls. Hum Mutat 2009; 30:E866-79. [PMID: 19562714 DOI: 10.1002/humu.21073] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
The study of transcription using genomic tiling arrays has lead to the identification of numerous additional exons. One example is the MECP2 gene on the X chromosome; using 5'RACE and RT-PCR in human tissues and cell lines, we have found more than 70 novel exons (RACEfrags) connecting to at least one annotated exon.. We sequenced all MECP2-connected exons and flanking sequences in 3 groups: 46 patients with the Rett syndrome and without mutations in the currently annotated exons of the MECP2 and CDKL5 genes; 32 patients with the Rett syndrome and identified mutations in the MECP2 gene; 100 control individuals from the same geoethnic group. Approximately 13 kb were sequenced per sample, (2.4 Mb of DNA resequencing). A total of 75 individuals had novel rare variants (mostly private variants) but no statistically significant difference was found among the 3 groups. These results suggest that variants in the newly discovered exons may not contribute to Rett syndrome. Interestingly however, there are about twice more variants in the novel exons than in the flanking sequences (44 vs. 21 for approximately 1.3 Mb sequenced for each class of sequences, p=0.0025). Thus the evolutionary forces that shape these novel exons may be different than those of neighboring sequences.
Collapse
Affiliation(s)
- Periklis Makrythanasis
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Abstract
Deep sequencing of 'transcriptomes'--the collection of all RNA transcripts produced at a given time--from worms to humans reveals that some transcripts are composed of sequence segments that are not co-linear, with pieces of sequence coming from distant regions of DNA, even different chromosomes. Some of these 'chimaeric' transcripts are formed by genetic rearrangements, but others arise during post-transcriptional events. The 'trans-splicing' process in lower eukaryotes is well understood, but events in higher eukaryotes are not. The existence of such chimaeric RNAs has far-reaching implications for the potential information content of genomes and the way it is arranged.
Collapse
Affiliation(s)
- Thomas R Gingeras
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA.
| |
Collapse
|
48
|
Efroni S, Duttagupta R, Cheng J, Dehghani H, Hoeppner DJ, Dash C, Bazett-Jones DP, Le Grice S, McKay RDG, Buetow KH, Gingeras TR, Misteli T, Meshorer E. Global transcription in pluripotent embryonic stem cells. Cell Stem Cell 2009. [PMID: 18462694 DOI: 10.1016/j.stem.2008.03.02188] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
The molecular mechanisms underlying pluripotency and lineage specification from embryonic stem cells (ESCs) are largely unclear. Differentiation pathways may be determined by the targeted activation of lineage-specific genes or by selective silencing of genome regions. Here we show that the ESC genome is transcriptionally globally hyperactive and undergoes large-scale silencing as cells differentiate. Normally silent repeat regions are active in ESCs, and tissue-specific genes are sporadically expressed at low levels. Whole-genome tiling arrays demonstrate widespread transcription in coding and noncoding regions in ESCs, whereas the transcriptional landscape becomes more discrete as differentiation proceeds. The transcriptional hyperactivity in ESCs is accompanied by disproportionate expression of chromatin-remodeling genes and the general transcription machinery. We propose that global transcription is a hallmark of pluripotent ESCs, contributing to their plasticity, and that lineage specification is driven by reduction of the transcribed portion of the genome.
Collapse
Affiliation(s)
- Sol Efroni
- National Cancer Institute Center for Bioinformatics, National Institutes of Health, Rockville, MD 20852, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Efroni S, Duttagupta R, Cheng J, Dehghani H, Hoeppner DJ, Dash C, Bazett-Jones DP, Le Grice S, McKay RDG, Buetow KH, Gingeras TR, Misteli T, Meshorer E. Global transcription in pluripotent embryonic stem cells. Cell Stem Cell 2009; 2:437-47. [PMID: 18462694 DOI: 10.1016/j.stem.2008.03.021] [Citation(s) in RCA: 496] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2007] [Revised: 11/09/2007] [Accepted: 03/28/2008] [Indexed: 12/21/2022]
Abstract
The molecular mechanisms underlying pluripotency and lineage specification from embryonic stem cells (ESCs) are largely unclear. Differentiation pathways may be determined by the targeted activation of lineage-specific genes or by selective silencing of genome regions. Here we show that the ESC genome is transcriptionally globally hyperactive and undergoes large-scale silencing as cells differentiate. Normally silent repeat regions are active in ESCs, and tissue-specific genes are sporadically expressed at low levels. Whole-genome tiling arrays demonstrate widespread transcription in coding and noncoding regions in ESCs, whereas the transcriptional landscape becomes more discrete as differentiation proceeds. The transcriptional hyperactivity in ESCs is accompanied by disproportionate expression of chromatin-remodeling genes and the general transcription machinery. We propose that global transcription is a hallmark of pluripotent ESCs, contributing to their plasticity, and that lineage specification is driven by reduction of the transcribed portion of the genome.
Collapse
Affiliation(s)
- Sol Efroni
- National Cancer Institute Center for Bioinformatics, National Institutes of Health, Rockville, MD 20852, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Djebali S, Kapranov P, Foissac S, Lagarde J, Reymond A, Ucla C, Wyss C, Drenkow J, Dumais E, Murray RR, Lin C, Szeto D, Denoeud F, Calvo M, Frankish A, Harrow J, Makrythanasis P, Vidal M, Salehi-Ashtiani K, Antonarakis SE, Gingeras TR, Guigó R. Efficient targeted transcript discovery via array-based normalization of RACE libraries. Nat Methods 2008; 5:629-35. [PMID: 18500348 PMCID: PMC2713501 DOI: 10.1038/nmeth.1216] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2008] [Accepted: 04/24/2008] [Indexed: 11/09/2022]
Abstract
RACE (Rapid Amplification of cDNA Ends) is a widely used approach for transcript identification. Random clone selection from the RACE mixture, however, is an ineffective sampling strategy if the dynamic range of transcript abundances is large. Here, we describe a strategy that uses array hybridization to improve sampling efficiency of human transcripts. The products of the RACE reaction are hybridized onto tiling arrays, and the exons detected are used to delineate a series of RT-PCR reactions, through which the original RACE mixture is segregated into simpler RT-PCR reactions. These are independently cloned, and randomly selected clones are sequenced. This approach is superior to direct cloning and sequencing of RACE products: it specifically targets novel transcripts, and often results in overall normalization of transcript abundances. We show theoretically and experimentally that this strategy leads indeed to efficient sampling of novel transcripts, and we investigate multiplexing it by pooling RACE reactions from multiple interrogated loci prior to hybridization.
Collapse
Affiliation(s)
- Sarah Djebali
- Grup de Recerca en Informàtica Biomèdica, Institut Municipal d'Investigació Mèdica/Universitat Pompeu Fabra, Dr. Aiguader 88, 08003 Barcelona, Spain
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|