1
|
Clawson H, Lee BT, Raney BJ, Barber GP, Casper J, Diekhans M, Fischer C, Gonzalez JN, Hinrichs AS, Lee CM, Nassar LR, Perez G, Wick B, Schmelter D, Speir ML, Armstrong J, Zweig AS, Kuhn RM, Kirilenko BM, Hiller M, Haussler D, Kent WJ, Haeussler M. GenArk: towards a million UCSC genome browsers. Genome Biol 2023; 24:217. [PMID: 37784172 PMCID: PMC10544498 DOI: 10.1186/s13059-023-03057-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 09/11/2023] [Indexed: 10/04/2023] Open
Abstract
Interactive graphical genome browsers are essential tools in genomics, but they do not contain all the recent genome assemblies. We create Genome Archive (GenArk) collection of UCSC Genome Browsers from NCBI assemblies. Built on our established track hub system, this enables fast visualization of annotations. Assemblies come with gene models, repeat masks, BLAT, and in silico PCR. Users can add annotations via track hubs and custom tracks. We can bulk-import third-party resources, demonstrated with TOGA and Ensembl gene models for hundreds of assemblies.Three thousand two hundred sixty-nine GenArk assemblies are listed at https://hgdownload.soe.ucsc.edu/hubs/ and can be searched for on the Genome Browser gateway page.
Collapse
Affiliation(s)
- Hiram Clawson
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA.
| | - Brian T Lee
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Brian J Raney
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Galt P Barber
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Jonathan Casper
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Mark Diekhans
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Clay Fischer
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | | | - Angie S Hinrichs
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Christopher M Lee
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Luis R Nassar
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Gerardo Perez
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Brittney Wick
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Daniel Schmelter
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Matthew L Speir
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Joel Armstrong
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Ann S Zweig
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Robert M Kuhn
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Bogdan M Kirilenko
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberganlage 25, 60325, Frankfurt, Germany
- Senckenberg Research Institute, Senckenberganlage 25, 60325, Frankfurt, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, Max-von-Laue-Str. 9, 60438, Frankfurt, Germany
| | - Michael Hiller
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberganlage 25, 60325, Frankfurt, Germany
- Senckenberg Research Institute, Senckenberganlage 25, 60325, Frankfurt, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, Max-von-Laue-Str. 9, 60438, Frankfurt, Germany
| | - David Haussler
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - W James Kent
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | | |
Collapse
|
2
|
Hitz BC, Lee JW, Jolanki O, Kagda MS, Graham K, Sud P, Gabdank I, Strattan JS, Sloan CA, Dreszer T, Rowe LD, Podduturi NR, Malladi VS, Chan ET, Davidson JM, Ho M, Miyasato S, Simison M, Tanaka F, Luo Y, Whaling I, Hong EL, Lee BT, Sandstrom R, Rynes E, Nelson J, Nishida A, Ingersoll A, Buckley M, Frerker M, Kim DS, Boley N, Trout D, Dobin A, Rahmanian S, Wyman D, Balderrama-Gutierrez G, Reese F, Durand NC, Dudchenko O, Weisz D, Rao SSP, Blackburn A, Gkountaroulis D, Sadr M, Olshansky M, Eliaz Y, Nguyen D, Bochkov I, Shamim MS, Mahajan R, Aiden E, Gingeras T, Heath S, Hirst M, Kent WJ, Kundaje A, Mortazavi A, Wold B, Cherry JM. The ENCODE Uniform Analysis Pipelines. Res Sq 2023:rs.3.rs-3111932. [PMID: 37503119 PMCID: PMC10371165 DOI: 10.21203/rs.3.rs-3111932/v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community. The ENCODE project has engineered and distributed uniform processing pipelines in order to promote data provenance and reproducibility as well as allow interoperability between genomic resources and other consortia. All data files, reference genome versions, software versions, and parameters used by the pipelines are captured and available via the ENCODE Portal. The pipeline code, developed using Docker and Workflow Description Language (WDL; https://openwdl.org/) is publicly available in GitHub, with images available on Dockerhub (https://hub.docker.com), enabling access to a diverse range of biomedical researchers. ENCODE pipelines maintained and used by the DCC can be installed to run on personal computers, local HPC clusters, or in cloud computing environments via Cromwell. Access to the pipelines and data via the cloud allows small labs the ability to use the data or software without access to institutional compute clusters. Standardization of the computational methodologies for analysis and quality control leads to comparable results from different ENCODE collections - a prerequisite for successful integrative analyses.
Collapse
Affiliation(s)
- Benjamin C Hitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Jin-Wook Lee
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Otto Jolanki
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Meenakshi S Kagda
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Keenan Graham
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Paul Sud
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Idan Gabdank
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - J Seth Strattan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Cricket A Sloan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Timothy Dreszer
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Laurence D Rowe
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Nikhil R Podduturi
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Venkat S Malladi
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Esther T Chan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Jean M Davidson
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Marcus Ho
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Stuart Miyasato
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Matt Simison
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Forrest Tanaka
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Yunhai Luo
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Ian Whaling
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Eurie L Hong
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Brian T Lee
- Genomics Institute, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Richard Sandstrom
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Eric Rynes
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Jemma Nelson
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Andrew Nishida
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Alyssa Ingersoll
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Michael Buckley
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Mark Frerker
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Daniel S Kim
- Department of Genetics, Department of Computer Science, Stanford University, 240 Pasteur Drive, Palo Alto, CA 94304, USA
| | - Nathan Boley
- Department of Genetics, Department of Computer Science, Stanford University, 240 Pasteur Drive, Palo Alto, CA 94304, USA
| | - Diane Trout
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125 USA
| | - Alex Dobin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sorena Rahmanian
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | - Dana Wyman
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | | | - Fairlie Reese
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | - Neva C Durand
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
- Department of Computer Science, Rice University, Houston, TX 77030, USA
| | - Olga Dudchenko
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - David Weisz
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Suhas S P Rao
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Department of Structural Biology, Stanford University School of Medicine, Stanford, CA 94305, USA
- Department of Medicine, University of California San Francisco, San Francisco, CA 94143, USA
| | - Alyssa Blackburn
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
| | - Dimos Gkountaroulis
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
| | - Mahdi Sadr
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Moshe Olshansky
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Yossi Eliaz
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Dat Nguyen
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Ivan Bochkov
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Muhammad Saad Shamim
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, TX 77030, USA
| | - Ragini Mahajan
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
- Department of BioSciences, Rice University, Houston, TX 77005, USA
| | - Erez Aiden
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
| | - Tom Gingeras
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Simon Heath
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain. Universitat Pompeu Fabra, Barcelona, Spain
| | - Martin Hirst
- Micheal Smith Laboratories, University of British Columbia, British Columbia, Canada
| | - W James Kent
- Genomics Institute, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Anshul Kundaje
- Department of Genetics, Department of Computer Science, Stanford University, 240 Pasteur Drive, Palo Alto, CA 94304, USA
| | - Ali Mortazavi
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | - Barbara Wold
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125 USA
| | - J Michael Cherry
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
3
|
Hitz BC, Jin-Wook L, Jolanki O, Kagda MS, Graham K, Sud P, Gabdank I, Strattan JS, Sloan CA, Dreszer T, Rowe LD, Podduturi NR, Malladi VS, Chan ET, Davidson JM, Ho M, Miyasato S, Simison M, Tanaka F, Luo Y, Whaling I, Hong EL, Lee BT, Sandstrom R, Rynes E, Nelson J, Nishida A, Ingersoll A, Buckley M, Frerker M, Kim DS, Boley N, Trout D, Dobin A, Rahmanian S, Wyman D, Balderrama-Gutierrez G, Reese F, Durand NC, Dudchenko O, Weisz D, Rao SSP, Blackburn A, Gkountaroulis D, Sadr M, Olshansky M, Eliaz Y, Nguyen D, Bochkov I, Shamim MS, Mahajan R, Aiden E, Gingeras T, Heath S, Hirst M, Kent WJ, Kundaje A, Mortazavi A, Wold B, Cherry JM. The ENCODE Uniform Analysis Pipelines. bioRxiv 2023:2023.04.04.535623. [PMID: 37066421 PMCID: PMC10104020 DOI: 10.1101/2023.04.04.535623] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community. The ENCODE project has engineered and distributed uniform processing pipelines in order to promote data provenance and reproducibility as well as allow interoperability between genomic resources and other consortia. All data files, reference genome versions, software versions, and parameters used by the pipelines are captured and available via the ENCODE Portal. The pipeline code, developed using Docker and Workflow Description Language (WDL; https://openwdl.org/) is publicly available in GitHub, with images available on Dockerhub (https://hub.docker.com), enabling access to a diverse range of biomedical researchers. ENCODE pipelines maintained and used by the DCC can be installed to run on personal computers, local HPC clusters, or in cloud computing environments via Cromwell. Access to the pipelines and data via the cloud allows small labs the ability to use the data or software without access to institutional compute clusters. Standardization of the computational methodologies for analysis and quality control leads to comparable results from different ENCODE collections - a prerequisite for successful integrative analyses.
Collapse
Affiliation(s)
- Benjamin C Hitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Lee Jin-Wook
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Otto Jolanki
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Meenakshi S Kagda
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Keenan Graham
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Paul Sud
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Idan Gabdank
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - J Seth Strattan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Cricket A Sloan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Timothy Dreszer
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Laurence D Rowe
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Nikhil R Podduturi
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Venkat S Malladi
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Esther T Chan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Jean M Davidson
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Marcus Ho
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Stuart Miyasato
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Matt Simison
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Forrest Tanaka
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Yunhai Luo
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Ian Whaling
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Eurie L Hong
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Brian T Lee
- Genomics Institute, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Richard Sandstrom
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Eric Rynes
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Jemma Nelson
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Andrew Nishida
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Alyssa Ingersoll
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Michael Buckley
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Mark Frerker
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Daniel S Kim
- Dept. of Genetics, Dept. of Computer Science, Stanford University, 240 Pasteur Drive, Palo Alto, CA 94304, USA
| | - Nathan Boley
- Dept. of Genetics, Dept. of Computer Science, Stanford University, 240 Pasteur Drive, Palo Alto, CA 94304, USA
| | - Diane Trout
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125 USA
| | - Alex Dobin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sorena Rahmanian
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | - Dana Wyman
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | | | - Fairlie Reese
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | - Neva C Durand
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
- Department of Computer Science, Rice University, Houston, TX 77030, USA
| | - Olga Dudchenko
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - David Weisz
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Suhas S P Rao
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Department of Structural Biology, Stanford University School of Medicine, Stanford, CA 94305, USA
- Department of Medicine, University of California San Francisco, San Francisco, CA 94143, USA
| | - Alyssa Blackburn
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
| | - Dimos Gkountaroulis
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
| | - Mahdi Sadr
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Moshe Olshansky
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Yossi Eliaz
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Dat Nguyen
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Ivan Bochkov
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Muhammad Saad Shamim
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, TX 77030, USA
| | - Ragini Mahajan
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
- Department of BioSciences, Rice University, Houston, TX 77005, USA
| | - Erez Aiden
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
| | - Tom Gingeras
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Simon Heath
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain. Universitat Pompeu Fabra, Barcelona, Spain
| | - Martin Hirst
- Micheal Smith Laboratories, University of British Columbia, British Columbia, Canada
| | - W James Kent
- Genomics Institute, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Anshul Kundaje
- Dept. of Genetics, Dept. of Computer Science, Stanford University, 240 Pasteur Drive, Palo Alto, CA 94304, USA
| | - Ali Mortazavi
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | - Barbara Wold
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125 USA
| | - J Michael Cherry
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
4
|
Clawson H, Lee BT, Raney BJ, Barber GP, Casper J, Diekhans M, Fischer C, Gonzalez JN, Hinrichs AS, Lee CM, Nassar LR, Perez G, Wick B, Schmelter D, Speir ML, Armstrong J, Zweig AS, Kuhn RM, Kirilenko BM, Hiller M, Haussler D, Kent WJ, Haeussler M. GenArk: Towards a million UCSC Genome Browsers. Res Sq 2023:rs.3.rs-2697398. [PMID: 37066427 PMCID: PMC10104252 DOI: 10.21203/rs.3.rs-2697398/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Interactive graphical genome browsers are essential tools for biologists working with DNA sequences. Although tens of thousands of new genome assemblies have become available over the last decade, accessibility is limited by the work involved in manually creating browsers and curating annotations. The results can push the limits of data storage infrastructure. To facilitate managing this increasing number of genome assemblies, we created the Genome Archive (GenArk) collection of UCSC Genome Browsers from assemblies hosted at NCBI(1). Built on our established assembly hub system, this collection enables fast, on-demand visualization of chromosome regions without requiring a database server. Available annotations include gene models, some mapped through whole-genome alignments, repeat masks, GC content, and others. We also modified our popular BLAT(2) aligner and in-silico PCR to support a large number of genomes using limited RAM. Users can upload additional annotations themselves via track hubs(3) and custom tracks. We can import more annotations in bulk from third-party resources, demonstrated here with TOGA(4) gene models. 2,430 GenArk assemblies are listed at https://hgdownload.soe.ucsc.edu/hubs/ and can be found by searching on the main UCSC gateway page. We will continue to add human high-quality assemblies and for other organisms, we are looking forward to receiving requests from the research community for ever more browsers and whole-genome alignments via http://genome.ucsc.edu/assemblyRequest.html.
Collapse
Affiliation(s)
- Hiram Clawson
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Brian T Lee
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Brian J Raney
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Galt P Barber
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Jonathan Casper
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Clay Fischer
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | | | - Angie S Hinrichs
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Christopher M Lee
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Luis R Nassar
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Gerardo Perez
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Brittney Wick
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Daniel Schmelter
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Matthew L Speir
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Joel Armstrong
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Ann S Zweig
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Robert M Kuhn
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - Bogdan M. Kirilenko
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberganlage 25, 60325 Frankfurt, Germany
- Senckenberg Research Institute, Senckenberganlage 25, 60325 Frankfurt, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, Max-von-Laue-Str. 9, 60438 Frankfurt, Germany
| | - Michael Hiller
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberganlage 25, 60325 Frankfurt, Germany
- Senckenberg Research Institute, Senckenberganlage 25, 60325 Frankfurt, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, Max-von-Laue-Str. 9, 60438 Frankfurt, Germany
| | - David Haussler
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | - W James Kent
- Genomics Institute, University of California, Santa Cruz, CA 95064, USA
| | | |
Collapse
|
5
|
Benet-Pagès A, Rosenbloom KR, Nassar LR, Lee CM, Raney BJ, Clawson H, Schmelter D, Casper J, Gonzalez JN, Perez G, Lee BT, Zweig AS, James Kent W, Haeussler M, Kuhn RM. Variant Interpretation: UCSC Genome Browser Recommended Track Sets. Hum Mutat 2022; 43:998-1011. [PMID: 35088925 PMCID: PMC9288501 DOI: 10.1002/humu.24335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 11/30/2021] [Accepted: 01/25/2022] [Indexed: 11/11/2022]
Abstract
The UCSC Genome Browser has been an important tool for genomics and clinical genetics since the sequence of the human genome was first released in 2000. As it has grown in scope to display more types of data it has also grown more complicated. The data, which are dispersed at many locations worldwide, are collected into one view on the Browser, where the graphical interface presents the data in one location. This supports the expertise of the researcher to interpret variants in the genome. Because the analysis of Single Nucleotide Variants (SNVs) and Copy Number Variants (CNVs) require interpretation of data at very different genomic scales, different data resources are required. We present here several Recommended Track Sets designed to facilitate the interpretation of variants in the clinic, offering quick access to datasets relevant to the appropriate scale. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Anna Benet-Pagès
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA.,Medical Genetics Center (MGZ), Munich, Germany
| | - Kate R Rosenbloom
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Luis R Nassar
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Christopher M Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Brian J Raney
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Hiram Clawson
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Daniel Schmelter
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Jonathan Casper
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA
| | | | - Gerardo Perez
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Brian T Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Ann S Zweig
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA
| | - W James Kent
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA
| | | | - Robert M Kuhn
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA
| |
Collapse
|
6
|
Speir ML, Bhaduri A, Markov NS, Moreno P, Nowakowski TJ, Papatheodorou I, Pollen AA, Raney BJ, Seninge L, Kent WJ, Haeussler M. UCSC Cell Browser: Visualize Your Single-Cell Data. Bioinformatics 2021; 37:4578-4580. [PMID: 34244710 PMCID: PMC8652023 DOI: 10.1093/bioinformatics/btab503] [Citation(s) in RCA: 79] [Impact Index Per Article: 26.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 07/05/2021] [Indexed: 02/07/2023] Open
Abstract
Summary As the use of single-cell technologies has grown, so has the need for tools to explore these large, complicated datasets. The UCSC Cell Browser is a tool that allows scientists to visualize gene expression and metadata annotation distribution throughout a single-cell dataset or multiple datasets. Availability and implementation We provide the UCSC Cell Browser as a free website where scientists can explore a growing collection of single-cell datasets and a freely available python package for scientists to create stable, self-contained visualizations for their own single-cell datasets. Learn more at https://cells.ucsc.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matthew L Speir
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Aparna Bhaduri
- Department of Biological Chemistry, University of California, Los Angeles, CA, USA
| | - Nikolay S Markov
- Division of Pulmonary and Critical Care, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Pablo Moreno
- EMBL-EBI European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Tomasz J Nowakowski
- The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California San Francisco, San Francisco, CA, USA.,Department of Anatomy, University of California San Francisco, San Francisco, CA, USA.,Department of Psychiatry and Behavioral Sciences, University of California San Francisco, San Francisco, CA, USA.,Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Irene Papatheodorou
- EMBL-EBI European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Alex A Pollen
- The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California San Francisco, San Francisco, CA, USA.,Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Brian J Raney
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Lucas Seninge
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA.,Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
| | - W James Kent
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | | |
Collapse
|
7
|
Navarro Gonzalez J, Zweig AS, Speir ML, Schmelter D, Rosenbloom KR, Raney BJ, Powell CC, Nassar LR, Maulding ND, Lee CM, Lee BT, Hinrichs AS, Fyfe AC, Fernandes JD, Diekhans M, Clawson H, Casper J, Benet-Pagès A, Barber GP, Haussler D, Kuhn RM, Haeussler M, Kent WJ. The UCSC Genome Browser database: 2021 update. Nucleic Acids Res 2021; 49:D1046-D1057. [PMID: 33221922 PMCID: PMC7779060 DOI: 10.1093/nar/gkaa1070] [Citation(s) in RCA: 273] [Impact Index Per Article: 91.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 10/19/2020] [Accepted: 11/18/2020] [Indexed: 12/11/2022] Open
Abstract
For more than two decades, the UCSC Genome Browser database (https://genome.ucsc.edu) has provided high-quality genomics data visualization and genome annotations to the research community. As the field of genomics grows and more data become available, new modes of display are required to accommodate new technologies. New features released this past year include a Hi-C heatmap display, a phased family trio display for VCF files, and various track visualization improvements. Striving to keep data up-to-date, new updates to gene annotations include GENCODE Genes, NCBI RefSeq Genes, and Ensembl Genes. New data tracks added for human and mouse genomes include the ENCODE registry of candidate cis-regulatory elements, promoters from the Eukaryotic Promoter Database, and NCBI RefSeq Select and Matched Annotation from NCBI and EMBL-EBI (MANE). Within weeks of learning about the outbreak of coronavirus, UCSC released a genome browser, with detailed annotation tracks, for the SARS-CoV-2 RNA reference assembly.
Collapse
Affiliation(s)
| | - Ann S Zweig
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matthew L Speir
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Daniel Schmelter
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Kate R Rosenbloom
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian J Raney
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Conner C Powell
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Luis R Nassar
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Nathan D Maulding
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Christopher M Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian T Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Alastair C Fyfe
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Jason D Fernandes
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Jonathan Casper
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Anna Benet-Pagès
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA.,Medical Genetics Center (MGZ), Munich, Germany
| | - Galt P Barber
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - David Haussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA.,Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Robert M Kuhn
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Maximilian Haeussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - W James Kent
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
8
|
Fernandes JD, Hinrichs AS, Clawson H, Gonzalez JN, Lee BT, Nassar LR, Raney BJ, Rosenbloom KR, Nerli S, Rao AA, Schmelter D, Fyfe A, Maulding N, Zweig AS, Lowe TM, Ares M, Corbet-Detig R, Kent WJ, Haussler D, Haeussler M. The UCSC SARS-CoV-2 Genome Browser. Nat Genet 2020; 52:991-998. [PMID: 32908258 PMCID: PMC8016453 DOI: 10.1038/s41588-020-0700-8] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Background: Researchers are generating molecular data pertaining to the SARS-CoV-2 RNA genome and its proteins at an unprecedented rate during the COVID-19 pandemic. As a result, there is a critical need for rapid and continuously updated access to the latest molecular data in a format in which all data can be quickly cross-referenced and compared. We adapted our genome browser visualization tool to the viral genome for this purpose. Molecular data, curated from published studies or from database submissions, are mapped to the viral genome and grouped together into “annotation tracks” where they can be visualized along the linear map of the viral genome sequence and programmatically downloaded in standard format for analysis. Results: The UCSC Genome Browser for SARS-CoV-2 (https://genome.ucsc.edu/covid19.html ) provides continuously updated access to the mutations in the many thousands of SARS-CoV-2 genomes deposited in GISAID and the international nucleotide sequencing databases, displayed alongside phylogenetic trees. These data are augmented with alignments of bat, pangolin, and other animal and human coronavirus genomes, including per-base evolutionary rate analysis. All available annotations are cross-referenced on the virus genome, including those from major databases (PDB, RFAM, IEDB, UniProt) as well as up-to-date individual results from preprints. Annotated data include predicted and validated immune epitopes, promising antibodies, RT-PCR and sequencing primers, CRISPR guides (from research, diagnostics, vaccines, and therapies), and points of interaction between human and viral genes. As a community resource, any user can add manual annotations which are quality checked and shared publicly on the browser the next day. Conclusions: We invite all investigators to contribute additional data and annotations to this resource to accelerate research and development activities globally. Contact us at genome-www@soe.ucsc.edu with data suggestions or requests for support for adding data. Rapid sharing of data will accelerate SARS-CoV-2 research, especially when researchers take time to integrate their data with those from other labs on a widely-used community browser platform with standardized machine-readable data formats, such as the SARS-CoV-2 Genome Browser.
Collapse
Affiliation(s)
- Jason D Fernandes
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Angie S Hinrichs
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Hiram Clawson
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | | | - Brian T Lee
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Luis R Nassar
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Brian J Raney
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Kate R Rosenbloom
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Santrupti Nerli
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Arjun A Rao
- ImmunoX Initiative, University of California San Francisco, San Francisco, CA, USA
| | - Daniel Schmelter
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Alastair Fyfe
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Nathan Maulding
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Ann S Zweig
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Todd M Lowe
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
- Center for Molecular Biology of RNA, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Manuel Ares
- Molecular, Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA, USA
- Center for Molecular Biology of RNA, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Russ Corbet-Detig
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - W James Kent
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - David Haussler
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA.
- Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, CA, USA.
- Center for Molecular Biology of RNA, University of California Santa Cruz, Santa Cruz, CA, USA.
| | | |
Collapse
|
9
|
Lee CM, Barber GP, Casper J, Clawson H, Diekhans M, Gonzalez JN, Hinrichs AS, Lee BT, Nassar LR, Powell CC, Raney BJ, Rosenbloom KR, Schmelter D, Speir ML, Zweig AS, Haussler D, Haeussler M, Kuhn RM, Kent WJ. UCSC Genome Browser enters 20th year. Nucleic Acids Res 2020; 48:D756-D761. [PMID: 31691824 PMCID: PMC7145642 DOI: 10.1093/nar/gkz1012] [Citation(s) in RCA: 77] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Revised: 10/16/2019] [Accepted: 10/25/2019] [Indexed: 12/27/2022] Open
Abstract
The University of California Santa Cruz Genome Browser website (https://genome.ucsc.edu) enters its 20th year of providing high-quality genomics data visualization and genome annotations to the research community. In the past year, we have added a new option to our web BLAT tool that allows search against all genomes, a single-cell expression viewer (https://cells.ucsc.edu), a ‘lollipop’ plot display mode for high-density variation data, a RESTful API for data extraction and a custom-track backup feature. New datasets include Tabula Muris single-cell expression data, GeneHancer regulatory annotations, The Cancer Genome Atlas Pan-Cancer variants, Genome Reference Consortium Patch sequences, new ENCODE transcription factor binding site peaks and clusters, the Database of Genomic Variants Gold Standard Variants, Genomenon Mastermind variants and three new multi-species alignment tracks.
Collapse
Affiliation(s)
- Christopher M Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Galt P Barber
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Jonathan Casper
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian T Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Luis R Nassar
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Conner C Powell
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian J Raney
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Kate R Rosenbloom
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Daniel Schmelter
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matthew L Speir
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Ann S Zweig
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - David Haussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA.,Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Maximilian Haeussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Robert M Kuhn
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - W James Kent
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
10
|
Haeussler M, Zweig AS, Tyner C, Speir ML, Rosenbloom KR, Raney BJ, Lee CM, Lee BT, Hinrichs AS, Gonzalez JN, Gibson D, Diekhans M, Clawson H, Casper J, Barber GP, Haussler D, Kuhn RM, Kent WJ. The UCSC Genome Browser database: 2019 update. Nucleic Acids Res 2020; 47:D853-D858. [PMID: 30407534 PMCID: PMC6323953 DOI: 10.1093/nar/gky1095] [Citation(s) in RCA: 505] [Impact Index Per Article: 126.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 10/19/2018] [Indexed: 01/17/2023] Open
Abstract
The UCSC Genome Browser (https://genome.ucsc.edu) is a graphical viewer for exploring genome annotations. For almost two decades, the Browser has provided visualization tools for genetics and molecular biology and continues to add new data and features. This year, we added a new tool that lets users interactively arrange existing graphing tracks into new groups. Other software additions include new formats for chromosome interactions, a ChIP-Seq peak display for track hubs and improved support for HGVS. On the annotation side, we have added gnomAD, TCGA expression, RefSeq Functional elements, GTEx eQTLs, CRISPR Guides, SNPpedia and created a 30-way primate alignment on the human genome. Nine assemblies now have RefSeq-mapped gene models.
Collapse
Affiliation(s)
- Maximilian Haeussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Ann S Zweig
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Cath Tyner
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matthew L Speir
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Kate R Rosenbloom
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian J Raney
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Christopher M Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian T Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - David Gibson
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Jonathan Casper
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Galt P Barber
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - David Haussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA.,Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Robert M Kuhn
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - W James Kent
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
11
|
Casper J, Zweig AS, Villarreal C, Tyner C, Speir ML, Rosenbloom KR, Raney BJ, Lee CM, Lee BT, Karolchik D, Hinrichs AS, Haeussler M, Guruvadoo L, Navarro Gonzalez J, Gibson D, Fiddes IT, Eisenhart C, Diekhans M, Clawson H, Barber GP, Armstrong J, Haussler D, Kuhn RM, Kent WJ. The UCSC Genome Browser database: 2018 update. Nucleic Acids Res 2019; 46:D762-D769. [PMID: 29106570 PMCID: PMC5753355 DOI: 10.1093/nar/gkx1020] [Citation(s) in RCA: 338] [Impact Index Per Article: 67.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/18/2017] [Indexed: 12/14/2022] Open
Abstract
The UCSC Genome Browser (https://genome.ucsc.edu) provides a web interface for exploring annotated genome assemblies. The assemblies and annotation tracks are updated on an ongoing basis—12 assemblies and more than 28 tracks were added in the past year. Two recent additions are a display of CRISPR/Cas9 guide sequences and an interactive navigator for gene interactions. Other upgrades from the past year include a command-line version of the Variant Annotation Integrator, support for Human Genome Variation Society variant nomenclature input and output, and a revised highlighting tool that now supports multiple simultaneous regions and colors.
Collapse
Affiliation(s)
- Jonathan Casper
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Ann S Zweig
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Chris Villarreal
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Cath Tyner
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matthew L Speir
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Kate R Rosenbloom
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian J Raney
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Christopher M Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian T Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Donna Karolchik
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Maximilian Haeussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Luvina Guruvadoo
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - David Gibson
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Ian T Fiddes
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Galt P Barber
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Joel Armstrong
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - David Haussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA.,Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Robert M Kuhn
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - W James Kent
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
12
|
Nowakowski TJ, Bhaduri A, Pollen AA, Alvarado B, Mostajo-Radji MA, Di Lullo E, Haeussler M, Sandoval-Espinosa C, Liu SJ, Velmeshev D, Ounadjela JR, Shuga J, Wang X, Lim DA, West JA, Leyrat AA, Kent WJ, Kriegstein AR. Spatiotemporal gene expression trajectories reveal developmental hierarchies of the human cortex. Science 2018; 358:1318-1323. [PMID: 29217575 DOI: 10.1126/science.aap8809] [Citation(s) in RCA: 516] [Impact Index Per Article: 86.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Accepted: 11/10/2017] [Indexed: 12/16/2022]
Abstract
Systematic analyses of spatiotemporal gene expression trajectories during organogenesis have been challenging because diverse cell types at different stages of maturation and differentiation coexist in the emerging tissues. We identified discrete cell types as well as temporally and spatially restricted trajectories of radial glia maturation and neurogenesis in developing human telencephalon. These lineage-specific trajectories reveal the expression of neurogenic transcription factors in early radial glia and enriched activation of mammalian target of rapamycin signaling in outer radial glia. Across cortical areas, modest transcriptional differences among radial glia cascade into robust typological distinctions among maturing neurons. Together, our results support a mixed model of topographical, typological, and temporal hierarchies governing cell-type diversity in the developing human telencephalon, including distinct excitatory lineages emerging in rostral and caudal cerebral cortex.
Collapse
Affiliation(s)
- Tomasz J Nowakowski
- Department of Neurology, University of California, San Francisco (UCSF), San Francisco, CA, USA. .,The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, UCSF, San Francisco, CA, USA.,Department of Anatomy, UCSF, San Francisco, CA, USA
| | - Aparna Bhaduri
- Department of Neurology, University of California, San Francisco (UCSF), San Francisco, CA, USA. .,The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, UCSF, San Francisco, CA, USA
| | - Alex A Pollen
- Department of Neurology, University of California, San Francisco (UCSF), San Francisco, CA, USA.,The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, UCSF, San Francisco, CA, USA
| | - Beatriz Alvarado
- Department of Neurology, University of California, San Francisco (UCSF), San Francisco, CA, USA.,The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, UCSF, San Francisco, CA, USA
| | - Mohammed A Mostajo-Radji
- Department of Neurology, University of California, San Francisco (UCSF), San Francisco, CA, USA.,The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, UCSF, San Francisco, CA, USA
| | - Elizabeth Di Lullo
- Department of Neurology, University of California, San Francisco (UCSF), San Francisco, CA, USA.,The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, UCSF, San Francisco, CA, USA
| | | | - Carmen Sandoval-Espinosa
- Department of Neurology, University of California, San Francisco (UCSF), San Francisco, CA, USA.,The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, UCSF, San Francisco, CA, USA
| | - Siyuan John Liu
- The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, UCSF, San Francisco, CA, USA.,Department of Neurosurgery, UCSF, San Francisco, CA, USA
| | - Dmitry Velmeshev
- Department of Neurology, University of California, San Francisco (UCSF), San Francisco, CA, USA.,The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, UCSF, San Francisco, CA, USA
| | - Johain Ryad Ounadjela
- The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, UCSF, San Francisco, CA, USA.,Department of Anatomy, UCSF, San Francisco, CA, USA
| | - Joe Shuga
- New Technologies, Fluidigm, South San Francisco, CA, USA
| | - Xiaohui Wang
- New Technologies, Fluidigm, South San Francisco, CA, USA
| | - Daniel A Lim
- The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, UCSF, San Francisco, CA, USA.,Department of Neurosurgery, UCSF, San Francisco, CA, USA
| | - Jay A West
- New Technologies, Fluidigm, South San Francisco, CA, USA
| | - Anne A Leyrat
- New Technologies, Fluidigm, South San Francisco, CA, USA
| | - W James Kent
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Arnold R Kriegstein
- Department of Neurology, University of California, San Francisco (UCSF), San Francisco, CA, USA.,The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, UCSF, San Francisco, CA, USA.,Department of Anatomy, UCSF, San Francisco, CA, USA
| |
Collapse
|
13
|
Tyner C, Barber GP, Casper J, Clawson H, Diekhans M, Eisenhart C, Fischer CM, Gibson D, Gonzalez JN, Guruvadoo L, Haeussler M, Heitner S, Hinrichs AS, Karolchik D, Lee BT, Lee CM, Nejad P, Raney BJ, Rosenbloom KR, Speir ML, Villarreal C, Vivian J, Zweig AS, Haussler D, Kuhn RM, Kent WJ. The UCSC Genome Browser database: 2017 update. Nucleic Acids Res 2017; 45:D626-D634. [PMID: 27899642 PMCID: PMC5210591 DOI: 10.1093/nar/gkw1134] [Citation(s) in RCA: 197] [Impact Index Per Article: 28.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Revised: 10/17/2016] [Accepted: 10/31/2016] [Indexed: 12/14/2022] Open
Abstract
Since its 2001 debut, the University of California, Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu/) team has provided continuous support to the international genomics and biomedical communities through a web-based, open source platform designed for the fast, scalable display of sequence alignments and annotations landscaped against a vast collection of quality reference genome assemblies. The browser's publicly accessible databases are the backbone of a rich, integrated bioinformatics tool suite that includes a graphical interface for data queries and downloads, alignment programs, command-line utilities and more. This year's highlights include newly designed home and gateway pages; a new 'multi-region' track display configuration for exon-only, gene-only and custom regions visualization; new genome browsers for three species (brown kiwi, crab-eating macaque and Malayan flying lemur); eight updated genome assemblies; extended support for new data types such as CRAM, RNA-seq expression data and long-range chromatin interaction pairs; and the unveiling of a new supported mirror site in Japan.
Collapse
Affiliation(s)
- Cath Tyner
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Galt P Barber
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Jonathan Casper
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Clayton M Fischer
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - David Gibson
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Luvina Guruvadoo
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Maximilian Haeussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Steve Heitner
- Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Donna Karolchik
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian T Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Christopher M Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Parisa Nejad
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian J Raney
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Kate R Rosenbloom
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matthew L Speir
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Chris Villarreal
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - John Vivian
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Ann S Zweig
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - David Haussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Howard Hughes Medical Institute, University of California Santa Cruz, CA 95064, USA
| | - Robert M Kuhn
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - W James Kent
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
14
|
Hinrichs AS, Raney BJ, Speir ML, Rhead B, Casper J, Karolchik D, Kuhn RM, Rosenbloom KR, Zweig AS, Haussler D, Kent WJ. UCSC Data Integrator and Variant Annotation Integrator. Bioinformatics 2016; 32:1430-2. [PMID: 26740527 PMCID: PMC4848401 DOI: 10.1093/bioinformatics/btv766] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2015] [Accepted: 12/28/2015] [Indexed: 01/29/2023] Open
Abstract
Summary: Two new tools on the UCSC Genome Browser web site provide improved ways of combining information from multiple datasets, optionally including the user's own custom track data and/or data from track hubs. The Data Integrator combines columns from multiple data tracks, showing all items from the first track along with overlapping items from the other tracks. The Variant Annotation Integrator is tailored to adding functional annotations to variant calls; it offers a more restricted set of underlying data tracks but adds predictions of each variant's consequences for any overlapping or nearby gene transcript. When available, it optionally adds additional annotations including effect prediction scores from dbNSFP for missense mutations, ENCODE regulatory summary tracks and conservation scores. Availability and implementation: The web tools are freely available at http://genome.ucsc.edu/ and the underlying database is available for download at http://hgdownload.cse.ucsc.edu/. The software (written in C and Javascript) is available from https://genome-store.ucsc.edu/ and is freely available for academic and non-profit usage; commercial users must obtain a license. Contact: angie@soe.ucsc.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Angie S Hinrichs
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Brian J Raney
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Matthew L Speir
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Brooke Rhead
- Computational Biology Graduate Group, University of California, Berkeley, CA, USA and
| | - Jonathan Casper
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Donna Karolchik
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Robert M Kuhn
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | | | - Ann S Zweig
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - David Haussler
- Genomics Institute, University of California, Santa Cruz, CA, USA, Howard Hughes Medical Institute, University of California, Santa Cruz, CA, USA
| | - W James Kent
- Genomics Institute, University of California, Santa Cruz, CA, USA
| |
Collapse
|
15
|
Speir ML, Zweig AS, Rosenbloom KR, Raney BJ, Paten B, Nejad P, Lee BT, Learned K, Karolchik D, Hinrichs AS, Heitner S, Harte RA, Haeussler M, Guruvadoo L, Fujita PA, Eisenhart C, Diekhans M, Clawson H, Casper J, Barber GP, Haussler D, Kuhn RM, Kent WJ. The UCSC Genome Browser database: 2016 update. Nucleic Acids Res 2015; 44:D717-25. [PMID: 26590259 PMCID: PMC4702902 DOI: 10.1093/nar/gkv1275] [Citation(s) in RCA: 334] [Impact Index Per Article: 37.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2015] [Accepted: 11/03/2015] [Indexed: 01/19/2023] Open
Abstract
For the past 15 years, the UCSC Genome Browser (http://genome.ucsc.edu/) has served the international research community by offering an integrated platform for viewing and analyzing information from a large database of genome assemblies and their associated annotations. The UCSC Genome Browser has been under continuous development since its inception with new data sets and software features added frequently. Some release highlights of this year include new and updated genome browsers for various assemblies, including bonobo and zebrafish; new gene annotation sets; improvements to track and assembly hub support; and a new interactive tool, the “Data Integrator”, for intersecting data from multiple tracks. We have greatly expanded the data sets available on the most recent human assembly, hg38/GRCh38, to include updated gene prediction sets from GENCODE, more phenotype- and disease-associated variants from ClinVar and ClinGen, more genomic regulatory data, and a new multiple genome alignment.
Collapse
Affiliation(s)
- Matthew L Speir
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Ann S Zweig
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Kate R Rosenbloom
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian J Raney
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Benedict Paten
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Parisa Nejad
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian T Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Katrina Learned
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Donna Karolchik
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Steve Heitner
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Maximilian Haeussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Luvina Guruvadoo
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Pauline A Fujita
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94143, USA
| | | | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Jonathan Casper
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Galt P Barber
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - David Haussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA Howard Hughes Medical Institute, University of California Santa Cruz, CA 95064, USA
| | - Robert M Kuhn
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - W James Kent
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
16
|
Paten B, Diekhans M, Druker BJ, Friend S, Guinney J, Gassner N, Guttman M, Kent WJ, Mantey P, Margolin AA, Massie M, Novak AM, Nothaft F, Pachter L, Patterson D, Smuga-Otto M, Stuart JM, Van't Veer L, Wold B, Haussler D. The NIH BD2K center for big data in translational genomics. J Am Med Inform Assoc 2015; 22:1143-7. [PMID: 26174866 DOI: 10.1093/jamia/ocv047] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Accepted: 04/20/2015] [Indexed: 11/14/2022] Open
Abstract
The world's genomics data will never be stored in a single repository - rather, it will be distributed among many sites in many countries. No one site will have enough data to explain genotype to phenotype relationships in rare diseases; therefore, sites must share data. To accomplish this, the genetics community must forge common standards and protocols to make sharing and computing data among many sites a seamless activity. Through the Global Alliance for Genomics and Health, we are pioneering the development of shared application programming interfaces (APIs) to connect the world's genome repositories. In parallel, we are developing an open source software stack (ADAM) that uses these APIs. This combination will create a cohesive genome informatics ecosystem. Using containers, we are facilitating the deployment of this software in a diverse array of environments. Through benchmarking efforts and big data driver projects, we are ensuring ADAM's performance and utility.
Collapse
Affiliation(s)
- Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Brian J Druker
- Knight Cancer Institute, Oregon Health & Science University, Portland, OR, USA
| | - Stephen Friend
- Sage Bionetworks, Fairview Ave North, Seattle 98109, WA, USA
| | - Justin Guinney
- Sage Bionetworks, Fairview Ave North, Seattle 98109, WA, USA
| | - Nadine Gassner
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Mitchell Guttman
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - W James Kent
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Patrick Mantey
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA Jack Baskin School of Engineering, University of California, Santa Cruz, CA, USA
| | - Adam A Margolin
- Computational Biology Program, Oregon Health & Science University, Portland, OR, USA
| | - Matt Massie
- Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA, USA
| | - Adam M Novak
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Frank Nothaft
- Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA, USA
| | - Lior Pachter
- Department of Mathematics, University of California Berkeley, Berkeley, CA, USA Department of Molecular & Cellular Biology, University of California Berkeley, Berkeley, CA, USA
| | - David Patterson
- Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA, USA
| | - Maciej Smuga-Otto
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Joshua M Stuart
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Laura Van't Veer
- Department of Laboratory Medicine, University of California, San Francisco, CA, USA
| | - Barbara Wold
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - David Haussler
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA Howard Hughes Medical Institute, Bethesda, MD, USA
| |
Collapse
|
17
|
Miga KH, Eisenhart C, Kent WJ. Utilizing mapping targets of sequences underrepresented in the reference assembly to reduce false positive alignments. Nucleic Acids Res 2015; 43:e133. [PMID: 26163063 PMCID: PMC4787761 DOI: 10.1093/nar/gkv671] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2014] [Accepted: 06/18/2015] [Indexed: 11/14/2022] Open
Abstract
The human reference assembly remains incomplete due to the underrepresentation of repeat-rich sequences that are found within centromeric regions and acrocentric short arms. Although these sequences are marginally represented in the assembly, they are often fully represented in whole-genome short-read datasets and contribute to inappropriate alignments and high read-depth signals that localize to a small number of assembled homologous regions. As a consequence, these regions often provide artifactual peak calls that confound hypothesis testing and large-scale genomic studies. To address this problem, we have constructed mapping targets that represent roughly 8% of the human genome generally omitted from the human reference assembly. By integrating these data into standard mapping and peak-calling pipelines we demonstrate a 10-fold reduction in signals in regions common to the blacklisted region and identify a comprehensive set of regions that exhibit mapping sensitivity with the presence of the repeat-rich targets.
Collapse
Affiliation(s)
- Karen H Miga
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Christopher Eisenhart
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - W James Kent
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
18
|
Malladi VS, Erickson DT, Podduturi NR, Rowe LD, Chan ET, Davidson JM, Hitz BC, Ho M, Lee BT, Miyasato S, Roe GR, Simison M, Sloan CA, Strattan JS, Tanaka F, Kent WJ, Cherry JM, Hong EL. Ontology application and use at the ENCODE DCC. Database (Oxford) 2015; 2015:bav010. [PMID: 25776021 PMCID: PMC4360730 DOI: 10.1093/database/bav010] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a catalog of genomic annotations. To date, the project has generated over 4000 experiments across more than 350 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory network and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All ENCODE experimental data, metadata and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage and distribution to community resources and the scientific community. As the volume of data increases, the organization of experimental details becomes increasingly complicated and demands careful curation to identify related experiments. Here, we describe the ENCODE DCC’s use of ontologies to standardize experimental metadata. We discuss how ontologies, when used to annotate metadata, provide improved searching capabilities and facilitate the ability to find connections within a set of experiments. Additionally, we provide examples of how ontologies are used to annotate ENCODE metadata and how the annotations can be identified via ontology-driven searches at the ENCODE portal. As genomic datasets grow larger and more interconnected, standardization of metadata becomes increasingly vital to allow for exploration and comparison of data between different scientific projects. Database URL: https://www.encodeproject.org/
Collapse
Affiliation(s)
- Venkat S Malladi
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Drew T Erickson
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Nikhil R Podduturi
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Laurence D Rowe
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Esther T Chan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Jean M Davidson
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Benjamin C Hitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Marcus Ho
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian T Lee
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Stuart Miyasato
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Gregory R Roe
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matt Simison
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Cricket A Sloan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - J Seth Strattan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Forrest Tanaka
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - W James Kent
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - J Michael Cherry
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Eurie L Hong
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA and Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
19
|
|
20
|
Yue F, Cheng Y, Breschi A, Vierstra J, Wu W, Ryba T, Sandstrom R, Ma Z, Davis C, Pope BD, Shen Y, Pervouchine DD, Djebali S, Thurman RE, Kaul R, Rynes E, Kirilusha A, Marinov GK, Williams BA, Trout D, Amrhein H, Fisher-Aylor K, Antoshechkin I, DeSalvo G, See LH, Fastuca M, Drenkow J, Zaleski C, Dobin A, Prieto P, Lagarde J, Bussotti G, Tanzer A, Denas O, Li K, Bender MA, Zhang M, Byron R, Groudine MT, McCleary D, Pham L, Ye Z, Kuan S, Edsall L, Wu YC, Rasmussen MD, Bansal MS, Kellis M, Keller CA, Morrissey CS, Mishra T, Jain D, Dogan N, Harris RS, Cayting P, Kawli T, Boyle AP, Euskirchen G, Kundaje A, Lin S, Lin Y, Jansen C, Malladi VS, Cline MS, Erickson DT, Kirkup VM, Learned K, Sloan CA, Rosenbloom KR, Lacerda de Sousa B, Beal K, Pignatelli M, Flicek P, Lian J, Kahveci T, Lee D, Kent WJ, Ramalho Santos M, Herrero J, Notredame C, Johnson A, Vong S, Lee K, Bates D, Neri F, Diegel M, Canfield T, Sabo PJ, Wilken MS, Reh TA, Giste E, Shafer A, Kutyavin T, Haugen E, Dunn D, Reynolds AP, Neph S, Humbert R, Hansen RS, De Bruijn M, Selleri L, Rudensky A, Josefowicz S, Samstein R, Eichler EE, Orkin SH, Levasseur D, Papayannopoulou T, Chang KH, Skoultchi A, Gosh S, Disteche C, Treuting P, Wang Y, Weiss MJ, Blobel GA, Cao X, Zhong S, Wang T, Good PJ, Lowdon RF, Adams LB, Zhou XQ, Pazin MJ, Feingold EA, Wold B, Taylor J, Mortazavi A, Weissman SM, Stamatoyannopoulos JA, Snyder MP, Guigo R, Gingeras TR, Gilbert DM, Hardison RC, Beer MA, Ren B. A comparative encyclopedia of DNA elements in the mouse genome. Nature 2015; 515:355-64. [PMID: 25409824 PMCID: PMC4266106 DOI: 10.1038/nature13992] [Citation(s) in RCA: 1135] [Impact Index Per Article: 126.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2014] [Accepted: 10/24/2014] [Indexed: 12/11/2022]
Abstract
The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.
Collapse
Affiliation(s)
- Feng Yue
- 1] Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA. [2] Department of Biochemistry and Molecular Biology, College of Medicine, The Pennsylvania State University, Hershey, Pennsylvania 17033, USA
| | - Yong Cheng
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Alessandra Breschi
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Jeff Vierstra
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Weisheng Wu
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Tyrone Ryba
- Department of Biological Science, 319 Stadium Drive, Florida State University, Tallahassee, Florida 32306-4295, USA
| | - Richard Sandstrom
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Zhihai Ma
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Carrie Davis
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Benjamin D Pope
- Department of Biological Science, 319 Stadium Drive, Florida State University, Tallahassee, Florida 32306-4295, USA
| | - Yin Shen
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Dmitri D Pervouchine
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Sarah Djebali
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Robert E Thurman
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Rajinder Kaul
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Eric Rynes
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Anthony Kirilusha
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Georgi K Marinov
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Brian A Williams
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Diane Trout
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Henry Amrhein
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Katherine Fisher-Aylor
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Igor Antoshechkin
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Gilberto DeSalvo
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Lei-Hoon See
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Meagan Fastuca
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Jorg Drenkow
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Chris Zaleski
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Alex Dobin
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Pablo Prieto
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Julien Lagarde
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Giovanni Bussotti
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Andrea Tanzer
- 1] Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain. [2] Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Waehringerstrasse 17/3/303, A-1090 Vienna, Austria
| | - Olgert Denas
- Departments of Biology and Mathematics and Computer Science, Emory University, O. Wayne Rollins Research Center, 1510 Clifton Road NE, Atlanta, Georgia 30322, USA
| | - Kanwei Li
- Departments of Biology and Mathematics and Computer Science, Emory University, O. Wayne Rollins Research Center, 1510 Clifton Road NE, Atlanta, Georgia 30322, USA
| | - M A Bender
- 1] Department of Pediatrics, University of Washington, Seattle, Washington 98195, USA. [2] Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | - Miaohua Zhang
- Basic Science Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | - Rachel Byron
- Basic Science Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | - Mark T Groudine
- 1] Basic Science Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA. [2] Department of Radiation Oncology, University of Washington, Seattle, Washington 98195, USA
| | - David McCleary
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Long Pham
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Zhen Ye
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Samantha Kuan
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Lee Edsall
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Yi-Chieh Wu
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
| | - Matthew D Rasmussen
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
| | - Mukul S Bansal
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
| | - Manolis Kellis
- 1] Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA. [2] Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Cheryl A Keller
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Christapher S Morrissey
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Tejaswini Mishra
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Deepti Jain
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Nergiz Dogan
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Robert S Harris
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Philip Cayting
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Trupti Kawli
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Alan P Boyle
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Ghia Euskirchen
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Shin Lin
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Yiing Lin
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Camden Jansen
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, California 92697, USA
| | - Venkat S Malladi
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Melissa S Cline
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Drew T Erickson
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Vanessa M Kirkup
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Katrina Learned
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Cricket A Sloan
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Kate R Rosenbloom
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Beatriz Lacerda de Sousa
- Departments of Obstetrics/Gynecology and Pathology, and Center for Reproductive Sciences, University of California San Francisco, San Francisco, California 94143, USA
| | - Kathryn Beal
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Miguel Pignatelli
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jin Lian
- Yale University, Department of Genetics, PO Box 208005, 333 Cedar Street, New Haven, Connecticut 06520-8005, USA
| | - Tamer Kahveci
- Computer &Information Sciences &Engineering, University of Florida, Gainesville, Florida 32611, USA
| | - Dongwon Lee
- McKusick-Nathans Institute of Genetic Medicine and Department of Biomedical Engineering, Johns Hopkins University, 733 N. Broadway, BRB 573 Baltimore, Maryland 21205, USA
| | - W James Kent
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California 95064, USA
| | - Miguel Ramalho Santos
- Departments of Obstetrics/Gynecology and Pathology, and Center for Reproductive Sciences, University of California San Francisco, San Francisco, California 94143, USA
| | - Javier Herrero
- 1] European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. [2] Bill Lyons Informatics Centre, UCL Cancer Institute, University College London, London WC1E 6DD, UK
| | - Cedric Notredame
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Audra Johnson
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Shinny Vong
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Kristen Lee
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Daniel Bates
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Fidencio Neri
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Morgan Diegel
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Theresa Canfield
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Peter J Sabo
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Matthew S Wilken
- Department of Biological Structure, University of Washington, HSB I-516, 1959 NE Pacific Street, Seattle, Washington 98195, USA
| | - Thomas A Reh
- Department of Biological Structure, University of Washington, HSB I-516, 1959 NE Pacific Street, Seattle, Washington 98195, USA
| | - Erika Giste
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Anthony Shafer
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Tanya Kutyavin
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Eric Haugen
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Douglas Dunn
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Alex P Reynolds
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Shane Neph
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Richard Humbert
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - R Scott Hansen
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Marella De Bruijn
- MRC Molecular Haemotology Unit, University of Oxford, Oxford OX3 9DS, UK
| | - Licia Selleri
- Department of Cell and Developmental Biology, Weill Cornell Medical College, New York, New York 10065, USA
| | - Alexander Rudensky
- HHMI and Ludwig Center at Memorial Sloan Kettering Cancer Center, Immunology Program, Memorial Sloan Kettering Cancer Canter, New York, New York 10065, USA
| | - Steven Josefowicz
- HHMI and Ludwig Center at Memorial Sloan Kettering Cancer Center, Immunology Program, Memorial Sloan Kettering Cancer Canter, New York, New York 10065, USA
| | - Robert Samstein
- HHMI and Ludwig Center at Memorial Sloan Kettering Cancer Center, Immunology Program, Memorial Sloan Kettering Cancer Canter, New York, New York 10065, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Stuart H Orkin
- Dana Farber Cancer Institute, Harvard Medical School, Cambridge, Massachusetts 02138, USA
| | - Dana Levasseur
- University of Iowa Carver College of Medicine, Department of Internal Medicine, Iowa City, Iowa 52242, USA
| | - Thalia Papayannopoulou
- Division of Hematology, Department of Medicine, University of Washington, Seattle, Washington 98195, USA
| | - Kai-Hsin Chang
- University of Iowa Carver College of Medicine, Department of Internal Medicine, Iowa City, Iowa 52242, USA
| | - Arthur Skoultchi
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, New York 10461, USA
| | - Srikanta Gosh
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, New York 10461, USA
| | - Christine Disteche
- Department of Pathology, University of Washington, Seattle, Washington 98195, USA
| | - Piper Treuting
- Department of Comparative Medicine, University of Washington, Seattle, Washington 98195, USA
| | - Yanli Wang
- Bioinformatics and Genomics program, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Mitchell J Weiss
- Department of Hematology, St Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Gerd A Blobel
- 1] Division of Hematology, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA. [2] Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Xiaoyi Cao
- Department of Bioengineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Sheng Zhong
- Department of Bioengineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA
| | - Ting Wang
- Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63108, USA
| | - Peter J Good
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Rebecca F Lowdon
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Leslie B Adams
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Xiao-Qiao Zhou
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Michael J Pazin
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Elise A Feingold
- NHGRI, National Institutes of Health, 5635 Fishers Lane, Bethesda, Maryland 20892-9307, USA
| | - Barbara Wold
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - James Taylor
- Departments of Biology and Mathematics and Computer Science, Emory University, O. Wayne Rollins Research Center, 1510 Clifton Road NE, Atlanta, Georgia 30322, USA
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, California 92697, USA
| | - Sherman M Weissman
- Yale University, Department of Genetics, PO Box 208005, 333 Cedar Street, New Haven, Connecticut 06520-8005, USA
| | | | - Michael P Snyder
- Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477 Stanford, California 94305, USA
| | - Roderic Guigo
- Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain
| | - Thomas R Gingeras
- Functional Genomics, Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - David M Gilbert
- Department of Biological Science, 319 Stadium Drive, Florida State University, Tallahassee, Florida 32306-4295, USA
| | - Ross C Hardison
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Michael A Beer
- McKusick-Nathans Institute of Genetic Medicine and Department of Biomedical Engineering, Johns Hopkins University, 733 N. Broadway, BRB 573 Baltimore, Maryland 21205, USA
| | - Bing Ren
- Ludwig Institute for Cancer Research and University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA
| | | |
Collapse
|
21
|
Nguyen N, Hickey G, Zerbino DR, Raney B, Earl D, Armstrong J, Kent WJ, Haussler D, Paten B. Building a pan-genome reference for a population. J Comput Biol 2015; 22:387-401. [PMID: 25565268 DOI: 10.1089/cmb.2014.0146] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
A reference genome is a high quality individual genome that is used as a coordinate system for the genomes of a population, or genomes of closely related subspecies. Given a set of genomes partitioned by homology into alignment blocks we formalize the problem of ordering and orienting the blocks such that the resulting ordering maximally agrees with the underlying genomes' ordering and orientation, creating a pan-genome reference ordering. We show this problem is NP-hard, but also demonstrate, empirically and within simulations, the performance of heuristic algorithms based upon a cactus graph decomposition to find locally maximal solutions. We describe an extension of our Cactus software to create a pan-genome reference for whole genome alignments, and demonstrate how it can be used to create novel genome browser visualizations using human variation data as a test. In addition, we test the use of a pan-genome for describing variations and as a reference for read mapping.
Collapse
Affiliation(s)
- Ngan Nguyen
- 1 Center for Biomolecular Science and Engineering, University of California , Santa Cruz, California
| | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Rosenbloom KR, Armstrong J, Barber GP, Casper J, Clawson H, Diekhans M, Dreszer TR, Fujita PA, Guruvadoo L, Haeussler M, Harte RA, Heitner S, Hickey G, Hinrichs AS, Hubley R, Karolchik D, Learned K, Lee BT, Li CH, Miga KH, Nguyen N, Paten B, Raney BJ, Smit AFA, Speir ML, Zweig AS, Haussler D, Kuhn RM, Kent WJ. The UCSC Genome Browser database: 2015 update. Nucleic Acids Res 2014; 43:D670-81. [PMID: 25428374 PMCID: PMC4383971 DOI: 10.1093/nar/gku1177] [Citation(s) in RCA: 690] [Impact Index Per Article: 69.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Launched in 2001 to showcase the draft human genome assembly, the UCSC Genome Browser database (http://genome.ucsc.edu) and associated tools continue to grow, providing a comprehensive resource of genome assemblies and annotations to scientists and students worldwide. Highlights of the past year include the release of a browser for the first new human genome reference assembly in 4 years in December 2013 (GRCh38, UCSC hg38), a watershed comparative genomics annotation (100-species multiple alignment and conservation) and a novel distribution mechanism for the browser (GBiB: Genome Browser in a Box). We created browsers for new species (Chinese hamster, elephant shark, minke whale), 'mined the web' for DNA sequences and expanded the browser display with stacked color graphs and region highlighting. As our user community increasingly adopts the UCSC track hub and assembly hub representations for sharing large-scale genomic annotation data sets and genome sequencing projects, our menu of public data hubs has tripled.
Collapse
Affiliation(s)
- Kate R Rosenbloom
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Joel Armstrong
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Galt P Barber
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Jonathan Casper
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Timothy R Dreszer
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Pauline A Fujita
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Luvina Guruvadoo
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Maximilian Haeussler
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Rachel A Harte
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Steve Heitner
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Glenn Hickey
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Angie S Hinrichs
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Robert Hubley
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Donna Karolchik
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Katrina Learned
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Brian T Lee
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Chin H Li
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Karen H Miga
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Ngan Nguyen
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Benedict Paten
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Brian J Raney
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | | | - Matthew L Speir
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Ann S Zweig
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - David Haussler
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Robert M Kuhn
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| | - W James Kent
- Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
| |
Collapse
|
23
|
Haeussler M, Karolchik D, Clawson H, Raney BJ, Rosenbloom KR, Fujita PA, Hinrichs AS, Speir ML, Eisenhart C, Zweig AS, Haussler D, Kent WJ. The UCSC Ebola Genome Portal. PLoS Curr 2014; 6. [PMID: 25685613 PMCID: PMC4318873 DOI: 10.1371/currents.outbreaks.386ab0964ab4d6c8cb550bfb6071d822] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Background:
With the Ebola epidemic raging out of control in West Africa, there has been a flurry of research into the Ebola virus, resulting in the generation of much genomic data.
Methods:
In response to the clear need for tools that integrate multiple strands of research around molecular sequences, we have created the University of California Santa Cruz (UCSC) Ebola Genome Browser, an adaptation of our popular UCSC Genome Browser web tool, which can be used to view the Ebola virus genome sequence from GenBank and nearly 30 annotation tracks generated by mapping external data to the reference sequence. Significant annotations include a multiple alignment comprising 102 Ebola genomes from the current outbreak, 56 from previous outbreaks, and 2 Marburg genomes as an outgroup; a gene track curated by NCBI; protein annotations curated by UniProt and antibody-binding epitopes curated by IEDB. We have extended the Genome Browser’s multiple alignment color-coding scheme to distinguish mutations resulting from non-synonymous coding changes, synonymous changes, or changes in untranslated regions.
Discussion:
Our Ebola Genome portal at http://genome.ucsc.edu/ebolaPortal/ links to the Ebola virus Genome Browser and an aggregate of useful information, including a collection of Ebola antibodies we are curating.
Collapse
Affiliation(s)
| | - Donna Karolchik
- CBSE, University of California Santa Cruz, Santa Cruz, California, USA
| | - Hiram Clawson
- CBSE, University of California Santa Cruz, Santa Cruz, California, USA
| | - Brian J Raney
- CBSE, University of California Santa Cruz, Santa Cruz, California, USA
| | - Kate R Rosenbloom
- CBSE, University of California Santa Cruz, Santa Cruz, California, USA
| | - Pauline A Fujita
- CBSE, University of California Santa Cruz, Santa Cruz, California, USA
| | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, California, USA
| | | | - Chris Eisenhart
- CBSE, University of California Santa Cruz, Santa Cruz, California, USA
| | - Ann S Zweig
- CBSE, University of California Santa Cruz, Santa Cruz, California, USA
| | - David Haussler
- CBSE, University of California Santa Cruz, Santa Cruz, California, USA
| | - W James Kent
- CBSE, University of California Santa Cruz, Santa Cruz, California, USA
| |
Collapse
|
24
|
Haeussler M, Raney BJ, Hinrichs AS, Clawson H, Zweig AS, Karolchik D, Casper J, Speir ML, Haussler D, Kent WJ. Navigating protected genomics data with UCSC Genome Browser in a Box. Bioinformatics 2014; 31:764-6. [PMID: 25348212 PMCID: PMC4341066 DOI: 10.1093/bioinformatics/btu712] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Summary: Genome Browser in a Box (GBiB) is a small virtual machine version of the popular University of California Santa Cruz (UCSC) Genome Browser that can be run on a researcher's own computer. Once GBiB is installed, a standard web browser is used to access the virtual server and add personal data files from the local hard disk. Annotation data are loaded on demand through the Internet from UCSC or can be downloaded to the local computer for faster access. Availability and implementation: Software downloads and installation instructions are freely available for non-commercial use at https://genome-store.ucsc.edu/. GBiB requires the installation of open-source software VirtualBox, available for all major operating systems, and the UCSC Genome Browser, which is open source and free for non-commercial use. Commercial use of GBiB and the Genome Browser requires a license (http://genome.ucsc.edu/license/). Contact:genome@soe.ucsc.edu
Collapse
Affiliation(s)
- Maximilian Haeussler
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA and Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian J Raney
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA and Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Angie S Hinrichs
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA and Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA and Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Ann S Zweig
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA and Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Donna Karolchik
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA and Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Jonathan Casper
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA and Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matthew L Speir
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA and Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - David Haussler
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA and Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA and Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - W James Kent
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA and Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
25
|
Abstract
The human genome sequence remains incomplete, with multimegabase-sized gaps representing the endogenous centromeres and other heterochromatic regions. Available sequence-based studies within these sites in the genome have demonstrated a role in centromere function and chromosome pairing, necessary to ensure proper chromosome segregation during cell division. A common genomic feature of these regions is the enrichment of long arrays of near-identical tandem repeats, known as satellite DNAs, which offer a limited number of variant sites to differentiate individual repeat copies across millions of bases. This substantial sequence homogeneity challenges available assembly strategies and, as a result, centromeric regions are omitted from ongoing genomic studies. To address this problem, we utilize monomer sequence and ordering information obtained from whole-genome shotgun reads to model two haploid human satellite arrays on chromosomes X and Y, resulting in an initial characterization of 3.83 Mb of centromeric DNA within an individual genome. To further expand the utility of each centromeric reference sequence model, we evaluate sites within the arrays for short-read mappability and chromosome specificity. Because satellite DNAs evolve in a concerted manner, we use these centromeric assemblies to assess the extent of sequence variation among 366 individuals from distinct human populations. We thus identify two satellite array variants in both X and Y centromeres, as determined by array length and sequence composition. This study provides an initial sequence characterization of a regional centromere and establishes a foundation to extend genomic characterization to these sites as well as to other repeat-rich regions within complex genomes.
Collapse
Affiliation(s)
- Karen H Miga
- Duke Institute for Genome Sciences & Policy, Duke University, Durham, North Carolina 27708, USA
| | | | | | | | | | | |
Collapse
|
26
|
Karolchik D, Barber GP, Casper J, Clawson H, Cline MS, Diekhans M, Dreszer TR, Fujita PA, Guruvadoo L, Haeussler M, Harte RA, Heitner S, Hinrichs AS, Learned K, Lee BT, Li CH, Raney BJ, Rhead B, Rosenbloom KR, Sloan CA, Speir ML, Zweig AS, Haussler D, Kuhn RM, Kent WJ. The UCSC Genome Browser database: 2014 update. Nucleic Acids Res 2014; 42:D764-70. [PMID: 24270787 PMCID: PMC3964947 DOI: 10.1093/nar/gkt1168] [Citation(s) in RCA: 550] [Impact Index Per Article: 55.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2013] [Revised: 10/30/2013] [Accepted: 10/30/2013] [Indexed: 12/17/2022] Open
Abstract
The University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a large collection of organisms, primarily vertebrates, with an emphasis on the human and mouse genomes. The Browser's web-based tools provide an integrated environment for visualizing, comparing, analysing and sharing both publicly available and user-generated genomic data sets. As of September 2013, the database contained genomic sequence and a basic set of annotation 'tracks' for ∼90 organisms. Significant new annotations include a 60-species multiple alignment conservation track on the mouse, updated UCSC Genes tracks for human and mouse, and several new sets of variation and ENCODE data. New software tools include a Variant Annotation Integrator that returns predicted functional effects of a set of variants uploaded as a custom track, an extension to UCSC Genes that displays haplotype alleles for protein-coding genes and an expansion of data hubs that includes the capability to display remotely hosted user-provided assembly sequence in addition to annotation data. To improve European access, we have added a Genome Browser mirror (http://genome-euro.ucsc.edu) hosted at Bielefeld University in Germany.
Collapse
Affiliation(s)
- Donna Karolchik
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Galt P. Barber
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Jonathan Casper
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Melissa S. Cline
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Timothy R. Dreszer
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Pauline A. Fujita
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Luvina Guruvadoo
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Maximilian Haeussler
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Rachel A. Harte
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Steve Heitner
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Angie S. Hinrichs
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Katrina Learned
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Brian T. Lee
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Chin H. Li
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Brian J. Raney
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Brooke Rhead
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Kate R. Rosenbloom
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Cricket A. Sloan
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Matthew L. Speir
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Ann S. Zweig
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - David Haussler
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Robert M. Kuhn
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - W. James Kent
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| |
Collapse
|
27
|
Raney BJ, Dreszer TR, Barber GP, Clawson H, Fujita PA, Wang T, Nguyen N, Paten B, Zweig AS, Karolchik D, Kent WJ. Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser. Bioinformatics 2013; 30:1003-5. [PMID: 24227676 PMCID: PMC3967101 DOI: 10.1093/bioinformatics/btt637] [Citation(s) in RCA: 285] [Impact Index Per Article: 25.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
SUMMARY Track data hubs provide an efficient mechanism for visualizing remotely hosted Internet-accessible collections of genome annotations. Hub datasets can be organized, configured and fully integrated into the University of California Santa Cruz (UCSC) Genome Browser and accessed through the familiar browser interface. For the first time, individuals can use the complete browser feature set to view custom datasets without the overhead of setting up and maintaining a mirror. AVAILABILITY AND IMPLEMENTATION Source code for the BigWig, BigBed and Genome Browser software is freely available for non-commercial use at http://hgdownload.cse.ucsc.edu/admin/jksrc.zip, implemented in C and supported on Linux. Binaries for the BigWig and BigBed creation and parsing utilities may be downloaded at http://hgdownload.cse.ucsc.edu/admin/exe/. Binary Alignment/Map (BAM) and Variant Call Format (VCF)/tabix utilities are available from http://samtools.sourceforge.net/ and http://vcftools.sourceforge.net/. The UCSC Genome Browser is publicly accessible at http://genome.ucsc.edu.
Collapse
Affiliation(s)
- Brian J Raney
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA and Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63108, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Abstract
The University of California Santa Cruz (UCSC) Genome Browser is a popular Web-based tool for quickly displaying a requested portion of a genome at any scale, accompanied by a series of aligned annotation "tracks." The annotations generated by the UCSC Genome Bioinformatics Group and external collaborators include gene predictions, mRNA and expressed sequence tag alignments, simple nucleotide polymorphisms, expression and regulatory data, phenotype and variation data, and pairwise and multiple-species comparative genomics data. All information relevant to a region is presented in one window, facilitating biological analysis and interpretation. The database tables underlying the Genome Browser tracks can be viewed, downloaded, and manipulated using another Web-based application, the UCSC Table Browser. Users can upload personal datasets in a wide variety of formats as custom annotation tracks in both browsers for research or educational purposes. This unit describes how to use the Genome Browser and Table Browser for genome analysis, download the underlying database tables, and create and display custom annotation tracks.
Collapse
Affiliation(s)
- Donna Karolchik
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, California, USA
| | | | | |
Collapse
|
29
|
Meyer LR, Zweig AS, Hinrichs AS, Karolchik D, Kuhn RM, Wong M, Sloan CA, Rosenbloom KR, Roe G, Rhead B, Raney BJ, Pohl A, Malladi VS, Li CH, Lee BT, Learned K, Kirkup V, Hsu F, Heitner S, Harte RA, Haeussler M, Guruvadoo L, Goldman M, Giardine BM, Fujita PA, Dreszer TR, Diekhans M, Cline MS, Clawson H, Barber GP, Haussler D, Kent WJ. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res 2013; 41:D64-9. [PMID: 23155063 PMCID: PMC3531082 DOI: 10.1093/nar/gks1048] [Citation(s) in RCA: 612] [Impact Index Per Article: 55.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2012] [Accepted: 10/08/2012] [Indexed: 11/14/2022] Open
Abstract
The University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a wide variety of organisms. The Browser is an integrated tool set for visualizing, comparing, analysing and sharing both publicly available and user-generated genomic datasets. As of September 2012, genomic sequence and a basic set of annotation 'tracks' are provided for 63 organisms, including 26 mammals, 13 non-mammal vertebrates, 3 invertebrate deuterostomes, 13 insects, 6 worms, yeast and sea hare. In the past year 19 new genome assemblies have been added, and we anticipate releasing another 28 in early 2013. Further, a large number of annotation tracks have been either added, updated by contributors or remapped to the latest human reference genome. Among these are an updated UCSC Genes track for human and mouse assemblies. We have also introduced several features to improve usability, including new navigation menus. This article provides an update to the UCSC Genome Browser database, which has been previously featured in the Database issue of this journal.
Collapse
Affiliation(s)
- Laurence R. Meyer
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Ann S. Zweig
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Angie S. Hinrichs
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Donna Karolchik
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Robert M. Kuhn
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Matthew Wong
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Cricket A. Sloan
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Kate R. Rosenbloom
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Greg Roe
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Brooke Rhead
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Brian J. Raney
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Andy Pohl
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Venkat S. Malladi
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Chin H. Li
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Brian T. Lee
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Katrina Learned
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Vanessa Kirkup
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Fan Hsu
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Steve Heitner
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Rachel A. Harte
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Maximilian Haeussler
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Luvina Guruvadoo
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Mary Goldman
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Belinda M. Giardine
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Pauline A. Fujita
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Timothy R. Dreszer
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Melissa S. Cline
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Galt P. Barber
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - David Haussler
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - W. James Kent
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| |
Collapse
|
30
|
Rosenbloom KR, Sloan CA, Malladi VS, Dreszer TR, Learned K, Kirkup VM, Wong MC, Maddren M, Fang R, Heitner SG, Lee BT, Barber GP, Harte RA, Diekhans M, Long JC, Wilder SP, Zweig AS, Karolchik D, Kuhn RM, Haussler D, Kent WJ. ENCODE data in the UCSC Genome Browser: year 5 update. Nucleic Acids Res 2013; 41:D56-63. [PMID: 23193274 PMCID: PMC3531152 DOI: 10.1093/nar/gks1172] [Citation(s) in RCA: 612] [Impact Index Per Article: 55.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2012] [Revised: 10/26/2012] [Accepted: 10/28/2012] [Indexed: 02/07/2023] Open
Abstract
The Encyclopedia of DNA Elements (ENCODE), http://encodeproject.org, has completed its fifth year of scientific collaboration to create a comprehensive catalog of functional elements in the human genome, and its third year of investigations in the mouse genome. Since the last report in this journal, the ENCODE human data repertoire has grown by 898 new experiments (totaling 2886), accompanied by a major integrative analysis. In the mouse genome, results from 404 new experiments became available this year, increasing the total to 583, collected during the course of the project. The University of California, Santa Cruz, makes this data available on the public Genome Browser http://genome.ucsc.edu for visual browsing and data mining. Download of raw and processed data files are all supported. The ENCODE portal provides specialized tools and information about the ENCODE data sets.
Collapse
Affiliation(s)
- Kate R Rosenbloom
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Abstract
The UCSC Genome Browser (http://genome.ucsc.edu) is a graphical viewer for genomic data now in its 13th year. Since the early days of the Human Genome Project, it has presented an integrated view of genomic data of many kinds. Now home to assemblies for 58 organisms, the Browser presents visualization of annotations mapped to genomic coordinates. The ability to juxtapose annotations of many types facilitates inquiry-driven data mining. Gene predictions, mRNA alignments, epigenomic data from the ENCODE project, conservation scores from vertebrate whole-genome alignments and variation data may be viewed at any scale from a single base to an entire chromosome. The Browser also includes many other widely used tools, including BLAT, which is useful for alignments from high-throughput sequencing experiments. Private data uploaded as Custom Tracks and Data Hubs in many formats may be displayed alongside the rich compendium of precomputed data in the UCSC database. The Table Browser is a full-featured graphical interface, which allows querying, filtering and intersection of data tables. The Saved Session feature allows users to store and share customized views, enhancing the utility of the system for organizing multiple trains of thought. Binary Alignment/Map (BAM), Variant Call Format and the Personal Genome Single Nucleotide Polymorphisms (SNPs) data formats are useful for visualizing a large sequencing experiment (whole-genome or whole-exome), where the differences between the data set and the reference assembly may be displayed graphically. Support for high-throughput sequencing extends to compact, indexed data formats, such as BAM, bigBed and bigWig, allowing rapid visualization of large datasets from RNA-seq and ChIP-seq experiments via local hosting.
Collapse
Affiliation(s)
- Robert M Kuhn
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA.
| | | | | |
Collapse
|
32
|
Stamatoyannopoulos JA, Snyder M, Hardison R, Ren B, Gingeras T, Gilbert DM, Groudine M, Bender M, Kaul R, Canfield T, Giste E, Johnson A, Zhang M, Balasundaram G, Byron R, Roach V, Sabo PJ, Sandstrom R, Stehling AS, Thurman RE, Weissman SM, Cayting P, Hariharan M, Lian J, Cheng Y, Landt SG, Ma Z, Wold BJ, Dekker J, Crawford GE, Keller CA, Wu W, Morrissey C, Kumar SA, Mishra T, Jain D, Byrska-Bishop M, Blankenberg D, Lajoie1 BR, Jain G, Sanyal A, Chen KB, Denas O, Taylor J, Blobel GA, Weiss MJ, Pimkin M, Deng W, Marinov GK, Williams BA, Fisher-Aylor KI, Desalvo G, Kiralusha A, Trout D, Amrhein H, Mortazavi A, Edsall L, McCleary D, Kuan S, Shen Y, Yue F, Ye Z, Davis CA, Zaleski C, Jha S, Xue C, Dobin A, Lin W, Fastuca M, Wang H, Guigo R, Djebali S, Lagarde J, Ryba T, Sasaki T, Malladi VS, Cline MS, Kirkup VM, Learned K, Rosenbloom KR, Kent WJ, Feingold EA, Good PJ, Pazin M, Lowdon RF, Adams LB. An encyclopedia of mouse DNA elements (Mouse ENCODE). Genome Biol 2012; 13:418. [PMID: 22889292 PMCID: PMC3491367 DOI: 10.1186/gb-2012-13-8-418] [Citation(s) in RCA: 343] [Impact Index Per Article: 28.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
To complement the human Encyclopedia of DNA Elements (ENCODE) project and to enable a broad range of mouse genomics efforts, the Mouse ENCODE Consortium is applying the same experimental pipelines developed for human ENCODE to annotate the mouse genome.
Collapse
Affiliation(s)
- John A Stamatoyannopoulos
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA
| | - Michael Snyder
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
| | - Ross Hardison
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, USA
| | - Bing Ren
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California San Diego, La Jolla, California, USA
| | - Thomas Gingeras
- Dept. of Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA
| | - David M Gilbert
- Department of Biological Science, Florida State University, Tallahassee, Florida, USA
| | - Mark Groudine
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Michael Bender
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Rajinder Kaul
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA
| | - Theresa Canfield
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA
| | - Erica Giste
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA
| | - Audra Johnson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA
| | - Mia Zhang
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Gayathri Balasundaram
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Rachel Byron
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Vaughan Roach
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA
| | - Peter J Sabo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA
| | - Richard Sandstrom
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA
| | - A Sandra Stehling
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA
| | - Robert E Thurman
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA
| | | | - Philip Cayting
- Department of Genetics, Yale University, New Haven, Connecticut, USA
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, USA
| | - Manoj Hariharan
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
| | - Jin Lian
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, USA
| | - Yong Cheng
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
| | - Stephen G Landt
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
| | - Zhihai Ma
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
| | - Barbara J Wold
- Div. of Biology, California Institute of Technology, Pasadena, California, USA
| | - Job Dekker
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachussetts, USA
| | - Gregory E Crawford
- Institute for Genome Sciences and Policy, Duke University, Durham, North Carolina, USA
- Department of Pediatrics, Duke University, Durham, North Carolina, USA
| | - Cheryl A Keller
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, USA
| | - Weisheng Wu
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, USA
| | - Christopher Morrissey
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, USA
| | - Swathi A Kumar
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, USA
| | - Tejaswini Mishra
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, USA
| | - Deepti Jain
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, USA
| | - Marta Byrska-Bishop
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, USA
| | - Daniel Blankenberg
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, USA
| | - Bryan R Lajoie1
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
| | - Gaurav Jain
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachussetts, USA
| | - Amartya Sanyal
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachussetts, USA
| | - Kaun-Bei Chen
- Institute for Genome Sciences and Policy, Duke University, Durham, North Carolina, USA
| | - Olgert Denas
- Institute for Genome Sciences and Policy, Duke University, Durham, North Carolina, USA
| | - James Taylor
- Department of Mathematics and Computer Science, Emory University, Atlanta, Georgia, USA
| | - Gerd A Blobel
- Div. of Hematology, Children's Hospital of Philadelphia, Abramson Research Center, Philadelphia, Pennsylvania, USA
| | - Mitchell J Weiss
- Div. of Hematology, Children's Hospital of Philadelphia, Abramson Research Center, Philadelphia, Pennsylvania, USA
| | - Max Pimkin
- Div. of Hematology, Children's Hospital of Philadelphia, Abramson Research Center, Philadelphia, Pennsylvania, USA
| | - Wulan Deng
- Div. of Hematology, Children's Hospital of Philadelphia, Abramson Research Center, Philadelphia, Pennsylvania, USA
| | - Georgi K Marinov
- Div. of Biology, California Institute of Technology, Pasadena, California, USA
| | - Brian A Williams
- Div. of Biology, California Institute of Technology, Pasadena, California, USA
| | | | - Gilberto Desalvo
- Div. of Biology, California Institute of Technology, Pasadena, California, USA
| | - Anthony Kiralusha
- Div. of Biology, California Institute of Technology, Pasadena, California, USA
| | - Diane Trout
- Div. of Biology, California Institute of Technology, Pasadena, California, USA
| | - Henry Amrhein
- Div. of Biology, California Institute of Technology, Pasadena, California, USA
| | - Ali Mortazavi
- Dept. of Developmental and Cell Biology, University of California Irvine, Irvine California, USA
| | - Lee Edsall
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California San Diego, La Jolla, California, USA
| | - David McCleary
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California San Diego, La Jolla, California, USA
| | - Samantha Kuan
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California San Diego, La Jolla, California, USA
| | - Yin Shen
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California San Diego, La Jolla, California, USA
| | - Feng Yue
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California San Diego, La Jolla, California, USA
| | - Zhen Ye
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California San Diego, La Jolla, California, USA
| | - Carrie A Davis
- Dept. of Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA
| | - Chris Zaleski
- Dept. of Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA
| | - Sonali Jha
- Dept. of Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA
| | - Chenghai Xue
- Dept. of Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA
| | - Alex Dobin
- Dept. of Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA
| | - Wei Lin
- Dept. of Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA
| | - Meagan Fastuca
- Dept. of Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA
| | - Huaien Wang
- Dept. of Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA
| | - Roderic Guigo
- Division of Bioinformatics and Genomics, Center for Genomic Regulation, Barcelona, Catalunya, Spain
| | - Sarah Djebali
- Division of Bioinformatics and Genomics, Center for Genomic Regulation, Barcelona, Catalunya, Spain
| | - Julien Lagarde
- Division of Bioinformatics and Genomics, Center for Genomic Regulation, Barcelona, Catalunya, Spain
| | - Tyrone Ryba
- Department of Biological Science, Florida State University, Tallahassee, Florida, USA
| | - Takayo Sasaki
- Department of Biological Science, Florida State University, Tallahassee, Florida, USA
| | - Venkat S Malladi
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California, USA
| | - Melissa S Cline
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California, USA
| | - Vanessa M Kirkup
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California, USA
| | - Katrina Learned
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California, USA
| | - Kate R Rosenbloom
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California, USA
| | - W James Kent
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, California, USA
| | - Elise A Feingold
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, USA
| | - Peter J Good
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, USA
| | - Michael Pazin
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, USA
| | - Rebecca F Lowdon
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, USA
| | - Leslie B Adams
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, USA
| |
Collapse
|
33
|
Rosenbloom KR, Dreszer TR, Long JC, Malladi VS, Sloan CA, Raney BJ, Cline MS, Karolchik D, Barber GP, Clawson H, Diekhans M, Fujita PA, Goldman M, Gravell RC, Harte RA, Hinrichs AS, Kirkup VM, Kuhn RM, Learned K, Maddren M, Meyer LR, Pohl A, Rhead B, Wong MC, Zweig AS, Haussler D, Kent WJ. ENCODE whole-genome data in the UCSC Genome Browser: update 2012. Nucleic Acids Res 2012; 40:D912-7. [PMID: 22075998 PMCID: PMC3245183 DOI: 10.1093/nar/gkr1012] [Citation(s) in RCA: 207] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2011] [Revised: 10/18/2011] [Accepted: 10/20/2011] [Indexed: 11/23/2022] Open
Abstract
The Encyclopedia of DNA Elements (ENCODE) Consortium is entering its 5th year of production-level effort generating high-quality whole-genome functional annotations of the human genome. The past year has brought the ENCODE compendium of functional elements to critical mass, with a diverse set of 27 biochemical assays now covering 200 distinct human cell types. Within the mouse genome, which has been under study by ENCODE groups for the past 2 years, 37 cell types have been assayed. Over 2000 individual experiments have been completed and submitted to the Data Coordination Center for public use. UCSC makes this data available on the quality-reviewed public Genome Browser (http://genome.ucsc.edu) and on an early-access Preview Browser (http://genome-preview.ucsc.edu). Visual browsing, data mining and download of raw and processed data files are all supported. An ENCODE portal (http://encodeproject.org) provides specialized tools and information about the ENCODE data sets.
Collapse
Affiliation(s)
- Kate R Rosenbloom
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Dreszer TR, Karolchik D, Zweig AS, Hinrichs AS, Raney BJ, Kuhn RM, Meyer LR, Wong M, Sloan CA, Rosenbloom KR, Roe G, Rhead B, Pohl A, Malladi VS, Li CH, Learned K, Kirkup V, Hsu F, Harte RA, Guruvadoo L, Goldman M, Giardine BM, Fujita PA, Diekhans M, Cline MS, Clawson H, Barber GP, Haussler D, James Kent W. The UCSC Genome Browser database: extensions and updates 2011. Nucleic Acids Res 2012; 40:D918-23. [PMID: 22086951 PMCID: PMC3245018 DOI: 10.1093/nar/gkr1055] [Citation(s) in RCA: 273] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2011] [Revised: 10/18/2011] [Accepted: 10/25/2011] [Indexed: 01/05/2023] Open
Abstract
The University of California Santa Cruz Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a wide variety of organisms. The Browser is an integrated tool set for visualizing, comparing, analyzing and sharing both publicly available and user-generated genomic data sets. In the past year, the local database has been updated with four new species assemblies, and we anticipate another four will be released by the end of 2011. Further, a large number of annotation tracks have been either added, updated by contributors, or remapped to the latest human reference genome. Among these are new phenotype and disease annotations, UCSC genes, and a major dbSNP update, which required new visualization methods. Growing beyond the local database, this year we have introduced 'track data hubs', which allow the Genome Browser to provide access to remotely located sets of annotations. This feature is designed to significantly extend the number and variety of annotation tracks that are publicly available for visualization and analysis from within our site. We have also introduced several usability features including track search and a context-sensitive menu of options available with a right-click anywhere on the Browser's image.
Collapse
Affiliation(s)
- Timothy R. Dreszer
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Donna Karolchik
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Ann S. Zweig
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Angie S. Hinrichs
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Brian J. Raney
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Robert M. Kuhn
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Laurence R. Meyer
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Mathew Wong
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Cricket A. Sloan
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Kate R. Rosenbloom
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Greg Roe
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Brooke Rhead
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Andy Pohl
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Venkat S. Malladi
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Chin H. Li
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Katrina Learned
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Vanessa Kirkup
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Fan Hsu
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Rachel A. Harte
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Luvina Guruvadoo
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Mary Goldman
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Belinda M. Giardine
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Pauline A. Fujita
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Melissa S. Cline
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Galt P. Barber
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - David Haussler
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - W. James Kent
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| |
Collapse
|
35
|
Zhou X, Maricque B, Xie M, Li D, Sundaram V, Martin EA, Koebbe BC, Nielsen C, Hirst M, Farnham P, Kuhn RM, Zhu J, Smirnov I, Kent WJ, Haussler D, Madden PAF, Costello JF, Wang T. The Human Epigenome Browser at Washington University. Nat Methods 2011; 8:989-90. [PMID: 22127213 DOI: 10.1038/nmeth.1772] [Citation(s) in RCA: 236] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
36
|
Abstract
The University of California Santa Cruz (UCSC) Genome Browser is a popular Web-based tool for quickly displaying a requested portion of a genome at any scale, accompanied by a series of aligned annotation "tracks." The annotations generated by the UCSC Genome Bioinformatics Group and external collaborators include gene predictions, mRNA and expressed sequence tag alignments, simple nucleotide polymorphisms, expression and regulatory data, phenotype and variation data, and pairwise and multiple-species comparative genomics data. All information relevant to a region is presented in one window, facilitating biological analysis and interpretation. The database tables underlying the Genome Browser tracks can be viewed, downloaded, and manipulated using another Web-based application, the UCSC Table Browser. Users can upload personal datasets in a wide variety of formats as custom annotation tracks in both browsers for research or educational purposes. This unit describes how to use the Genome Browser and Table Browser for genome analysis, download the underlying database tables, and create and display custom annotation tracks.
Collapse
Affiliation(s)
- Donna Karolchik
- Center for Biomolecular Science and Engineering, University of California Santa Cruz
| | - Angie S. Hinrichs
- Center for Biomolecular Science and Engineering, University of California Santa Cruz
| | - W. James Kent
- Center for Biomolecular Science and Engineering, University of California Santa Cruz
| |
Collapse
|
37
|
Sanborn JZ, Benz SC, Craft B, Szeto C, Kober KM, Meyer L, Vaske CJ, Goldman M, Smith KE, Kuhn RM, Karolchik D, Kent WJ, Stuart JM, Haussler D, Zhu J. The UCSC Cancer Genomics Browser: update 2011. Nucleic Acids Res 2011; 39:D951-9. [PMID: 21059681 PMCID: PMC3013705 DOI: 10.1093/nar/gkq1113] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2010] [Accepted: 10/18/2010] [Indexed: 12/12/2022] Open
Abstract
The UCSC Cancer Genomics Browser (https://genome-cancer.ucsc.edu) comprises a suite of web-based tools to integrate, visualize and analyze cancer genomics and clinical data. The browser displays whole-genome views of genome-wide experimental measurements for multiple samples alongside their associated clinical information. Multiple data sets can be viewed simultaneously as coordinated 'heatmap tracks' to compare across studies or different data modalities. Users can order, filter, aggregate, classify and display data interactively based on any given feature set including clinical features, annotated biological pathways and user-contributed collections of genes. Integrated standard statistical tools provide dynamic quantitative analysis within all available data sets. The browser hosts a growing body of publicly available cancer genomics data from a variety of cancer types, including data generated from the Cancer Genome Atlas project. Multiple consortiums use the browser on confidential prepublication data enabled by private installations. Many new features have been added, including the hgMicroscope tumor image viewer, hgSignature for real-time genomic signature evaluation on any browser track, and 'PARADIGM' pathway tracks to display integrative pathway activities. The browser is integrated with the UCSC Genome Browser; thus inheriting and integrating the Genome Browser's rich set of human biology and genetics data that enhances the interpretability of the cancer genomics data.
Collapse
Affiliation(s)
- J. Zachary Sanborn
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| | - Stephen C. Benz
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian Craft
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| | - Christopher Szeto
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| | - Kord M. Kober
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| | - Laurence Meyer
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| | - Charles J. Vaske
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| | - Mary Goldman
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| | - Kayla E. Smith
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| | - Robert M. Kuhn
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| | - Donna Karolchik
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| | - W. James Kent
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| | - Joshua M. Stuart
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| | - David Haussler
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| | - Jingchun Zhu
- Department of Biomolecular Engineering, Center for Biomolecular Science and Engineering and Howard Hughes Medical Institute, University of California at Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
38
|
Gerstein MB, Lu ZJ, Van Nostrand EL, Cheng C, Arshinoff BI, Liu T, Yip KY, Robilotto R, Rechtsteiner A, Ikegami K, Alves P, Chateigner A, Perry M, Morris M, Auerbach RK, Feng X, Leng J, Vielle A, Niu W, Rhrissorrakrai K, Agarwal A, Alexander RP, Barber G, Brdlik CM, Brennan J, Brouillet JJ, Carr A, Cheung MS, Clawson H, Contrino S, Dannenberg LO, Dernburg AF, Desai A, Dick L, Dosé AC, Du J, Egelhofer T, Ercan S, Euskirchen G, Ewing B, Feingold EA, Gassmann R, Good PJ, Green P, Gullier F, Gutwein M, Guyer MS, Habegger L, Han T, Henikoff JG, Henz SR, Hinrichs A, Holster H, Hyman T, Iniguez AL, Janette J, Jensen M, Kato M, Kent WJ, Kephart E, Khivansara V, Khurana E, Kim JK, Kolasinska-Zwierz P, Lai EC, Latorre I, Leahey A, Lewis S, Lloyd P, Lochovsky L, Lowdon RF, Lubling Y, Lyne R, MacCoss M, Mackowiak SD, Mangone M, McKay S, Mecenas D, Merrihew G, Miller DM, Muroyama A, Murray JI, Ooi SL, Pham H, Phippen T, Preston EA, Rajewsky N, Rätsch G, Rosenbaum H, Rozowsky J, Rutherford K, Ruzanov P, Sarov M, Sasidharan R, Sboner A, Scheid P, Segal E, Shin H, Shou C, Slack FJ, Slightam C, Smith R, Spencer WC, Stinson EO, Taing S, Takasaki T, Vafeados D, Voronina K, Wang G, Washington NL, Whittle CM, Wu B, Yan KK, Zeller G, Zha Z, Zhong M, Zhou X, Ahringer J, Strome S, Gunsalus KC, Micklem G, Liu XS, Reinke V, Kim SK, Hillier LW, Henikoff S, Piano F, Snyder M, Stein L, Lieb JD, Waterston RH. Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science 2010; 330:1775-87. [PMID: 21177976 PMCID: PMC3142569 DOI: 10.1126/science.1196914] [Citation(s) in RCA: 741] [Impact Index Per Article: 52.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor-binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
Collapse
Affiliation(s)
- Mark B. Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
- Department of Computer Science, Yale University, 51 Prospect Street, New Haven, CT 06511, USA
| | - Zhi John Lu
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Eric L. Van Nostrand
- Department of Genetics, Stanford University Medical Center, Stanford, CA 94305, USA
| | - Chao Cheng
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Bradley I. Arshinoff
- Ontario Institute for Cancer Research, 101 College Street, Suite 800, Toronto, Ontario M5G 0A3, Canada
- Department of Molecular Genetics, University of Toronto, 27 King's College Circle, Toronto, Ontario M5S 1A1, Canada
| | - Tao Liu
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 44 Binney Street, Boston, MA 02115, USA
- Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115, USA
| | - Kevin Y. Yip
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Rebecca Robilotto
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Andreas Rechtsteiner
- Molecular, Cell, and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Kohta Ikegami
- Department of Biology and Carolina Center for Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Pedro Alves
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Aurelien Chateigner
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK, and Cambridge Systems Biology Centre, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Marc Perry
- Ontario Institute for Cancer Research, 101 College Street, Suite 800, Toronto, Ontario M5G 0A3, Canada
| | - Mitzi Morris
- Center for Genomics and Systems Biology, Department of Biology, New York University, 1009 Silver Center, 100 Washington Square East, New York, NY 10003–6688, USA
| | - Raymond K. Auerbach
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Xin Feng
- Ontario Institute for Cancer Research, 101 College Street, Suite 800, Toronto, Ontario M5G 0A3, Canada
- Department of Biomedical Engineering, State University of New York at Stonybrook, Stonybrook, NY 11794, USA
| | - Jing Leng
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Anne Vielle
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK
| | - Wei Niu
- Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT 06824, USA
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06520–8005, USA
| | - Kahn Rhrissorrakrai
- Center for Genomics and Systems Biology, Department of Biology, New York University, 1009 Silver Center, 100 Washington Square East, New York, NY 10003–6688, USA
| | - Ashish Agarwal
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
- Department of Computer Science, Yale University, 51 Prospect Street, New Haven, CT 06511, USA
| | - Roger P. Alexander
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Galt Barber
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064 USA
| | - Cathleen M. Brdlik
- Department of Genetics, Stanford University Medical Center, Stanford, CA 94305, USA
| | - Jennifer Brennan
- Department of Biology and Carolina Center for Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | | | - Adrian Carr
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK, and Cambridge Systems Biology Centre, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Ming-Sin Cheung
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK
| | - Hiram Clawson
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064 USA
| | - Sergio Contrino
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK, and Cambridge Systems Biology Centre, Tennis Court Road, Cambridge CB2 1QR, UK
| | | | - Abby F. Dernburg
- Howard Hughes Medical Institute, Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA, and Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Arshad Desai
- Ludwig Institute Cancer Research/Department of Cellular and Molecular Medicine, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093–0653, USA
| | - Lindsay Dick
- David Rockefeller Graduate Program, Rockefeller University, 1230 York Avenue New York, NY 10065, USA
| | - Andréa C. Dosé
- Howard Hughes Medical Institute, Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA, and Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Jiang Du
- Department of Computer Science, Yale University, 51 Prospect Street, New Haven, CT 06511, USA
| | - Thea Egelhofer
- Molecular, Cell, and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Sevinc Ercan
- Department of Biology and Carolina Center for Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Ghia Euskirchen
- Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT 06824, USA
| | - Brent Ewing
- Department of Genome Sciences, University of Washington School of Medicine, William H. Foege Building S350D, 1705 NE Pacific Street, Post Office Box 355065, Seattle, WA 98195–5065, USA
| | - Elise A. Feingold
- Division of Extramural Research, National Human Genome Research Institute, National Institutes of Health, 5635 Fishers Lane, Suite 4076, Bethesda, MD 20892–9305, USA
| | - Reto Gassmann
- Ludwig Institute Cancer Research/Department of Cellular and Molecular Medicine, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093–0653, USA
| | - Peter J. Good
- Division of Extramural Research, National Human Genome Research Institute, National Institutes of Health, 5635 Fishers Lane, Suite 4076, Bethesda, MD 20892–9305, USA
| | - Phil Green
- Department of Genome Sciences, University of Washington School of Medicine, William H. Foege Building S350D, 1705 NE Pacific Street, Post Office Box 355065, Seattle, WA 98195–5065, USA
| | - Francois Gullier
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK, and Cambridge Systems Biology Centre, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Michelle Gutwein
- Center for Genomics and Systems Biology, Department of Biology, New York University, 1009 Silver Center, 100 Washington Square East, New York, NY 10003–6688, USA
| | - Mark S. Guyer
- Division of Extramural Research, National Human Genome Research Institute, National Institutes of Health, 5635 Fishers Lane, Suite 4076, Bethesda, MD 20892–9305, USA
| | - Lukas Habegger
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Ting Han
- Life Sciences Institute, Department of Human Genetics, University of Michigan, 210 Washtenaw Avenue, Ann Arbor, MI 48109–2216, USA
| | - Jorja G. Henikoff
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109, USA
| | - Stefan R. Henz
- Max Planck Institute for Developmental Biology, Spemannstrasse 37-39, 72076 Tübingen, Germany
| | - Angie Hinrichs
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064 USA
| | - Heather Holster
- Roche NimbleGen, 500 South Rosa Road, Madison, WI 53719, USA
| | - Tony Hyman
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307 Dresden, Germany
| | - A. Leo Iniguez
- Roche NimbleGen, 500 South Rosa Road, Madison, WI 53719, USA
| | - Judith Janette
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06520–8005, USA
| | - Morten Jensen
- Department of Biology and Carolina Center for Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Masaomi Kato
- Department of Molecular, Cellular and Developmental Biology, Post Office Box 208103, Yale University, New Haven, CT 06520, USA
| | - W. James Kent
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064 USA
| | - Ellen Kephart
- Ontario Institute for Cancer Research, 101 College Street, Suite 800, Toronto, Ontario M5G 0A3, Canada
| | - Vishal Khivansara
- Life Sciences Institute, Department of Human Genetics, University of Michigan, 210 Washtenaw Avenue, Ann Arbor, MI 48109–2216, USA
| | - Ekta Khurana
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - John K. Kim
- Life Sciences Institute, Department of Human Genetics, University of Michigan, 210 Washtenaw Avenue, Ann Arbor, MI 48109–2216, USA
| | - Paulina Kolasinska-Zwierz
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK
| | - Eric C. Lai
- Sloan-Kettering Institute, 1275 York Avenue, Post Office Box 252, New York, NY 10065, USA
| | - Isabel Latorre
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK
| | - Amber Leahey
- Department of Genome Sciences, University of Washington School of Medicine, William H. Foege Building S350D, 1705 NE Pacific Street, Post Office Box 355065, Seattle, WA 98195–5065, USA
| | - Suzanna Lewis
- Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Mailstop 64-121, Berkeley, CA 94720 USA
| | - Paul Lloyd
- Ontario Institute for Cancer Research, 101 College Street, Suite 800, Toronto, Ontario M5G 0A3, Canada
| | - Lucas Lochovsky
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Rebecca F. Lowdon
- Division of Extramural Research, National Human Genome Research Institute, National Institutes of Health, 5635 Fishers Lane, Suite 4076, Bethesda, MD 20892–9305, USA
| | - Yaniv Lubling
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Rachel Lyne
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK, and Cambridge Systems Biology Centre, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Michael MacCoss
- Department of Genome Sciences, University of Washington School of Medicine, William H. Foege Building S350D, 1705 NE Pacific Street, Post Office Box 355065, Seattle, WA 98195–5065, USA
| | - Sebastian D. Mackowiak
- Max-Delbrück-Centrum für Molekulare Medizin, Division of Systems Biology, Robert-Rössle-Strasse 10, D-13125 Berlin-Buch, Germany
| | - Marco Mangone
- Center for Genomics and Systems Biology, Department of Biology, New York University, 1009 Silver Center, 100 Washington Square East, New York, NY 10003–6688, USA
| | - Sheldon McKay
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11542 USA
| | - Desirea Mecenas
- Center for Genomics and Systems Biology, Department of Biology, New York University, 1009 Silver Center, 100 Washington Square East, New York, NY 10003–6688, USA
| | - Gennifer Merrihew
- Department of Genome Sciences, University of Washington School of Medicine, William H. Foege Building S350D, 1705 NE Pacific Street, Post Office Box 355065, Seattle, WA 98195–5065, USA
| | - David M. Miller
- Department of Cell and Developmental Biology, Vanderbilt University, 465 21st Avenue South, Nashville, TN 37232–8240, USA
| | - Andrew Muroyama
- Ludwig Institute Cancer Research/Department of Cellular and Molecular Medicine, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093–0653, USA
| | - John I. Murray
- Department of Genome Sciences, University of Washington School of Medicine, William H. Foege Building S350D, 1705 NE Pacific Street, Post Office Box 355065, Seattle, WA 98195–5065, USA
| | - Siew-Loon Ooi
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109, USA
| | - Hoang Pham
- Howard Hughes Medical Institute, Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA, and Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Taryn Phippen
- Molecular, Cell, and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Elicia A. Preston
- Department of Genome Sciences, University of Washington School of Medicine, William H. Foege Building S350D, 1705 NE Pacific Street, Post Office Box 355065, Seattle, WA 98195–5065, USA
| | - Nikolaus Rajewsky
- Max-Delbrück-Centrum für Molekulare Medizin, Division of Systems Biology, Robert-Rössle-Strasse 10, D-13125 Berlin-Buch, Germany
| | - Gunnar Rätsch
- Friedrich Miescher Laboratory of the Max Planck Society, Spemannstrasse 39, 72076 Tübingen, Germany
| | - Heidi Rosenbaum
- Roche NimbleGen, 500 South Rosa Road, Madison, WI 53719, USA
| | - Joel Rozowsky
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Kim Rutherford
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK, and Cambridge Systems Biology Centre, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Peter Ruzanov
- Ontario Institute for Cancer Research, 101 College Street, Suite 800, Toronto, Ontario M5G 0A3, Canada
| | - Mihail Sarov
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307 Dresden, Germany
| | - Rajkumar Sasidharan
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Andrea Sboner
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Paul Scheid
- Center for Genomics and Systems Biology, Department of Biology, New York University, 1009 Silver Center, 100 Washington Square East, New York, NY 10003–6688, USA
| | - Eran Segal
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Hyunjin Shin
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 44 Binney Street, Boston, MA 02115, USA
- Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115, USA
| | - Chong Shou
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Frank J. Slack
- Department of Molecular, Cellular and Developmental Biology, Post Office Box 208103, Yale University, New Haven, CT 06520, USA
| | - Cindie Slightam
- Department of Developmental Biology, Stanford University Medical Center, 279 Campus Drive, Stanford, CA 94305–5329, USA
| | - Richard Smith
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK, and Cambridge Systems Biology Centre, Tennis Court Road, Cambridge CB2 1QR, UK
| | - William C. Spencer
- Department of Cell and Developmental Biology, Vanderbilt University, 465 21st Avenue South, Nashville, TN 37232–8240, USA
| | - E. O. Stinson
- Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Mailstop 64-121, Berkeley, CA 94720 USA
| | - Scott Taing
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 44 Binney Street, Boston, MA 02115, USA
| | - Teruaki Takasaki
- Molecular, Cell, and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Dionne Vafeados
- Department of Genome Sciences, University of Washington School of Medicine, William H. Foege Building S350D, 1705 NE Pacific Street, Post Office Box 355065, Seattle, WA 98195–5065, USA
| | - Ksenia Voronina
- Ludwig Institute Cancer Research/Department of Cellular and Molecular Medicine, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093–0653, USA
| | - Guilin Wang
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06520–8005, USA
| | - Nicole L. Washington
- Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Mailstop 64-121, Berkeley, CA 94720 USA
| | - Christina M. Whittle
- Department of Biology and Carolina Center for Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Beijing Wu
- Department of Developmental Biology, Stanford University Medical Center, 279 Campus Drive, Stanford, CA 94305–5329, USA
| | - Koon-Kiu Yan
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Georg Zeller
- Friedrich Miescher Laboratory of the Max Planck Society, Spemannstrasse 39, 72076 Tübingen, Germany
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Zheng Zha
- Ontario Institute for Cancer Research, 101 College Street, Suite 800, Toronto, Ontario M5G 0A3, Canada
| | - Mei Zhong
- Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT 06824, USA
| | - Xingliang Zhou
- Department of Biology and Carolina Center for Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | | | - Julie Ahringer
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK
| | - Susan Strome
- Molecular, Cell, and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Kristin C. Gunsalus
- Center for Genomics and Systems Biology, Department of Biology, New York University, 1009 Silver Center, 100 Washington Square East, New York, NY 10003–6688, USA
- New York University, Abu Dhabi, United Arab Emirates
| | - Gos Micklem
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK, and Cambridge Systems Biology Centre, Tennis Court Road, Cambridge CB2 1QR, UK
| | - X. Shirley Liu
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 44 Binney Street, Boston, MA 02115, USA
- Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115, USA
| | - Valerie Reinke
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06520–8005, USA
| | - Stuart K. Kim
- Department of Genetics, Stanford University Medical Center, Stanford, CA 94305, USA
- Department of Developmental Biology, Stanford University Medical Center, 279 Campus Drive, Stanford, CA 94305–5329, USA
| | - LaDeana W. Hillier
- Department of Genome Sciences, University of Washington School of Medicine, William H. Foege Building S350D, 1705 NE Pacific Street, Post Office Box 355065, Seattle, WA 98195–5065, USA
| | - Steven Henikoff
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109, USA
| | - Fabio Piano
- Center for Genomics and Systems Biology, Department of Biology, New York University, 1009 Silver Center, 100 Washington Square East, New York, NY 10003–6688, USA
- New York University, Abu Dhabi, United Arab Emirates
| | - Michael Snyder
- Department of Genetics, Stanford University Medical Center, Stanford, CA 94305, USA
- Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT 06824, USA
| | - Lincoln Stein
- Ontario Institute for Cancer Research, 101 College Street, Suite 800, Toronto, Ontario M5G 0A3, Canada
- Department of Molecular Genetics, University of Toronto, 27 King's College Circle, Toronto, Ontario M5S 1A1, Canada
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11542 USA
| | - Jason D. Lieb
- Department of Biology and Carolina Center for Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Robert H. Waterston
- Department of Genome Sciences, University of Washington School of Medicine, William H. Foege Building S350D, 1705 NE Pacific Street, Post Office Box 355065, Seattle, WA 98195–5065, USA
| |
Collapse
|
39
|
Raney BJ, Cline MS, Rosenbloom KR, Dreszer TR, Learned K, Barber GP, Meyer LR, Sloan CA, Malladi VS, Roskin KM, Suh BB, Hinrichs AS, Clawson H, Zweig AS, Kirkup V, Fujita PA, Rhead B, Smith KE, Pohl A, Kuhn RM, Karolchik D, Haussler D, Kent WJ. ENCODE whole-genome data in the UCSC genome browser (2011 update). Nucleic Acids Res 2010; 39:D871-5. [PMID: 21037257 PMCID: PMC3013645 DOI: 10.1093/nar/gkq1017] [Citation(s) in RCA: 155] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
The ENCODE project is an international consortium with a goal of cataloguing all the functional elements in the human genome. The ENCODE Data Coordination Center (DCC) at the University of California, Santa Cruz serves as the central repository for ENCODE data. In this role, the DCC offers a collection of high-throughput, genome-wide data generated with technologies such as ChIP-Seq, RNA-Seq, DNA digestion and others. This data helps illuminate transcription factor-binding sites, histone marks, chromatin accessibility, DNA methylation, RNA expression, RNA binding and other cell-state indicators. It includes sequences with quality scores, alignments, signals calculated from the alignments, and in most cases, element or peak calls calculated from the signal data. Each data set is available for visualization and download via the UCSC Genome Browser (http://genome.ucsc.edu/). ENCODE data can also be retrieved using a metadata system that captures the experimental parameters of each assay. The ENCODE web portal at UCSC (http://encodeproject.org/) provides information about the ENCODE data and links for access.
Collapse
Affiliation(s)
- Brian J Raney
- Center for Biomolecular Science and Engineering, School of Engineering and Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Fujita PA, Rhead B, Zweig AS, Hinrichs AS, Karolchik D, Cline MS, Goldman M, Barber GP, Clawson H, Coelho A, Diekhans M, Dreszer TR, Giardine BM, Harte RA, Hillman-Jackson J, Hsu F, Kirkup V, Kuhn RM, Learned K, Li CH, Meyer LR, Pohl A, Raney BJ, Rosenbloom KR, Smith KE, Haussler D, Kent WJ. The UCSC Genome Browser database: update 2011. Nucleic Acids Res 2010; 39:D876-82. [PMID: 20959295 PMCID: PMC3242726 DOI: 10.1093/nar/gkq963] [Citation(s) in RCA: 841] [Impact Index Per Article: 60.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
The University of California, Santa Cruz Genome Browser (http://genome.ucsc.edu) offers online access to a database of genomic sequence and annotation data for a wide variety of organisms. The Browser also has many tools for visualizing, comparing and analyzing both publicly available and user-generated genomic data sets, aligning sequences and uploading user data. Among the features released this year are a gene search tool and annotation track drag-reorder functionality as well as support for BAM and BigWig/BigBed file formats. New display enhancements include overlay of multiple wiggle tracks through use of transparent coloring, options for displaying transformed wiggle data, a 'mean+whiskers' windowing function for display of wiggle data at high zoom levels, and more color schemes for microarray data. New data highlights include seven new genome assemblies, a Neandertal genome data portal, phenotype and disease association data, a human RNA editing track, and a zebrafish Conservation track. We also describe updates to existing tracks.
Collapse
Affiliation(s)
- Pauline A Fujita
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Abstract
Summary: BigWig and BigBed files are compressed binary indexed files containing data at several resolutions that allow the high-performance display of next-generation sequencing experiment results in the UCSC Genome Browser. The visualization is implemented using a multi-layered software approach that takes advantage of specific capabilities of web-based protocols and Linux and UNIX operating systems files, R trees and various indexing and compression tricks. As a result, only the data needed to support the current browser view is transmitted rather than the entire file, enabling fast remote access to large distributed data sets. Availability and implementation: Binaries for the BigWig and BigBed creation and parsing utilities may be downloaded at http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/. Source code for the creation and visualization software is freely available for non-commercial use at http://hgdownload.cse.ucsc.edu/admin/jksrc.zip, implemented in C and supported on Linux. The UCSC Genome Browser is available at http://genome.ucsc.edu Contact:ann@soe.ucsc.edu Supplementary information: Supplementary byte-level details of the BigWig and BigBed file formats are available at Bioinformatics online. For an in-depth description of UCSC data file formats and custom tracks, see http://genome.ucsc.edu/FAQ/FAQformat.html and http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html
Collapse
Affiliation(s)
- W J Kent
- Center for Biomolecular Science and Engineering, School of Engineering, University of California, Santa Cruz (UCSC), Santa Cruz, CA 95064, USA
| | | | | | | | | |
Collapse
|
42
|
Abstract
The University of California Santa Cruz (UCSC) Genome Browser is a popular Web-based tool for quickly displaying a requested portion of a genome at any scale, accompanied by a series of aligned annotation "tracks." The annotations-generated by the UCSC Genome Bioinformatics Group and external collaborators-display gene predictions, mRNA and expressed sequence tag alignments, simple nucleotide polymorphisms, expression and regulatory data, phenotype and variation data, and pairwise and multiple-species comparative genomics data. All information relevant to a region is presented in one window, facilitating biological analysis and interpretation. The database tables underlying the Genome Browser tracks can be viewed, downloaded, and manipulated using another Web-based application, the UCSC Table Browser. Users can upload data as custom annotation tracks in both browsers for research or educational use. This unit describes how to use the Genome Browser and Table Browser for genome analysis, download the underlying database tables, and create and display custom annotation tracks.
Collapse
Affiliation(s)
- Donna Karolchik
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, Phone: (831) 459-1571, Fax: (831) 459-1809
| | - Angie S. Hinrichs
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, Phone: (831) 459-1544, Fax: (831) 459-1809
| | - W. James Kent
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, Phone: (831) 459-1401, Fax: (831) 459-1809
| |
Collapse
|
43
|
Rosenbloom KR, Dreszer TR, Pheasant M, Barber GP, Meyer LR, Pohl A, Raney BJ, Wang T, Hinrichs AS, Zweig AS, Fujita PA, Learned K, Rhead B, Smith KE, Kuhn RM, Karolchik D, Haussler D, Kent WJ. ENCODE whole-genome data in the UCSC Genome Browser. Nucleic Acids Res 2009; 38:D620-5. [PMID: 19920125 PMCID: PMC2808953 DOI: 10.1093/nar/gkp961] [Citation(s) in RCA: 204] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The Encyclopedia of DNA Elements (ENCODE) project is an international consortium of investigators funded to analyze the human genome with the goal of producing a comprehensive catalog of functional elements. The ENCODE Data Coordination Center at The University of California, Santa Cruz (UCSC) is the primary repository for experimental results generated by ENCODE investigators. These results are captured in the UCSC Genome Bioinformatics database and download server for visualization and data mining via the UCSC Genome Browser and companion tools (Rhead et al. The UCSC Genome Browser Database: update 2010, in this issue). The ENCODE web portal at UCSC (http://encodeproject.org or http://genome.ucsc.edu/ENCODE) provides information about the ENCODE data and convenient links for access.
Collapse
Affiliation(s)
- Kate R Rosenbloom
- Center for Biomolecular Science and Engineering, School of Engineering, University of California, Santa Cruz, CA 95064, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Rhead B, Karolchik D, Kuhn RM, Hinrichs AS, Zweig AS, Fujita PA, Diekhans M, Smith KE, Rosenbloom KR, Raney BJ, Pohl A, Pheasant M, Meyer LR, Learned K, Hsu F, Hillman-Jackson J, Harte RA, Giardine B, Dreszer TR, Clawson H, Barber GP, Haussler D, Kent WJ. The UCSC Genome Browser database: update 2010. Nucleic Acids Res 2009; 38:D613-9. [PMID: 19906737 PMCID: PMC2808870 DOI: 10.1093/nar/gkp939] [Citation(s) in RCA: 500] [Impact Index Per Article: 33.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The University of California, Santa Cruz (UCSC) Genome Browser website (http://genome.ucsc.edu/) provides a large database of publicly available sequence and annotation data along with an integrated tool set for examining and comparing the genomes of organisms, aligning sequence to genomes, and displaying and sharing users’ own annotation data. As of September 2009, genomic sequence and a basic set of annotation ‘tracks’ are provided for 47 organisms, including 14 mammals, 10 non-mammal vertebrates, 3 invertebrate deuterostomes, 13 insects, 6 worms and a yeast. New data highlights this year include an updated human genome browser, a 44-species multiple sequence alignment track, improved variation and phenotype tracks and 16 new genome-wide ENCODE tracks. New features include drag-and-zoom navigation, a Wiki track for user-added annotations, new custom track formats for large datasets (bigBed and bigWig), a new multiple alignment output tool, links to variation and protein structure tools, in silico PCR utility enhancements, and improved track configuration tools.
Collapse
Affiliation(s)
- Brooke Rhead
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Zhu J, Sanborn JZ, Benz S, Szeto C, Hsu F, Kuhn RM, Karolchik D, Archie J, Lenburg ME, Esserman LJ, Kent WJ, Haussler D, Wang T. The UCSC Cancer Genomics Browser. Nat Methods 2009; 6:239-40. [PMID: 19333237 DOI: 10.1038/nmeth0409-239] [Citation(s) in RCA: 141] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
46
|
Kuhn RM, Karolchik D, Zweig AS, Wang T, Smith KE, Rosenbloom KR, Rhead B, Raney BJ, Pohl A, Pheasant M, Meyer L, Hsu F, Hinrichs AS, Harte RA, Giardine B, Fujita P, Diekhans M, Dreszer T, Clawson H, Barber GP, Haussler D, Kent WJ. The UCSC Genome Browser Database: update 2009. Nucleic Acids Res 2008; 37:D755-61. [PMID: 18996895 PMCID: PMC2686463 DOI: 10.1093/nar/gkn875] [Citation(s) in RCA: 303] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The UCSC Genome Browser Database (GBD, http://genome.ucsc.edu) is a publicly available collection of genome assembly sequence data and integrated annotations for a large number of organisms, including extensive comparative-genomic resources. In the past year, 13 new genome assemblies have been added, including two important primate species, orangutan and marmoset, bringing the total to 46 assemblies for 24 different vertebrates and 39 assemblies for 22 different invertebrate animals. The GBD datasets may be viewed graphically with the UCSC Genome Browser, which uses a coordinate-based display system allowing users to juxtapose a wide variety of data. These data include all mRNAs from GenBank mapped to all organisms, RefSeq alignments, gene predictions, regulatory elements, gene expression data, repeats, SNPs and other variation data, as well as pairwise and multiple-genome alignments. A variety of other bioinformatics tools are also provided, including BLAT, the Table Browser, the Gene Sorter, the Proteome Browser, VisiGene and Genome Graphs.
Collapse
Affiliation(s)
- R M Kuhn
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Baertsch R, Diekhans M, Kent WJ, Haussler D, Brosius J. Retrocopy contributions to the evolution of the human genome. BMC Genomics 2008; 9:466. [PMID: 18842134 PMCID: PMC2584115 DOI: 10.1186/1471-2164-9-466] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2008] [Accepted: 10/08/2008] [Indexed: 02/06/2023] Open
Abstract
Background Evolution via point mutations is a relatively slow process and is unlikely to completely explain the differences between primates and other mammals. By contrast, 45% of the human genome is composed of retroposed elements, many of which were inserted in the primate lineage. A subset of retroposed mRNAs (retrocopies) shows strong evidence of expression in primates, often yielding functional retrogenes. Results To identify and analyze the relatively recently evolved retrogenes, we carried out BLASTZ alignments of all human mRNAs against the human genome and scored a set of features indicative of retroposition. Of over 12,000 putative retrocopy-derived genes that arose mainly in the primate lineage, 726 with strong evidence of transcript expression were examined in detail. These mRNA retroposition events fall into three categories: I) 34 retrocopies and antisense retrocopies that added potential protein coding space and UTRs to existing genes; II) 682 complete retrocopy duplications inserted into new loci; and III) an unexpected set of 13 retrocopies that contributed out-of-frame, or antisense sequences in combination with other types of transposed elements (SINEs, LINEs, LTRs), even unannotated sequence to form potentially novel genes with no homologs outside primates. In addition to their presence in human, several of the gene candidates also had potentially viable ORFs in chimpanzee, orangutan, and rhesus macaque, underscoring their potential of function. Conclusion mRNA-derived retrocopies provide raw material for the evolution of genes in a wide variety of ways, duplicating and amending the protein coding region of existing genes as well as generating the potential for new protein coding space, or non-protein coding RNAs, by unexpected contributions out of frame, in reverse orientation, or from previously non-protein coding sequence.
Collapse
Affiliation(s)
- Robert Baertsch
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, California 95064, USA.
| | | | | | | | | |
Collapse
|
48
|
Zweig AS, Karolchik D, Kuhn RM, Haussler D, Kent WJ. UCSC genome browser tutorial. Genomics 2008; 92:75-84. [PMID: 18514479 DOI: 10.1016/j.ygeno.2008.02.003] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2007] [Revised: 02/01/2008] [Accepted: 02/18/2008] [Indexed: 11/30/2022]
Abstract
The University of California Santa Cruz (UCSC) Genome Bioinformatics website consists of a suite of free, open-source, on-line tools that can be used to browse, analyze, and query genomic data. These tools are available to anyone who has an Internet browser and an interest in genomics. The website provides a quick and easy-to-use visual display of genomic data. It places annotation tracks beneath genome coordinate positions, allowing rapid visual correlation of different types of information. Many of the annotation tracks are submitted by scientists worldwide; the others are computed by the UCSC Genome Bioinformatics group from publicly available sequence data. It also allows users to upload and display their own experimental results or annotation sets by creating a custom track. The suite of tools, downloadable data files, and links to documentation and other information can be found at http://genome.ucsc.edu/.
Collapse
Affiliation(s)
- Ann S Zweig
- UCSC Genome Bioinformatics Group, Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA.
| | | | | | | | | |
Collapse
|
49
|
Abstract
The University of California Santa Cruz (UCSC) Genome Browser (genome.ucsc.edu) is a popular Web-based tool for quickly displaying a requested portion of a genome at any scale, accompanied by a series of aligned annotation "tracks". The annotations-generated by the UCSC Genome Bioinformatics Group and external collaborators-display gene predictions, mRNA and expressed sequence tag alignments, simple nucleotide polymorphisms, expression and regulatory data, and pairwise and multiple-species comparative genomics data. All information relevant to a region is presented in one window, facilitating biological analysis and interpretation. The database tables underlying the Genome Browser tracks can be viewed, downloaded, and manipulated using another Web-based application, the UCSC Table Browser. Users can upload personal data as custom annotation tracks in both browsers for research or educational use. This unit describes how to use the Genome Browser and Table Browser for genome analysis, download the underlying database tables, and create and display custom annotation tracks.
Collapse
Affiliation(s)
- Donna Karolchik
- University of California Santa Cruz, Santa Cruz, California, USA
| | | | | |
Collapse
|
50
|
Karolchik D, Kuhn RM, Baertsch R, Barber GP, Clawson H, Diekhans M, Giardine B, Harte RA, Hinrichs AS, Hsu F, Kober KM, Miller W, Pedersen JS, Pohl A, Raney BJ, Rhead B, Rosenbloom KR, Smith KE, Stanke M, Thakkapallayil A, Trumbower H, Wang T, Zweig AS, Haussler D, Kent WJ. The UCSC Genome Browser Database: 2008 update. Nucleic Acids Res 2007; 36:D773-9. [PMID: 18086701 PMCID: PMC2238835 DOI: 10.1093/nar/gkm966] [Citation(s) in RCA: 403] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
The University of California, Santa Cruz, Genome Browser Database (GBD) provides integrated sequence and annotation data for a large collection of vertebrate and model organism genomes. Seventeen new assemblies have been added to the database in the past year, for a total coverage of 19 vertebrate and 21 invertebrate species as of September 2007. For each assembly, the GBD contains a collection of annotation data aligned to the genomic sequence. Highlights of this year's additions include a 28-species human-based vertebrate conservation annotation, an enhanced UCSC Genes set, and more human variation, MGC, and ENCODE data. The database is optimized for fast interactive performance with a set of web-based tools that may be used to view, manipulate, filter and download the annotation data. New toolset features include the Genome Graphs tool for displaying genome-wide data sets, session saving and sharing, better custom track management, expanded Genome Browser configuration options and a Genome Browser wiki site. The downloadable GBD data, the companion Genome Browser toolset and links to documentation and related information can be found at: http://genome.ucsc.edu/.
Collapse
Affiliation(s)
- D Karolchik
- Center for Biomolecular Science and Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|