101
|
Mathelier A, Fornes O, Arenillas DJ, Chen CY, Denay G, Lee J, Shi W, Shyr C, Tan G, Worsley-Hunt R, Zhang AW, Parcy F, Lenhard B, Sandelin A, Wasserman WW. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res 2015; 44:D110-5. [PMID: 26531826 PMCID: PMC4702842 DOI: 10.1093/nar/gkv1176] [Citation(s) in RCA: 737] [Impact Index Per Article: 81.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2015] [Accepted: 10/22/2015] [Indexed: 11/28/2022] Open
Abstract
JASPAR (http://jaspar.genereg.net) is an open-access database storing curated, non-redundant transcription factor (TF) binding profiles representing transcription factor binding preferences as position frequency matrices for multiple species in six taxonomic groups. For this 2016 release, we expanded the JASPAR CORE collection with 494 new TF binding profiles (315 in vertebrates, 11 in nematodes, 3 in insects, 1 in fungi and 164 in plants) and updated 59 profiles (58 in vertebrates and 1 in fungi). The introduced profiles represent an 83% expansion and 10% update when compared to the previous release. We updated the structural annotation of the TF DNA binding domains (DBDs) following a published hierarchical structural classification. In addition, we introduced 130 transcription factor flexible models trained on ChIP-seq data for vertebrates, which capture dinucleotide dependencies within TF binding sites. This new JASPAR release is accompanied by a new web tool to infer JASPAR TF binding profiles recognized by a given TF protein sequence. Moreover, we provide the users with a Ruby module complementing the JASPAR API to ease programmatic access and use of the JASPAR collection of profiles. Finally, we provide the JASPAR2016 R/Bioconductor data package with the data of this release.
Collapse
Affiliation(s)
- Anthony Mathelier
- Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, V5Z 4H4, BC, Canada
| | - Oriol Fornes
- Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, V5Z 4H4, BC, Canada
| | - David J Arenillas
- Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, V5Z 4H4, BC, Canada
| | - Chih-Yu Chen
- Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, V5Z 4H4, BC, Canada
| | - Grégoire Denay
- Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France
| | - Jessica Lee
- Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, V5Z 4H4, BC, Canada
| | - Wenqiang Shi
- Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, V5Z 4H4, BC, Canada
| | - Casper Shyr
- Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, V5Z 4H4, BC, Canada
| | - Ge Tan
- Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK
| | - Rebecca Worsley-Hunt
- Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, V5Z 4H4, BC, Canada
| | - Allen W Zhang
- Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, V5Z 4H4, BC, Canada
| | - François Parcy
- Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France
| | - Boris Lenhard
- Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK
| | - Albin Sandelin
- The Bioinformatics Centre, Department of Biology and Biotech Research and Innovation Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark
| | - Wyeth W Wasserman
- Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, V5Z 4H4, BC, Canada
| |
Collapse
|
122
|
Mathelier A, Zhao X, Zhang AW, Parcy F, Worsley-Hunt R, Arenillas DJ, Buchman S, Chen CY, Chou A, Ienasescu H, Lim J, Shyr C, Tan G, Zhou M, Lenhard B, Sandelin A, Wasserman WW. JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res 2014; 42:D142-7. [PMID: 24194598 PMCID: PMC3965086 DOI: 10.1093/nar/gkt997] [Citation(s) in RCA: 786] [Impact Index Per Article: 78.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2013] [Accepted: 10/03/2013] [Indexed: 11/14/2022] Open
Abstract
JASPAR (http://jaspar.genereg.net) is the largest open-access database of matrix-based nucleotide profiles describing the binding preference of transcription factors from multiple species. The fifth major release greatly expands the heart of JASPAR-the JASPAR CORE subcollection, which contains curated, non-redundant profiles-with 135 new curated profiles (74 in vertebrates, 8 in Drosophila melanogaster, 10 in Caenorhabditis elegans and 43 in Arabidopsis thaliana; a 30% increase in total) and 43 older updated profiles (36 in vertebrates, 3 in D. melanogaster and 4 in A. thaliana; a 9% update in total). The new and updated profiles are mainly derived from published chromatin immunoprecipitation-seq experimental datasets. In addition, the web interface has been enhanced with advanced capabilities in browsing, searching and subsetting. Finally, the new JASPAR release is accompanied by a new BioPython package, a new R tool package and a new R/Bioconductor data package to facilitate access for both manual and automated methods.
Collapse
Affiliation(s)
- Anthony Mathelier
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Xiaobei Zhao
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Allen W. Zhang
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - François Parcy
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Rebecca Worsley-Hunt
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - David J. Arenillas
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Sorana Buchman
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Chih-yu Chen
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Alice Chou
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Hans Ienasescu
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Jonathan Lim
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Casper Shyr
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Ge Tan
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Michelle Zhou
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Boris Lenhard
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Albin Sandelin
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Wyeth W. Wasserman
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada, Department of Biology and Biotech Research and Innovation Centre, The Bioinformatics Centre, Copenhagen University, Ole Maaloes Vej 5, DK-2200, Denmark, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Laboratoire Physiologie Cellulaire & Végétale, Université Grenoble Alpes, CNRS, CEA, iRTSV, INRA, 38054 Grenoble, France, Computational Regulatory Genomics, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK, and Department of Informatics, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| |
Collapse
|