51
|
Mi H, Dong Q, Muruganujan A, Gaudet P, Lewis S, Thomas PD. PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium. Nucleic Acids Res 2010; 38:D204-10. [PMID: 20015972 PMCID: PMC2808919 DOI: 10.1093/nar/gkp1019] [Citation(s) in RCA: 448] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2009] [Revised: 10/16/2009] [Accepted: 10/19/2009] [Indexed: 11/12/2022] Open
Abstract
Protein Analysis THrough Evolutionary Relationships (PANTHER) is a comprehensive software system for inferring the functions of genes based on their evolutionary relationships. Phylogenetic trees of gene families form the basis for PANTHER and these trees are annotated with ontology terms describing the evolution of gene function from ancestral to modern day genes. One of the main applications of PANTHER is in accurate prediction of the functions of uncharacterized genes, based on their evolutionary relationships to genes with functions known from experiment. The PANTHER website, freely available at http://www.pantherdb.org, also includes software tools for analyzing genomic data relative to known and inferred gene functions. Since 2007, there have been several new developments to PANTHER: (i) improved phylogenetic trees, explicitly representing speciation and gene duplication events, (ii) identification of gene orthologs, including least diverged orthologs (best one-to-one pairs), (iii) coverage of more genomes (48 genomes, up to 87% of genes in each genome; see http://www.pantherdb.org/panther/summaryStats.jsp), (iv) improved support for alternative database identifiers for genes, proteins and microarray probes and (v) adoption of the SBGN standard for display of biological pathways. In addition, PANTHER trees are being annotated with gene function as part of the Gene Ontology Reference Genome project, resulting in an increasing number of curated functional annotations.
Collapse
Affiliation(s)
- Huaiyu Mi
- Evolutionary Systems Biology Group, SRI International, dictyBase, Northwestern University and Berkeley Bioinformatics and Open-source Projects (BBOP), Lawrence Berkeley National Laboratory, USA
| | - Qing Dong
- Evolutionary Systems Biology Group, SRI International, dictyBase, Northwestern University and Berkeley Bioinformatics and Open-source Projects (BBOP), Lawrence Berkeley National Laboratory, USA
| | - Anushya Muruganujan
- Evolutionary Systems Biology Group, SRI International, dictyBase, Northwestern University and Berkeley Bioinformatics and Open-source Projects (BBOP), Lawrence Berkeley National Laboratory, USA
| | - Pascale Gaudet
- Evolutionary Systems Biology Group, SRI International, dictyBase, Northwestern University and Berkeley Bioinformatics and Open-source Projects (BBOP), Lawrence Berkeley National Laboratory, USA
| | - Suzanna Lewis
- Evolutionary Systems Biology Group, SRI International, dictyBase, Northwestern University and Berkeley Bioinformatics and Open-source Projects (BBOP), Lawrence Berkeley National Laboratory, USA
| | - Paul D. Thomas
- Evolutionary Systems Biology Group, SRI International, dictyBase, Northwestern University and Berkeley Bioinformatics and Open-source Projects (BBOP), Lawrence Berkeley National Laboratory, USA
| |
Collapse
|
52
|
Goll J, Montgomery R, Brinkac LM, Schobel S, Harkins DM, Sebastian Y, Shrivastava S, Durkin S, Sutton G. The Protein Naming Utility: a rules database for protein nomenclature. Nucleic Acids Res 2009; 38:D336-9. [PMID: 20007151 PMCID: PMC2808875 DOI: 10.1093/nar/gkp958] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Generation of syntactically correct and unambiguous names for proteins is a challenging, yet vital task for functional annotation processes. Proteins are often named based on homology to known proteins, many of which have problematic names. To address the need to generate high-quality protein names, and capture our significant experience correcting protein names manually, we have developed the Protein Naming Utility (PNU, http://www.jcvi.org/pn-utility). The PNU is a web-based database for storing and applying naming rules to identify and correct syntactically incorrect protein names, or to replace synonyms with their preferred name. The PNU allows users to generate and manage collections of naming rules, optionally building upon the growing body of rules generated at the J. Craig Venter Institute (JCVI). Since communities often enforce disparate conventions for naming proteins, the PNU supports grouping rules into user-managed collections. Users can check their protein names against a selected PNU rule collection, generating both statistics and corrected names. The PNU can also be used to correct GenBank table files prior to submission to GenBank. Currently, the database features 3080 manual rules that have been entered by JCVI Bioinformatics Analysts as well as 7458 automatically imported names.
Collapse
Affiliation(s)
- Johannes Goll
- The J Craig Venter Institute, Rockville, MD 20850, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
53
|
Abstract
The University of California Santa Cruz (UCSC) Genome Browser is a popular Web-based tool for quickly displaying a requested portion of a genome at any scale, accompanied by a series of aligned annotation "tracks." The annotations-generated by the UCSC Genome Bioinformatics Group and external collaborators-display gene predictions, mRNA and expressed sequence tag alignments, simple nucleotide polymorphisms, expression and regulatory data, phenotype and variation data, and pairwise and multiple-species comparative genomics data. All information relevant to a region is presented in one window, facilitating biological analysis and interpretation. The database tables underlying the Genome Browser tracks can be viewed, downloaded, and manipulated using another Web-based application, the UCSC Table Browser. Users can upload data as custom annotation tracks in both browsers for research or educational use. This unit describes how to use the Genome Browser and Table Browser for genome analysis, download the underlying database tables, and create and display custom annotation tracks.
Collapse
Affiliation(s)
- Donna Karolchik
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, Phone: (831) 459-1571, Fax: (831) 459-1809
| | - Angie S. Hinrichs
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, Phone: (831) 459-1544, Fax: (831) 459-1809
| | - W. James Kent
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, Phone: (831) 459-1401, Fax: (831) 459-1809
| |
Collapse
|
54
|
Yamazaki Y, Akashi R, Banno Y, Endo T, Ezura H, Fukami-Kobayashi K, Inaba K, Isa T, Kamei K, Kasai F, Kobayashi M, Kurata N, Kusaba M, Matuzawa T, Mitani S, Nakamura T, Nakamura Y, Nakatsuji N, Naruse K, Niki H, Nitasaka E, Obata Y, Okamoto H, Okuma M, Sato K, Serikawa T, Shiroishi T, Sugawara H, Urushibara H, Yamamoto M, Yaoita Y, Yoshiki A, Kohara Y. NBRP databases: databases of biological resources in Japan. Nucleic Acids Res 2009; 38:D26-32. [PMID: 19934255 PMCID: PMC2808968 DOI: 10.1093/nar/gkp996] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The National BioResource Project (NBRP) is a Japanese project that aims to establish a system for collecting, preserving and providing bioresources for use as experimental materials for life science research. It is promoted by 27 core resource facilities, each concerned with a particular group of organisms, and by one information center. The NBRP database is a product of this project. Thirty databases and an integrated database-retrieval system (BioResource World: BRW) have been created and made available through the NBRP home page (http://www.nbrp.jp). The 30 independent databases have individual features which directly reflect the data maintained by each resource facility. The BRW is designed for users who need to search across several resources without moving from one database to another. BRW provides access to a collection of 4.5-million records on bioresources including wild species, inbred lines, mutants, genetically engineered lines, DNA clones and so on. BRW supports summary browsing, keyword searching, and searching by DNA sequences or gene ontology. The results of searches provide links to online requests for distribution of research materials. A circulation system allows users to submit details of papers published on research conducted using NBRP resources.
Collapse
Affiliation(s)
- Yukiko Yamazaki
- National Institute of Genetics, University of Miyazaki, Japan.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
55
|
Flicek P, Aken BL, Ballester B, Beal K, Bragin E, Brent S, Chen Y, Clapham P, Coates G, Fairley S, Fitzgerald S, Fernandez-Banet J, Gordon L, Gräf S, Haider S, Hammond M, Howe K, Jenkinson A, Johnson N, Kähäri A, Keefe D, Keenan S, Kinsella R, Kokocinski F, Koscielny G, Kulesha E, Lawson D, Longden I, Massingham T, McLaren W, Megy K, Overduin B, Pritchard B, Rios D, Ruffier M, Schuster M, Slater G, Smedley D, Spudich G, Tang YA, Trevanion S, Vilella A, Vogel J, White S, Wilder SP, Zadissa A, Birney E, Cunningham F, Dunham I, Durbin R, Fernández-Suarez XM, Herrero J, Hubbard TJP, Parker A, Proctor G, Smith J, Searle SMJ. Ensembl's 10th year. Nucleic Acids Res 2009; 38:D557-62. [PMID: 19906699 PMCID: PMC2808936 DOI: 10.1093/nar/gkp972] [Citation(s) in RCA: 238] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Ensembl (http://www.ensembl.org) integrates genomic information for a comprehensive set of chordate genomes with a particular focus on resources for human, mouse, rat, zebrafish and other high-value sequenced genomes. We provide complete gene annotations for all supported species in addition to specific resources that target genome variation, function and evolution. Ensembl data is accessible in a variety of formats including via our genome browser, API and BioMart. This year marks the tenth anniversary of Ensembl and in that time the project has grown with advances in genome technology. As of release 56 (September 2009), Ensembl supports 51 species including marmoset, pig, zebra finch, lizard, gorilla and wallaby, which were added in the past year. Major additions and improvements to Ensembl since our previous report include the incorporation of the human GRCh37 assembly, enhanced visualisation and data-mining options for the Ensembl regulatory features and continued development of our software infrastructure.
Collapse
Affiliation(s)
- Paul Flicek
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
56
|
Shimoyama M, Hayman GT, Laulederkind SJF, Nigam R, Lowry TF, Petri V, Smith JR, Wang SJ, Munzenmaier DH, Dwinell MR, Twigger SN, Jacob HJ. The rat genome database curators: who, what, where, why. PLoS Comput Biol 2009; 5:e1000582. [PMID: 19956751 PMCID: PMC2775909 DOI: 10.1371/journal.pcbi.1000582] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Affiliation(s)
- Mary Shimoyama
- Rat Genome Database, Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, Wisconsin, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
57
|
Berriz GF, Beaver JE, Cenik C, Tasan M, Roth FP. Next generation software for functional trend analysis. Bioinformatics 2009; 25:3043-4. [PMID: 19717575 DOI: 10.1093/bioinformatics/btp498] [Citation(s) in RCA: 198] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED FuncAssociate is a web application that discovers properties enriched in lists of genes or proteins that emerge from large-scale experimentation. Here we describe an updated application with a new interface and several new features. For example, enrichment analysis can now be performed within multiple gene- and protein-naming systems. This feature avoids potentially serious translation artifacts to which other enrichment analysis strategies are subject. AVAILABILITY The FuncAssociate web application is freely available to all users at http://llama.med.harvard.edu/funcassociate.
Collapse
Affiliation(s)
- Gabriel F Berriz
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, 250 Longwood Avenue and Center for Cancer Systems Biology, Dana Farber Cancer Institute, 44 Binney Street, Boston, MA 02115, USA
| | | | | | | | | |
Collapse
|
58
|
Yoshida Y, Makita Y, Heida N, Asano S, Matsushima A, Ishii M, Mochizuki Y, Masuya H, Wakana S, Kobayashi N, Toyoda T. PosMed (Positional Medline): prioritizing genes with an artificial neural network comprising medical documents to accelerate positional cloning. Nucleic Acids Res 2009; 37:W147-52. [PMID: 19468046 PMCID: PMC2703941 DOI: 10.1093/nar/gkp384] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
PosMed (http://omicspace.riken.jp/) prioritizes candidate genes for positional cloning by employing our original database search engine GRASE, which uses an inferential process similar to an artificial neural network comprising documental neurons (or 'documentrons') that represent each document contained in databases such as MEDLINE and OMIM. Given a user-specified query, PosMed initially performs a full-text search of each documentron in the first-layer artificial neurons and then calculates the statistical significance of the connections between the hit documentrons and the second-layer artificial neurons representing each gene. When a chromosomal interval(s) is specified, PosMed explores the second-layer and third-layer artificial neurons representing genes within the chromosomal interval by evaluating the combined significance of the connections from the hit documentrons to the genes. PosMed is, therefore, a powerful tool that immediately ranks the candidate genes by connecting phenotypic keywords to the genes through connections representing not only gene-gene interactions but also other biological interactions (e.g. metabolite-gene, mutant mouse-gene, drug-gene, disease-gene and protein-protein interactions) and ortholog data. By utilizing orthologous connections, PosMed facilitates the ranking of human genes based on evidence found in other model species such as mouse. Currently, PosMed, an artificial superbrain that has learned a vast amount of biological knowledge ranging from genomes to phenomes (or 'omic space'), supports the prioritization of positional candidate genes in humans, mouse, rat and Arabidopsis thaliana.
Collapse
Affiliation(s)
- Yuko Yoshida
- Bioinformatics And Systems Engineering division, RIKEN. 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|