1
|
Tong S, Ventola P, Frater CH, Klotz J, Phillips JM, Muppidi S, Dwight SS, Mueller WF, Beahm BJ, Wilsey M, Lee KJ. NGLY1 deficiency: a prospective natural history study. Hum Mol Genet 2023; 32:2787-2796. [PMID: 37379343 PMCID: PMC10481101 DOI: 10.1093/hmg/ddad106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 06/20/2023] [Accepted: 06/21/2023] [Indexed: 06/30/2023] Open
Abstract
N-glycanase 1 (NGLY1) deficiency is a debilitating, ultra-rare autosomal recessive disorder caused by loss of function of NGLY1, a cytosolic enzyme that deglycosylates other proteins. It is characterized by severe global developmental delay and/or intellectual disability, hyperkinetic movement disorder, transient elevation of transaminases, (hypo)alacrima and progressive, diffuse, length-dependent sensorimotor polyneuropathy. A prospective natural history study (NHS) was conducted to elucidate clinical features and disease course. Twenty-nine participants were enrolled (15 onsite, 14 remotely) and followed for up to 32 months, representing ~29% of the ~100 patients identified worldwide. Participants exhibited profound developmental delays, with almost all developmental quotients below 20 on the Mullen Scales of Early Learning, well below the normative score of 100. Increased difficulties with sitting and standing suggested decline in motor function over time. Most patients presented with (hypo)alacrima and reduced sweat response. Pediatric quality of life was poor except for emotional function. Language/communication and motor skill problems including hand use were reported by caregivers as the most bothersome symptoms. Levels of the substrate biomarker, GlcNAc-Asn (aspartylglucosamine; GNA), were consistently elevated in all participants over time, independent of age. Liver enzymes were elevated for some participants but improved especially in younger patients and did not reach levels indicating severe liver disease. Three participants died during the study period. Data from this NHS informs selection of endpoints and assessments for future clinical trials for NGLY1 deficiency interventions. Potential endpoints include GNA biomarker levels, neurocognitive assessments, autonomic and motor function (particularly hand use), (hypo)alacrima and quality of life.
Collapse
Affiliation(s)
- Sandra Tong
- Grace Science Foundation, Menlo Park, CA 94026, USA
| | - Pamela Ventola
- Cogstate, New Haven, CT 06510, USA
- Yale Child Study Center, New Haven, CT 06519, USA
| | | | - Jenna Klotz
- Department of Neurology, Stanford University, Stanford, CA 94305, USA
| | | | - Srikanth Muppidi
- Department of Neurology, Stanford University, Stanford, CA 94305, USA
| | | | | | | | - Matt Wilsey
- Grace Science Foundation, Menlo Park, CA 94026, USA
| | - Kevin J Lee
- Grace Science Foundation, Menlo Park, CA 94026, USA
| |
Collapse
|
2
|
Stanclift CR, Dwight SS, Lee K, Eijkenboom QL, Wilsey M, Wilsey K, Kobayashi ES, Tong S, Bainbridge MN. NGLY1 deficiency: estimated incidence, clinical features, and genotypic spectrum from the NGLY1 Registry. Orphanet J Rare Dis 2022; 17:440. [PMID: 36528660 PMCID: PMC9759919 DOI: 10.1186/s13023-022-02592-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 12/07/2022] [Indexed: 12/23/2022] Open
Abstract
PURPOSE NGLY1 Deficiency is an ultra-rare, multisystemic disease caused by biallelic pathogenic NGLY1 variants. The aims of this study were to (1) characterize the variants and clinical features of the largest cohort of NGLY1 Deficiency patients reported to date, and (2) estimate the incidence of this disorder. METHODS The Grace Science Foundation collected genotypic data from 74 NGLY1 Deficiency patients, of which 37 also provided phenotypic data. We analyzed NGLY1 variants and clinical features and estimated NGLY1 disease incidence in the United States (U.S.). RESULTS Analysis of patient genotypes, including 10 previously unreported NGLY1 variants, showed strong statistical enrichment for missense variants in the transglutaminase-like domain of NGLY1 (p < 1.96E-11). Caregivers reported global developmental delay, movement disorder, and alacrima in over 85% of patients. Some phenotypic differences were noted between males and females. Regression was reported for all patients over 14 years old by their caregivers. The calculated U.S. incidence of NGLY1 Deficiency was ~ 12 individuals born per year. CONCLUSION The estimated U.S. incidence of NGLY1 indicates the disease may be more common than the number of patients reported in the literature suggests. Given the low frequency of most variants and proportion of compound heterozygotes, genotype/phenotype correlations were not distinguishable.
Collapse
Affiliation(s)
| | | | - Kevin Lee
- Grace Science Foundation, P.O. Box 114, Menlo Park, CA USA
| | | | - Matt Wilsey
- Grace Science Foundation, P.O. Box 114, Menlo Park, CA USA
| | - Kristen Wilsey
- Grace Science Foundation, P.O. Box 114, Menlo Park, CA USA
| | - Erica Sanford Kobayashi
- grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, 3020 Children’s Way, San Diego, CA USA
| | - Sandra Tong
- Grace Science Foundation, P.O. Box 114, Menlo Park, CA USA
| | - Matthew N. Bainbridge
- grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, 3020 Children’s Way, San Diego, CA USA
| |
Collapse
|
3
|
Zhu L, Tan B, Dwight SS, Beahm B, Wilsey M, Crawford BE, Schweighardt B, Cook JW, Wechsler T, Mueller WF. AAV9-NGLY1 gene replacement therapy improves phenotypic and biomarker endpoints in a rat model of NGLY1 Deficiency. Mol Ther Methods Clin Dev 2022; 27:259-271. [PMID: 36320418 PMCID: PMC9593239 DOI: 10.1016/j.omtm.2022.09.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 09/29/2022] [Indexed: 11/06/2022]
Abstract
N-glycanase 1 (NGLY1) Deficiency is a progressive, ultra-rare, autosomal recessive disorder with no approved therapy and five core clinical features: severe global developmental delay, hyperkinetic movement disorder, elevated liver transaminases, alacrima, and peripheral neuropathy. Here, we confirmed and characterized the Ngly1 -/- / rat as a relevant disease model. GS-100, a gene therapy candidate, is a recombinant, single-stranded adeno-associated virus (AAV) 9 vector designed to deliver a functional copy of the human NGLY1 gene. Using the Ngly1 -/- rat, we tested different administration routes for GS-100: intracerebroventricular (ICV), intravenous (IV), or the dual route (IV + ICV). ICV and IV + ICV administration resulted in widespread biodistribution of human NGLY1 DNA and corresponding mRNA and protein expression in CNS tissues. GS-100 delivered by ICV or IV + ICV significantly reduced levels of the substrate biomarker N-acetylglucosamine-asparagine (GlcNAc-Asn or GNA) in CSF and brain tissue compared with untreated Ngly1-/- rats. ICV and IV + ICV administration of GS-100 resulted in behavioral improvements in rotarod and rearing tests, whereas IV-only administration did not. IV + ICV did not provide additional benefit compared with ICV administration alone. These data provide evidence that GS-100 could be an effective therapy for NGLY1 Deficiency using the ICV route of administration.
Collapse
Affiliation(s)
- Lei Zhu
- Grace Science, LLC, Menlo Park, CA 94025, USA
| | - Brandon Tan
- Grace Science, LLC, Menlo Park, CA 94025, USA
| | | | | | - Matt Wilsey
- Grace Science, LLC, Menlo Park, CA 94025, USA
| | | | | | | | | | - William F. Mueller
- Grace Science, LLC, Menlo Park, CA 94025, USA
- Corresponding author William F. Mueller, Grace Science, LLC, 1142 Crane Street, Ste 4, Menlo Park, CA 94025, USA.
| |
Collapse
|
4
|
Preston CG, Wright MW, Madhavrao R, Harrison SM, Goldstein JL, Luo X, Wand H, Wulf B, Cheung G, Mandell ME, Tong H, Cheng S, Iacocca MA, Pineda AL, Popejoy AB, Dalton K, Zhen J, Dwight SS, Babb L, DiStefano M, O’Daniel JM, Lee K, Riggs ER, Zastrow DB, Mester JL, Ritter DI, Patel RY, Subramanian SL, Milosavljevic A, Berg JS, Rehm HL, Plon SE, Cherry JM, Bustamante CD, Costa HA. ClinGen Variant Curation Interface: a variant classification platform for the application of evidence criteria from ACMG/AMP guidelines. Genome Med 2022; 14:6. [PMID: 35039090 PMCID: PMC8764818 DOI: 10.1186/s13073-021-01004-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2021] [Accepted: 11/12/2021] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND Identification of clinically significant genetic alterations involved in human disease has been dramatically accelerated by developments in next-generation sequencing technologies. However, the infrastructure and accessible comprehensive curation tools necessary for analyzing an individual patient genome and interpreting genetic variants to inform healthcare management have been lacking. RESULTS Here we present the ClinGen Variant Curation Interface (VCI), a global open-source variant classification platform for supporting the application of evidence criteria and classification of variants based on the ACMG/AMP variant classification guidelines. The VCI is among a suite of tools developed by the NIH-funded Clinical Genome Resource (ClinGen) Consortium and supports an FDA-recognized human variant curation process. Essential to this is the ability to enable collaboration and peer review across ClinGen Expert Panels supporting users in comprehensively identifying, annotating, and sharing relevant evidence while making variant pathogenicity assertions. To facilitate evidence-based improvements in human variant classification, the VCI is publicly available to the genomics community. Navigation workflows support users providing guidance to comprehensively apply the ACMG/AMP evidence criteria and document provenance for asserting variant classifications. CONCLUSIONS The VCI offers a central platform for clinical variant classification that fills a gap in the learning healthcare system, facilitates widespread adoption of standards for clinical curation, and is available at https://curation.clinicalgenome.org.
Collapse
Affiliation(s)
- Christine G. Preston
- grid.168010.e0000000419368956Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, MSOB x313, Stanford, CA 94305 USA
| | - Matt W. Wright
- grid.168010.e0000000419368956Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, MSOB x313, Stanford, CA 94305 USA
| | - Rao Madhavrao
- grid.168010.e0000000419368956Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, MSOB x313, Stanford, CA 94305 USA
| | - Steven M. Harrison
- grid.66859.340000 0004 0546 1623Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142 USA
| | - Jennifer L. Goldstein
- grid.410711.20000 0001 1034 1720Department of Genetics, University of North Carolina, Chapel Hill, NC 27599 USA
| | - Xi Luo
- grid.39382.330000 0001 2160 926XDepartment of Pediatrics/Hematology-Oncology, Baylor College of Medicine, Houston, TX 77030 USA
| | - Hannah Wand
- grid.490568.60000 0004 5997 482XCenter for Inherited Cardiovascular Disease, Stanford Health Care, Stanford, CA 94305 USA
| | - Bryan Wulf
- grid.168010.e0000000419368956Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, MSOB x313, Stanford, CA 94305 USA
| | - Gloria Cheung
- grid.168010.e0000000419368956Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, MSOB x313, Stanford, CA 94305 USA
| | - Mark E. Mandell
- grid.168010.e0000000419368956Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, MSOB x313, Stanford, CA 94305 USA
| | - Howard Tong
- grid.168010.e0000000419368956Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, MSOB x313, Stanford, CA 94305 USA
| | - Shaung Cheng
- grid.168010.e0000000419368956Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, MSOB x313, Stanford, CA 94305 USA
| | - Michael A. Iacocca
- grid.168010.e0000000419368956Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, MSOB x313, Stanford, CA 94305 USA
| | - Arturo Lopez Pineda
- grid.168010.e0000000419368956Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Alice B. Popejoy
- grid.168010.e0000000419368956Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Karen Dalton
- grid.168010.e0000000419368956Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Jimmy Zhen
- grid.168010.e0000000419368956Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305 USA
| | | | - Lawrence Babb
- grid.66859.340000 0004 0546 1623Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142 USA
| | - Marina DiStefano
- grid.66859.340000 0004 0546 1623Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142 USA
| | - Julianne M. O’Daniel
- grid.410711.20000 0001 1034 1720Department of Genetics, University of North Carolina, Chapel Hill, NC 27599 USA
| | - Kristy Lee
- grid.410711.20000 0001 1034 1720Department of Genetics, University of North Carolina, Chapel Hill, NC 27599 USA
| | - Erin R. Riggs
- grid.280776.c0000 0004 0394 1447Autism & Developmental Medicine Institute, Geisinger Health System, Lewisburg, PA 17837 USA
| | - Diane B. Zastrow
- grid.416759.80000 0004 0460 3124Sutter Health, Mountain View, CA 94040 USA
| | - Jessica L. Mester
- grid.428467.b0000 0004 0409 2707GeneDx Inc., Gaithersburg, MD 20877 USA
| | - Deborah I. Ritter
- grid.39382.330000 0001 2160 926XDepartment of Pediatrics/Hematology-Oncology, Baylor College of Medicine, Houston, TX 77030 USA
| | - Ronak Y. Patel
- grid.39382.330000 0001 2160 926XDepartment of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030 USA
| | - Sai Lakshmi Subramanian
- grid.39382.330000 0001 2160 926XDepartment of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030 USA
| | - Aleksander Milosavljevic
- grid.39382.330000 0001 2160 926XDepartment of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030 USA
| | - Jonathan S. Berg
- grid.410711.20000 0001 1034 1720Department of Genetics, University of North Carolina, Chapel Hill, NC 27599 USA
| | - Heidi L. Rehm
- grid.66859.340000 0004 0546 1623Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142 USA ,grid.32224.350000 0004 0386 9924Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114 USA
| | - Sharon E. Plon
- grid.39382.330000 0001 2160 926XDepartment of Pediatrics/Hematology-Oncology, Baylor College of Medicine, Houston, TX 77030 USA ,grid.39382.330000 0001 2160 926XDepartment of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030 USA
| | - J. Michael Cherry
- grid.168010.e0000000419368956Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Carlos D. Bustamante
- grid.168010.e0000000419368956Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305 USA ,grid.168010.e0000000419368956Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Helio A. Costa
- grid.168010.e0000000419368956Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, MSOB x313, Stanford, CA 94305 USA ,grid.168010.e0000000419368956Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305 USA
| | | |
Collapse
|
5
|
Zastrow DB, Baudet H, Shen W, Thomas A, Si Y, Weaver MA, Lager AM, Liu J, Mangels R, Dwight SS, Wright MW, Dobrowolski SF, Eilbeck K, Enns GM, Feigenbaum A, Lichter-Konecki U, Lyon E, Pasquali M, Watson M, Blau N, Steiner RD, Craigen WJ, Mao R. Unique aspects of sequence variant interpretation for inborn errors of metabolism (IEM): The ClinGen IEM Working Group and the Phenylalanine Hydroxylase Gene. Hum Mutat 2019; 39:1569-1580. [PMID: 30311390 DOI: 10.1002/humu.23649] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 08/28/2018] [Accepted: 09/06/2018] [Indexed: 11/09/2022]
Abstract
The ClinGen Inborn Errors of Metabolism Working Group was tasked with creating a comprehensive, standardized knowledge base of genes and variants for metabolic diseases. Phenylalanine hydroxylase (PAH) deficiency was chosen to pilot development of the Working Group's standards and guidelines. A PAH variant curation expert panel (VCEP) was created to facilitate this process. Following ACMG-AMP variant interpretation guidelines, we present the development of these standards in the context of PAH variant curation and interpretation. Existing ACMG-AMP rules were adjusted based on disease (6) or strength (5) or both (2). Disease adjustments include allele frequency thresholds, functional assay thresholds, and phenotype-specific guidelines. Our validation of PAH-specific variant interpretation guidelines is presented using 85 variants. The PAH VCEP interpretations were concordant with existing interpretations in ClinVar for 69 variants (81%). Development of biocurator tools and standards are also described. Using the PAH-specific ACMG-AMP guidelines, 714 PAH variants have been curated and will be submitted to ClinVar. We also discuss strategies and challenges in applying ACMG-AMP guidelines to autosomal recessive metabolic disease, and the curation of variants in these genes.
Collapse
Affiliation(s)
- Diane B Zastrow
- Palo Alto Medical Foundation, Palo Alto, California.,Stanford University, Stanford, California
| | - Heather Baudet
- University of North Carolina, Chapel Hill, North Carolina
| | - Wei Shen
- ARUP Laboratories, Salt Lake City, Utah.,University of Utah, Salt Lake City, Utah
| | - Amanda Thomas
- Department of Pathology and Cell Biology, Columbia University Irving Medical Center, New York, New York
| | - Yue Si
- GeneDx, Gaithersburg, Maryland
| | - Meredith A Weaver
- American College of Medical Genetics and Genomics, Bethesda, Maryland
| | - Angela M Lager
- Section of Hematology/Oncology, Department of Medicine, University of Chicago, Chicago, Illinois
| | - Jixia Liu
- Marshfield Clinic Research Institute, Marshfield, Wisconsin
| | | | | | | | | | | | | | - Annette Feigenbaum
- Rady Children's Hospital and University of California, San Diego, California
| | - Uta Lichter-Konecki
- Children's Hospital of Pittsburg of UPMC, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Elaine Lyon
- ARUP Laboratories, Salt Lake City, Utah.,University of Utah, Salt Lake City, Utah
| | - Marzia Pasquali
- ARUP Laboratories, Salt Lake City, Utah.,University of Utah, Salt Lake City, Utah
| | - Michael Watson
- American College of Medical Genetics and Genomics, Bethesda, Maryland
| | - Nenad Blau
- Dietmar-Hopp Metabolic Center, University Children's Hospital, Department of General Pediatrics, Heidelberg, Germany
| | - Robert D Steiner
- Marshfield Clinic Research Institute, Marshfield, Wisconsin.,University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin
| | | | - Rong Mao
- ARUP Laboratories, Salt Lake City, Utah.,University of Utah, Salt Lake City, Utah
| | | |
Collapse
|
6
|
Zastrow DB, Baudet H, Shen W, Thomas A, Si Y, Weaver MA, Lager AM, Liu J, Mangels R, Dwight SS, Wright MW, Dobrowolski SF, Eilbeck K, Enns GM, Feigenbaum A, Lichter‐Konecki U, Lyon E, Pasquali M, Watson M, Blau N, Steiner RD, Craigen WJ, Mao R. Cover Image, Volume 39, Issue 11. Hum Mutat 2018. [DOI: 10.1002/humu.23662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
7
|
Strande NT, Riggs ER, Buchanan AH, Ceyhan-Birsoy O, DiStefano M, Dwight SS, Goldstein J, Ghosh R, Seifert BA, Sneddon TP, Wright MW, Milko LV, Cherry JM, Giovanni MA, Murray MF, O’Daniel JM, Ramos EM, Santani AB, Scott AF, Plon SE, Rehm HL, Martin CL, Berg JS. Evaluating the Clinical Validity of Gene-Disease Associations: An Evidence-Based Framework Developed by the Clinical Genome Resource. Am J Hum Genet 2017; 100:895-906. [PMID: 28552198 DOI: 10.1016/j.ajhg.2017.04.015] [Citation(s) in RCA: 325] [Impact Index Per Article: 46.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Accepted: 04/26/2017] [Indexed: 10/19/2022] Open
Abstract
With advances in genomic sequencing technology, the number of reported gene-disease relationships has rapidly expanded. However, the evidence supporting these claims varies widely, confounding accurate evaluation of genomic variation in a clinical setting. Despite the critical need to differentiate clinically valid relationships from less well-substantiated relationships, standard guidelines for such evaluation do not currently exist. The NIH-funded Clinical Genome Resource (ClinGen) has developed a framework to define and evaluate the clinical validity of gene-disease pairs across a variety of Mendelian disorders. In this manuscript we describe a proposed framework to evaluate relevant genetic and experimental evidence supporting or contradicting a gene-disease relationship and the subsequent validation of this framework using a set of representative gene-disease pairs. The framework provides a semiquantitative measurement for the strength of evidence of a gene-disease relationship that correlates to a qualitative classification: "Definitive," "Strong," "Moderate," "Limited," "No Reported Evidence," or "Conflicting Evidence." Within the ClinGen structure, classifications derived with this framework are reviewed and confirmed or adjusted based on clinical expertise of appropriate disease experts. Detailed guidance for utilizing this framework and access to the curation interface is available on our website. This evidence-based, systematic method to assess the strength of gene-disease relationships will facilitate more knowledgeable utilization of genomic variants in clinical and research settings.
Collapse
|
8
|
Meldal BHM, Forner-Martinez O, Costanzo MC, Dana J, Demeter J, Dumousseau M, Dwight SS, Gaulton A, Licata L, Melidoni AN, Ricard-Blum S, Roechert B, Skyzypek MS, Tiwari M, Velankar S, Wong ED, Hermjakob H, Orchard S. The complex portal--an encyclopaedia of macromolecular complexes. Nucleic Acids Res 2014; 43:D479-84. [PMID: 25313161 PMCID: PMC4384031 DOI: 10.1093/nar/gku975] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
The IntAct molecular interaction database has created a new, free, open-source, manually curated resource, the Complex Portal (www.ebi.ac.uk/intact/complex), through which protein complexes from major model organisms are being collated and made available for search, viewing and download. It has been built in close collaboration with other bioinformatics services and populated with data from ChEMBL, MatrixDB, PDBe, Reactome and UniProtKB. Each entry contains information about the participating molecules (including small molecules and nucleic acids), their stoichiometry, topology and structural assembly. Complexes are annotated with details about their function, properties and complex-specific Gene Ontology (GO) terms. Consistent nomenclature is used throughout the resource with systematic names, recommended names and a list of synonyms all provided. The use of the Evidence Code Ontology allows us to indicate for which entries direct experimental evidence is available or if the complex has been inferred based on homology or orthology. The data are searchable using standard identifiers, such as UniProt, ChEBI and GO IDs, protein, gene and complex names or synonyms. This reference resource will be maintained and grow to encompass an increasing number of organisms. Input from groups and individuals with specific areas of expertise is welcome.
Collapse
Affiliation(s)
- Birgit H M Meldal
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Oscar Forner-Martinez
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Maria C Costanzo
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Jose Dana
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Janos Demeter
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Marine Dumousseau
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Selina S Dwight
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Anna Gaulton
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Luana Licata
- Department of Biology, University of Rome, Tor Vergata, Rome 00133, Italy
| | - Anna N Melidoni
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Sylvie Ricard-Blum
- UMR 5086 CNRS, Université Lyon1, Institut de Biologie et Chimie des Protéines, 7 passage du Vercors, 69367 Lyon Cedex 07, France
| | - Bernd Roechert
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva, Switzerland
| | - Marek S Skyzypek
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Manu Tiwari
- Stammzellbiologie, Institut für Anatomie und Zellbiologie, GZMB Universitätsmedizin Göttingen, Ernst-Caspari-Haus, Justus-von-Liebig-Weg 11, 37077 Göttingen, Germany
| | - Sameer Velankar
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Edith D Wong
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Henning Hermjakob
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Sandra Orchard
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| |
Collapse
|
9
|
Engel SR, Dietrich FS, Fisk DG, Binkley G, Balakrishnan R, Costanzo MC, Dwight SS, Hitz BC, Karra K, Nash RS, Weng S, Wong ED, Lloyd P, Skrzypek MS, Miyasato SR, Simison M, Cherry JM. The reference genome sequence of Saccharomyces cerevisiae: then and now. G3 (Bethesda) 2014; 4:389-98. [PMID: 24374639 PMCID: PMC3962479 DOI: 10.1534/g3.113.008995] [Citation(s) in RCA: 247] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2013] [Accepted: 12/21/2013] [Indexed: 11/18/2022]
Abstract
The genome of the budding yeast Saccharomyces cerevisiae was the first completely sequenced from a eukaryote. It was released in 1996 as the work of a worldwide effort of hundreds of researchers. In the time since, the yeast genome has been intensively studied by geneticists, molecular biologists, and computational scientists all over the world. Maintenance and annotation of the genome sequence have long been provided by the Saccharomyces Genome Database, one of the original model organism databases. To deepen our understanding of the eukaryotic genome, the S. cerevisiae strain S288C reference genome sequence was updated recently in its first major update since 1996. The new version, called "S288C 2010," was determined from a single yeast colony using modern sequencing technologies and serves as the anchor for further innovations in yeast genomic science.
Collapse
Affiliation(s)
- Stacia R. Engel
- Department of Genetics, Stanford University, Stanford, California 94305
| | - Fred S. Dietrich
- Department of Molecular Genetics and Microbiology, Duke University, Durham, North Carolina 27710
| | - Dianna G. Fisk
- Department of Genetics, Stanford University, Stanford, California 94305
| | - Gail Binkley
- Department of Genetics, Stanford University, Stanford, California 94305
| | - Rama Balakrishnan
- Department of Genetics, Stanford University, Stanford, California 94305
| | - Maria C. Costanzo
- Department of Genetics, Stanford University, Stanford, California 94305
| | - Selina S. Dwight
- Department of Genetics, Stanford University, Stanford, California 94305
| | - Benjamin C. Hitz
- Department of Genetics, Stanford University, Stanford, California 94305
| | - Kalpana Karra
- Department of Genetics, Stanford University, Stanford, California 94305
| | - Robert S. Nash
- Department of Genetics, Stanford University, Stanford, California 94305
| | - Shuai Weng
- Department of Genetics, Stanford University, Stanford, California 94305
| | - Edith D. Wong
- Department of Genetics, Stanford University, Stanford, California 94305
| | - Paul Lloyd
- Department of Genetics, Stanford University, Stanford, California 94305
| | - Marek S. Skrzypek
- Department of Genetics, Stanford University, Stanford, California 94305
| | | | - Matt Simison
- Department of Genetics, Stanford University, Stanford, California 94305
| | - J. Michael Cherry
- Department of Genetics, Stanford University, Stanford, California 94305
| |
Collapse
|
10
|
Cherry JM, Hong EL, Amundsen C, Balakrishnan R, Binkley G, Chan ET, Christie KR, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hirschman JE, Hitz BC, Karra K, Krieger CJ, Miyasato SR, Nash RS, Park J, Skrzypek MS, Simison M, Weng S, Wong ED. Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res 2011; 40:D700-5. [PMID: 22110037 PMCID: PMC3245034 DOI: 10.1093/nar/gkr1029] [Citation(s) in RCA: 1240] [Impact Index Per Article: 95.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Saccharomyces Genome Database (SGD, http://www.yeastgenome.org) is the community resource for the budding yeast Saccharomyces cerevisiae. The SGD project provides the highest-quality manually curated information from peer-reviewed literature. The experimental results reported in the literature are extracted and integrated within a well-developed database. These data are combined with quality high-throughput results and provided through Locus Summary pages, a powerful query engine and rich genome browser. The acquisition, integration and retrieval of these data allow SGD to facilitate experimental design and analysis by providing an encyclopedia of the yeast genome, its chromosomal features, their functions and interactions. Public access to these data is provided to researchers and educators via web pages designed for optimal ease of use.
Collapse
Affiliation(s)
- J Michael Cherry
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Engel SR, Balakrishnan R, Binkley G, Christie KR, Costanzo MC, Dwight SS, Fisk DG, Hirschman JE, Hitz BC, Hong EL, Krieger CJ, Livstone MS, Miyasato SR, Nash R, Oughtred R, Park J, Skrzypek MS, Weng S, Wong ED, Dolinski K, Botstein D, Cherry JM. Saccharomyces Genome Database provides mutant phenotype data. Nucleic Acids Res 2009; 38:D433-6. [PMID: 19906697 PMCID: PMC2808950 DOI: 10.1093/nar/gkp917] [Citation(s) in RCA: 95] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is a scientific database for the molecular biology and genetics of the yeast Saccharomyces cerevisiae, which is commonly known as baker's or budding yeast. The information in SGD includes functional annotations, mapping and sequence information, protein domains and structure, expression data, mutant phenotypes, physical and genetic interactions and the primary literature from which these data are derived. Here we describe how published phenotypes and genetic interaction data are annotated and displayed in SGD.
Collapse
Affiliation(s)
- Stacia R Engel
- Department of Genetics, Stanford University, Stanford, CA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Hong EL, Balakrishnan R, Dong Q, Christie KR, Park J, Binkley G, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hirschman JE, Hitz BC, Krieger CJ, Livstone MS, Miyasato SR, Nash RS, Oughtred R, Skrzypek MS, Weng S, Wong ED, Zhu KK, Dolinski K, Botstein D, Cherry JM. Gene Ontology annotations at SGD: new data sources and annotation methods. Nucleic Acids Res 2007; 36:D577-81. [PMID: 17982175 PMCID: PMC2238894 DOI: 10.1093/nar/gkm909] [Citation(s) in RCA: 198] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) collects and organizes biological information about the chromosomal features and gene products of the budding yeast Saccharomyces cerevisiae. Although published data from traditional experimental methods are the primary sources of evidence supporting Gene Ontology (GO) annotations for a gene product, high-throughput experiments and computational predictions can also provide valuable insights in the absence of an extensive body of literature. Therefore, GO annotations available at SGD now include high-throughput data as well as computational predictions provided by the GO Annotation Project (GOA UniProt; http://www.ebi.ac.uk/GOA/). Because the annotation method used to assign GO annotations varies by data source, GO resources at SGD have been modified to distinguish data sources and annotation methods. In addition to providing information for genes that have not been experimentally characterized, GO annotations from independent sources can be compared to those made by SGD to help keep the literature-based GO annotations current.
Collapse
Affiliation(s)
- Eurie L Hong
- Department of Genetics, Stanford University, Stanford, CA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Nash R, Weng S, Hitz B, Balakrishnan R, Christie KR, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hirschman JE, Hong EL, Livstone MS, Oughtred R, Park J, Skrzypek M, Theesfeld CL, Binkley G, Dong Q, Lane C, Miyasato S, Sethuraman A, Schroeder M, Dolinski K, Botstein D, Cherry JM. Expanded protein information at SGD: new pages and proteome browser. Nucleic Acids Res 2006; 35:D468-71. [PMID: 17142221 PMCID: PMC1669759 DOI: 10.1093/nar/gkl931] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
The recent explosion in protein data generated from both directed small-scale studies and large-scale proteomics efforts has greatly expanded the quantity of available protein information and has prompted the Saccharomyces Genome Database (SGD; ) to enhance the depth and accessibility of protein annotations. In particular, we have expanded ongoing efforts to improve the integration of experimental information and sequence-based predictions and have redesigned the protein information web pages. A key feature of this redesign is the development of a GBrowse-derived interactive Proteome Browser customized to improve the visualization of sequence-based protein information. This Proteome Browser has enabled SGD to unify the display of hidden Markov model (HMM) domains, protein family HMMs, motifs, transmembrane regions, signal peptides, hydropathy plots and profile hits using several popular prediction algorithms. In addition, a physico-chemical properties page has been introduced to provide easy access to basic protein information. Improvements to the layout of the Protein Information page and integration of the Proteome Browser will facilitate the ongoing expansion of sequence-specific experimental information captured in SGD, including post-translational modifications and other user-defined annotations. Finally, SGD continues to improve upon the availability of genetic and physical interaction data in an ongoing collaboration with BioGRID by providing direct access to more than 82 000 manually-curated interactions.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | - Michael S. Livstone
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | - Rose Oughtred
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | | | | | | | | | | | | | | | | | - Mark Schroeder
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | - Kara Dolinski
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | - David Botstein
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | - J. Michael Cherry
- To whom correspondence should be addressed. Tel: +1 650 723 7541; Fax: +1 650 725 1534;
| |
Collapse
|
14
|
Hirschman JE, Balakrishnan R, Christie KR, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hong EL, Livstone MS, Nash R, Park J, Oughtred R, Skrzypek M, Starr B, Theesfeld CL, Williams J, Andrada R, Binkley G, Dong Q, Lane C, Miyasato S, Sethuraman A, Schroeder M, Thanawala MK, Weng S, Dolinski K, Botstein D, Cherry JM. Genome Snapshot: a new resource at the Saccharomyces Genome Database (SGD) presenting an overview of the Saccharomyces cerevisiae genome. Nucleic Acids Res 2006; 34:D442-5. [PMID: 16381907 PMCID: PMC1347479 DOI: 10.1093/nar/gkj117] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Sequencing and annotation of the entire Saccharomyces cerevisiae genome has made it possible to gain a genome-wide perspective on yeast genes and gene products. To make this information available on an ongoing basis, the Saccharomyces Genome Database (SGD) () has created the Genome Snapshot (). The Genome Snapshot summarizes the current state of knowledge about the genes and chromosomal features of S.cerevisiae. The information is organized into two categories: (i) number of each type of chromosomal feature annotated in the genome and (ii) number and distribution of genes annotated to Gene Ontology terms. Detailed lists are accessible through SGD's Advanced Search tool (), and all the data presented on this page are available from the SGD ftp site ().
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Michael S. Livstone
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | | | | | - Rose Oughtred
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | | | | | | | | | | | | | | | | | | | | | - Mark Schroeder
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | | | | | - Kara Dolinski
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | - David Botstein
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | - J. Michael Cherry
- To whom correspondence should be addressed. Tel: +1 650 723 7541; Fax: +1 650 725 1534;
| |
Collapse
|
15
|
Balakrishnan R, Christie KR, Costanzo MC, Dolinski K, Dwight SS, Engel SR, Fisk DG, Hirschman JE, Hong EL, Nash R, Oughtred R, Skrzypek M, Theesfeld CL, Binkley G, Dong Q, Lane C, Sethuraman A, Weng S, Botstein D, Cherry JM. Fungal BLAST and Model Organism BLASTP Best Hits: new comparison resources at the Saccharomyces Genome Database (SGD). Nucleic Acids Res 2005; 33:D374-7. [PMID: 15608219 PMCID: PMC539977 DOI: 10.1093/nar/gki023] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is a scientific database of gene, protein and genomic information for the yeast Saccharomyces cerevisiae. SGD has recently developed two new resources that facilitate nucleotide and protein sequence comparisons between S.cerevisiae and other organisms. The Fungal BLAST tool provides directed searches against all fungal nucleotide and protein sequences available from GenBank, divided into categories according to organism, status of completeness and annotation, and source. The Model Organism BLASTP Best Hits resource displays, for each S.cerevisiae protein, the single most similar protein from several model organisms and presents links to the database pages of those proteins, facilitating access to curated information about potential orthologs of yeast proteins.
Collapse
Affiliation(s)
- Rama Balakrishnan
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Dwight SS, Balakrishnan R, Christie KR, Costanzo MC, Dolinski K, Engel SR, Feierbach B, Fisk DG, Hirschman J, Hong EL, Issel-Tarver L, Nash RS, Sethuraman A, Starr B, Theesfeld CL, Andrada R, Binkley G, Dong Q, Lane C, Schroeder M, Weng S, Botstein D, Cherry JM. Saccharomyces genome database: underlying principles and organisation. Brief Bioinform 2004; 5:9-22. [PMID: 15153302 PMCID: PMC3037832 DOI: 10.1093/bib/5.1.9] [Citation(s) in RCA: 73] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
A scientific database can be a powerful tool for biologists in an era where large-scale genomic analysis, combined with smaller-scale scientific results, provides new insights into the roles of genes and their products in the cell. However, the collection and assimilation of data is, in itself, not enough to make a database useful. The data must be incorporated into the database and presented to the user in an intuitive and biologically significant manner. Most importantly, this presentation must be driven by the user's point of view; that is, from a biological perspective. The success of a scientific database can therefore be measured by the response of its users - statistically, by usage numbers and, in a less quantifiable way, by its relationship with the community it serves and its ability to serve as a model for similar projects. Since its inception ten years ago, the Saccharomyces Genome Database (SGD) has seen a dramatic increase in its usage, has developed and maintained a positive working relationship with the yeast research community, and has served as a template for at least one other database. The success of SGD, as measured by these criteria, is due in large part to philosophies that have guided its mission and organisation since it was established in 1993. This paper aims to detail these philosophies and how they shape the organisation and presentation of the database.
Collapse
Affiliation(s)
- Selina S Dwight
- Department of Genetics, School of Medicine, Standford University, Standford, CA 94305-5120, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R, Eilbeck K, Lewis S, Marshall B, Mungall C, Richter J, Rubin GM, Blake JA, Bult C, Dolan M, Drabkin H, Eppig JT, Hill DP, Ni L, Ringwald M, Balakrishnan R, Cherry JM, Christie KR, Costanzo MC, Dwight SS, Engel S, Fisk DG, Hirschman JE, Hong EL, Nash RS, Sethuraman A, Theesfeld CL, Botstein D, Dolinski K, Feierbach B, Berardini T, Mundodi S, Rhee SY, Apweiler R, Barrell D, Camon E, Dimmer E, Lee V, Chisholm R, Gaudet P, Kibbe W, Kishore R, Schwarz EM, Sternberg P, Gwinn M, Hannick L, Wortman J, Berriman M, Wood V, de la Cruz N, Tonellato P, Jaiswal P, Seigfried T, White R. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res 2004; 32:D258-61. [PMID: 14681407 PMCID: PMC308770 DOI: 10.1093/nar/gkh036] [Citation(s) in RCA: 2541] [Impact Index Per Article: 127.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Gene Ontology (GO) project (http://www. geneontology.org/) provides structured, controlled vocabularies and classifications that cover several domains of molecular and cellular biology and are freely available for community use in the annotation of genes, gene products and sequences. Many model organism databases and genome annotation groups use the GO and contribute their annotation sets to the GO resource. The GO database integrates the vocabularies and contributed annotations and provides full access to this information in several formats. Members of the GO Consortium continually work collectively, involving outside experts as needed, to expand and update the GO vocabularies. The GO Web resource also provides access to extensive documentation about the GO project and links to applications that use GO data for functional analyses.
Collapse
|
18
|
Christie KR, Weng S, Balakrishnan R, Costanzo MC, Dolinski K, Dwight SS, Engel SR, Feierbach B, Fisk DG, Hirschman JE, Hong EL, Issel-Tarver L, Nash R, Sethuraman A, Starr B, Theesfeld CL, Andrada R, Binkley G, Dong Q, Lane C, Schroeder M, Botstein D, Cherry JM. Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms. Nucleic Acids Res 2004; 32:D311-4. [PMID: 14681421 PMCID: PMC308767 DOI: 10.1093/nar/gkh033] [Citation(s) in RCA: 202] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/), a scientific database of the molecular biology and genetics of the yeast Saccharomyces cerevisiae, has recently developed several new resources that allow the comparison and integration of information on a genome-wide scale, enabling the user not only to find detailed information about individual genes, but also to make connections across groups of genes with common features and across different species. The Fungal Alignment Viewer displays alignments of sequences from multiple fungal genomes, while the Sequence Similarity Query tool displays PSI-BLAST alignments of each S.cerevisiae protein with similar proteins from any species whose sequences are contained in the non-redundant (nr) protein data set at NCBI. The Yeast Biochemical Pathways tool integrates groups of genes by their common roles in metabolism and displays the metabolic pathways in a graphical form. Finally, the Find Chromosomal Features search interface provides a versatile tool for querying multiple types of information in SGD.
Collapse
Affiliation(s)
- Karen R Christie
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Abstract
The Gene Ontology (GO) Consortium has produced a controlled vocabulary for annotation of gene function that is used in many organism-specific gene annotation databases. This allows the prediction of gene function based on patterns of annotation. For example, if annotations for two attributes tend to occur together in a database, then a gene holding one attribute is likely to hold the other as well. We modeled the relationships among GO attributes with decision trees and Bayesian networks, using the annotations in the Saccharomyces Genome Database (SGD) and in FlyBase as training data. We tested the models using cross-validation, and we manually assessed 100 gene-attribute associations that were predicted by the models but that were not present in the SGD or FlyBase databases. Of the 100 manually assessed associations, 41 were judged to be true, and another 42 were judged to be plausible.
Collapse
Affiliation(s)
- Oliver D King
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts 02115, USA
| | | | | | | | | |
Collapse
|
20
|
Weng S, Dong Q, Balakrishnan R, Christie K, Costanzo M, Dolinski K, Dwight SS, Engel S, Fisk DG, Hong E, Issel-Tarver L, Sethuraman A, Theesfeld C, Andrada R, Binkley G, Lane C, Schroeder M, Botstein D, Michael Cherry J. Saccharomyces Genome Database (SGD) provides biochemical and structural information for budding yeast proteins. Nucleic Acids Res 2003; 31:216-8. [PMID: 12519985 PMCID: PMC165501 DOI: 10.1093/nar/gkg054] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The Saccharomyces Genome Database (SGD: http://genome-www.stanford.edu/Saccharomyces/) has recently developed new resources to provide more complete information about proteins from the budding yeast Saccharomyces cerevisiae. The PDB Homologs page provides structural information from the Protein Data Bank (PDB) about yeast proteins and/or their homologs. SGD has also created a resource that utilizes the eMOTIF database for motif information about a given protein. A third new resource is the Protein Information page, which contains protein physical and chemical properties, such as molecular weight and hydropathicity scores, predicted from the translated ORF sequence.
Collapse
Affiliation(s)
- Shuai Weng
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Issel-Tarver L, Christie KR, Dolinski K, Andrada R, Balakrishnan R, Ball CA, Binkley G, Dong S, Dwight SS, Fisk DG, Harris M, Schroeder M, Sethuraman A, Tse K, Weng S, Botstein D, Cherry JM. Saccharomyces Genome Database. Methods Enzymol 2002; 350:329-46. [PMID: 12073322 DOI: 10.1016/s0076-6879(02)50972-1] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Laurie Issel-Tarver
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Dwight SS, Harris MA, Dolinski K, Ball CA, Binkley G, Christie KR, Fisk DG, Issel-Tarver L, Schroeder M, Sherlock G, Sethuraman A, Weng S, Botstein D, Cherry JM. Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO). Nucleic Acids Res 2002; 30:69-72. [PMID: 11752257 PMCID: PMC99086 DOI: 10.1093/nar/30.1.69] [Citation(s) in RCA: 272] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Saccharomyces Genome Database (SGD) resources, ranging from genetic and physical maps to genome-wide analysis tools, reflect the scientific progress in identifying genes and their functions over the last decade. As emphasis shifts from identification of the genes to identification of the role of their gene products in the cell, SGD seeks to provide its users with annotations that will allow relationships to be made between gene products, both within Saccharomyces cerevisiae and across species. To this end, SGD is annotating genes to the Gene Ontology (GO), a structured representation of biological knowledge that can be shared across species. The GO consists of three separate ontologies describing molecular function, biological process and cellular component. The goal is to use published information to associate each characterized S.cerevisiae gene product with one or more GO terms from each of the three ontologies. To be useful, this must be done in a manner that allows accurate associations based on experimental evidence, modifications to GO when necessary, and careful documentation of the annotations through evidence codes for given citations. Reaching this goal is an ongoing process at SGD. For information on the current progress of GO annotations at SGD and other participating databases, as well as a description of each of the three ontologies, please visit the GO Consortium page at http://www.geneontology.org. SGD gene associations to GO can be found by visiting our site at http://genome-www.stanford.edu/Saccharomyces/.
Collapse
Affiliation(s)
- Selina S Dwight
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Ball CA, Jin H, Sherlock G, Weng S, Matese JC, Andrada R, Binkley G, Dolinski K, Dwight SS, Harris MA, Issel-Tarver L, Schroeder M, Botstein D, Cherry JM. Saccharomyces Genome Database provides tools to survey gene expression and functional analysis data. Nucleic Acids Res 2001; 29:80-1. [PMID: 11125055 PMCID: PMC29796 DOI: 10.1093/nar/29.1.80] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Upon the completion of the SACCHAROMYCES: cerevisiae genomic sequence in 1996 [Goffeau,A. et al. (1997) NATURE:, 387, 5], several creative and ambitious projects have been initiated to explore the functions of gene products or gene expression on a genome-wide scale. To help researchers take advantage of these projects, the SACCHAROMYCES: Genome Database (SGD) has created two new tools, Function Junction and Expression Connection. Together, the tools form a central resource for querying multiple large-scale analysis projects for data about individual genes. Function Junction provides information from diverse projects that shed light on the role a gene product plays in the cell, while Expression Connection delivers information produced by the ever-increasing number of microarray projects. WWW access to SGD is available at genome-www.stanford. edu/Saccharomyces/.
Collapse
Affiliation(s)
- C A Ball
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Sherlock G, Hernandez-Boussard T, Kasarskis A, Binkley G, Matese JC, Dwight SS, Kaloper M, Weng S, Jin H, Ball CA, Eisen MB, Spellman PT, Brown PO, Botstein D, Cherry JM. The Stanford Microarray Database. Nucleic Acids Res 2001; 29:152-5. [PMID: 11125075 PMCID: PMC29818 DOI: 10.1093/nar/29.1.152] [Citation(s) in RCA: 342] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Stanford Microarray Database (SMD) stores raw and normalized data from microarray experiments, and provides web interfaces for researchers to retrieve, analyze and visualize their data. The two immediate goals for SMD are to serve as a storage site for microarray data from ongoing research at Stanford University, and to facilitate the public dissemination of that data once published, or released by the researcher. Of paramount importance is the connection of microarray data with the biological data that pertains to the DNA deposited on the microarray (genes, clones etc.). SMD makes use of many public resources to connect expression information to the relevant biology, including SGD [Ball,C.A., Dolinski,K., Dwight,S.S., Harris,M.A., Issel-Tarver,L., Kasarskis,A., Scafe,C.R., Sherlock,G., Binkley,G., Jin,H. et al. (2000) Nucleic Acids Res., 28, 77-80], YPD and WormPD [Costanzo,M.C., Hogan,J.D., Cusick,M.E., Davis,B.P., Fancher,A.M., Hodges,P.E., Kondu,P., Lengieza,C., Lew-Smith,J.E., Lingner,C. et al. (2000) Nucleic Acids Res., 28, 73-76], Unigene [Wheeler,D.L., Chappey,C., Lash,A.E., Leipe,D.D., Madden,T.L., Schuler,G.D., Tatusova,T.A. and Rapp,B.A. (2000) Nucleic Acids Res., 28, 10-14], dbEST [Boguski,M.S., Lowe,T.M. and Tolstoshev,C.M. (1993) Nature Genet., 4, 332-333] and SWISS-PROT [Bairoch,A. and Apweiler,R. (2000) Nucleic Acids Res., 28, 45-48] and can be accessed at http://genome-www.stanford.edu/microarray.
Collapse
Affiliation(s)
- G Sherlock
- Department of Genetics, Center for Clinical Sciences Research, 269 Campus Drive, Room 2255b, Stanford University, Stanford, CA 94305-5163, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000. [PMID: 10802651 DOI: 10.1038/75556.gene] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
Collapse
Affiliation(s)
- M Ashburner
- Department of Genetics, Stanford University School of Medicine, California, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000; 25:25-9. [PMID: 10802651 PMCID: PMC3037419 DOI: 10.1038/75556] [Citation(s) in RCA: 26075] [Impact Index Per Article: 1086.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
Collapse
Affiliation(s)
- M Ashburner
- Department of Genetics, Stanford University School of Medicine, California, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Ball CA, Dolinski K, Dwight SS, Harris MA, Issel-Tarver L, Kasarskis A, Scafe CR, Sherlock G, Binkley G, Jin H, Kaloper M, Orr SD, Schroeder M, Weng S, Zhu Y, Botstein D, Cherry JM. Integrating functional genomic information into the Saccharomyces genome database. Nucleic Acids Res 2000; 28:77-80. [PMID: 10592186 PMCID: PMC102447 DOI: 10.1093/nar/28.1.77] [Citation(s) in RCA: 78] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/1999] [Revised: 10/07/1999] [Accepted: 10/07/1999] [Indexed: 11/14/2022] Open
Abstract
The Saccharomyces Genome Database (SGD) stores and organizes information about the nearly 6200 genes in the yeast genome. The information is organized around the 'locus page' and directs users to the detailed information they seek. SGD is endeavoring to integrate the existing information about yeast genes with the large volume of data generated by functional analyses that are beginning to appear in the literature and on web sites. New features will include searches of systematic analyses and Gene Summary Paragraphs that succinctly review the literature for each gene. In addition to current information, such as gene product and phenotype descriptions, the new locus page will also describe a gene product's cellular process, function and localization using a controlled vocabulary developed in collaboration with two other model organism databases. We describe these developments in SGD through the newly reorganized locus page. The SGD is accessible via the WWW at http://genome-www.stanford.edu/Saccharomyces/
Collapse
Affiliation(s)
- C A Ball
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Chervitz SA, Hester ET, Ball CA, Dolinski K, Dwight SS, Harris MA, Juvik G, Malekian A, Roberts S, Roe T, Scafe C, Schroeder M, Sherlock G, Weng S, Zhu Y, Cherry JM, Botstein D. Using the Saccharomyces Genome Database (SGD) for analysis of protein similarities and structure. Nucleic Acids Res 1999; 27:74-8. [PMID: 9847146 PMCID: PMC148101 DOI: 10.1093/nar/27.1.74] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Saccharomyces Genome Database (SGD) collects and organizes information about the molecular biology and genetics of the yeast Saccharomyces cerevisiae. The latest protein structure and comparison tools available at SGD are presented here. With the completion of the yeast sequence and the Caenorhabditis elegans sequence soon to follow, comparison of proteins from complete eukaryotic proteomes will be an extremely powerful way to learn more about a particular protein's structure, its function, and its relationships with other proteins. SGD can be accessed through the World Wide Web at http://genome-www.stanford.edu/Saccharomyces/
Collapse
Affiliation(s)
- S A Chervitz
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Chervitz SA, Aravind L, Sherlock G, Ball CA, Koonin EV, Dwight SS, Harris MA, Dolinski K, Mohr S, Smith T, Weng S, Cherry JM, Botstein D. Comparison of the complete protein sets of worm and yeast: orthology and divergence. Science 1998; 282:2022-8. [PMID: 9851918 PMCID: PMC3057080 DOI: 10.1126/science.282.5396.2022] [Citation(s) in RCA: 309] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Comparative analysis of predicted protein sequences encoded by the genomes of Caenorhabditis elegans and Saccharomyces cerevisiae suggests that most of the core biological functions are carried out by orthologous proteins (proteins of different species that can be traced back to a common ancestor) that occur in comparable numbers. The specialized processes of signal transduction and regulatory control that are unique to the multicellular worm appear to use novel proteins, many of which re-use conserved domains. Major expansion of the number of some of these domains seen in the worm may have contributed to the advent of multicellularity. The proteins conserved in yeast and worm are likely to have orthologs throughout eukaryotes; in contrast, the proteins unique to the worm may well define metazoans.
Collapse
Affiliation(s)
- S A Chervitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5120, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Cherry JM, Adler C, Ball C, Chervitz SA, Dwight SS, Hester ET, Jia Y, Juvik G, Roe T, Schroeder M, Weng S, Botstein D. SGD: Saccharomyces Genome Database. Nucleic Acids Res 1998; 26:73-9. [PMID: 9399804 PMCID: PMC147204 DOI: 10.1093/nar/26.1.73] [Citation(s) in RCA: 655] [Impact Index Per Article: 25.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
The Saccharomyces Genome Database (SGD) provides Internet access to the complete Saccharomyces cerevisiae genomic sequence, its genes and their products, the phenotypes of its mutants, and the literature supporting these data. The amount of information and the number of features provided by SGD have increased greatly following the release of the S.cerevisiae genomic sequence, which is currently the only complete sequence of a eukaryotic genome. SGD aids researchers by providing not only basic information, but also tools such as sequence similarity searching that lead to detailed information about features of the genome and relationships between genes. SGD presents information using a variety of user-friendly, dynamically created graphical displays illustrating physical, genetic and sequence feature maps. SGD can be accessed via the World Wide Web at http://genome-www.stanford.edu/Saccharomyces/
Collapse
Affiliation(s)
- J M Cherry
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|