1
|
Sternberg PW, Van Auken K, Wang Q, Wright A, Yook K, Zarowiecki M, Arnaboldi V, Becerra A, Brown S, Cain S, Chan J, Chen WJ, Cho J, Davis P, Diamantakis S, Dyer S, Grigoriadis D, Grove CA, Harris T, Howe K, Kishore R, Lee R, Longden I, Luypaert M, Müller HM, Nuin P, Quinton-Tulloch M, Raciti D, Schedl T, Schindelman G, Stein L. WormBase 2024: status and transitioning to Alliance infrastructure. Genetics 2024; 227:iyae050. [PMID: 38573366 PMCID: PMC11075546 DOI: 10.1093/genetics/iyae050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 03/19/2024] [Accepted: 03/20/2024] [Indexed: 04/05/2024] Open
Abstract
WormBase has been the major repository and knowledgebase of information about the genome and genetics of Caenorhabditis elegans and other nematodes of experimental interest for over 2 decades. We have 3 goals: to keep current with the fast-paced C. elegans research, to provide better integration with other resources, and to be sustainable. Here, we discuss the current state of WormBase as well as progress and plans for moving core WormBase infrastructure to the Alliance of Genome Resources (the Alliance). As an Alliance member, WormBase will continue to interact with the C. elegans community, develop new features as needed, and curate key information from the literature and large-scale projects.
Collapse
Affiliation(s)
- Paul W Sternberg
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Kimberly Van Auken
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Qinghua Wang
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Adam Wright
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Karen Yook
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Magdalena Zarowiecki
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Valerio Arnaboldi
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Andrés Becerra
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Stephanie Brown
- School of Infection and Immunity, University of Glasgow, Glasgow G12 8TA, UK
| | - Scott Cain
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Juancarlos Chan
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Wen J Chen
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Jaehyoung Cho
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paul Davis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Stavros Diamantakis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Sarah Dyer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | | | - Christian A Grove
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Todd Harris
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Kevin Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Ranjana Kishore
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Raymond Lee
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Ian Longden
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Manuel Luypaert
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Hans-Michael Müller
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paulo Nuin
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Mark Quinton-Tulloch
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Daniela Raciti
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Tim Schedl
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Gary Schindelman
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Lincoln Stein
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| |
Collapse
|
2
|
Fischer A, Niklowitz P, Menke T, Döring F. Coenzyme Q regulates the expression of essential genes of the pathogen- and xenobiotic-associated defense pathway in C. elegans. J Clin Biochem Nutr 2015; 57:171-7. [PMID: 26566301 PMCID: PMC4639588 DOI: 10.3164/jcbn.15-46] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 05/01/2015] [Indexed: 11/22/2022] Open
Abstract
Coenzyme Q (CoQ) is necessary for mitochondrial energy production and modulates the expression of genes that are important for inflammatory processes, growth and detoxification reactions. A cellular surveillance-activated detoxification and defenses (cSADDs) pathway has been recently identified in C. elegans. The down-regulation of the components of the cSADDs pathway initiates an aversion behavior of the nematode. Here we hypothesized that CoQ regulates genes of the cSADDs pathway. To verify this we generated CoQ-deficient worms ("CoQ-free") and performed whole-genome expression profiling. We found about 30% (120 genes) of the cSADDs pathway genes were differentially regulated under CoQ-deficient condition. Remarkably, 83% of these genes were down-regulated. The majority of the CoQ-sensitive cSADDs pathway genes encode for proteins involved in larval development (enrichment score (ES) = 38.0, p = 5.0E(-37)), aminoacyl-tRNA biosynthesis, proteasome function (ES 8.2, p = 5.9E(-31)) and mitochondria function (ES 3.4, p = 1.7E(-5)). 67% (80 genes) of these genes are categorized as lethal. Thus it is shown for the first time that CoQ regulates a substantial number of essential genes that function in the evolutionary conserved cellular surveillance-activated detoxification and defenses pathway in C. elegans.
Collapse
Affiliation(s)
- Alexandra Fischer
- Institute of Human Nutrition and Food Science, Division of Molecular Prevention, Christian-Albrechts-University of Kiel, Heinrich-Hecht-Platz 10, 24118 Kiel, Germany
| | - Petra Niklowitz
- Children's Hospital of Datteln, Witten/Herdecke University, Dr.-Friedrich-Steiner Str. 5, 45711 Datteln, Germany
| | - Thomas Menke
- Children's Hospital of Datteln, Witten/Herdecke University, Dr.-Friedrich-Steiner Str. 5, 45711 Datteln, Germany
| | - Frank Döring
- Institute of Human Nutrition and Food Science, Division of Molecular Prevention, Christian-Albrechts-University of Kiel, Heinrich-Hecht-Platz 10, 24118 Kiel, Germany
| |
Collapse
|
3
|
Cantacessi C, Hofmann A, Campbell BE, Gasser RB. Impact of next-generation technologies on exploring socioeconomically important parasites and developing new interventions. Methods Mol Biol 2015; 1247:437-474. [PMID: 25399114 DOI: 10.1007/978-1-4939-2004-4_31] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
High-throughput molecular and computer technologies have become instrumental for systems biological explorations of pathogens, including parasites. For instance, investigations of the transcriptomes of different developmental stages of parasitic nematodes give insights into gene expression, regulation and function in a parasite, which is a significant step to understanding their biology, as well as interactions with their host(s) and disease. This chapter (1) gives a background on some key parasitic nematodes of socioeconomic importance, (2) describes sequencing and bioinformatic technologies for large-scale studies of the transcriptomes and genomes of these parasites, (3) provides some recent examples of applications and (4) emphasizes the prospects of fundamental biological explorations of parasites using these technologies for the development of new interventions to combat parasitic diseases.
Collapse
Affiliation(s)
- Cinzia Cantacessi
- Department of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC, 3010, Australia
| | | | | | | |
Collapse
|
4
|
Hashimshony T, Feder M, Levin M, Hall BK, Yanai I. Spatiotemporal transcriptomics reveals the evolutionary history of the endoderm germ layer. Nature 2014; 519:219-22. [PMID: 25487147 PMCID: PMC4359913 DOI: 10.1038/nature13996] [Citation(s) in RCA: 123] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2014] [Accepted: 10/23/2014] [Indexed: 01/19/2023]
Abstract
The concept of germ layers has been one of the foremost organizing principles in developmental biology, classification, systematics and evolution for 150 years (refs 1 - 3). Of the three germ layers, the mesoderm is found in bilaterian animals but is absent in species in the phyla Cnidaria and Ctenophora, which has been taken as evidence that the mesoderm was the final germ layer to evolve. The origin of the ectoderm and endoderm germ layers, however, remains unclear, with models supporting the antecedence of each as well as a simultaneous origin. Here we determine the temporal and spatial components of gene expression spanning embryonic development for all Caenorhabditis elegans genes and use it to determine the evolutionary ages of the germ layers. The gene expression program of the mesoderm is induced after those of the ectoderm and endoderm, thus making it the last germ layer both to evolve and to develop. Strikingly, the C. elegans endoderm and ectoderm expression programs do not co-induce; rather the endoderm activates earlier, and this is also observed in the expression of endoderm orthologues during the embryology of the frog Xenopus tropicalis, the sea anemone Nematostella vectensis and the sponge Amphimedon queenslandica. Querying the phylogenetic ages of specifically expressed genes reveals that the endoderm comprises older genes. Taken together, we propose that the endoderm program dates back to the origin of multicellularity, whereas the ectoderm originated as a secondary germ layer freed from ancestral feeding functions.
Collapse
Affiliation(s)
- Tamar Hashimshony
- Department of Biology, Technion - Israel Institute of Technology, Haifa 32000, Israel
| | - Martin Feder
- Department of Biology, Technion - Israel Institute of Technology, Haifa 32000, Israel
| | - Michal Levin
- Department of Biology, Technion - Israel Institute of Technology, Haifa 32000, Israel
| | - Brian K Hall
- Department of Biology, Dalhousie University, Halifax, Nova Scotia B3H 4JI, Canada
| | - Itai Yanai
- Department of Biology, Technion - Israel Institute of Technology, Haifa 32000, Israel
| |
Collapse
|
5
|
Potential conservation of circadian clock proteins in the phylum Nematoda as revealed by bioinformatic searches. PLoS One 2014; 9:e112871. [PMID: 25396739 PMCID: PMC4232591 DOI: 10.1371/journal.pone.0112871] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2014] [Accepted: 10/15/2014] [Indexed: 11/19/2022] Open
Abstract
Although several circadian rhythms have been described in C. elegans, its molecular clock remains elusive. In this work we employed a novel bioinformatic approach, applying probabilistic methodologies, to search for circadian clock proteins of several of the best studied circadian model organisms of different taxa (Mus musculus, Drosophila melanogaster, Neurospora crassa, Arabidopsis thaliana and Synechoccocus elongatus) in the proteomes of C. elegans and other members of the phylum Nematoda. With this approach we found that the Nematoda contain proteins most related to the core and accessory proteins of the insect and mammalian clocks, which provide new insights into the nematode clock and the evolution of the circadian system.
Collapse
|
6
|
McCormick MA, Kennedy BK. Genome-scale studies of aging: challenges and opportunities. Curr Genomics 2013; 13:500-7. [PMID: 23633910 PMCID: PMC3468883 DOI: 10.2174/138920212803251454] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2012] [Revised: 06/08/2012] [Accepted: 07/25/2012] [Indexed: 12/21/2022] Open
Abstract
Whole-genome studies involving a phenotype of interest are increasingly prevalent, in part due to a dramatic increase in speed at which many high throughput technologies can be performed coupled to simultaneous decreases in cost. This type of genome-scale methodology has been applied to the phenotype of lifespan, as well as to whole-transcriptome changes during the aging process or in mutants affecting aging. The value of high throughput discovery-based science in this field is clearly evident, but will it yield a true systems-level understanding of the aging process? Here we review some of this work to date, focusing on recent findings and the unanswered puzzles to which they point. In this context, we also discuss recent technological advances and some of the likely future directions that they portend.
Collapse
|
7
|
Heizer E, Zarlenga DS, Rosa B, Gao X, Gasser RB, De Graef J, Geldhof P, Mitreva M. Transcriptome analyses reveal protein and domain families that delineate stage-related development in the economically important parasitic nematodes, Ostertagia ostertagi and Cooperia oncophora. BMC Genomics 2013; 14:118. [PMID: 23432754 PMCID: PMC3599158 DOI: 10.1186/1471-2164-14-118] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2012] [Accepted: 02/11/2013] [Indexed: 12/21/2022] Open
Abstract
Background Cooperia oncophora and Ostertagia ostertagi are among the most important gastrointestinal nematodes of cattle worldwide. The economic losses caused by these parasites are on the order of hundreds of millions of dollars per year. Conventional treatment of these parasites is through anthelmintic drugs; however, as resistance to anthelmintics increases, overall effectiveness has begun decreasing. New methods of control and alternative drug targets are necessary. In-depth analysis of transcriptomic data can help provide these targets. Results The assembly of 8.7 million and 11 million sequences from C. oncophora and O. ostertagi, respectively, resulted in 29,900 and 34,792 transcripts. Among these, 69% and 73% of the predicted peptides encoded by C. oncophora and O. ostertagi had homologues in other nematodes. Approximately 21% and 24% were constitutively expressed in both species, respectively; however, the numbers of transcripts that were stage specific were much smaller (~1% of the transcripts expressed in a stage). Approximately 21% of the transcripts in C. oncophora and 22% in O. ostertagi were up-regulated in a particular stage. Functional molecular signatures were detected for 46% and 35% of the transcripts in C. oncophora and O. ostertagi, respectively. More in-depth examinations of the most prevalent domains led to knowledge of gene expression changes between the free-living (egg, L1, L2 and L3 sheathed) and parasitic (L3 exsheathed, L4, and adult) stages. Domains previously implicated in growth and development such as chromo domains and the MADF domain tended to dominate in the free-living stages. In contrast, domains potentially involved in feeding such as the zinc finger and CAP domains dominated in the parasitic stages. Pathway analyses showed significant associations between life-cycle stages and peptides involved in energy metabolism in O. ostertagi whereas metabolism of cofactors and vitamins were specifically up-regulated in the parasitic stages of C. oncophora. Substantial differences were observed also between Gene Ontology terms associated with free-living and parasitic stages. Conclusions This study characterized transcriptomes from multiple life stages from both C. oncophora and O. ostertagi. These data represent an important resource for studying these parasites. The results of this study show distinct differences in the genes involved in the free-living and parasitic life cycle stages. The data produced will enable better annotation of the upcoming genome sequences and will allow future comparative analyses of the biology, evolution and adaptation to parasitism in nematodes.
Collapse
Affiliation(s)
- Esley Heizer
- The Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA
| | | | | | | | | | | | | | | |
Collapse
|
8
|
Xin X, Gfeller D, Cheng J, Tonikian R, Sun L, Guo A, Lopez L, Pavlenco A, Akintobi A, Zhang Y, Rual JF, Currell B, Seshagiri S, Hao T, Yang X, Shen YA, Salehi-Ashtiani K, Li J, Cheng AT, Bouamalay D, Lugari A, Hill DE, Grimes ML, Drubin DG, Grant BD, Vidal M, Boone C, Sidhu SS, Bader GD. SH3 interactome conserves general function over specific form. Mol Syst Biol 2013; 9:652. [PMID: 23549480 PMCID: PMC3658277 DOI: 10.1038/msb.2013.9] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2012] [Accepted: 02/20/2013] [Indexed: 12/20/2022] Open
Abstract
Src homology 3 (SH3) domains bind peptides to mediate protein-protein interactions that assemble and regulate dynamic biological processes. We surveyed the repertoire of SH3 binding specificity using peptide phage display in a metazoan, the worm Caenorhabditis elegans, and discovered that it structurally mirrors that of the budding yeast Saccharomyces cerevisiae. We then mapped the worm SH3 interactome using stringent yeast two-hybrid and compared it with the equivalent map for yeast. We found that the worm SH3 interactome resembles the analogous yeast network because it is significantly enriched for proteins with roles in endocytosis. Nevertheless, orthologous SH3 domain-mediated interactions are highly rewired. Our results suggest a model of network evolution where general function of the SH3 domain network is conserved over its specific form.
Collapse
Affiliation(s)
- Xiaofeng Xin
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - David Gfeller
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | - Jackie Cheng
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
| | - Raffi Tonikian
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Lin Sun
- Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, NJ, USA
| | - Ailan Guo
- Cell Signaling Technology, Danvers, MA, USA
| | - Lianet Lopez
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | - Alevtina Pavlenco
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | - Adenrele Akintobi
- Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, NJ, USA
| | - Yingnan Zhang
- Department of Early Discovery Biochemistry, Genentech, South San Francisco, CA, USA
| | - Jean-François Rual
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Bridget Currell
- Department of Molecular Biology, Genentech, South San Francisco, CA, USA
| | | | - Tong Hao
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Xinping Yang
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Yun A Shen
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Kourosh Salehi-Ashtiani
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Jingjing Li
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Aaron T Cheng
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
| | - Dryden Bouamalay
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
| | - Adrien Lugari
- IMR Laboratory, UPR 3243, Institut de Microbiologie de la Méditérannée, CNRS and Aix-Marseille Université, Marseille Cedex 20, France
| | - David E Hill
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Mark L Grimes
- Division of Biological Sciences, Center for Structural and Functional Neuroscience, The University of Montana, Missoula, MT, USA
| | - David G Drubin
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
| | - Barth D Grant
- Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, NJ, USA
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Charles Boone
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Sachdev S Sidhu
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Gary D Bader
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
9
|
Couillault C, Fourquet P, Pophillat M, Ewbank JJ. A UPR-independent infection-specific role for a BiP/GRP78 protein in the control of antimicrobial peptide expression in C. elegans epidermis. Virulence 2012; 3:299-308. [PMID: 22546897 PMCID: PMC3442842 DOI: 10.4161/viru.20384] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
The nematode C. elegans responds to infection by the fungus Drechmeria coniospora with a rapid increase in the expression of antimicrobial peptide genes. To investigate further the molecular basis of this innate immune response, we took a two-dimensional difference in-gel electrophoresis (2D-DIGE) approach to characterize the changes in host protein that accompany infection. We identified a total of 68 proteins from differentially represented spots and their corresponding genes. Through class testing, we identified functional categories that were enriched in our proteomic data set. One of these was “protein processing in endoplasmic reticulum,” pointing to a potential link between innate immunity and endoplasmic reticulum function. This class included HSP-3, a chaperone of the BiP/GRP78 family known to act coordinately in the endoplasmic reticulum with its paralog HSP-4 to regulate the unfolded protein response (UPR). Other studies have shown that infection of C. elegans can provoke a UPR. We observed, however, that in adult C. elegans infection with D. coniospora did not induce a UPR, and conversely, triggering a UPR did not lead to an increase in expression of the well-characterized antimicrobial peptide gene nlp-29. On the other hand, we demonstrated a specific role for hsp-3 in the regulation of nlp-29 after infection that is not shared with hsp-4. Epistasis analysis allowed us to place hsp-3 genetically between the Tribbles-like kinase gene nipi-3 and the protein kinase C delta gene tpa-1. The precise function of hsp-3 has yet to be determined, but these results uncover a hitherto unsuspected link between a BiP/GRP78 family protein and innate immune signaling.
Collapse
Affiliation(s)
- Carole Couillault
- Centre d'Immunologie de Marseille-Luminy, Aix-Marseille Université, Marseille, France
| | | | | | | |
Collapse
|
10
|
Cantacessi C, Campbell BE, Gasser RB. Key strongylid nematodes of animals — Impact of next-generation transcriptomics on systems biology and biotechnology. Biotechnol Adv 2012; 30:469-88. [DOI: 10.1016/j.biotechadv.2011.08.016] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2011] [Revised: 08/09/2011] [Accepted: 08/19/2011] [Indexed: 10/17/2022]
|
11
|
CANTACESSI C, CAMPBELL BE, JEX AR, YOUNG ND, HALL RS, RANGANATHAN S, GASSER RB. Bioinformatics meets parasitology. Parasite Immunol 2012; 34:265-75. [DOI: 10.1111/j.1365-3024.2011.01304.x] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
|
12
|
Vallin E, Gallagher J, Granger L, Martin E, Belougne J, Maurizio J, Duverger Y, Scaglione S, Borrel C, Cortier E, Abouzid K, Carre-Pierrat M, Gieseler K, Ségalat L, Kuwabara PE, Ewbank JJ. A genome-wide collection of Mos1 transposon insertion mutants for the C. elegans research community. PLoS One 2012; 7:e30482. [PMID: 22347378 PMCID: PMC3275553 DOI: 10.1371/journal.pone.0030482] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2011] [Accepted: 12/16/2011] [Indexed: 11/24/2022] Open
Abstract
Methods that use homologous recombination to engineer the genome of C. elegans commonly use strains carrying specific insertions of the heterologous transposon Mos1. A large collection of known Mos1 insertion alleles would therefore be of general interest to the C. elegans research community. We describe here the optimization of a semi-automated methodology for the construction of a substantial collection of Mos1 insertion mutant strains. At peak production, more than 5,000 strains were generated per month. These strains were then subject to molecular analysis, and more than 13,300 Mos1 insertions characterized. In addition to targeting directly more than 4,700 genes, these alleles represent the potential starting point for the engineered deletion of essentially all C. elegans genes and the modification of more than 40% of them. This collection of mutants, generated under the auspices of the European NEMAGENETAG consortium, is publicly available and represents an important research resource.
Collapse
Affiliation(s)
- Elodie Vallin
- Centre de Génétique et de Physiologie Moléculaires et Cellulaires, CNRS UMR 5534, Campus de la Doua, Villeurbanne, France
- Université Claude Bernard Lyon 1, Villeurbanne, France
| | - Joseph Gallagher
- School of Biochemistry, University of Bristol, Bristol, United Kingdom
| | - Laure Granger
- Centre de Génétique et de Physiologie Moléculaires et Cellulaires, CNRS UMR 5534, Campus de la Doua, Villeurbanne, France
- Université Claude Bernard Lyon 1, Villeurbanne, France
| | - Edwige Martin
- Centre de Génétique et de Physiologie Moléculaires et Cellulaires, CNRS UMR 5534, Campus de la Doua, Villeurbanne, France
- Université Claude Bernard Lyon 1, Villeurbanne, France
| | - Jérôme Belougne
- Centre d'Immunologie de Marseille-Luminy, Aix-Marseille University, Marseille, France
- INSERM, U1104, Marseille, France
- CNRS, UMR7280, Marseille, France
| | - Julien Maurizio
- Centre d'Immunologie de Marseille-Luminy, Aix-Marseille University, Marseille, France
- INSERM, U1104, Marseille, France
- CNRS, UMR7280, Marseille, France
| | - Yohann Duverger
- Centre d'Immunologie de Marseille-Luminy, Aix-Marseille University, Marseille, France
- INSERM, U1104, Marseille, France
- CNRS, UMR7280, Marseille, France
| | - Sarah Scaglione
- Centre d'Immunologie de Marseille-Luminy, Aix-Marseille University, Marseille, France
- INSERM, U1104, Marseille, France
- CNRS, UMR7280, Marseille, France
| | - Caroline Borrel
- Centre de Génétique et de Physiologie Moléculaires et Cellulaires, CNRS UMR 5534, Campus de la Doua, Villeurbanne, France
- Université Claude Bernard Lyon 1, Villeurbanne, France
| | - Elisabeth Cortier
- Centre de Génétique et de Physiologie Moléculaires et Cellulaires, CNRS UMR 5534, Campus de la Doua, Villeurbanne, France
- Université Claude Bernard Lyon 1, Villeurbanne, France
| | - Karima Abouzid
- Centre de Génétique et de Physiologie Moléculaires et Cellulaires, CNRS UMR 5534, Campus de la Doua, Villeurbanne, France
- Université Claude Bernard Lyon 1, Villeurbanne, France
| | - Maité Carre-Pierrat
- Centre de Génétique et de Physiologie Moléculaires et Cellulaires, CNRS UMR 5534, Campus de la Doua, Villeurbanne, France
- Université Claude Bernard Lyon 1, Villeurbanne, France
- Plateforme “Biologie de Caenorhabditis elegans”, CNRS UMS3421, Campus de la Doua, Villeurbanne, France
| | - Kathrin Gieseler
- Centre de Génétique et de Physiologie Moléculaires et Cellulaires, CNRS UMR 5534, Campus de la Doua, Villeurbanne, France
- Université Claude Bernard Lyon 1, Villeurbanne, France
| | - Laurent Ségalat
- Centre de Génétique et de Physiologie Moléculaires et Cellulaires, CNRS UMR 5534, Campus de la Doua, Villeurbanne, France
- Université Claude Bernard Lyon 1, Villeurbanne, France
| | | | - Jonathan J. Ewbank
- Centre d'Immunologie de Marseille-Luminy, Aix-Marseille University, Marseille, France
- INSERM, U1104, Marseille, France
- CNRS, UMR7280, Marseille, France
| |
Collapse
|
13
|
Gasser RB, Cantacessi C. Heartworm genomics: unprecedented opportunities for fundamental molecular insights and new intervention strategies. Top Companion Anim Med 2012; 26:193-9. [PMID: 22152607 DOI: 10.1053/j.tcam.2011.09.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Vector-borne diseases, including canine heartworm disease (CHWD), are of major socioeconomic and canine health importance worldwide. Although many studies have provided insights into CHWD, to date there has been limited study of fundamental molecular aspects of Dirofilaria immitis itself, its relationship with the canine host, its vectors, as well as the potential of drug resistance to emerge, using advanced -omic technologies. This article takes a prospective view of the benefits that advanced -omics technologies will have toward understanding D. immitis and CHWD. Tackling key biological questions using these technologies will provide a "systems biology" context and could lead to radically new intervention and management strategies against heartworm.
Collapse
Affiliation(s)
- Robin B Gasser
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria 3010, Australia.
| | | |
Collapse
|
14
|
Haas BJ, Zeng Q, Pearson MD, Cuomo CA, Wortman JR. Approaches to Fungal Genome Annotation. Mycology 2011; 2:118-141. [PMID: 22059117 PMCID: PMC3207268 DOI: 10.1080/21501203.2011.606851] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Fungal genome annotation is the starting point for analysis of genome content. This generally involves the application of diverse methods to identify features on a genome assembly such as protein-coding and non-coding genes, repeats and transposable elements, and pseudogenes. Here we describe tools and methods leveraged for eukaryotic genome annotation with a focus on the annotation of fungal nuclear and mitochondrial genomes. We highlight the application of the latest technologies and tools to improve the quality of predicted gene sets. The Broad Institute eukaryotic genome annotation pipeline is described as one example of how such methods and tools are integrated into a sequencing center's production genome annotation environment.
Collapse
Affiliation(s)
- Brian J Haas
- Genome Sequencing and Analysis Program, Broad Institute, 7 Cambridge Center, Cambridge, MA 02142, U.S.A
| | | | | | | | | |
Collapse
|
15
|
Engelmann I, Griffon A, Tichit L, Montañana-Sanchis F, Wang G, Reinke V, Waterston RH, Hillier LW, Ewbank JJ. A comprehensive analysis of gene expression changes provoked by bacterial and fungal infection in C. elegans. PLoS One 2011; 6:e19055. [PMID: 21602919 PMCID: PMC3094335 DOI: 10.1371/journal.pone.0019055] [Citation(s) in RCA: 136] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2011] [Accepted: 03/17/2011] [Indexed: 12/16/2022] Open
Abstract
While Caenorhabditis elegans specifically responds to infection by the up-regulation of certain genes, distinct pathogens trigger the expression of a common set of genes. We applied new methods to conduct a comprehensive and comparative study of the transcriptional response of C. elegans to bacterial and fungal infection. Using tiling arrays and/or RNA-sequencing, we have characterized the genome-wide transcriptional changes that underlie the host's response to infection by three bacterial (Serratia marcescens, Enterococcus faecalis and otorhabdus luminescens) and two fungal pathogens (Drechmeria coniospora and Harposporium sp.). We developed a flexible tool, the WormBase Converter (available at http://wormbasemanager.sourceforge.net/), to allow cross-study comparisons. The new data sets provided more extensive lists of differentially regulated genes than previous studies. Annotation analysis confirmed that genes commonly up-regulated by bacterial infections are related to stress responses. We found substantial overlaps between the genes regulated upon intestinal infection by the bacterial pathogens and Harposporium, and between those regulated by Harposporium and D. coniospora, which infects the epidermis. Among the fungus-regulated genes, there was a significant bias towards genes that are evolving rapidly and potentially encode small proteins. The results obtained using new methods reveal that the response to infection in C. elegans is determined by the nature of the pathogen, the site of infection and the physiological imbalance provoked by infection. They form the basis for future functional dissection of innate immune signaling. Finally, we also propose alternative methods to identify differentially regulated genes that take into account the greater variability in lowly expressed genes.
Collapse
Affiliation(s)
- Ilka Engelmann
- Centre d'Immunologie de Marseille-Luminy, Université de la Méditerranée, Marseille, France
- INSERM, U631, Marseille, France
- CNRS, UMR6102, Marseille, France
| | - Aurélien Griffon
- Centre d'Immunologie de Marseille-Luminy, Université de la Méditerranée, Marseille, France
- INSERM, U631, Marseille, France
- CNRS, UMR6102, Marseille, France
| | | | - Frédéric Montañana-Sanchis
- Centre d'Immunologie de Marseille-Luminy, Université de la Méditerranée, Marseille, France
- INSERM, U631, Marseille, France
- CNRS, UMR6102, Marseille, France
| | - Guilin Wang
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Valerie Reinke
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Robert H. Waterston
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, United States of America
| | - LaDeana W. Hillier
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, United States of America
| | - Jonathan J. Ewbank
- Centre d'Immunologie de Marseille-Luminy, Université de la Méditerranée, Marseille, France
- INSERM, U631, Marseille, France
- CNRS, UMR6102, Marseille, France
- * E-mail:
| |
Collapse
|
16
|
Gasser RB, Cantacessi C, Campbell BE, Hofmann A, Otranto D. Major prospects for exploring canine vector borne diseases and novel intervention methods using 'omic technologies. Parasit Vectors 2011; 4:53. [PMID: 21489242 PMCID: PMC3095997 DOI: 10.1186/1756-3305-4-53] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2011] [Accepted: 04/13/2011] [Indexed: 11/26/2022] Open
Abstract
Canine vector-borne diseases (CVBDs) are of major socioeconomic importance worldwide. Although many studies have provided insights into CVBDs, there has been limited exploration of fundamental molecular aspects of most pathogens, their vectors, pathogen-host relationships and disease and drug resistance using advanced, 'omic technologies. The aim of the present article is to take a prospective view of the impact that next-generation, 'omics technologies could have, with an emphasis on describing the principles of transcriptomic/genomic sequencing as well as bioinformatic technologies and their implications in both fundamental and applied areas of CVBD research. Tackling key biological questions employing these technologies will provide a 'systems biology' context and could lead to radically new intervention and management strategies against CVBDs.
Collapse
Affiliation(s)
- Robin B Gasser
- Department of Veterinary Science, The University of Melbourne, 250 Princes Highway, Werribee, Victoria 3030, Australia
| | - Cinzia Cantacessi
- Department of Veterinary Science, The University of Melbourne, 250 Princes Highway, Werribee, Victoria 3030, Australia
| | - Bronwyn E Campbell
- Department of Veterinary Science, The University of Melbourne, 250 Princes Highway, Werribee, Victoria 3030, Australia
| | - Andreas Hofmann
- Structural Chemistry Program, Eskitis Institute for Cell & Molecular Therapies, Griffith University, Brisbane, Queensland, Australia
| | - Domenico Otranto
- Dipartimento di Sanità Pubblica e Zootecnia, Facoltà di Medicina Veterinaria, Università di Bari, Str. prov. le per Casamassima Km 3, 70010, Valenzano, Bari, Italy
| |
Collapse
|
17
|
Desalermos A, Muhammed M, Glavis-Bloom J, Mylonakis E. Using C. elegans for antimicrobial drug discovery. Expert Opin Drug Discov 2011; 6:645-652. [PMID: 21686092 DOI: 10.1517/17460441.2011.573781] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
INTRODUCTION: The number of microorganism strains with resistance to known antimicrobials is increasing. Therefore, there is a high demand for new, non-toxic and efficient antimicrobial agents. Research with the microscopic nematode Caenorhabditis elegans can address this high demand for the discovery of new antimicrobial compounds. In particular, C. elegans can be used as a model host for in vivo drug discovery through high-throughput screens of chemical libraries. AREAS COVERED: This review introduces the use of substitute model hosts and especially C. elegans in the study of microbial pathogenesis. The authors also highlight recently published literature on the role of C. elegans in drug discovery and outline its use as a promising host with unique advantages in the discovery of new antimicrobial drugs. EXPERT OPINION: C. elegans can be used, as a model host, to research many diseases, including fungal infections and Alzheimer's disease. In addition, high-throughput techniques, for screening chemical libraries, can also be facilitated. Nevertheless, C. elegans and mammals have significant differences that both limit the use of the nematode in research and the degree by which results can be interpreted. That being said, the use of C. elegans in drug discovery still holds promise and the field continues to grow, with attempts to improve the methodology already underway.
Collapse
Affiliation(s)
- Athanasios Desalermos
- Division of Infectious Diseases, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA
| | | | | | | |
Collapse
|
18
|
Abstract
Vaccination is one of the greatest triumphs of modern medicine, yet we remain largely ignorant of the mechanisms by which successful vaccines stimulate protective immunity. Two recent advances are beginning to illuminate such mechanisms: realization of the pivotal role of the innate immune system in sensing microbes and stimulating adaptive immunity, and advances in systems biology. Recent studies have used systems biology approaches to obtain a global picture of the immune responses to vaccination in humans. This has enabled the identification of early innate signatures that predict the immunogenicity of vaccines, and identification of potentially novel mechanisms of immune regulation. Here, we review these advances and critically examine the potential opportunities and challenges posed by systems biology in vaccine development.
Collapse
|
19
|
Abstract
A genome browser is software that allows users to visualize DNA, protein, or other sequence features within the context of a reference sequence, such as a chromosome or contig. The Generic Genome Browser (GBrowse) is an open-source browser developed as part of the Generic Model Organism Database project (Stein et al., 2002). GBrowse can be configured to display genomic sequence features for any organism and is the browser used for the model organisms Drosophila melanogaster (Grumbling and Strelets, 2006) and Caenorhabditis elegans (Schwarz et al., 2006), among others. The software package can be downloaded from the Web and run on a Windows, Mac OS X, or Unix-type system. Version 1.64, as described in the original protocol, was released in November 2005, but the software is under active development and new versions are released about every six months. This update includes instructions on updating existing data sources with new files from NCBI.
Collapse
Affiliation(s)
- Maureen J Donlin
- Department of Biochemistry and Molecular Biology and Department of Molecular Microbiology and Immunology, Saint Louis University School of Medicine, St. Louis, Missouri, USA
| |
Collapse
|
20
|
Armstrong KR, Chamberlin HM. Coordinate regulation of gene expression in the C. elegans excretory cell by the POU domain protein CEH-6. Mol Genet Genomics 2009; 283:73-87. [PMID: 19921263 DOI: 10.1007/s00438-009-0497-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2009] [Accepted: 10/23/2009] [Indexed: 11/24/2022]
Abstract
Excretory renal organs are critical in animals for osmoregulation and the elimination of waste. Renal organs across a range of species exhibit cellular and molecular similarities. For example, class III POU-homeodomain transcription factors are expressed in the renal organs of many invertebrates and vertebrates. However, the functional role for these factors is not well characterized. To better understand the role of class III POU-homeodomain proteins in animal excretory systems, we have characterized a set of genes expressed in the Caenorhabditis elegans excretory cell, and determined their regulation by the POU-III transcription factor CEH-6. Our molecular and biochemical studies show that CEH-6 regulates a subset of genes expressed in the excretory cell. Additionally, we find that the CEH-6-dependent genes share two molecular features: they contain at least one octamer regulatory element and they encode for transport and channel proteins. This work suggests that a role for POU-III factors in renal organs is to coordinate the expression of a set of functionally related genes.
Collapse
Affiliation(s)
- Kristin R Armstrong
- Department of Molecular Genetics, Ohio State University, 938 Biological Sciences Building, 484 W. 12th Avenue, Columbus, OH 43210, USA
| | | |
Collapse
|
21
|
Oliveira RP, Porter Abate J, Dilks K, Landis J, Ashraf J, Murphy CT, Blackwell TK. Condition-adapted stress and longevity gene regulation by Caenorhabditis elegans SKN-1/Nrf. Aging Cell 2009; 8:524-41. [PMID: 19575768 DOI: 10.1111/j.1474-9726.2009.00501.x] [Citation(s) in RCA: 266] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
Studies in model organisms have identified regulatory processes that profoundly influence aging, many of which modulate resistance against environmental or metabolic stresses. In Caenorhabditis elegans, the transcription regulator SKN-1 is important for oxidative stress resistance and acts in multiple longevity pathways. SKN-1 is the ortholog of mammalian Nrf proteins, which induce Phase 2 detoxification genes in response to stress. Phase 2 enzymes defend against oxygen radicals and conjugate electrophiles that are produced by Phase 1 detoxification enzymes, which metabolize lipophilic compounds. Here, we have used expression profiling to identify genes and processes that are regulated by SKN-1 under normal and stress-response conditions. Under nonstressed conditions SKN-1 upregulates numerous genes involved in detoxification, cellular repair, and other functions, and downregulates a set of genes that reduce stress resistance and lifespan. Many of these genes appear to be direct SKN-1 targets, based upon presence of predicted SKN-binding sites in their promoters. The metalloid sodium arsenite induces skn-1-dependent activation of certain detoxification gene groups, including some that were not SKN-1-upregulated under normal conditions. An organic peroxide also triggers induction of a discrete Phase 2 gene set, but additionally stimulates a broad SKN-1-independent response. We conclude that under normal conditions SKN-1 has a wide range of functions in detoxification and other processes, including modulating mechanisms that reduce lifespan. In response to stress, SKN-1 and other regulators tailor transcription programs to meet the challenge at hand. Our findings reveal striking complexity in SKN-1 functions and the regulation of systemic detoxification defenses.
Collapse
Affiliation(s)
- Riva P Oliveira
- Section on Developmental and Stem Cell Biology, Joslin Diabetes Center, Department of Pathology, Harvard Medical School, Harvard Stem Cell Institute, One Joslin Place, Boston, MA 02215, USA
| | | | | | | | | | | | | |
Collapse
|
22
|
Schattner P. Genomics made easier: An introductory tutorial to genome datamining. Genomics 2009; 93:187-95. [DOI: 10.1016/j.ygeno.2008.10.009] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2008] [Revised: 10/13/2008] [Accepted: 10/29/2008] [Indexed: 11/28/2022]
|
23
|
Wong HM, Huppert JL. Stable G-quadruplexes are found outside nucleosome-bound regions. MOLECULAR BIOSYSTEMS 2009; 5:1713-9. [DOI: 10.1039/b905848f] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
24
|
Wichadakul D, McDermott J, Samudrala R. Prediction and integration of regulatory and protein-protein interactions. Methods Mol Biol 2009; 541:101-43. [PMID: 19381527 DOI: 10.1007/978-1-59745-243-4_6] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Knowledge of transcriptional regulatory interactions (TRIs) is essential for exploring functional genomics and systems biology in any organism. While several results from genome-wide analysis of transcriptional regulatory networks are available, they are limited to model organisms such as yeast ( 1 ) and worm ( 2 ). Beyond these networks, experiments on TRIs study only individual genes and proteins of specific interest. In this chapter, we present a method for the integration of various data sets to predict TRIs for 54 organisms in the Bioverse ( 3 ). We describe how to compile and handle various formats and identifiers of data sets from different sources and how to predict TRIs using a homology-based approach, utilizing the compiled data sets. Integrated data sets include experimentally verified TRIs, binding sites of transcription factors, promoter sequences, protein subcellular localization, and protein families. Predicted TRIs expand the networks of gene regulation for a large number of organisms. The integration of experimentally verified and predicted TRIs with other known protein-protein interactions (PPIs) gives insight into specific pathways, network motifs, and the topological dynamics of an integrated network with gene expression under different conditions, essential for exploring functional genomics and systems biology.
Collapse
|
25
|
Gordon PMK, Soliman MA, Bose P, Trinh Q, Sensen CW, Riabowol K. Interspecies data mining to predict novel ING-protein interactions in human. BMC Genomics 2008; 9:426. [PMID: 18801192 PMCID: PMC2565686 DOI: 10.1186/1471-2164-9-426] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2008] [Accepted: 09/18/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The INhibitor of Growth (ING) family of type II tumor suppressors (ING1-ING5) is involved in many cellular processes such as cell aging, apoptosis, DNA repair and tumorigenesis. To expand our understanding of the proteins with which the ING proteins interact, we designed a method that did not depend upon large-scale proteomics-based methods, since they may fail to highlight transient or relatively weak interactions. Here we test a cross-species (yeast, fly, and human) bioinformatics-based approach to identify potential human ING-interacting proteins with higher probability and accuracy than approaches based on screens in a single species. RESULTS We confirm the validity of this screen and show that ING1 interacts specifically with three of the three proteins tested; p38MAPK, MEKK4 and RAD50. These novel ING-interacting proteins further link ING proteins to cell stress and DNA damage signaling, providing previously unknown upstream links to DNA damage response pathways in which ING1 participates. The bioinformatics approach we describe can be used to create an interaction prediction list for any human proteins with yeast homolog(s). CONCLUSION None of the validated interactions were predicted by the conventional protein-protein interaction tools we tested. Validation of our approach by traditional laboratory techniques shows that we can extract value from the voluminous weak interaction data already elucidated in yeast and fly databases. We therefore propose that the weak (low signal to noise ratio) data from large-scale interaction datasets are currently underutilized.
Collapse
Affiliation(s)
- Paul MK Gordon
- Department of Biochemistry & Molecular Biology, Faculty of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Mohamed A Soliman
- Department of Biochemistry & Molecular Biology, Faculty of Medicine, University of Calgary, Calgary, Alberta, Canada
- Department of Oncology, Faculty of Medicine, University of Calgary, Calgary, Alberta, Canada
- Department of Biochemistry, Faculty of Pharmacy, Cairo University, Cairo, Egypt
| | - Pinaki Bose
- Department of Biochemistry & Molecular Biology, Faculty of Medicine, University of Calgary, Calgary, Alberta, Canada
- Department of Oncology, Faculty of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Quang Trinh
- Department of Biochemistry & Molecular Biology, Faculty of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Christoph W Sensen
- Department of Biochemistry & Molecular Biology, Faculty of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Karl Riabowol
- Department of Biochemistry & Molecular Biology, Faculty of Medicine, University of Calgary, Calgary, Alberta, Canada
- Department of Oncology, Faculty of Medicine, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
26
|
O'Connor BD, Day A, Cain S, Arnaiz O, Sperling L, Stein LD. GMODWeb: a web framework for the Generic Model Organism Database. Genome Biol 2008; 9:R102. [PMID: 18570664 PMCID: PMC2481422 DOI: 10.1186/gb-2008-9-6-r102] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2007] [Revised: 04/12/2008] [Accepted: 06/20/2008] [Indexed: 11/18/2022] Open
Abstract
GMODWeb is a software framework designed to speed the development of websites for model organism databases. The Generic Model Organism Database (GMOD) initiative provides species-agnostic data models and software tools for representing curated model organism data. Here we describe GMODWeb, a GMOD project designed to speed the development of model organism database (MOD) websites. Sites created with GMODWeb provide integration with other GMOD tools and allow users to browse and search through a variety of data types. GMODWeb was built using the open source Turnkey web framework and is available from .
Collapse
Affiliation(s)
- Brian D O'Connor
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California, USA.
| | | | | | | | | | | |
Collapse
|
27
|
Abstract
A genome browser is software that allows users to visualize DNA, protein, or other sequence features within the context of a reference sequence, such as a chromosome or contig. The Generic Genome Browser (GBrowse) is an open-source browser developed as part of the Generic Model Organism Database project (Stein et al., 2002). GBrowse can be configured to display genomic sequence features for any organism and is the browser used for the model organisms Drosophila melanogaster (Grumbling and Strelets, 2006) and Caenorhabditis elegans (Schwarz et al., 2006), among others. The software package can be downloaded from the web and run on a Windows, Mac OS X, or Unix-type system. Version 1.64, as described in this protocol, was released in November 2005, but the software is under active development and new versions are released about every six months.
Collapse
Affiliation(s)
- Maureen J Donlin
- Department of Biochemistry and Molecular Biology, Saint Louis University School of Medicine, St. Louis, Missouri, USA
| |
Collapse
|
28
|
In silico analysis of expressed sequence tags from Trichostrongylus vitrinus (Nematoda): comparison of the automated ESTExplorer workflow platform with conventional database searches. BMC Bioinformatics 2008; 9 Suppl 1:S10. [PMID: 18315841 PMCID: PMC2259411 DOI: 10.1186/1471-2105-9-s1-s10] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Background The analysis of expressed sequence tags (EST) offers a rapid and cost effective approach to elucidate the transcriptome of an organism, but requires several computational methods for assembly and annotation. Researchers frequently analyse each step manually, which is laborious and time consuming. We have recently developed ESTExplorer, a semi-automated computational workflow system, in order to achieve the rapid analysis of EST datasets. In this study, we evaluated EST data analysis for the parasitic nematode Trichostrongylus vitrinus (order Strongylida) using ESTExplorer, compared with database matching alone. Results We functionally annotated 1776 ESTs obtained via suppressive-subtractive hybridisation from T. vitrinus, an important parasitic trichostrongylid of small ruminants. Cluster and comparative genomic analyses of the transcripts using ESTExplorer indicated that 290 (41%) sequences had homologues in Caenorhabditis elegans, 329 (42%) in parasitic nematodes, 202 (28%) in organisms other than nematodes, and 218 (31%) had no significant match to any sequence in the current databases. Of the C. elegans homologues, 90 were associated with 'non-wildtype' double-stranded RNA interference (RNAi) phenotypes, including embryonic lethality, maternal sterility, sterile progeny, larval arrest and slow growth. We could functionally classify 267 (38%) sequences using the Gene Ontologies (GO) and establish pathway associations for 230 (33%) sequences using the Kyoto Encyclopedia of Genes and Genomes (KEGG). Further examination of this EST dataset revealed a number of signalling molecules, proteases, protease inhibitors, enzymes, ion channels and immune-related genes. In addition, we identified 40 putative secreted proteins that could represent potential candidates for developing novel anthelmintics or vaccines. We further compared the automated EST sequence annotations, using ESTExplorer, with database search results for individual T. vitrinus ESTs. ESTExplorer reliably and rapidly annotated 301 ESTs, with pathway and GO information, eliminating 60 low quality hits from database searches. Conclusion We evaluated the efficacy of ESTExplorer in analysing EST data, and demonstrate that computational tools can be used to accelerate the process of gene discovery in EST sequencing projects. The present study has elucidated sets of relatively conserved and potentially novel genes for biological investigation, and the annotated EST set provides further insight into the molecular biology of T. vitrinus, towards the identification of novel drug targets.
Collapse
|
29
|
Jacob J, Mitreva M, Vanholme B, Gheysen G. Exploring the transcriptome of the burrowing nematode Radopholus similis. Mol Genet Genomics 2008; 280:1-17. [PMID: 18386064 DOI: 10.1007/s00438-008-0340-7] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2007] [Accepted: 03/19/2008] [Indexed: 01/03/2023]
Abstract
Radopholus similis is an important nematode pest on fruit crops in the tropics. Unraveling the transcriptome of this migratory plant-parasitic nematode can provide insight in the parasitism process and lead to more efficient control measures. For the first high throughput molecular characterization of this devastating nematode, 5,853 expressed sequence tags from a mixed stage population were generated. Adding 1,154 tags from the EST division of GenBank for subsequent analysis, resulted in a total of 7,007 ESTs, which represent approximately 3,200 genes. The mean G + C content of the nucleotides at the third codon position (GC3%) was calculated to be as high as 64.8%, the highest for nematodes reported to date. BLAST-searches resulted in about 70% of the clustered ESTs having homology to (DNA and protein) sequences from the GenBank database, whereas one-third of them did not match to any known sequence. Roughly 40% of these latter sequences are predicted to be coding, representing putative novel protein coding genes. Functional annotation of the sequences by GO annotation revealed the abundance of genes involved in reproduction and development, which reflects the nematode population biology. Genes with a role in the parasitism process are identified, as well as genes essential for nematode survival, providing information useful for parasite control. No evidence was found for the presence of trans-spliced leader sequences commonly occurring in nematodes, despite the use of various approaches. In conclusion, we found three different sources for the EST sequences: the majority has a nuclear origin, approximately 1% of the EST sequences are derived from the mitochondrial transcriptome, and interestingly, 1% of the tags are with high probability derived from Wolbachia, providing the first molecular indication for the presence of this endosymbiont in a plant-parasitic nematode.
Collapse
Affiliation(s)
- Joachim Jacob
- Department of Molecular Biotechnology, Faculty of Bioscience Engineering, Ghent University, Coupure links 653, 9000 Ghent, Belgium.
| | | | | | | |
Collapse
|
30
|
Identification of direct target genes using joint sequence and expression likelihood with application to DAF-16. PLoS One 2008; 3:e1821. [PMID: 18350157 PMCID: PMC2266795 DOI: 10.1371/journal.pone.0001821] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2007] [Accepted: 01/31/2008] [Indexed: 12/11/2022] Open
Abstract
A major challenge in the post-genome era is to reconstruct regulatory networks from the biological knowledge accumulated up to date. The development of tools for identifying direct target genes of transcription factors (TFs) is critical to this endeavor. Given a set of microarray experiments, a probabilistic model called TRANSMODIS has been developed which can infer the direct targets of a TF by integrating sequence motif, gene expression and ChIP-chip data. The performance of TRANSMODIS was first validated on a set of transcription factor perturbation experiments (TFPEs) involving Pho4p, a well studied TF in Saccharomyces cerevisiae. TRANSMODIS removed elements of arbitrariness in manual target gene selection process and produced results that concur with one's intuition. TRANSMODIS was further validated on a genome-wide scale by comparing it with two other methods in Saccharomyces cerevisiae. The usefulness of TRANSMODIS was then demonstrated by applying it to the identification of direct targets of DAF-16, a critical TF regulating ageing in Caenorhabditis elegans. We found that 189 genes were tightly regulated by DAF-16. In addition, DAF-16 has differential preference for motifs when acting as an activator or repressor, which awaits experimental verification. TRANSMODIS is computationally efficient and robust, making it a useful probabilistic framework for finding immediate targets.
Collapse
|
31
|
Bai X, Grewal PS, Hogenhout SA, Adams BJ, Ciche TA, Gaugler R, Sternberg PW. Expressed sequence tag analysis of gene representation in insect parasitic nematode Heterorhabditis bacteriophora. J Parasitol 2008; 93:1343-9. [PMID: 18314678 DOI: 10.1645/ge-1246.1] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
We compared Heterorhabditis bacteriophora GPS11 expressed sequence tags (ESTs) to the ESTs of animal-parasitic, human-parasitic, plant-parasitic, and free-living nematodes. We identified 127 previously nondescribed ESTs of which 119 had homologs in ESTs and 8 had homologs in proteins of free-living nematodes. These ESTs were assigned putative functions in transcription, signal transduction, cell cycle control, metabolism, information processing, and cellular processes, thereby providing better insight into H. bacteriophora metabolism, sex determination, and signal transduction. We also identified 36 H. bacteriophora ESTs that had significant similarities to ESTs of parasitic nematodes, but not to ESTs or proteins of free-living nematodes species. Among these are the ESTs encoding a centrin, an ankyrin-repeat containing protein, and a nuclear hormone receptor. Our analysis also revealed that parasitic nematode-specific ESTs in this H. bacteriophora data set had more homologs in animal-parasitic nematodes than those parasitizing humans or plants.
Collapse
Affiliation(s)
- Xiaodong Bai
- Department of Entomology, The Ohio State University-OARDC, Wooster, Ohio 44691, USA
| | | | | | | | | | | | | |
Collapse
|
32
|
Bamps S, Hope IA. Large-scale gene expression pattern analysis, in situ, in Caenorhabditis elegans. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2008; 7:175-83. [DOI: 10.1093/bfgp/eln013] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
|
33
|
Pairing of competitive and topologically distinct regulatory modules enhances patterned gene expression. Mol Syst Biol 2008; 4:163. [PMID: 18277379 PMCID: PMC2267734 DOI: 10.1038/msb.2008.6] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2007] [Accepted: 01/08/2008] [Indexed: 11/08/2022] Open
Abstract
Biological networks are inherently modular, yet little is known about how modules are assembled to enable coordinated and complex functions. We used RNAi and time series, whole-genome microarray analyses to systematically perturb and characterize components of a Caenorhabditis elegans lineage-specific transcriptional regulatory network. These data are supported by selected reporter gene analyses and comprehensive yeast one-hybrid and promoter sequence analyses. Based on these results, we define and characterize two modules composed of muscle- and epidermal-specifying transcription factors that function together within a single cell lineage to robustly specify multiple cell types. The expression of these two modules, although positively regulated by a common factor, is reliably segregated among daughter cells. Our analyses indicate that these modules repress each other, and we propose that this cross-inhibition coupled with their relative time of induction function to enhance the initial asymmetry in their expression patterns, thus leading to the observed invariant gene expression patterns and cell lineage. The coupling of asynchronous and topologically distinct modules may be a general principle of module assembly that functions to potentiate genetic switches.
Collapse
|
34
|
Dieterich C, Sommer RJ. A Caenorhabditis motif compendium for studying transcriptional gene regulation. BMC Genomics 2008; 9:30. [PMID: 18215260 PMCID: PMC2248174 DOI: 10.1186/1471-2164-9-30] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2007] [Accepted: 01/23/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Controlling gene expression is fundamental to biological complexity. The nematode Caenorhabditis elegans is an important model for studying principles of gene regulation in multi-cellular organisms. A comprehensive parts list of putative regulatory motifs was yet missing for this model system. In this study, we compile a set of putative regulatory motifs by combining evidence from conservation and expression data. DESCRIPTION We present an unbiased comparative approach to a regulatory motif compendium for Caenorhabditis species. This involves the assembly of a new nematode genome, whole genome alignments and assessment of conserved k-mers counts. Candidate motifs are selected from a set of 9,500 randomly picked genes by three different motif discovery strategies. Motif candidates have to pass a conservation enrichment filter. Motif degeneracy and length are optimized. Retained motif descriptions are evaluated by expression data using a non-parametric test, which assesses expression changes due to the presence/absence of individual motifs. Finally, we also provide condition-specific motif ensembles by conditional tree analysis. CONCLUSION The nematode genomes align surprisingly well despite high neutral substitution rates. Our pipeline delivers motif sets by three alternative strategies. Each set contains less than 400 motifs, which are significantly conserved and correlated with 214 out of 270 tested gene expression conditions. This motif compendium is an entry point to comprehensive studies on nematode gene regulation. The website: http://corg.eb.tuebingen.mpg.de/CMC has extensive query capabilities, supplements this article and supports the experimental list.
Collapse
Affiliation(s)
- Christoph Dieterich
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, Spemannstrasse 35 - 37, Tübingen, Germany.
| | | |
Collapse
|
35
|
Strange K. Revisiting the Krogh Principle in the post-genome era: Caenorhabditis elegans as a model system for integrative physiology research. ACTA ACUST UNITED AC 2008; 210:1622-31. [PMID: 17449828 DOI: 10.1242/jeb.000125] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Molecular biology drove a powerful reductionist or ;molecule-centric' approach to biological research in the last half of the 20th century. Reductionism is the attempt to explain complex phenomena by defining the functional properties of the individual components that comprise multi-component systems. Systems biology has emerged in the post-genome era as the successor to reductionism. In my opinion, systems biology and physiology are synonymous. Both disciplines seek to understand multi-component processes or 'systems' and the underlying pathways of information flow from an organism's genes up through increasingly complex levels of organization. The physiologist and Nobel laureate August Krogh believed that there is an ideal organism in which almost every physiological problem could be studied most readily (the 'Krogh Principle'). If an investigator's goal were to define a physiological process from the level of genes to the whole animal, the optimal model organism for him/her to utilize would be one that is genetically and molecularly tractable. In other words, an organism in which forward and reverse genetic analyses could be carried out readily, rapidly and economically. Non-mammalian model organisms such as Escherichia coli, Saccharomyces, Caenorhabditis elegans, Drosophila, zebrafish and the plant Arabidopsis are cornerstones of systems biology research. The nematode C. elegans provides a particularly striking example of the experimental utility of non-mammalian model organisms. The aim of this paper is to illustrate how genetic, functional genomic, molecular and physiological methods can be combined in C. elegans to develop a systems biological understanding of fundamental physiological processes common to all animals. I present examples of the experimental tools available for the study of C. elegans and discuss how we have used them to gain new insights into osmotic stress signaling in animal cells.
Collapse
Affiliation(s)
- Kevin Strange
- Departments of Anesthesiology, Molecular Physiology and Biophysics, and Pharmacology, Vanderbilt University Medical Center, Nashville, TN 37232, USA.
| |
Collapse
|
36
|
|
37
|
Chatr-Aryamontri A, Zanzoni A, Ceol A, Cesareni G. Searching the protein interaction space through the MINT database. Methods Mol Biol 2008; 484:305-317. [PMID: 18592188 DOI: 10.1007/978-1-59745-398-1_20] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Many fundamental processes involve protein-protein interactions. Recent advances in technology make it possible to perform large-scale, genome-wide interaction mapping experiments that result in an always increasing amount of data. Protein-protein interaction databases are thus becoming a major resource for investigating biological networks and pathways. In this chapter we describe the Molecular INTeraction database (MINT). The MINT database aims at storing, in a structured format, information about protein-protein interactions (PPIs) by extracting experimental details from work published in peer-reviewed journals.
Collapse
|
38
|
Miska EA, Alvarez-Saavedra E, Abbott AL, Lau NC, Hellman AB, McGonagle SM, Bartel DP, Ambros VR, Horvitz HR. Most Caenorhabditis elegans microRNAs are individually not essential for development or viability. PLoS Genet 2007; 3:e215. [PMID: 18085825 PMCID: PMC2134938 DOI: 10.1371/journal.pgen.0030215] [Citation(s) in RCA: 376] [Impact Index Per Article: 22.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2007] [Accepted: 10/12/2007] [Indexed: 12/19/2022] Open
Abstract
MicroRNAs (miRNAs), a large class of short noncoding RNAs found in many plants and animals, often act to post-transcriptionally inhibit gene expression. We report the generation of deletion mutations in 87 miRNA genes in Caenorhabditis elegans, expanding the number of mutated miRNA genes to 95, or 83% of known C. elegans miRNAs. We find that the majority of miRNAs are not essential for the viability or development of C. elegans, and mutations in most miRNA genes do not result in grossly abnormal phenotypes. These observations are consistent with the hypothesis that there is significant functional redundancy among miRNAs or among gene pathways regulated by miRNAs. This study represents the first comprehensive genetic analysis of miRNA function in any organism and provides a unique, permanent resource for the systematic study of miRNAs.
Collapse
Affiliation(s)
- Eric A Miska
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Ezequiel Alvarez-Saavedra
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Allison L Abbott
- Department of Genetics, Dartmouth Medical School, Hanover, New Hampshire, United States of America
| | - Nelson C Lau
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
| | - Andrew B Hellman
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Shannon M McGonagle
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - David P Bartel
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
| | - Victor R Ambros
- Department of Genetics, Dartmouth Medical School, Hanover, New Hampshire, United States of America
| | - H. Robert Horvitz
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| |
Collapse
|
39
|
Bauer Huang SL, Saheki Y, VanHoven MK, Torayama I, Ishihara T, Katsura I, van der Linden A, Sengupta P, Bargmann CI. Left-right olfactory asymmetry results from antagonistic functions of voltage-activated calcium channels and the Raw repeat protein OLRN-1 in C. elegans. Neural Dev 2007; 2:24. [PMID: 17986337 PMCID: PMC2213652 DOI: 10.1186/1749-8104-2-24] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2007] [Accepted: 11/06/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The left and right AWC olfactory neurons in Caenorhabditis elegans differ in their functions and in their expression of chemosensory receptor genes; in each animal, one AWC randomly takes on one identity, designated AWCOFF, and the contralateral AWC becomes AWCON. Signaling between AWC neurons induces left-right asymmetry through a gap junction network and a claudin-related protein, which inhibit a calcium-regulated MAP kinase pathway in the neuron that becomes AWCON. RESULTS We show here that the asymmetry gene olrn-1 acts downstream of the gap junction and claudin genes to inhibit the calcium-MAP kinase pathway in AWCON. OLRN-1, a protein with potential membrane-association domains, is related to the Drosophila Raw protein, a negative regulator of JNK mitogen-activated protein (MAP) kinase signaling. olrn-1 opposes the action of two voltage-activated calcium channel homologs, unc-2 (CaV2) and egl-19 (CaV1), which act together to stimulate the calcium/calmodulin-dependent kinase CaMKII and the MAP kinase pathway. Calcium channel activity is essential in AWCOFF, and the two AWC neurons coordinate left-right asymmetry using signals from the calcium channels and signals from olrn-1. CONCLUSION olrn-1 and voltage-activated calcium channels are mediators and targets of AWC signaling that act at the transition between a multicellular signaling network and cell-autonomous execution of the decision. We suggest that the asymmetry decision in AWC results from the intercellular coupling of voltage-regulated channels, whose cross-regulation generates distinct calcium signals in the left and right AWC neurons. The interpretation of these signals by the kinase cascade initiates the sustained difference between the two cells.
Collapse
Affiliation(s)
- Sarah L Bauer Huang
- Howard Hughes Medical Institute and Rockefeller University, New York, NY 10065, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Eki T, Ishihara T, Katsura I, Hanaoka F. A genome-wide survey and systematic RNAi-based characterization of helicase-like genes in Caenorhabditis elegans. DNA Res 2007; 14:183-99. [PMID: 17921522 PMCID: PMC2533593 DOI: 10.1093/dnares/dsm016] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Helicase-like proteins play a crucial role in nucleic acid- and chromatin-mediated reactions. In this study, we identified 134 helicase-like proteins in the nematode Caenorhabditis elegans and classified the proteins into 10 known subfamilies and a group of orphan genes on the basis of sequence similarity. We characterized loss-of-function phenotypes in RNA interference (RNAi)-treated animals for helicase family members, using the RNAi feeding method, and found several previously unreported phenotypes. Fifty-one (39.5%) of 129 genes tested showed development- or growth-defect phenotypes, and many of these genes were putative nematode homologs of essential genes in a unicellular eukaryote, budding yeast, suggesting conservation of these essential proteins in both species. Comparative analyses between these species identified evolutionarily diverged nematode proteins as well as conserved family members. Chromosome mapping of the nematode genes revealed 10 pairs of putative duplicated genes and clusters of C. elegans-specific SNF2-like genes and Helitrons. Analyses of transcriptional profile data revealed a predominantly oogenesis- and germline-enriched expression of many helicase-like genes. Finally, we identified the D2005.5(drh-3) gene in an RNAi-based screen for genes involved in resistance to X-ray irradiation. Analysis of DRH-3 will clarify the potentially novel mechanism by which it protects against X-ray-induced damage in C. elegans.
Collapse
Affiliation(s)
- Toshihiko Eki
- Division of Life Science and Biotechnology, Department of Ecological Engineering, Toyohashi University of Technology, Toyohashi, Aichi, Japan.
| | | | | | | |
Collapse
|
41
|
Ciche TA, Sternberg PW. Postembryonic RNAi in Heterorhabditis bacteriophora: a nematode insect parasite and host for insect pathogenic symbionts. BMC DEVELOPMENTAL BIOLOGY 2007; 7:101. [PMID: 17803822 PMCID: PMC2014770 DOI: 10.1186/1471-213x-7-101] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2007] [Accepted: 09/05/2007] [Indexed: 12/15/2022]
Abstract
Background Heterorhabditis bacteriophora is applied throughout the world for the biological control of insects and is an animal model to study interspecies interactions, e.g. mutualism, parasitism and vector-borne disease. H. bacteriophora nematodes are mutually associated with the insect pathogen, Photorhabdus luminescens. The developmentally arrested infective juvenile (IJ) stage nematode (vector) specifically transmits Photorhabdus luminescens bacteria (pathogen) in its gut mucosa to the haemocoel of insects (host). The nematode vector and pathogen alone are not known to cause insect disease. RNA interference is an excellent reverse genetic tool to study gene function in C. elegans, and it would be useful in H. bacteriophora to exploit the H. bacteriophora genome project, currently in progress. Results Soaking L1 stage H. bacteriophora with seven dsRNAs of genes whose C. elegans orthologs had severe RNAi phenotypes resulted in highly penetrant and obvious developmental and reproductive abnormalities. The efficacy of postembryonic double strand RNA interference (RNAi) was evident by abnormal gonad morphology and sterility of adult H. bacteriophora and C. elegans presumable due to defects in germ cell proliferation and gonad development. The penetrance of RNAi phenotypes in H. bacteriophora was high for five genes (87–100%; Hba-cct-2, Hba-daf-21, Hba-icd-1; Hba-nol-5, and Hba-W01G7.3) and moderate for two genes (usually 30–50%; Hba-rack-1 and Hba-arf-1). RNAi of three additional C. elegans orthologs for which RNAi phenotypes were not previously detected in C. elegans, also did not result in any apparent phenotypes in H. bacteriophora. Specific and severe reduction in transcript levels in RNAi treated L1s was determined by quantitative real-time RT-PCR. These results suggest that postembryonic RNAi by soaking is potent and specific. Conclusion Although RNAi is conserved in animals and plants, RNAi using long dsRNA is not. These results demonstrate that RNAi can be used effectively in H. bacteriophora and can be applied for analyses of nematode genes involved in symbiosis and parasitism. It is likely that RNAi will be an important tool for functional genomics utilizing the high quality draft H. bacteriophora genome sequence.
Collapse
Affiliation(s)
- Todd A Ciche
- Department of Microbiology and Molecular Genetics, Michigan State University, 2215 Biomedical Physical Sciences Building, East Lansing, MI 48824, USA
| | - Paul W Sternberg
- Mail Code 156-29, California Institute of Technology, 1200 E. California Blvd., Pasadena, CA 91125, USA
| |
Collapse
|
42
|
Schmutz C, Stevens J, Spang A. Functions of the novel RhoGAP proteins RGA-3 and RGA-4 in the germ line and in the early embryo of C. elegans. Development 2007; 134:3495-505. [PMID: 17728351 DOI: 10.1242/dev.000802] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
We have identified two redundant GTPase activating proteins (GAPs) - RGA-3 and RGA-4 - that regulate Rho GTPase function at the plasma membrane in early Caenorhabditis elegans embryos. Knockdown of both RhoGAPs resulted in extensive membrane ruffling, furrowing and pronounced pseudo-cleavages. In addition, the non-muscle myosin NMY-2 and RHO-1 accumulated on the cortex at sites of ruffling. RGA-3 and RGA-4 are GAPs for RHO-1, but most probably not for CDC-42, because only RHO-1 was epistatic to the two GAPs, and the GAPs had no obvious influence on CDC-42 function. Furthermore, knockdown of either the RHO-1 effector, LET-502, or the exchange factor for RHO-1, ECT-2, alleviated the membrane-ruffling phenotype caused by simultaneous knockdown of both RGA-3 and RGA-4 [rga-3/4 (RNAi)]. GFP::PAR-6 and GFP::PAR-2 were localized at the anterior and posterior part of the early C. elegans embryo, respectively showing that rga-3/4 (RNAi) did not interfere with polarity establishment. Most importantly, upon simultaneous knockdown of RGA-3, RGA-4 and the third RhoGAP present in the early embryo, CYK-4, NMY-2 spread over the entire cortex and GFP::PAR-2 localization at the posterior cortex was greatly diminished. These results indicate that the functions of CYK-4 are temporally and spatially distinct from RGA-3 and RGA-4 (RGA-3/4). RGA-3/4 and CYK-4 also play different roles in controlling LET-502 activation in the germ line, because rga-3/4 (RNAi), but not cyk-4 (RNAi), aggravated the let-502(sb106) phenotype. We propose that RGA-3/4 and CYK-4 control with which effector molecules RHO-1 interacts at particular sites at the cortex in the zygote and in the germ line.
Collapse
Affiliation(s)
- Cornelia Schmutz
- Biozentrum, University of Basel, Klingelbergstrasse 70, CH-4056 Basel, Switzerland
| | | | | |
Collapse
|
43
|
Affiliation(s)
- Dmitrij Frishman
- Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenchaftszentrum Weihenstephan, 85350 Freising, Germany
| |
Collapse
|
44
|
Coghlan A, Durbin R. Genomix: a method for combining gene-finders' predictions, which uses evolutionary conservation of sequence and intron-exon structure. Bioinformatics 2007; 23:1468-75. [PMID: 17483502 PMCID: PMC2880447 DOI: 10.1093/bioinformatics/btm133] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
MOTIVATION Correct gene predictions are crucial for most analyses of genomes. However, in the absence of transcript data, gene prediction is still challenging. One way to improve gene-finding accuracy in such genomes is to combine the exons predicted by several gene-finders, so that gene-finders that make uncorrelated errors can correct each other. RESULTS We present a method for combining gene-finders called Genomix. Genomix selects the predicted exons that are best conserved within and/or between species in terms of sequence and intron-exon structure, and combines them into a gene structure. Genomix was used to combine predictions from four gene-finders for Caenorhabditis elegans, by selecting the predicted exons that are best conserved with C.briggsae and C.remanei. On a set of approximately 1500 confirmed C.elegans genes, Genomix increased the exon-level specificity by 10.1% and sensitivity by 2.7% compared to the best input gene-finder. AVAILABILITY Scripts and Supplementary Material can be found at http://www.sanger.ac.uk/Software/analysis/genomix
Collapse
Affiliation(s)
- Avril Coghlan
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
| | | |
Collapse
|
45
|
Winston WM, Sutherlin M, Wright AJ, Feinberg EH, Hunter CP. Caenorhabditis elegans SID-2 is required for environmental RNA interference. Proc Natl Acad Sci U S A 2007; 104:10565-70. [PMID: 17563372 PMCID: PMC1965553 DOI: 10.1073/pnas.0611282104] [Citation(s) in RCA: 200] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
In plants and in the nematode Caenorhabditis elegans, an RNAi signal can trigger gene silencing in cells distant from the site where silencing is initiated. In plants, this signal is known to be a form of dsRNA, and the signal is most likely a form of dsRNA in C. elegans as well. Furthermore, in C. elegans, dsRNA present in the environment or expressed in ingested bacteria is sufficient to trigger RNAi (environmental RNAi). Ingestion and soaking delivery of dsRNA has also been described for other invertebrates. Here we report the identification and characterization of SID-2, an intestinal luminal transmembrane protein required for environmental RNAi in C. elegans. SID-2, when expressed in the environmental RNAi defective species Caenorhabditis briggsae, confers environmental RNAi.
Collapse
Affiliation(s)
- William M. Winston
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138
| | - Marie Sutherlin
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138
| | - Amanda J. Wright
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138
| | - Evan H. Feinberg
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138
| | - Craig P. Hunter
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
46
|
Rätsch G, Sonnenburg S, Srinivasan J, Witte H, Müller KR, Sommer RJ, Schölkopf B. Improving the Caenorhabditis elegans genome annotation using machine learning. PLoS Comput Biol 2007; 3:e20. [PMID: 17319737 PMCID: PMC1808025 DOI: 10.1371/journal.pcbi.0030020] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2006] [Accepted: 12/20/2006] [Indexed: 11/19/2022] Open
Abstract
For modern biology, precise genome annotations are of prime importance, as they allow the accurate definition of genic regions. We employ state-of-the-art machine learning methods to assay and improve the accuracy of the genome annotation of the nematode Caenorhabditis elegans. The proposed machine learning system is trained to recognize exons and introns on the unspliced mRNA, utilizing recent advances in support vector machines and label sequence learning. In 87% (coding and untranslated regions) and 95% (coding regions only) of all genes tested in several out-of-sample evaluations, our method correctly identified all exons and introns. Notably, only 37% and 50%, respectively, of the presently unconfirmed genes in the C. elegans genome annotation agree with our predictions, thus we hypothesize that a sizable fraction of those genes are not correctly annotated. A retrospective evaluation of the Wormbase WS120 annotation [] of C. elegans reveals that splice form predictions on unconfirmed genes in WS120 are inaccurate in about 18% of the considered cases, while our predictions deviate from the truth only in 10%-13%. We experimentally analyzed 20 controversial genes on which our system and the annotation disagree, confirming the superiority of our predictions. While our method correctly predicted 75% of those cases, the standard annotation was never completely correct. The accuracy of our system is further corroborated by a comparison with two other recently proposed systems that can be used for splice form prediction: SNAP and ExonHunter. We conclude that the genome annotation of C. elegans and other organisms can be greatly enhanced using modern machine learning technology.
Collapse
Affiliation(s)
- Gunnar Rätsch
- Friedrich Miescher Laboratory, Max Planck Society, Tübingen, Germany.
| | | | | | | | | | | | | |
Collapse
|
47
|
Thomas PD, Mi H, Lewis S. Ontology annotation: mapping genomic regions to biological function. Curr Opin Chem Biol 2007; 11:4-11. [PMID: 17208035 DOI: 10.1016/j.cbpa.2006.11.039] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2006] [Accepted: 11/29/2006] [Indexed: 10/23/2022]
Abstract
With numerous whole genomes now in hand, and experimental data about genes and biological pathways on the increase, a systems approach to biological research is becoming essential. Ontologies provide a formal representation of knowledge that is amenable to computational as well as human analysis, an obvious underpinning of systems biology. Mapping function to gene products in the genome consists of two, somewhat intertwined enterprises: ontology building and ontology annotation. Ontology building is the formal representation of a domain of knowledge; ontology annotation is association of specific genomic regions (which we refer to simply as 'genes', including genes and their regulatory elements and products such as proteins and functional RNAs) to parts of the ontology. We consider two complementary representations of gene function: the Gene Ontology (GO) and pathway ontologies. GO represents function from the gene's eye view, in relation to a large and growing context of biological knowledge at all levels. Pathway ontologies represent function from the point of view of biochemical reactions and interactions, which are ordered into networks and causal cascades. The more mature GO provides an example of ontology annotation: how conclusions from the scientific literature and from evolutionary relationships are converted into formal statements about gene function. Annotations are made using a variety of different types of evidence, which can be used to estimate the relative reliability of different annotations.
Collapse
Affiliation(s)
- Paul D Thomas
- Evolutionary Systems Biology Group, Artificial Intelligence Center, SRI International, Menlo Park, CA 94025, USA.
| | | | | |
Collapse
|
48
|
EDGEdb: a transcription factor-DNA interaction database for the analysis of C. elegans differential gene expression. BMC Genomics 2007; 8:21. [PMID: 17233892 PMCID: PMC1790901 DOI: 10.1186/1471-2164-8-21] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2006] [Accepted: 01/18/2007] [Indexed: 11/21/2022] Open
Abstract
Background Transcription regulatory networks are composed of protein-DNA interactions between transcription factors and their target genes. A long-term goal in genome biology is to map protein-DNA interaction networks of all regulatory regions in a genome of interest. Both transcription factor -and gene-centered methods can be used to systematically identify such interactions. We use high-throughput yeast one-hybrid assays as a gene-centered method to identify protein-DNA interactions between regulatory sequences (e.g. gene promoters) and transcription factors in the nematode Caenorhabditis elegans. We have already mapped several hundred protein-DNA interactions and analyzed the transcriptional consequences of some by examining differential gene expression of targets in the presence or absence of an upstream regulator. The rapidly increasing amount of protein-DNA interaction data at a genome scale requires a database that facilitates efficient data storage, retrieval and integration. Description Here, we report the implementation of a C. elegans differential gene expression database (EDGEdb). This database enables the storage and retrieval of protein-DNA interactions and other data that relate to differential gene expression. Specifically, EDGEdb contains: i) sequence information of regulatory elements, including gene promoters, ii) sequence information of all 934 predicted transcription factors, their DNA binding domains, and, where available, their dimerization partners and consensus DNA binding sites, iii) protein-DNA interactions between regulatory elements and transcription factors, and iv) expression patterns conferred by regulatory elements, and how such patterns are affected by interacting transcription factors. Conclusion EDGEdb provides a protein-DNA -and protein-protein interaction resource for C. elegans transcription factors and a framework for similar databases for other organisms. The database is available at .
Collapse
|
49
|
Stajich JE, Dietrich FS, Roy SW. Comparative genomic analysis of fungal genomes reveals intron-rich ancestors. Genome Biol 2007; 8:R223. [PMID: 17949488 PMCID: PMC2246297 DOI: 10.1186/gb-2007-8-10-r223] [Citation(s) in RCA: 99] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2006] [Revised: 10/12/2007] [Accepted: 10/19/2007] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Eukaryotic protein-coding genes are interrupted by spliceosomal introns, which are removed from transcripts before protein translation. Many facets of spliceosomal intron evolution, including age, mechanisms of origins, the role of natural selection, and the causes of the vast differences in intron number between eukaryotic species, remain debated. Genome sequencing and comparative analysis has made possible whole genome analysis of intron evolution to address these questions. RESULTS We analyzed intron positions in 1,161 sets of orthologous genes across 25 eukaryotic species. We find strong support for an intron-rich fungus-animal ancestor, with more than four introns per kilobase, comparable to the highest known modern intron densities. Indeed, the fungus-animal ancestor is estimated to have had more introns than any of the extant fungi in this study. Thus, subsequent fungal evolution has been characterized by widespread and recurrent intron loss occurring in all fungal clades. These results reconcile three previously proposed methods for estimation of ancestral intron number, which previously gave very different estimates of ancestral intron number for eight eukaryotic species, as well as a fourth more recent method. We do not find a clear inverse correspondence between rates of intron loss and gain, contrary to the predictions of selection-based proposals for interspecific differences in intron number. CONCLUSION Our results underscore the high intron density of eukaryotic ancestors and the widespread importance of intron loss through eukaryotic evolution.
Collapse
Affiliation(s)
- Jason E Stajich
- Department of Molecular Genetics and Microbiology, Center for Genome Technology, Institute for Genome Science and Policy, Duke University, Durham, NC 27710, USA
- Miller Institute for Basic Research and Department of Plant and Microbial Biology, 111 Koshland Hall #3102, University of California, Berkeley, CA 94720-3102, USA
| | - Fred S Dietrich
- Department of Molecular Genetics and Microbiology, Center for Genome Technology, Institute for Genome Science and Policy, Duke University, Durham, NC 27710, USA
| | - Scott W Roy
- Department of Molecular Genetics and Microbiology, Center for Genome Technology, Institute for Genome Science and Policy, Duke University, Durham, NC 27710, USA
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| |
Collapse
|
50
|
Pruitt KD, Tatusova T, Maglott DR. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 2007. [PMID: 17130148 DOI: 10.1093/nar/gkl84] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023] Open
Abstract
NCBI's reference sequence (RefSeq) database (http://www.ncbi.nlm.nih.gov/RefSeq/) is a curated non-redundant collection of sequences representing genomes, transcripts and proteins. The database includes 3774 organisms spanning prokaryotes, eukaryotes and viruses, and has records for 2,879,860 proteins (RefSeq release 19). RefSeq records integrate information from multiple sources, when additional data are available from those sources and therefore represent a current description of the sequence and its features. Annotations include coding regions, conserved domains, tRNAs, sequence tagged sites (STS), variation, references, gene and protein product names, and database cross-references. Sequence is reviewed and features are added using a combined approach of collaboration and other input from the scientific community, prediction, propagation from GenBank and curation by NCBI staff. The format of all RefSeq records is validated, and an increasing number of tests are being applied to evaluate the quality of sequence and annotation, especially in the context of complete genomic sequence.
Collapse
Affiliation(s)
- Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Rm 6An.12J, 45 Center Drive, Bethesda, MD 20892-6510, USA.
| | | | | |
Collapse
|