151
|
Arhondakis S, Clay O, Bernardi G. Compositional properties of human cDNA libraries: practical implications. FEBS Lett 2006; 580:5772-8. [PMID: 17022979 DOI: 10.1016/j.febslet.2006.09.034] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2006] [Revised: 09/12/2006] [Accepted: 09/19/2006] [Indexed: 01/28/2023]
Abstract
The strikingly wide and bimodal gene distribution exhibited by the human genome has prompted us to study the correlations between EST-counts (expression levels) and base composition of genes, especially since existing data are contradictory. Here we investigate how cDNA library preparation affects the GC distributions of ESTs and/or genes found in the library, and address consequences for expression studies. We observe that strongly anomalous GC distributions often indicate experimental biases or deficits during their preparation. We propose the use of compositional distributions of raw ESTs from a cDNA library, and/or of the genes they represent, as a simple and effective tool for quality control.
Collapse
Affiliation(s)
- Stilianos Arhondakis
- Laboratory of Molecular Evolution, Stazione Zoologica Anton Dohrn, 80121 Naples, Italy
| | | | | |
Collapse
|
152
|
McCarthy FM, Wang N, Magee GB, Nanduri B, Lawrence ML, Camon EB, Barrell DG, Hill DP, Dolan ME, Williams WP, Luthe DS, Bridges SM, Burgess SC. AgBase: a functional genomics resource for agriculture. BMC Genomics 2006; 7:229. [PMID: 16961921 PMCID: PMC1618847 DOI: 10.1186/1471-2164-7-229] [Citation(s) in RCA: 204] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2006] [Accepted: 09/08/2006] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Many agricultural species and their pathogens have sequenced genomes and more are in progress. Agricultural species provide food, fiber, xenotransplant tissues, biopharmaceuticals and biomedical models. Moreover, many agricultural microorganisms are human zoonoses. However, systems biology from functional genomics data is hindered in agricultural species because agricultural genome sequences have relatively poor structural and functional annotation and agricultural research communities are smaller with limited funding compared to many model organism communities. DESCRIPTION To facilitate systems biology in these traditionally agricultural species we have established "AgBase", a curated, web-accessible, public resource http://www.agbase.msstate.edu for structural and functional annotation of agricultural genomes. The AgBase database includes a suite of computational tools to use GO annotations. We use standardized nomenclature following the Human Genome Organization Gene Nomenclature guidelines and are currently functionally annotating chicken, cow and sheep gene products using the Gene Ontology (GO). The computational tools we have developed accept and batch process data derived from different public databases (with different accession codes), return all existing GO annotations, provide a list of products without GO annotation, identify potential orthologs, model functional genomics data using GO and assist proteomics analysis of ESTs and EST assemblies. Our journal database helps prevent redundant manual GO curation. We encourage and publicly acknowledge GO annotations from researchers and provide a service for researchers interested in GO and analysis of functional genomics data. CONCLUSION The AgBase database is the first database dedicated to functional genomics and systems biology analysis for agriculturally important species and their pathogens. We use experimental data to improve structural annotation of genomes and to functionally characterize gene products. AgBase is also directly relevant for researchers in fields as diverse as agricultural production, cancer biology, biopharmaceuticals, human health and evolutionary biology. Moreover, the experimental methods and bioinformatics tools we provide are widely applicable to many other species including model organisms.
Collapse
Affiliation(s)
- Fiona M McCarthy
- Department of Basic Sciences, College of Veterinary Medicine, Mississippi State University, P.O. Box 1600, Mississippi State, MS 39762, USA
- Institute for Digital Biology, Mississippi State University, MS 39762, USA
| | - Nan Wang
- Department of Computer Science and Engineering, Bagley College of Engineering, P.O. Box 9637, Mississippi State University, MS 39762, USA
- Institute for Digital Biology, Mississippi State University, MS 39762, USA
| | - G Bryce Magee
- Department of Computer Science and Engineering, Bagley College of Engineering, P.O. Box 9637, Mississippi State University, MS 39762, USA
- Institute for Digital Biology, Mississippi State University, MS 39762, USA
| | - Bindu Nanduri
- Department of Basic Sciences, College of Veterinary Medicine, Mississippi State University, P.O. Box 1600, Mississippi State, MS 39762, USA
- Institute for Digital Biology, Mississippi State University, MS 39762, USA
| | - Mark L Lawrence
- Department of Basic Sciences, College of Veterinary Medicine, Mississippi State University, P.O. Box 1600, Mississippi State, MS 39762, USA
| | - Evelyn B Camon
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel G Barrell
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - David P Hill
- Mouse Genome Informatics, The Jackson Laboratory 600 Main Street, Bar Harbor, ME 04609, USA
| | - Mary E Dolan
- Mouse Genome Informatics, The Jackson Laboratory 600 Main Street, Bar Harbor, ME 04609, USA
| | - W Paul Williams
- USDA ARS Corn Host Plant Resistance Research Unit, Box 5367, Mississippi State University, MS 39762, USA
| | - Dawn S Luthe
- Department of Biochemistry and Molecular Biology, P.O. Box 9650, Mississippi State University, MS 39762, USA
- Institute for Digital Biology, Mississippi State University, MS 39762, USA
| | - Susan M Bridges
- Department of Computer Science and Engineering, Bagley College of Engineering, P.O. Box 9637, Mississippi State University, MS 39762, USA
- Institute for Digital Biology, Mississippi State University, MS 39762, USA
| | - Shane C Burgess
- Department of Basic Sciences, College of Veterinary Medicine, Mississippi State University, P.O. Box 1600, Mississippi State, MS 39762, USA
- Institute for Digital Biology, Mississippi State University, MS 39762, USA
| |
Collapse
|
153
|
Fredslund J, Madsen LH, Hougaard BK, Sandal N, Stougaard J, Bertioli D, Schauser L. GeMprospector--online design of cross-species genetic marker candidates in legumes and grasses. Nucleic Acids Res 2006; 34:W670-5. [PMID: 16845095 PMCID: PMC1538858 DOI: 10.1093/nar/gkl201] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
The web program GeMprospector (URL: ) allows users to automatically design large sets of cross-species genetic marker candidates targeting either legumes or grasses. The user uploads a collection of ESTs from one or more legume or grass species, and they are compared with a database of clusters of homologous EST and genomic sequences from other legumes or grasses, respectively. Multiple sequence alignments between submitted ESTs and their homologues in the appropriate database form the basis of automated PCR primer design in conserved exons such that each primer set amplifies an intron. The only user input is a collection of ESTs, not necessarily from more than one species, and GeMprospector can boost the potential of such an EST collection by combining it with a large database to produce cross-species genetic marker candidates for legumes or grasses.
Collapse
Affiliation(s)
- Jakob Fredslund
- Bioinformatics Research Centre, University of Aarhus, Høegh-Guldbergsgade 10, 8000 Aarhus C, Denmark.
| | | | | | | | | | | | | |
Collapse
|
154
|
Fredslund J, Madsen LH, Hougaard BK, Nielsen AM, Bertioli D, Sandal N, Stougaard J, Schauser L. A general pipeline for the development of anchor markers for comparative genomics in plants. BMC Genomics 2006; 7:207. [PMID: 16907970 PMCID: PMC1570147 DOI: 10.1186/1471-2164-7-207] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2006] [Accepted: 08/14/2006] [Indexed: 12/19/2022] Open
Abstract
Background Complete or near-complete genomic sequence information is presently only available for a few plant species representing a large phylogenetic diversity among plants. In order to effectively transfer this information to species lacking sequence information, comparative genomic tools need to be developed. Molecular markers permitting cross-species mapping along co-linear genomic regions are central to comparative genomics. These "anchor" markers, defining unique loci in genetic linkage maps of multiple species, are gene-based and possess a number of features that make them relatively sparse. To identify potential anchor marker sequences more efficiently, we have established an automated bioinformatic pipeline that combines multi-species Expressed Sequence Tags (EST) and genome sequence data. Results Taking advantage of sequence data from related species, the pipeline identifies evolutionarily conserved sequences that are likely to define unique orthologous loci in most species of the same phylogenetic clade. The key features are the identification of evolutionarily conserved sequences followed by automated design of intron-flanking Polymerase Chain Reaction (PCR) primer pairs. Polymorphisms can subsequently be identified by size- or sequence variation of PCR products, amplified from mapping parents or populations. We illustrate our procedure in legumes and grasses and exemplify its application in legumes, where model plant studies and the genome- and EST-sequence data available have a potential impact on the breeding of crop species and on our understanding of the evolution of this large and diverse family. Conclusion We provide a database of 459 candidate anchor loci which have the potential to serve as map anchors in more than 18,000 legume species, a number of which are of agricultural importance. For grasses, the database contains 1335 candidate anchor loci. Based on this database, we have evaluated 76 candidate anchor loci with respect to marker development in legume species with no sequence information available, demonstrating the validity of this approach.
Collapse
Affiliation(s)
- Jakob Fredslund
- Bioinformatics Research Center, University of Aarhus, Høegh-Guldbergs Gade 10, Building 090, DK-8000 Århus C, Denmark
| | - Lene H Madsen
- Laboratory of Gene Expression, Department of Molecular Biology, University of Aarhus, Gustav Wieds Vej 10, DK-8000 Århus C, Denmark
| | - Birgit K Hougaard
- Laboratory of Gene Expression, Department of Molecular Biology, University of Aarhus, Gustav Wieds Vej 10, DK-8000 Århus C, Denmark
| | - Anna Marie Nielsen
- Laboratory of Gene Expression, Department of Molecular Biology, University of Aarhus, Gustav Wieds Vej 10, DK-8000 Århus C, Denmark
| | - David Bertioli
- Univerisidade Catolica de Brasil – UCB, Programa de Pós-Graduação em Biotecnologia Genômica, Campus II – SGAN Quadra 916, Módulo B, Av. W5 Norte, Brasília – DF, CEP: 70790-160, Brazil
| | - Niels Sandal
- Laboratory of Gene Expression, Department of Molecular Biology, University of Aarhus, Gustav Wieds Vej 10, DK-8000 Århus C, Denmark
| | - Jens Stougaard
- Laboratory of Gene Expression, Department of Molecular Biology, University of Aarhus, Gustav Wieds Vej 10, DK-8000 Århus C, Denmark
| | - Leif Schauser
- Bioinformatics Research Center, University of Aarhus, Høegh-Guldbergs Gade 10, Building 090, DK-8000 Århus C, Denmark
| |
Collapse
|
155
|
Hohnjec N, Henckel K, Bekel T, Gouzy J, Dondrup M, Goesmann A, Küster H. Transcriptional snapshots provide insights into the molecular basis of arbuscular mycorrhiza in the model legume Medicago truncatula. FUNCTIONAL PLANT BIOLOGY : FPB 2006; 33:737-748. [PMID: 32689284 DOI: 10.1071/fp06079] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2006] [Accepted: 06/15/2006] [Indexed: 06/11/2023]
Abstract
The arbuscular mycorrhizal (AM) association between terrestrial plants and soil fungi of the phylum Glomeromycota is the most widespread beneficial plant-microbe interaction on earth. In the course of the symbiosis, fungal hyphae colonise plant roots and supply limiting nutrients, in particular phosphorus, in exchange for carbon compounds. Owing to the obligate biotrophy of mycorrhizal fungi and the lack of genetic systems to study them, targeted molecular studies on AM symbioses proved to be difficult. With the emergence of plant genomics and the selection of suitable models, an application of untargeted expression profiling experiments became possible. In the model legume Medicago truncatula, high-throughput expressed sequence tag (EST)-sequencing in conjunction with in silico and experimental transcriptome profiling provided transcriptional snapshots that together defined the global genetic program activated during AM. Owing to an asynchronous development of the symbiosis, several hundred genes found to be activated during the symbiosis cannot be easily correlated with symbiotic structures, but the expression of selected genes has been extended to the cellular level to correlate gene expression with specific stages of AM development. These approaches identified marker genes for the AM symbiosis and provided the first insights into the molecular basis of gene expression regulation during AM.
Collapse
Affiliation(s)
- Natalija Hohnjec
- Institute for Genome Research, Center for Biotechnology (CeBiTec), Bielefeld University, D-33594 Bielefeld, Germany
| | - Kolja Henckel
- Bioinformatics Resource Facility, Center for Biotechnology (CeBiTec), Bielefeld University, D-33594 Bielefeld, Germany
| | - Thomas Bekel
- Bioinformatics Resource Facility, Center for Biotechnology (CeBiTec), Bielefeld University, D-33594 Bielefeld, Germany
| | - Jerome Gouzy
- Laboratoire des Interactions Plantes Micro-organismes LIPM, Chemin de Borde-Rouge-Auzeville, BP 52627, 31326 Castanet Tolosan, Cedex, France
| | - Michael Dondrup
- International Graduate School in Bioinformatics and Genome Research, Center for Biotechnology (CeBiTec), Bielefeld University, D-33594 Bielefeld, Germany
| | - Alexander Goesmann
- Bioinformatics Resource Facility, Center for Biotechnology (CeBiTec), Bielefeld University, D-33594 Bielefeld, Germany
| | - Helge Küster
- Institute for Genome Research, Center for Biotechnology (CeBiTec), Bielefeld University, D-33594 Bielefeld, Germany
| |
Collapse
|
156
|
Juhn J, James AA. oskar gene expression in the vector mosquitoes, Anopheles gambiae and Aedes aegypti. INSECT MOLECULAR BIOLOGY 2006; 15:363-72. [PMID: 16756555 DOI: 10.1111/j.1365-2583.2006.00655.x] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
A disease control strategy based on the introduction into mosquito populations of a gene conferring a pathogen-refractory phenotype is currently under investigation. This population replacement approach requires a drive system that will quickly spread and fix antipathogen effector genes in target populations. Modified transposable elements containing the control sequences of developmentally regulated genes may provide the basis for a gene drive system that regulates gene mobilization in a sex- and stage-restrictive manner. Screening of a Drosophila melanogaster database for genes whose products localize exclusively in the future germ cells during early embryonic development resulted in the identification of several candidate genes. The regulatory sequences of these genes could be used to drive transposition. Mosquito orthologous genes of oskar were identified based on sequence homology and characterized further. The tissue- and sex-specific expression profiles and hybridizations in situ show that oskar orthologous transcripts in Anopheles gambiae and Aedes aegypti accumulate in developing oocytes of adult females and localize to the posterior poles of early embryos. These characteristics potentiate the use of the regulatory sequences of mosquito oskar genes for the control of modified transposable elements.
Collapse
Affiliation(s)
- J Juhn
- Department of Molecular Biology & Biochemistry, University of California, Irvine, CA 92697-3900, USA
| | | |
Collapse
|
157
|
Dana AN, Hillenmeyer ME, Lobo NF, Kern MK, Romans PA, Collins FH. Differential gene expression in abdomens of the malaria vector mosquito, Anopheles gambiae, after sugar feeding, blood feeding and Plasmodium berghei infection. BMC Genomics 2006; 7:119. [PMID: 16712725 PMCID: PMC1508153 DOI: 10.1186/1471-2164-7-119] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2005] [Accepted: 05/19/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Large scale sequencing of cDNA libraries can provide profiles of genes expressed in an organism under defined biological and environmental circumstances. We have analyzed sequences of 4541 Expressed Sequence Tags (ESTs) from 3 different cDNA libraries created from abdomens from Plasmodium infection-susceptible adult female Anopheles gambiae. These libraries were made from sugar fed (S), rat blood fed (RB), and P. berghei-infected (IRB) mosquitoes at 30 hours after the blood meal, when most parasites would be transforming ookinetes or very early oocysts. RESULTS The S, RB and IRB libraries contained 1727, 1145 and 1669 high quality ESTs, respectively, averaging 455 nucleotides (nt) in length. They assembled into 1975 consensus sequences--567 contigs and 1408 singletons. Functional annotation was performed to annotate probable molecular functions of the gene products and the biological processes in which they function. Genes represented at high frequency in one or more of the libraries were subjected to digital Northern analysis and results on expression of 5 verified by qRT-PCR. CONCLUSION 13% of the 1965 ESTs showing identity to the A. gambiae genome sequence represent novel genes. These, together with untranslated regions (UTR) present on many of the ESTs, will inform further genome annotation. We have identified 23 genes encoding products likely to be involved in regulating the cellular oxidative environment and 25 insect immunity genes. We also identified 25 genes as being up or down regulated following blood feeding and/or feeding with P. berghei infected blood relative to their expression levels in sugar fed females.
Collapse
Affiliation(s)
- Ali N Dana
- Center for Tropical Disease Research and Training, Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| | | | - Neil F Lobo
- Center for Tropical Disease Research and Training, Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Marcia K Kern
- Center for Tropical Disease Research and Training, Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Patricia A Romans
- Department of Zoology, University of Toronto, Toronto, ON M5S 3G5, Canada
| | - Frank H Collins
- Center for Tropical Disease Research and Training, Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| |
Collapse
|
158
|
Siviero F, Rezende-Teixeira P, Andrade A, Machado-Santelli GM, Santelli RV. Analysis of expressed sequence tags from Rhynchosciara americana salivary glands. INSECT MOLECULAR BIOLOGY 2006; 15:109-18. [PMID: 16640721 DOI: 10.1111/j.1365-2583.2006.00616.x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
The diptera Rhynchosciara americana (sciaridae) is an important model organism in polyteny and gene amplification research, but up to now a limited amount of data regarding DNA sequences and molecular aspects of this species is available. Considering the importance of going further on the DNA puffs biological meaning, we proposed to generate EST sequences from a DNA library constructed from salivary glands. After their categorization in gene ontology terms, they were used to construct an 'electronic Northern' that represents a general view of the salivary gland metabolic status in an important phase of larval development: the spinning of communal cocoon. In this phase occurs the last polytene DNA replication cycle concomitantly with the specific loci amplification related to protein secretion.
Collapse
Affiliation(s)
- F Siviero
- Departamento de Biologia Celular e Desenvolvimento, Instituto de Ciências Biomédicas, Universidade de São Paulo, São Paulo, Brazil.
| | | | | | | | | |
Collapse
|
159
|
Smith J, Speed D, Hocking PM, Talbot RT, Degen WGJ, Schijns VEJC, Glass EJ, Burt DW. Development of a chicken 5 K microarray targeted towards immune function. BMC Genomics 2006; 7:49. [PMID: 16533398 PMCID: PMC1440325 DOI: 10.1186/1471-2164-7-49] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2005] [Accepted: 03/13/2006] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND The development of microarray resources for the chicken is an important step in being able to profile gene expression changes occurring in birds in response to different challenges and stimuli. The creation of an immune-related array is highly valuable in determining the host immune response in relation to infection with a wide variety of bacterial and viral diseases. RESULTS Here we report the development of chicken immune-related cDNA libraries and the subsequent construction of a microarray containing 5190 elements (in duplicate). Clones on the array originate from tissues known to contain high levels of cells related to the immune system, namely Bursa, Peyers patch, thymus and spleen. Represented on the array are genes that are known to cluster with existing chicken ESTs as well as genes that are unique to our libraries. Some of these genes have no known homologies and represent novel genes in the chicken collection. A series of reference genes (ie. genes of known immune function) are also present on the array. Functional annotation data is also provided for as many of the genes on the array as is possible. CONCLUSION Six new chicken immune cDNA libraries have been created and nearly 10,000 sequences submitted to GenBank [GenBank: AM063043-AM071350; AM071520-AM072286; AM075249-AM075607]. A 5 K immune-related array has been developed from these libraries. Individual clones and arrays are available from the ARK-Genomics resource centre.
Collapse
Affiliation(s)
- Jacqueline Smith
- Division of Genetics and Genomics, Roslin Institute, Roslin (Edinburgh), Midlothian, EH25 9PS, UK
| | - David Speed
- Division of Genetics and Genomics, Roslin Institute, Roslin (Edinburgh), Midlothian, EH25 9PS, UK
| | - Paul M Hocking
- Division of Genetics and Genomics, Roslin Institute, Roslin (Edinburgh), Midlothian, EH25 9PS, UK
| | - Richard T Talbot
- Ark-Genomics, Roslin Institute, Roslin (Edinburgh), Midlothian, EH25 9PS, UK
| | - Winfried GJ Degen
- Intervet International B.V., Dept. of Vaccine Technology and Immunology R&D, P.O. Box 31, 5830 AA, Boxmeer, The Netherlands
| | - Virgil EJC Schijns
- Intervet International B.V., Dept. of Vaccine Technology and Immunology R&D, P.O. Box 31, 5830 AA, Boxmeer, The Netherlands
| | - Elizabeth J Glass
- Division of Genetics and Genomics, Roslin Institute, Roslin (Edinburgh), Midlothian, EH25 9PS, UK
| | - David W Burt
- Division of Genetics and Genomics, Roslin Institute, Roslin (Edinburgh), Midlothian, EH25 9PS, UK
| |
Collapse
|
160
|
Shafer P, Lin DM, Yona G. EST2Prot: mapping EST sequences to proteins. BMC Genomics 2006; 7:41. [PMID: 16515706 PMCID: PMC1456965 DOI: 10.1186/1471-2164-7-41] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2005] [Accepted: 03/04/2006] [Indexed: 11/12/2022] Open
Abstract
Background EST libraries are used in various biological studies, from microarray experiments to proteomic and genetic screens. These libraries usually contain many uncharacterized ESTs that are typically ignored since they cannot be mapped to known genes. Consequently, new discoveries are possibly overlooked. Results We describe a system (EST2Prot) that uses multiple elements to map EST sequences to their corresponding protein products. EST2Prot uses UniGene clusters, substring analysis, information about protein coding regions in existing DNA sequences and protein database searches to detect protein products related to a query EST sequence. Gene Ontology terms, Swiss-Prot keywords, and protein similarity data are used to map the ESTs to functional descriptors. Conclusion EST2Prot extends and significantly enriches the popular UniGene mapping by utilizing multiple relations between known biological entities. It produces a mapping between ESTs and proteins in real-time through a simple web-interface. The system is part of the Biozon database and is accessible at .
Collapse
Affiliation(s)
- Paul Shafer
- Department of Computer Science, Cornell University, Ithaca, NY, USA
| | - David M Lin
- Department of Biomedical Sciences, Cornell University, Ithaca, NY, USA
| | - Golan Yona
- Department of Computer Science, Cornell University, Ithaca, NY, USA
| |
Collapse
|
161
|
Scholz B, Kultima K, Mattsson A, Axelsson J, Brunström B, Halldin K, Stigson M, Dencker L. Sex-dependent gene expression in early brain development of chicken embryos. BMC Neurosci 2006; 7:12. [PMID: 16480516 PMCID: PMC1386693 DOI: 10.1186/1471-2202-7-12] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2005] [Accepted: 02/15/2006] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Differentiation of the brain during development leads to sexually dimorphic adult reproductive behavior and other neural sex dimorphisms. Genetic mechanisms independent of steroid hormones produced by the gonads have recently been suggested to partly explain these dimorphisms. RESULTS Using cDNA microarrays and real-time PCR we found gene expression differences between the male and female embryonic brain (or whole head) that may be independent of morphological differentiation of the gonads. Genes located on the sex chromosomes (ZZ in males and ZW in females) were common among the differentially expressed genes, several of which (WPKCI-8, HINT, MHM non-coding RNA) have previously been implicated in avian sex determination. A majority of the identified genes were more highly expressed in males. Three of these genes (CDK7, CCNH and BTF2-P44) encode subunits of the transcription factor IIH complex, indicating a role for this complex in neuronal differentiation. CONCLUSION In conclusion, this study provides novel insights into sexually dimorphic gene expression in the embryonic chicken brain and its possible involvement in sex differentiation of the nervous system in birds.
Collapse
Affiliation(s)
- Birger Scholz
- Department of Pharmaceutical Biosciences, Division of Toxicology, The Biomedical Center, Husargatan 3, Box 594, SE-75124 Uppsala, and Centre for Reproductive Biology in Uppsala, Uppsala University, Sweden
| | - Kim Kultima
- Department of Pharmaceutical Biosciences, Division of Toxicology, The Biomedical Center, Husargatan 3, Box 594, SE-75124 Uppsala, and Centre for Reproductive Biology in Uppsala, Uppsala University, Sweden
| | - Anna Mattsson
- Department of Environmental Toxicology, Uppsala University, Norbyvägen 18A, SE-75236 Uppsala, and Centre for Reproductive Biology in Uppsala, Uppsala University, Sweden
| | - Jeanette Axelsson
- Department of Environmental Toxicology, Uppsala University, Norbyvägen 18A, SE-75236 Uppsala, and Centre for Reproductive Biology in Uppsala, Uppsala University, Sweden
| | - Björn Brunström
- Department of Environmental Toxicology, Uppsala University, Norbyvägen 18A, SE-75236 Uppsala, and Centre for Reproductive Biology in Uppsala, Uppsala University, Sweden
| | - Krister Halldin
- Institute of Environmental Medicine, Karolinska Institutet, P.O. Box 210, SE-171 77 Stockholm, Sweden
| | - Michael Stigson
- Department of Pharmaceutical Biosciences, Division of Toxicology, The Biomedical Center, Husargatan 3, Box 594, SE-75124 Uppsala, and Centre for Reproductive Biology in Uppsala, Uppsala University, Sweden
| | - Lennart Dencker
- Department of Pharmaceutical Biosciences, Division of Toxicology, The Biomedical Center, Husargatan 3, Box 594, SE-75124 Uppsala, and Centre for Reproductive Biology in Uppsala, Uppsala University, Sweden
| |
Collapse
|
162
|
Jiang Z, Wu XL, Michal JJ, McNamara JP. Pattern profiling and mapping of the fat body transcriptome in Drosophila melanogaster. ACTA ACUST UNITED AC 2006; 13:1898-904. [PMID: 16339120 DOI: 10.1038/oby.2005.233] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
In Drosophila, the fat body is a collective name for the masses and sheets of adipose tissue that are distributed throughout the fly body. Thus far, >386,000 Drosophila expressed sequence tags (ESTs) have been deposited to the GenBank database, including 10,443 derived from fat body in flies (data accessed on October 7, 2004). The objective of this study was to map the transcriptome of the fat body in flies and thus provide genomics and bioinformatics tools for developing a Drosophila model for addressing the genetic complexity of obesity in humans. The gene-EST Basic Local Alignment Search Tool (BLAST) matches revealed that these ESTs could represent 12,188 coding genes in the Drosophila genome. Among them, at least 2,261 are expressed in the fat body, including 41 identified as preferentially expressed genes with logarithm of odds >3.0. Self-organizing map analysis revealed a cluster of 290 genes favorably expressed in the fat body compared with genes expressed in five other tissues. Mapping of the fat body transcriptome identified a 1.7-Mb domain on 3L containing 35 genes that were expressed at a much higher level than in other tissues (transcript density factor = 1.0 approximately 2.3).
Collapse
Affiliation(s)
- Zhihua Jiang
- Department of Animal Sciences, Washington State University, Pullman, WA 99164-6351, USA.
| | | | | | | |
Collapse
|
163
|
Lohar DP, Sharopova N, Endre G, Peñuela S, Samac D, Town C, Silverstein KAT, VandenBosch KA. Transcript analysis of early nodulation events in Medicago truncatula. PLANT PHYSIOLOGY 2006; 140:221-34. [PMID: 16377745 PMCID: PMC1326046 DOI: 10.1104/pp.105.070326] [Citation(s) in RCA: 180] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2005] [Revised: 11/03/2005] [Accepted: 11/09/2005] [Indexed: 05/05/2023]
Abstract
Within the first 72 h of the interaction between rhizobia and their host plants, nodule primordium induction and infection occur. We predicted that transcription profiling of early stages of the symbiosis between Medicago truncatula roots and Sinorhizobium meliloti would identify regulated plant genes that likely condition key events in nodule initiation. Therefore, using a microarray with about 6,000 cDNAs, we compared transcripts from inoculated and uninoculated roots corresponding to defined stages between 1 and 72 h post inoculation (hpi). Hundreds of genes of both known and unknown function were significantly regulated at these time points. Four stages of the interaction were recognized based on gene expression profiles, and potential marker genes for these stages were identified. Some genes that were regulated differentially during stages I (1 hpi) and II (6-12 hpi) of the interaction belong to families encoding proteins involved in calcium transport and binding, reactive oxygen metabolism, and cytoskeleton and cell wall functions. Genes involved in cell proliferation were found to be up-regulated during stages III (24-48 hpi) and IV (72 hpi). Many genes that are homologs of defense response genes were up-regulated during stage I but down-regulated later, likely facilitating infection thread progression into the root cortex. Additionally, genes putatively involved in signal transduction and transcriptional regulation were found to be differentially regulated in the inoculated roots at each time point. The findings shed light on the complexity of coordinated gene regulation and will be useful for continued dissection of the early steps in symbiosis.
Collapse
|
164
|
Sampedro J, Carey RE, Cosgrove DJ. Genome histories clarify evolution of the expansin superfamily: new insights from the poplar genome and pine ESTs. JOURNAL OF PLANT RESEARCH 2006; 119:11-21. [PMID: 16411016 DOI: 10.1007/s10265-005-0253-z] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2005] [Accepted: 11/29/2005] [Indexed: 05/06/2023]
Abstract
Expansins comprise a superfamily of plant cell wall-loosening proteins that has been divided into four distinct families, EXPA, EXPB, EXLA and EXLB. In a recent analysis of Arabidopsis thaliana and Oryza sativa expansins, we proposed a further subdivision of the families into 17 clades, representing independent lineages in the last common ancestor of monocots and eudicots. This division was based on both traditional sequence-based phylogenetic trees and on position-based trees, in which genomic locations and dated segmental duplications were used to reconstruct gene phylogeny. In this article we review recent work concerning the patterns of expansin evolution in angiosperms and include additional insights gained from the genome of a second eudicot species, Populus trichocarpa, which includes at least 36 expansin genes. All of the previously proposed monocot-eudicot orthologous groups, but no additional ones, are represented in this species. The results also confirm that all of these clades are truly independent lineages. Furthermore, we have used position-based phylogeny to clarify the history of clades EXPA-II and EXPA-IV. Most of the growth of the expansin superfamily in the poplar lineage is likely due to a recent polyploidy event. Finally, some monocot-eudicot clades are shown to have diverged before the separation of the angiosperm and gymnosperm lineages.
Collapse
Affiliation(s)
- Javier Sampedro
- Department of Biology, Pennsylvania State University, 208 Mueller Lab, University Park, PA 16802, USA
| | | | | |
Collapse
|
165
|
America AHP, Cordewener JHG, van Geffen MHA, Lommen A, Vissers JPC, Bino RJ, Hall RD. Alignment and statistical difference analysis of complex peptide data sets generated by multidimensional LC-MS. Proteomics 2006; 6:641-53. [PMID: 16372275 DOI: 10.1002/pmic.200500034] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
A method for high-resolution proteomics analyses of complex protein mixtures is presented using multidimensional HPLC coupled to MS (MDLC-MS). The method was applied to identify proteins that are differentially expressed during fruit ripening of tomato. Protein extracts from red and green tomato fruits were digested by trypsin. The resulting highly complex peptide mixtures were separated by strong cation exchange chromatography (SCX), and subsequently analyzed by RP nano-LC coupled to quadrupole-TOF MS. For detailed quantitative comparison, triplicate RP-LC-MS runs were performed for each SCX fraction. The resulting data sets were analyzed using MetAlign software for noise and data reduction, multiple alignment and statistical variance analysis. For each RP-LC-MS chromatogram, up to 7000 mass components were detected. Peak intensity data were compared by multivariate and statistical analysis. This revealed a clear separation between the green and red tomato samples, and a clear separation of the different SCX fractions. MS/MS spectra were collected using the data-dependent acquisition mode from a selected set of differentially detected peptide masses, enabling the identification of proteins that were differentially expressed during ripening of tomato fruits. Our approach is a highly sensitive method to analyze proteins in complex mixtures without the need of isotope labeling.
Collapse
|
166
|
Pfister KK, Shah PR, Hummerich H, Russ A, Cotton J, Annuar AA, King SM, Fisher EMC. Genetic analysis of the cytoplasmic dynein subunit families. PLoS Genet 2006; 2:e1. [PMID: 16440056 PMCID: PMC1331979 DOI: 10.1371/journal.pgen.0020001] [Citation(s) in RCA: 216] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Cytoplasmic dyneins, the principal microtubule minus-end-directed motor proteins of the cell, are involved in many essential cellular processes. The major form of this enzyme is a complex of at least six protein subunits, and in mammals all but one of the subunits are encoded by at least two genes. Here we review current knowledge concerning the subunits, their interactions, and their functional roles as derived from biochemical and genetic analyses. We also carried out extensive database searches to look for new genes and to clarify anomalies in the databases. Our analysis documents evolutionary relationships among the dynein subunits of mammals and other model organisms, and sheds new light on the role of this diverse group of proteins, highlighting the existence of two cytoplasmic dynein complexes with distinct cellular roles.
Collapse
Affiliation(s)
- K Kevin Pfister
- Department of Cell Biology, School of Medicine, University of Virginia, Charlottesville, Virginia, USA.
| | | | | | | | | | | | | | | |
Collapse
|
167
|
Abstract
Alternative splicing and gene duplication are two major sources of proteomic function diversity. Here, we study the evolutionary trend of alternative splicing after gene duplication by analyzing the alternative splicing differences between duplicate genes. We observed that duplicate genes have fewer alternative splice (AS) forms than single-copy genes, and that a negative correlation exists between the mean number of AS forms and the gene family size. Interestingly, we found that the loss of alternative splicing in duplicate genes may occur shortly after the gene duplication. These results support the subfunctionization model of alternative splicing in the early stage after gene duplication. Further analysis of the alternative splicing distribution in human duplicate pairs showed the asymmetric evolution of alternative splicing after gene duplications; i.e., the AS forms between duplicates may differ dramatically. We therefore conclude that alternative splicing and gene duplication may not evolve independently. In the early stage after gene duplication, young duplicates may take over a certain amount of protein function diversity that previously was carried out by the alternative splicing mechanism. In the late stage, the gain and loss of alternative splicing seem to be independent between duplicates.
Collapse
Affiliation(s)
- Zhixi Su
- James D. Watson Institute of Genome Sciences, Zhejiang University, Hangzhou 310008, China
| | | | | | | | | |
Collapse
|
168
|
Zolman BK, Monroe-Augustus M, Silva ID, Bartel B. Identification and functional characterization of Arabidopsis PEROXIN4 and the interacting protein PEROXIN22. THE PLANT CELL 2005; 17:3422-35. [PMID: 16272432 PMCID: PMC1315379 DOI: 10.1105/tpc.105.035691] [Citation(s) in RCA: 100] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Peroxins are genetically defined as proteins necessary for peroxisome biogenesis. By screening for reduced response to indole-3-butyric acid, which is metabolized to active auxin in peroxisomes, we isolated an Arabidopsis thaliana peroxin4 (pex4) mutant. This mutant displays sucrose-dependent seedling development and reduced lateral root production, characteristics of plant peroxisome malfunction. We used yeast two-hybrid analysis to determine that PEX4, an apparent ubiquitin-conjugating enzyme, interacts with a previously unidentified Arabidopsis protein, PEX22. A pex4 pex22 double mutant enhanced pex4 defects, confirming that PEX22 is a peroxin. Expression of both Arabidopsis genes together complemented yeast pex4 or pex22 mutant defects, whereas expression of either gene individually failed to rescue the corresponding yeast mutant. Therefore, it is likely that the Arabidopsis proteins can function similarly to the yeast PEX4-PEX22 complex, with PEX4 ubiquitinating substrates and PEX22 tethering PEX4 to the peroxisome. However, the severe sucrose dependence of the pex4 pex22 mutant is not accompanied by correspondingly strong defects in peroxisomal matrix protein import, suggesting that this peroxin pair may have novel plant targets in addition to those important in fungi. Isocitrate lyase is stabilized in pex4 pex22, indicating that PEX4 and PEX22 may be important during the remodeling of peroxisome matrix contents as glyoxysomes transition to leaf peroxisomes.
Collapse
Affiliation(s)
- Bethany K Zolman
- Department of Biochemistry and Cell Biology, Rice University, Houston, Texas 77005, USA
| | | | | | | |
Collapse
|
169
|
Yamasaki C, Koyanagi KO, Fujii Y, Itoh T, Barrero R, Tamura T, Yamaguchi-Kabata Y, Tanino M, Takeda JI, Fukuchi S, Miyazaki S, Nomura N, Sugano S, Imanishi T, Gojobori T. Investigation of protein functions through data-mining on integrated human transcriptome database, H-Invitational database (H-InvDB). Gene 2005; 364:99-107. [PMID: 16185827 DOI: 10.1016/j.gene.2005.05.036] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2005] [Revised: 04/06/2005] [Accepted: 05/30/2005] [Indexed: 10/25/2022]
Abstract
H-Invitational Database (H-InvDB; ) is a human transcriptome database, containing integrative annotation of 41,118 full-length cDNA clones originated from 21,037 loci. H-InvDB is a product of the H-Invitational project, an international collaboration to systematically and functionally validate human genes by analysis of a unique set of high quality full-length cDNA clones using automatic annotation and human curation under unified criteria. Here, 19,574 proteins encoded by these cDNAs were classified into 11,709 function-known and 7865 function-unknown hypothetical proteins by similarity with protein databases and motif prediction (InterProScan). The proportion of "hypothetical proteins" in H-InvDB was as high as 40.4%. In this study, we thus conducted data-mining in H-InvDB with the aim of assigning advanced functional annotations to those hypothetical proteins. First, by data-mining in the H-InvDB version of GTOP, we identified 337 SCOP domains within 7865 H-Inv hypothetical proteins. Second, by data-mining of predicted subcellular localization by SOSUI and TMHMM in H-InvDB, we found 1032 transmembrane proteins within H-Inv hypothetical proteins. These results clearly demonstrate that structural prediction is effective for functional annotation of proteins with unknown functions. All the data in H-InvDB are shown in two main views, the cDNA view and the Locus view, and five auxiliary databases with web-based viewers; DiseaseInfo Viewer, H-ANGEL, Clustering Viewer, G-integra and TOPO Viewer; the data also are provided as flat files and XML files. The data consists of descriptions of their gene structures, novel alternative splicing isoforms, functional RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein 3D structure, mapping of SNPs and microsatellite repeat motifs in relation with orphan diseases, gene expression profiling, and comparisons with mouse full-length cDNAs in the context of molecular evolution. This unique integrative platform for conducting in silico data-mining represents a substantial contribution to resources required for the exploration of human biology and pathology.
Collapse
Affiliation(s)
- Chisato Yamasaki
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, AIST Waterfront Bio-IT Research Building, 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
170
|
Coppel RL, Black CG. Parasite genomes. Int J Parasitol 2005; 35:465-79. [PMID: 15826640 DOI: 10.1016/j.ijpara.2005.01.010] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2005] [Revised: 02/24/2005] [Accepted: 02/24/2005] [Indexed: 01/01/2023]
Abstract
The availability of genome sequences and the associated transcriptome and proteome mapping projects has revolutionised research in the field of parasitology. As more parasite species are sequenced, comparative and phylogenetic comparisons are improving the quality of gene prediction and annotation. Genome sequences of parasites are also providing important data sets for understanding parasite biology and identifying new vaccine candidates and drug targets. We review some of the preliminary conclusions from examination of parasite genome sequences and discuss some of the bioinformatics approaches taken in this analysis.
Collapse
Affiliation(s)
- Ross L Coppel
- Department of Microbiology and the Victorian Bioinformatics Consortium, Monash University, Melbourne, Vic. 3800, Australia.
| | | |
Collapse
|
171
|
Pavy N, Paule C, Parsons L, Crow JA, Morency MJ, Cooke J, Johnson JE, Noumen E, Guillet-Claude C, Butterfield Y, Barber S, Yang G, Liu J, Stott J, Kirkpatrick R, Siddiqui A, Holt R, Marra M, Seguin A, Retzel E, Bousquet J, MacKay J. Generation, annotation, analysis and database integration of 16,500 white spruce EST clusters. BMC Genomics 2005; 6:144. [PMID: 16236172 PMCID: PMC1277824 DOI: 10.1186/1471-2164-6-144] [Citation(s) in RCA: 96] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2005] [Accepted: 10/19/2005] [Indexed: 12/02/2022] Open
Abstract
Background The sequencing and analysis of ESTs is for now the only practical approach for large-scale gene discovery and annotation in conifers because their very large genomes are unlikely to be sequenced in the near future. Our objective was to produce extensive collections of ESTs and cDNA clones to support manufacture of cDNA microarrays and gene discovery in white spruce (Picea glauca [Moench] Voss). Results We produced 16 cDNA libraries from different tissues and a variety of treatments, and partially sequenced 50,000 cDNA clones. High quality 3' and 5' reads were assembled into 16,578 consensus sequences, 45% of which represented full length inserts. Consensus sequences derived from 5' and 3' reads of the same cDNA clone were linked to define 14,471 transcripts. A large proportion (84%) of the spruce sequences matched a pine sequence, but only 68% of the spruce transcripts had homologs in Arabidopsis or rice. Nearly all the sequences that matched the Populus trichocarpa genome (the only sequenced tree genome) also matched rice or Arabidopsis genomes. We used several sequence similarity search approaches for assignment of putative functions, including blast searches against general and specialized databases (transcription factors, cell wall related proteins), Gene Ontology term assignation and Hidden Markov Model searches against PFAM protein families and domains. In total, 70% of the spruce transcripts displayed matches to proteins of known or unknown function in the Uniref100 database (blastx e-value < 1e-10). We identified multigenic families that appeared larger in spruce than in the Arabidopsis or rice genomes. Detailed analysis of translationally controlled tumour proteins and S-adenosylmethionine synthetase families confirmed a twofold size difference. Sequences and annotations were organized in a dedicated database, SpruceDB. Several search tools were developed to mine the data either based on their occurrence in the cDNA libraries or on functional annotations. Conclusion This report illustrates specific approaches for large-scale gene discovery and annotation in an organism that is very distantly related to any of the fully sequenced genomes. The ArboreaSet sequences and cDNA clones represent a valuable resource for investigations ranging from plant comparative genomics to applied conifer genetics.
Collapse
Affiliation(s)
- Nathalie Pavy
- ARBOREA and Canada Research Chair in Forest Genomics, Pavillon Charles-Eugène-Marchand, Université Laval, Ste.Foy, Québec G1K 7P4, Canada
| | - Charles Paule
- Center for Computational Genomics and Bioinformatics, University of Minnesota, 420 Delaware St. S.E., MMC 43, Minneapolis, MN 55455, USA
| | - Lee Parsons
- Center for Computational Genomics and Bioinformatics, University of Minnesota, 420 Delaware St. S.E., MMC 43, Minneapolis, MN 55455, USA
| | - John A Crow
- Center for Computational Genomics and Bioinformatics, University of Minnesota, 420 Delaware St. S.E., MMC 43, Minneapolis, MN 55455, USA
| | - Marie-Josee Morency
- Laurentian Forestry Center (Canadian Forestry Service), Natural Resources Canada, 1055 rue du PEPS, Québec, Québec, G1V 4C7, Canada
| | - Janice Cooke
- ARBOREA and Canada Research Chair in Forest Genomics, Pavillon Charles-Eugène-Marchand, Université Laval, Ste.Foy, Québec G1K 7P4, Canada
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, T6G 2E9, Canada
| | - James E Johnson
- Center for Computational Genomics and Bioinformatics, University of Minnesota, 420 Delaware St. S.E., MMC 43, Minneapolis, MN 55455, USA
| | - Etienne Noumen
- ARBOREA and Canada Research Chair in Forest Genomics, Pavillon Charles-Eugène-Marchand, Université Laval, Ste.Foy, Québec G1K 7P4, Canada
| | - Carine Guillet-Claude
- ARBOREA and Canada Research Chair in Forest Genomics, Pavillon Charles-Eugène-Marchand, Université Laval, Ste.Foy, Québec G1K 7P4, Canada
| | - Yaron Butterfield
- Genome Sciences Center, BC Cancer Agency, 675 West 10 th Avenue, Vancouver, BC, V5Z 1L3, Canada
| | - Sarah Barber
- Genome Sciences Center, BC Cancer Agency, 675 West 10 th Avenue, Vancouver, BC, V5Z 1L3, Canada
| | - George Yang
- Genome Sciences Center, BC Cancer Agency, 675 West 10 th Avenue, Vancouver, BC, V5Z 1L3, Canada
| | - Jerry Liu
- Genome Sciences Center, BC Cancer Agency, 675 West 10 th Avenue, Vancouver, BC, V5Z 1L3, Canada
| | - Jeff Stott
- Genome Sciences Center, BC Cancer Agency, 675 West 10 th Avenue, Vancouver, BC, V5Z 1L3, Canada
| | - Robert Kirkpatrick
- Genome Sciences Center, BC Cancer Agency, 675 West 10 th Avenue, Vancouver, BC, V5Z 1L3, Canada
| | - Asim Siddiqui
- Genome Sciences Center, BC Cancer Agency, 675 West 10 th Avenue, Vancouver, BC, V5Z 1L3, Canada
| | - Robert Holt
- Genome Sciences Center, BC Cancer Agency, 675 West 10 th Avenue, Vancouver, BC, V5Z 1L3, Canada
| | - Marco Marra
- Genome Sciences Center, BC Cancer Agency, 675 West 10 th Avenue, Vancouver, BC, V5Z 1L3, Canada
| | - Armand Seguin
- Laurentian Forestry Center (Canadian Forestry Service), Natural Resources Canada, 1055 rue du PEPS, Québec, Québec, G1V 4C7, Canada
| | - Ernest Retzel
- Center for Computational Genomics and Bioinformatics, University of Minnesota, 420 Delaware St. S.E., MMC 43, Minneapolis, MN 55455, USA
| | - Jean Bousquet
- ARBOREA and Canada Research Chair in Forest Genomics, Pavillon Charles-Eugène-Marchand, Université Laval, Ste.Foy, Québec G1K 7P4, Canada
| | - John MacKay
- ARBOREA and Canada Research Chair in Forest Genomics, Pavillon Charles-Eugène-Marchand, Université Laval, Ste.Foy, Québec G1K 7P4, Canada
| |
Collapse
|
172
|
Tsai J, Sultana R, Lee Y, Pertea G, Karamycheva S, Antonescu V, Cho J, Parvizi B, Cheung F, Quackenbush J. RESOURCERER: a database for annotating and linking microarray resources within and across species. Genome Biol 2005; 2:SOFTWARE0002. [PMID: 16173164 PMCID: PMC138985 DOI: 10.1186/gb-2001-2-11-software0002] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Microarray expression analysis is providing unprecedented data on gene expression in humans and mammalian model systems. Although such studies provide a tremendous resource for understanding human disease states, one of the significant challenges is cross-referencing the data derived from different species, across diverse expression analysis platforms, in order to properly derive inferences regarding gene expression and disease state. To address this problem, we have developed RESOURCERER, a microarray-resource annotation and cross-reference database built using the analysis of expressed sequence tags (ESTs) and gene sequences provided by the TIGR Gene Index (TGI) and TIGR Orthologous Gene Alignment (TOGA) databases [now called Eukaryotic Gene Orthologs (EGO)].
Collapse
Affiliation(s)
- Jennifer Tsai
- The Institute for Genomic Research, Rockville, MD 20850, USA
| | - Razvan Sultana
- The Institute for Genomic Research, Rockville, MD 20850, USA
| | - Yudan Lee
- The Institute for Genomic Research, Rockville, MD 20850, USA
| | - Geo Pertea
- The Institute for Genomic Research, Rockville, MD 20850, USA
| | | | | | - Jennifer Cho
- The Institute for Genomic Research, Rockville, MD 20850, USA
| | - Babak Parvizi
- The Institute for Genomic Research, Rockville, MD 20850, USA
| | - Foo Cheung
- The Institute for Genomic Research, Rockville, MD 20850, USA
| | | |
Collapse
|
173
|
Zhou XW, Kafsack BFC, Cole RN, Beckett P, Shen RF, Carruthers VB. The opportunistic pathogen Toxoplasma gondii deploys a diverse legion of invasion and survival proteins. J Biol Chem 2005; 280:34233-44. [PMID: 16002397 PMCID: PMC1360232 DOI: 10.1074/jbc.m504160200] [Citation(s) in RCA: 94] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Host cell invasion is an essential step during infection by Toxoplasma gondii, an intracellular protozoan that causes the severe opportunistic disease toxoplasmosis in humans. Recent evidence strongly suggests that proteins discharged from Toxoplasma apical secretory organelles (micronemes, dense granules, and rhoptries) play key roles in host cell invasion and survival during infection. However, to date, only a limited number of secretory proteins have been discovered, and the full spectrum of effector molecules involved in parasite invasion and survival remains unknown. To address these issues, we analyzed a large cohort of freely released Toxoplasma secretory proteins by using two complementary methodologies, two-dimensional electrophoresis/mass spectrometry and liquid chromatography/electrospray ionization-tandem mass spectrometry (MudPIT, shotgun proteomics). Visualization of Toxoplasma secretory products by two-dimensional electrophoresis revealed approximately 100 spots, most of which were successfully identified by protein microsequencing or matrix-assisted laser desorption ionization-mass spectrometry analysis. Many proteins were present in multiple species suggesting they are subjected to substantial post-translational modification. Shotgun proteomic analysis of the secretory fraction revealed several additional products, including novel putative adhesive proteins, proteases, and hypothetical secretory proteins similar to products expressed by other related parasites including Plasmodium, the etiologic agent of malaria. A subset of novel proteins were re-expressed as fusions to yellow fluorescent protein, and this initial screen revealed shared and distinct localizations within secretory compartments of T. gondii tachyzoites. These findings provided a uniquely broad view of Toxoplasma secretory proteins that participate in parasite survival and pathogenesis during infection.
Collapse
Affiliation(s)
- Xing W Zhou
- The W. Harry Feinstone Department of Molecular Microbiology and Immunology, The Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland 21205, USA
| | | | | | | | | | | |
Collapse
|
174
|
da Silva FG, Iandolino A, Al-Kayal F, Bohlmann MC, Cushman MA, Lim H, Ergul A, Figueroa R, Kabuloglu EK, Osborne C, Rowe J, Tattersall E, Leslie A, Xu J, Baek J, Cramer GR, Cushman JC, Cook DR. Characterizing the grape transcriptome. Analysis of expressed sequence tags from multiple Vitis species and development of a compendium of gene expression during berry development. PLANT PHYSIOLOGY 2005; 139:574-97. [PMID: 16219919 PMCID: PMC1255978 DOI: 10.1104/pp.105.065748] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2005] [Revised: 07/28/2005] [Accepted: 08/04/2005] [Indexed: 05/04/2023]
Abstract
We report the analysis and annotation of 146,075 expressed sequence tags from Vitis species. The majority of these sequences were derived from different cultivars of Vitis vinifera, comprising an estimated 25,746 unique contig and singleton sequences that survey transcription in various tissues and developmental stages and during biotic and abiotic stress. Putatively homologous proteins were identified for over 17,752 of the transcripts, with 1,962 transcripts further subdivided into one or more Gene Ontology categories. A simple structured vocabulary, with modules for plant genotype, plant development, and stress, was developed to describe the relationship between individual expressed sequence tags and cDNA libraries; the resulting vocabulary provides query terms to facilitate data mining within the context of a relational database. As a measure of the extent to which characterized metabolic pathways were encompassed by the data set, we searched for homologs of the enzymes leading from glycolysis, through the oxidative/nonoxidative pentose phosphate pathway, and into the general phenylpropanoid pathway. Homologs were identified for 65 of these 77 enzymes, with 86% of enzymatic steps represented by paralogous genes. Differentially expressed transcripts were identified by means of a stringent believability index cutoff of > or =98.4%. Correlation analysis and two-dimensional hierarchical clustering grouped these transcripts according to similarity of expression. In the broadest analysis, 665 differentially expressed transcripts were identified across 29 cDNA libraries, representing a range of developmental and stress conditions. The groupings revealed expected associations between plant developmental stages and tissue types, with the notable exception of abiotic stress treatments. A more focused analysis of flower and berry development identified 87 differentially expressed transcripts and provides the basis for a compendium that relates gene expression and annotation to previously characterized aspects of berry development and physiology. Comparison with published results for select genes, as well as correlation analysis between independent data sets, suggests that the inferred in silico patterns of expression are likely to be an accurate representation of transcript abundance for the conditions surveyed. Thus, the combined data set reveals the in silico expression patterns for hundreds of genes in V. vinifera, the majority of which have not been previously studied within this species.
Collapse
|
175
|
Brown DW, Cheung F, Proctor RH, Butchko RAE, Zheng L, Lee Y, Utterback T, Smith S, Feldblyum T, Glenn AE, Plattner RD, Kendra DF, Town CD, Whitelaw CA. Comparative analysis of 87,000 expressed sequence tags from the fumonisin-producing fungus Fusarium verticillioides. Fungal Genet Biol 2005; 42:848-61. [PMID: 16099185 DOI: 10.1016/j.fgb.2005.06.001] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2005] [Revised: 05/20/2005] [Accepted: 06/06/2005] [Indexed: 11/25/2022]
Abstract
Fusarium verticillioides (teleomorph Gibberella moniliformis) is a pathogen of maize worldwide and produces fumonisins, a family of mycotoxins that have been associated with several animal diseases as well as cancer in humans. In this study, we sought to identify fungal genes that affect fumonisin production and/or the plant-fungal interaction. We generated over 87,000 expressed sequence tags from nine different cDNA libraries that correspond to 11,119 unique sequences and are estimated to represent 80% of the genomic complement of genes. A comparative analysis of the libraries showed that all 15 genes in the fumonisin gene cluster were differentially expressed. In addition, nine candidate fumonisin regulatory genes and a number of genes that may play a role in plant-fungal interaction were identified. Analysis of over 700 FUM gene transcripts from five different libraries provided evidence for transcripts with unspliced introns and spliced introns with alternative 3' splice sites. The abundance of the alternative splice forms and the frequency with which they were found for genes involved in the biosynthesis of a single family of metabolites as well as their differential expression suggest they may have a biological function. Finally, analysis of an EST that aligns to genomic sequence between FUM12 and FUM13 provided evidence for a previously unidentified gene (FUM20) in the FUM gene cluster.
Collapse
Affiliation(s)
- Daren W Brown
- Mycotoxin Research Unit, U.S. Department of Agriculture-ARS, Peoria, IL 61604, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
176
|
Kuang H, Wei F, Marano MR, Wirtz U, Wang X, Liu J, Shum WP, Zaborsky J, Tallon LJ, Rensink W, Lobst S, Zhang P, Tornqvist CE, Tek A, Bamberg J, Helgeson J, Fry W, You F, Luo MC, Jiang J, Robin Buell C, Baker B. The R1 resistance gene cluster contains three groups of independently evolving, type I R1 homologues and shows substantial structural variation among haplotypes of Solanum demissum. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2005; 44:37-51. [PMID: 16167894 DOI: 10.1111/j.1365-313x.2005.02506.x] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Cultivated and wild potatoes contain a major disease-resistance cluster on the short arm of chromosome V, including the R1 resistance (R) gene against potato late blight. To explore the functional and evolutionary significance of clustering in the generation of novel disease-resistance genes, we constructed three approximately 1 Mb physical maps in the R1 gene region, one for each of the three genomes (haplotypes) of allohexaploid Solanum demissum, the wild potato progenitor of the R1 locus. Totals of 691, 919 and 559 kb were sequenced for each haplotype, and three distinct resistance-gene families were identified, one homologous to the potato R1 gene and two others homologous to either the Prf or the Bs4 R-gene of tomato. The regions with R1 homologues are highly divergent among the three haplotypes, in contrast to the conserved flanking non-resistance gene regions. The R1 locus shows dramatic variation in overall length and R1 homologue number among the three haplotypes. Sequence comparisons of the R1 homologues show that they form three distinct clades in a distance tree. Frequent sequence exchanges were detected among R1 homologues within each clade, but not among those in different clades. These frequent sequence exchanges homogenized the intron sequences of homologues within each clade, but did not homogenize the coding sequences. Our results suggest that the R1 homologues represent three independent groups of fast-evolving type I resistance genes, characterized by chimeric structures resulting from frequent sequence exchanges among group members. Such genes were first identified among clustered RGC2 genes in lettuce, where they were distinguished from slow-evolving type II R-genes. Our findings at the R1 locus in S. demissum may indicate that a common or similar mechanism underlies the previously reported differentiation of type I and type II R-genes and the differentiation of type I R-genes into distinct groups, identified here.
Collapse
Affiliation(s)
- Hanhui Kuang
- Plant Gene Expression Center, USDA-ARS and Department of Plant and Microbial Biology, University of California, Berkeley, 800 Buchanan Street, Albany, CA 94710, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
177
|
The Rice Chromosomes 11 and 12 Sequencing Consortia*. The sequence of rice chromosomes 11 and 12, rich in disease resistance genes and recent gene duplications. BMC Biol 2005; 3:20. [PMID: 16188032 PMCID: PMC1261165 DOI: 10.1186/1741-7007-3-20] [Citation(s) in RCA: 130] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2005] [Accepted: 09/27/2005] [Indexed: 01/13/2023] Open
Abstract
Background Rice is an important staple food and, with the smallest cereal genome, serves as a reference species for studies on the evolution of cereals and other grasses. Therefore, decoding its entire genome will be a prerequisite for applied and basic research on this species and all other cereals. Results We have determined and analyzed the complete sequences of two of its chromosomes, 11 and 12, which total 55.9 Mb (14.3% of the entire genome length), based on a set of overlapping clones. A total of 5,993 non-transposable element related genes are present on these chromosomes. Among them are 289 disease resistance-like and 28 defense-response genes, a higher proportion of these categories than on any other rice chromosome. A three-Mb segment on both chromosomes resulted from a duplication 7.7 million years ago (mya), the most recent large-scale duplication in the rice genome. Paralogous gene copies within this segmental duplication can be aligned with genomic assemblies from sorghum and maize. Although these gene copies are preserved on both chromosomes, their expression patterns have diverged. When the gene order of rice chromosomes 11 and 12 was compared to wheat gene loci, significant synteny between these orthologous regions was detected, illustrating the presence of conserved genes alternating with recently evolved genes. Conclusion Because the resistance and defense response genes, enriched on these chromosomes relative to the whole genome, also occur in clusters, they provide a preferred target for breeding durable disease resistance in rice and the isolation of their allelic variants. The recent duplication of a large chromosomal segment coupled with the high density of disease resistance gene clusters makes this the most recently evolved part of the rice genome. Based on syntenic alignments of these chromosomes, rice chromosome 11 and 12 do not appear to have resulted from a single whole-genome duplication event as previously suggested.
Collapse
|
178
|
Sczyrba A, Beckstette M, Brivanlou AH, Giegerich R, Altmann CR. XenDB: full length cDNA prediction and cross species mapping in Xenopus laevis. BMC Genomics 2005; 6:123. [PMID: 16162280 PMCID: PMC1261260 DOI: 10.1186/1471-2164-6-123] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2005] [Accepted: 09/14/2005] [Indexed: 11/23/2022] Open
Abstract
Background Research using the model system Xenopus laevis has provided critical insights into the mechanisms of early vertebrate development and cell biology. Large scale sequencing efforts have provided an increasingly important resource for researchers. To provide full advantage of the available sequence, we have analyzed 350,468 Xenopus laevis Expressed Sequence Tags (ESTs) both to identify full length protein encoding sequences and to develop a unique database system to support comparative approaches between X. laevis and other model systems. Description Using a suffix array based clustering approach, we have identified 25,971 clusters and 40,877 singleton sequences. Generation of a consensus sequence for each cluster resulted in 31,353 tentative contig and 4,801 singleton sequences. Using both BLASTX and FASTY comparison to five model organisms and the NR protein database, more than 15,000 sequences are predicted to encode full length proteins and these have been matched to publicly available IMAGE clones when available. Each sequence has been compared to the KOG database and ~67% of the sequences have been assigned a putative functional category. Based on sequence homology to mouse and human, putative GO annotations have been determined. Conclusion The results of the analysis have been stored in a publicly available database XenDB . A unique capability of the database is the ability to batch upload cross species queries to identify potential Xenopus homologues and their associated full length clones. Examples are provided including mapping of microarray results and application of 'in silico' analysis. The ability to quickly translate the results of various species into 'Xenopus-centric' information should greatly enhance comparative embryological approaches. Supplementary material can be found at .
Collapse
Affiliation(s)
- Alexander Sczyrba
- AG Praktische Informatik, Technische Fakultät, Universität Bielefeld, D-33594 Bielefeld, Germany
| | - Michael Beckstette
- AG Praktische Informatik, Technische Fakultät, Universität Bielefeld, D-33594 Bielefeld, Germany
| | - Ali H Brivanlou
- The Rockefeller University, Laboratory of Molecular Vertebrate Embryology, 1230 York Avenue, New York, NY 10021, USA
| | - Robert Giegerich
- AG Praktische Informatik, Technische Fakultät, Universität Bielefeld, D-33594 Bielefeld, Germany
| | - Curtis R Altmann
- FSU College of Medicine, Department of Biomedical Sciences, 1269 W. Call Street, Tallahassee, FL 32306, USA
| |
Collapse
|
179
|
Rensink WA, Lee Y, Liu J, Iobst S, Ouyang S, Buell CR. Comparative analyses of six solanaceous transcriptomes reveal a high degree of sequence conservation and species-specific transcripts. BMC Genomics 2005; 6:124. [PMID: 16162286 PMCID: PMC1249569 DOI: 10.1186/1471-2164-6-124] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2005] [Accepted: 09/14/2005] [Indexed: 11/14/2022] Open
Abstract
Background The Solanaceae is a family of closely related species with diverse phenotypes that have been exploited for agronomic purposes. Previous studies involving a small number of genes suggested sequence conservation across the Solanaceae. The availability of large collections of Expressed Sequence Tags (ESTs) for the Solanaceae now provides the opportunity to assess sequence conservation and divergence on a genomic scale. Results All available ESTs and Expressed Transcripts (ETs), 449,224 sequences for six Solanaceae species (potato, tomato, pepper, petunia, tobacco and Nicotiana benthamiana), were clustered and assembled into gene indices. Examination of gene ontologies revealed that the transcripts within the gene indices encode a similar suite of biological processes. Although the ESTs and ETs were derived from a variety of tissues, 55–81% of the sequences had significant similarity at the nucleotide level with sequences among the six species. Putative orthologs could be identified for 28–58% of the sequences. This high degree of sequence conservation was supported by expression profiling using heterologous hybridizations to potato cDNA arrays that showed similar expression patterns in mature leaves for all six solanaceous species. 16–19% of the transcripts within the six Solanaceae gene indices did not have matches among Solanaceae, Arabidopsis, rice or 21 other plant gene indices. Conclusion Results from this genome scale analysis confirmed a high level of sequence conservation at the nucleotide level of the coding sequence among Solanaceae. Additionally, the results indicated that part of the Solanaceae transcriptome is likely to be unique for each species.
Collapse
Affiliation(s)
- Willem Albert Rensink
- The Institute for Genomic Research, 9712 Medical Center Dr., Rockville MD, 20850, USA
| | - Yuandan Lee
- The Institute for Genomic Research, 9712 Medical Center Dr., Rockville MD, 20850, USA
| | - Jia Liu
- The Institute for Genomic Research, 9712 Medical Center Dr., Rockville MD, 20850, USA
| | - Stacy Iobst
- The Institute for Genomic Research, 9712 Medical Center Dr., Rockville MD, 20850, USA
| | - Shu Ouyang
- The Institute for Genomic Research, 9712 Medical Center Dr., Rockville MD, 20850, USA
| | - C Robin Buell
- The Institute for Genomic Research, 9712 Medical Center Dr., Rockville MD, 20850, USA
| |
Collapse
|
180
|
Machado JG, Hyland KA, Dvorak CMT, Murtaugh MP. Gene expression profiling of jejunal Peyer’s patches in juvenile and adult pigs. Mamm Genome 2005; 16:599-612. [PMID: 16180142 DOI: 10.1007/s00335-005-0008-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2005] [Accepted: 05/06/2005] [Indexed: 11/27/2022]
Abstract
Peyer's patches are organized lymphoid tissues of the small intestine that play a critical role in disease resistance and oral tolerance. Peyer's patches in the jejunum contain lymphocytes, dendritic cells, macrophages, villous epithelium, and specialized follicle-associated epithelium. Little is known about the mechanisms and processes by which cells of the Peyer's patches discriminate food nutrients and commensal microflora from pathogenic microbiota. We hypothesize that the jejunal Peyer's patches express genes that mediate and regulate its essential functions. Expression patterns of approximately 2600 cDNAs from a porcine Peyer's patch subtracted library were examined by microarray profiling. Individual mRNAs of interest were further examined by quantitative RT-PCR. Innate immunity-associated genes, including complement 3 and lysozyme, and the genes for epithelial chloride channel and trappin 1 were highly expressed by jejunal Peyer's patch in both juvenile and adult pigs. The growth- and apoptosis-associated genes CIDE-B, GW112, and PSP/Reg I (pancreatic stone protein or regenerating gene) were differentially expressed in juvenile pig Peyer's patches. Many sequences which were highly expressed in jejunal Peyer's patches have previously been described with functions in epithelial cells. Animal-to-animal variation in basal jejunal Peyer's patch gene expression was considerable and reflects the dynamic physiological environment of the gut in addition to genetic, epigenetic, and microbiological variation in the small intestine.
Collapse
Affiliation(s)
- Juliana G Machado
- Department of Veterinary and Biomedical Sciences, University of Minnesota, 1971 Commonwealth Avenue, St. Paul, Minnesota, 55108, USA
| | | | | | | |
Collapse
|
181
|
Buell CR, Yuan Q, Ouyang S, Liu J, Zhu W, Wang A, Maiti R, Haas B, Wortman J, Pertea M, Jones KM, Kim M, Overton L, Tsitrin T, Fadrosh D, Bera J, Weaver B, Jin S, Johri S, Reardon M, Webb K, Hill J, Moffat K, Tallon L, Van Aken S, Lewis M, Utterback T, Feldblyum T, Zismann V, Iobst S, Hsiao J, de Vazeille AR, Salzberg SL, White O, Fraser C, Yu Y, Kim H, Rambo T, Currie J, Collura K, Kernodle-Thompson S, Wei F, Kudrna K, Ammiraju JSS, Luo M, Goicoechea JL, Wing RA, Henry D, Oates R, Palmer M, Pries G, Saski C, Simmons J, Soderlund C, Nelson W, de la Bastide M, Spiegel L, Nascimento L, Huang E, Preston R, Zutavern T, Palmer L, O'Shaughnessy A, Dike S, McCombie WR, Minx P, Cordum H, Wilson R, Jin W, Lee HR, Jiang J, Jackson S. Sequence, annotation, and analysis of synteny between rice chromosome 3 and diverged grass species. Genome Res 2005; 15:1284-91. [PMID: 16109971 PMCID: PMC1199543 DOI: 10.1101/gr.3869505] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Rice (Oryza sativa L.) chromosome 3 is evolutionarily conserved across the cultivated cereals and shares large blocks of synteny with maize and sorghum, which diverged from rice more than 50 million years ago. To begin to completely understand this chromosome, we sequenced, finished, and annotated 36.1 Mb ( approximately 97%) from O. sativa subsp. japonica cv Nipponbare. Annotation features of the chromosome include 5915 genes, of which 913 are related to transposable elements. A putative function could be assigned to 3064 genes, with another 757 genes annotated as expressed, leaving 2094 that encode hypothetical proteins. Similarity searches against the proteome of Arabidopsis thaliana revealed putative homologs for 67% of the chromosome 3 proteins. Further searches of a nonredundant amino acid database, the Pfam domain database, plant Expressed Sequence Tags, and genomic assemblies from sorghum and maize revealed only 853 nontransposable element related proteins from chromosome 3 that lacked similarity to other known sequences. Interestingly, 426 of these have a paralog within the rice genome. A comparative physical map of the wild progenitor species, Oryza nivara, with japonica chromosome 3 revealed a high degree of sequence identity and synteny between these two species, which diverged approximately 10,000 years ago. Although no major rearrangements were detected, the deduced size of the O. nivara chromosome 3 was 21% smaller than that of japonica. Synteny between rice and other cereals using an integrated maize physical map and wheat genetic map was strikingly high, further supporting the use of rice and, in particular, chromosome 3, as a model for comparative studies among the cereals.
Collapse
Affiliation(s)
- C Robin Buell
- The Institute for Genomic Research, Rockville, Maryland 20850, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
182
|
Mudge J, Cannon SB, Kalo P, Oldroyd GED, Roe BA, Town CD, Young ND. Highly syntenic regions in the genomes of soybean, Medicago truncatula, and Arabidopsis thaliana. BMC PLANT BIOLOGY 2005; 5:15. [PMID: 16102170 PMCID: PMC1201151 DOI: 10.1186/1471-2229-5-15] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2005] [Accepted: 08/15/2005] [Indexed: 05/04/2023]
Abstract
BACKGROUND Recent genome sequencing enables mega-base scale comparisons between related genomes. Comparisons between animals, plants, fungi, and bacteria demonstrate extensive synteny tempered by rearrangements. Within the legume plant family, glimpses of synteny have also been observed. Characterizing syntenic relationships in legumes is important in transferring knowledge from model legumes to crops that are important sources of protein, fixed nitrogen, and health-promoting compounds. RESULTS We have uncovered two large soybean regions exhibiting synteny with M. truncatula and with a network of segmentally duplicated regions in Arabidopsis. In all, syntenic regions comprise over 500 predicted genes spanning 3 Mb. Up to 75% of soybean genes are colinear with M. truncatula, including one region in which 33 of 35 soybean predicted genes with database support are colinear to M. truncatula. In some regions, 60% of soybean genes share colinearity with a network of A. thaliana duplications. One region is especially interesting because this 500 kbp segment of soybean is syntenic to two paralogous regions in M. truncatula on different chromosomes. Phylogenetic analysis of individual genes within these regions demonstrates that one is orthologous to the soybean region, with which it also shows substantially denser synteny and significantly lower levels of synonymous nucleotide substitutions. The other M. truncatula region is inferred to be paralogous, presumably resulting from a duplication event preceding speciation. CONCLUSION The presence of well-defined M. truncatula segments showing orthologous and paralogous relationships with soybean allows us to explore the evolution of contiguous genomic regions in the context of ancient genome duplication and speciation events.
Collapse
Affiliation(s)
- Joann Mudge
- Dept of Plant Pathology, 495 Borlaug Hall, University of Minnesota, St. Paul, MN 55108 USA
| | - Steven B Cannon
- Dept of Plant Pathology, 495 Borlaug Hall, University of Minnesota, St. Paul, MN 55108 USA
| | - Peter Kalo
- Dept. of Disease and Stress Biology, John Innes Centre, Norwich Research Park, Colney Norwich, NR4 7UH, UK
| | - Giles ED Oldroyd
- Dept. of Disease and Stress Biology, John Innes Centre, Norwich Research Park, Colney Norwich, NR4 7UH, UK
| | - Bruce A Roe
- The Advanced Center for Genome Technology (ACGT), Stephenson Research & Technology Center, University of Oklahoma, Norman OK 73019 USA
| | - Christopher D Town
- The Institute for Genomic Research (TIGR), 9712 Medicago Center Drive, Rockville, MN 20850 USA
| | - Nevin D Young
- Dept of Plant Pathology, 495 Borlaug Hall, University of Minnesota, St. Paul, MN 55108 USA
| |
Collapse
|
183
|
Krasnov A, Koskinen H, Afanasyev S, Mölsä H. Transcribed Tc1-like transposons in salmonid fish. BMC Genomics 2005; 6:107. [PMID: 16095544 PMCID: PMC1192797 DOI: 10.1186/1471-2164-6-107] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2005] [Accepted: 08/12/2005] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Mobile genetic elements comprise a substantial fraction of vertebrate genomes. These genes are considered to be deleterious, and in vertebrates they are usually inactive. High throughput sequencing of salmonid fish cDNA libraries has revealed a large number of transposons, which remain transcribed despite inactivation of translation. This article reports on the structure and potential role of these genes. RESULTS A search of EST showed the ratio of transcribed transposons in salmonid fish (i.e., 0.5% of all unique cDNA sequences) to be 2.4-32 times greater than in other vertebrate species, and 68% of these genes belonged to the Tc1-family of DNA transposons. A phylogenetic analysis of reading frames indicate repeated transposition of distantly related genes into the fish genome over protracted intervals of evolutionary time. Several copies of two new DNA transposons were cloned. These copies showed relatively little divergence (11.4% and 1.9%). The latter gene was transcribed at a high level in rainbow trout tissues, and was present in genomes of many phylogenetically remote fish species. A comparison of synonymous and non-synonymous divergence revealed remnants of divergent evolution in the younger gene, while the older gene evolved in a neutral mode. From a 1.2 MB fragment of genomic DNA, the salmonid genome contains approximately 10(5) Tc1-like sequences, the major fraction of which is not transcribed. Our microarray studies showed that transcription of rainbow trout transposons is activated by external stimuli, such as toxicity, stress and bacterial antigens. The expression profiles of Tc1-like transposons gave a strong correlation (r2 = 0.63-0.88) with a group of genes implicated in defense response, signal transduction and regulation of transcription. CONCLUSION Salmonid genomes contain a large quantity of transcribed mobile genetic elements. Divergent or neutral evolution within genomes and lateral transmission can account for the diversity and sustained persistence of Tc1-like transposons in lower vertebrates. A small part of transposons remain transcribed and their transcription is enhanced by responses to acute conditions.
Collapse
Affiliation(s)
- Aleksei Krasnov
- Institute of Applied Biotechnology, University of Kuopio, P.O.B. 1627, FIN-70211 Kuopio, Finland
| | - Heikki Koskinen
- Institute of Applied Biotechnology, University of Kuopio, P.O.B. 1627, FIN-70211 Kuopio, Finland
| | - Sergey Afanasyev
- Sechenov Institute of Evolutionary Physiology and Biochemistry, M.Toreza av. 44, Petersburg, 194223, Russia
| | - Hannu Mölsä
- Institute of Applied Biotechnology, University of Kuopio, P.O.B. 1627, FIN-70211 Kuopio, Finland
| |
Collapse
|
184
|
Fredslund J, Schauser L, Madsen LH, Sandal N, Stougaard J. PriFi: using a multiple alignment of related sequences to find primers for amplification of homologs. Nucleic Acids Res 2005; 33:W516-20. [PMID: 15980525 PMCID: PMC1160186 DOI: 10.1093/nar/gki425] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Using a comparative approach, the web program PriFi (http://cgi-www.daimi.au.dk/cgi-chili/PriFi/main) designs pairs of primers useful for PCR amplification of genomic DNA in species where prior sequence information is not available. The program works with an alignment of DNA sequences from phylogenetically related species and outputs a list of possibly degenerate primer pairs fulfilling a number of criteria, such that the primers have a maximal probability of amplifying orthologous sequences in other phylogenetically related species. Operating on a genome-wide scale, PriFi automates the first steps of a procedure for developing general markers serving as common anchor loci across species. To accommodate users with special preferences, configuration settings and criteria can be customized.
Collapse
Affiliation(s)
- Jakob Fredslund
- Bioinformatics Research Center, University of Aarhus, Hoegh-Guldbergsgade 10, 8000 Aarhus C, Denmark.
| | | | | | | | | |
Collapse
|
185
|
Abstract
Expressed sequence tag (EST) data are a major contributor to the known plant sequence space. Organization of the data into non-redundant clusters representing tentative unique genes provides snapshots of the gene repertoires of a species. This chapter reviews availability of sequences and sequence analysis results and describes several resources and tools that should facilitate broad-based utilization of EST data for gene structure annotation, gene discovery, and comparative genomics.
Collapse
Affiliation(s)
- Qunfeng Dong
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, Iowa 50011-3260, USA
| | | | | | | | | |
Collapse
|
186
|
Bersoult A, Camut S, Perhald A, Kereszt A, Kiss GB, Cullimore JV. Expression of the Medicago truncatula DM12 gene suggests roles of the symbiotic nodulation receptor kinase in nodules and during early nodule development. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2005; 18:869-76. [PMID: 16134899 DOI: 10.1094/mpmi-18-0869] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The Medicago truncatula DMI2 gene encodes a receptorlike kinase required for establishing root endosymbioses. The DMI2 gene was shown to be expressed much more highly in roots and nodules than in leaves and stems. In roots, its expression was not altered by nitrogen starvation or treatment with lipochitooligosaccharidic Nod factors. Moreover, the DMI2 mRNA abundance in roots of the nfp, dmil, dmi3, nsp1, nsp2, and hcl symbiotic mutants was similar to the wild type, whereas lower levels in some dmi2 mutants could be explained by regulation by the nonsense-mediated decay, RNA surveillance mechanism. Using pDMI2::GUS fusions, the expression of DMI2 in roots appeared to be localized primarily in the cortical and epidermal cells of the younger, lateral roots and was not observed in the root apices. Following inoculation with Sinorhizobium meliloti, the DMI2 gene was induced in the nodule primordia, before penetration by the infection threads. No increased expression was seen in lateral-root primordia. In nodules, expression was observed primarily in a few cell layers of the pre-infection zone. These results are consistent with the DMI2 gene mediating Nod factor perception and transduction leading to rhizobial infection, not only in root epidermal cells but also during nodule development.
Collapse
Affiliation(s)
- Anne Bersoult
- Laboratoire des Interactions Plantes-Microorganismes, CNRS-INRA, BP52627, 31326 Castanet-Tolosan Cedex, France
| | | | | | | | | | | |
Collapse
|
187
|
Wang Z, Triezenberg SJ, Thomashow MF, Stockinger EJ. Multiple hydrophobic motifs in Arabidopsis CBF1 COOH-terminus provide functional redundancy in trans-activation. PLANT MOLECULAR BIOLOGY 2005; 58:543-59. [PMID: 16021338 DOI: 10.1007/s11103-005-6760-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2005] [Accepted: 05/02/2005] [Indexed: 05/03/2023]
Abstract
The Arabidopsis CBF proteins activate expression of a set of genes whose upstream regulatory sequences typically harbor one or more copies of the CRT/DRE low temperature cis-acting DNA regulatory element. Using domain swap experiments in both yeast and Arabidopsis we show that the NH3-terminal 115 amino acids direct CBF1 to target genes and the COOH-terminal 98 amino acids function in trans-activation. Mutational analysis through the COOH-terminus using truncation and alanine-substitution mutants in yeast revealed four motifs that contribute positively towards activation. Overexpression of mutants in plants support this conclusion and also indicated that disruption of a single motif did not seriously compromise activity unless combined with the disruption of a second. These motifs consist of clusters of hydrophobic residues which are delimited from one another by short stretches of Asp, Glu, Pro and other residues favoring the formation of loops. This structural pattern is conserved across plant taxa as revealed through alignment of Arabidopsis CBF1 with homologous sequences from a diverse array of plant species. Overexpression in plants of the CBF1 COOH-terminus as a fusion with the yeast GAL4 DNA binding domain also resulted in severe stunting of growth, a phenotype which was alleviated if the activation domain was rendered ineffective. Taken together these results suggest that high level overexpression of an active, CBF activation domain compromises plant growth.
Collapse
Affiliation(s)
- Zhibin Wang
- Department of Horticulture and Crop Science, The Ohio State University/OARDC, Wooster, OH 44691, USA
| | | | | | | |
Collapse
|
188
|
Jiang Z, Wu XL, Garcia MD, Griffin KB, Michal JJ, Ott TL, Gaskins CT, Wright RW. Comparative gene-based in silico analysis of transcriptomes in different bovine tissues and (or) organs. Genome 2005; 47:1164-72. [PMID: 15644975 DOI: 10.1139/g04-084] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
A gene-based approach was used to annotate 322,168 cattle expressed sequence tags (ESTs) based on human genes in order to census the transcriptomes, analyze their expression similarities, and identify genes preferentially expressed in different bovine tissues and (or) organs. Of the 34,157 human coding genes used in a standalone BLAST search, 14,928 could be matched with provisional orthologous sequences in a total of 230,135 bovine ESTs. The remaining 92,033 bovine ESTs were estimated to represent an additional 5970 genes in cattle. On average, approximately 8600 genes were estimated to be expressed in a single tissue and (or) organ and 13,000 in a pooled tissue library. On the basis of the estimated numbers of genes, no more than 3% of genes would be missed when approximately 34,000 ESTs were sequenced from a single tissue and (or) organ library and approximately 40,000 ESTs from a pooled source, respectively. Cluster analyses of the gene expression patterns among 12 single tissues and (or) organs in cattle revealed that their expression similarities would depend on physiological functions. In addition, a total of 1502 genes were identified as preferentially expressed genes in these 12 single tissues and (or) organs with LOD (logarithm of the odds, base 10) > or = 3.0. Therefore, our study provides some insights for further investigating the developmental and functional relations of various tissues and organs in mammals.
Collapse
Affiliation(s)
- Zhihua Jiang
- Department of Animal Sciences, Washington State University, Pullman, WA 99164, USA.
| | | | | | | | | | | | | | | |
Collapse
|
189
|
Sharov AA, Dudekula DB, Ko MSH. Genome-wide assembly and analysis of alternative transcripts in mouse. Genome Res 2005; 15:748-54. [PMID: 15867436 PMCID: PMC1088304 DOI: 10.1101/gr.3269805] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
To build a mouse gene index with the most comprehensive coverage of alternative transcription/splicing (ATS), we developed an algorithm and a fully automated computational pipeline for transcript assembly from expressed sequences aligned to the genome. We identified 191,946 genomic loci, which included 27,497 protein-coding genes and 11,906 additional gene candidates (e.g., nonprotein-coding, but multiexon). Comparison of the resulting gene index with TIGR, UniGene, DoTS, and ESTGenes databases revealed that it had a greater number of transcripts, a greater average number of exons and introns with proper splicing sites per gene, and longer ORFs. The 27,497 protein-coding genes had 77,138 transcripts, i.e., 2.8 transcripts per gene on average. Close examination of transcripts led to a combinatorial table of 23 types of ATS units, only nine of which were previously described, i.e., 14 types of alternative splicing, seven types of alternative starts, and two types of alternative termination. The 47%, 18%, and 14% of 20,323 multiexon protein-coding genes with proper splice sites had alternative splicings, alternative starts, and alternative terminations, respectively. The gene index with the comprehensive ATS will provide a useful platform for analyzing the nature and mechanism of ATS, as well as for designing the accurate exon-based DNA microarrays. The sequence data from this study have been submitted to GenBank under accession numbers: CK329321-CK334090; CF891695-CF906652; CF906741-CF916750; CK334091-CK347104; CK387035-CK393993; CN660032-CN690720; CN690721-CN725493.
Collapse
Affiliation(s)
- Alexei A Sharov
- Developmental Genomics and Aging Section, Laboratory of Genetics, National Institute on Aging, National Institutes of Health, Baltimore, MD 21224, USA
| | | | | |
Collapse
|
190
|
Weng JK, Tanurdzic M, Chapple C. Functional analysis and comparative genomics of expressed sequence tags from the lycophyte Selaginella moellendorffii. BMC Genomics 2005; 6:85. [PMID: 15938755 PMCID: PMC1184070 DOI: 10.1186/1471-2164-6-85] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2005] [Accepted: 06/06/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The lycophyte Selaginella moellendorffii is a member of one of the oldest lineages of vascular plants on Earth. Fossil records show that the lycophyte clade arose 400 million years ago, 150-200 million years earlier than angiosperms, a group of plants that includes the well-studied flowering plant Arabidopsis thaliana. S. moellendorffii has a genome size of approximately 100 Mbp, as small or smaller than that of A. thaliana. S. moellendorffii has the potential to provide significant comparative information to better understand the evolution of vascular plants. RESULTS We sequenced 2181 Expressed Sequence Tags (ESTs) from a S. moellendorffii cDNA library. One thousand three hundred and one non-redundant sequences were assembled, containing 291 contigs and 1010 singletons. Approximately 75% of the ESTs matched proteins in the non-redundant protein database. Among 1301 clusters, 343 were categorized according to Gene Ontology (GO) hierarchy and were compared to the GO mapping of A. thaliana tentative consensus sequences. We compared S. moellendorffii ESTs to the A. thaliana and Physcomitrella patens EST databases, using the tBLASTX algorithm. Approximately 60% of the ESTs exhibited similarity with both A. thaliana and P. patens ESTs; whereas, 13% and 1% of the ESTs had exclusive similarity with A. thaliana and P. patens ESTs, respectively. A substantial proportion of the ESTs (26%) had no match with A. thaliana or P. patens ESTs. CONCLUSION We discovered 1301 putative unigenes in S. moellendorffii. These results give an initial insight into its transcriptome that will aid in the study of the S. moellendorffii genome in the near future.
Collapse
Affiliation(s)
- Jing-Ke Weng
- Department of Biochemistry, Purdue University, West Lafayette, IN 47907, USA
| | - Milos Tanurdzic
- Department of Botany and Plant Pathology, Purdue University, West Lafayette, IN 47907, USA
- current address, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Clint Chapple
- Department of Biochemistry, Purdue University, West Lafayette, IN 47907, USA
| |
Collapse
|
191
|
Guerrero FD, Miller RJ, Rousseau ME, Sunkara S, Quackenbush J, Lee Y, Nene V. BmiGI: a database of cDNAs expressed in Boophilus microplus, the tropical/southern cattle tick. INSECT BIOCHEMISTRY AND MOLECULAR BIOLOGY 2005; 35:585-595. [PMID: 15857764 DOI: 10.1016/j.ibmb.2005.01.020] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2004] [Revised: 01/26/2005] [Accepted: 01/26/2005] [Indexed: 05/24/2023]
Abstract
We used an expressed sequence tag approach to initiate a study of the genome of the southern cattle tick, Boophilus microplus. A normalized cDNA library was synthesized from pooled RNA purified from tick larvae which had been subjected to different treatments, including acaricide exposure, heat shock, cold shock, host odor, and infection with Babesia bovis. For the acaricide exposure experiments, we used several strains of ticks, which varied in their levels of susceptibility to pyrethroid, organophosphate and amitraz. We also included RNA purified from samples of eggs, nymphs and adult ticks and dissected tick organs. Plasmid DNA was prepared from 11,520 cDNA clones and both 5' and 3' sequencing performed on each clone. The sequence data was used to search public protein databases and a B. microplus gene index was constructed, consisting of 8270 unique sequences whose associated putative functional assignments, when available, can be viewed at the TIGR website (http://www.tigr.org/tdb/tgi). A number of novel sequences were identified which possessed significant sequence similarity to genes, which might be involved in resistance to acaricides.
Collapse
Affiliation(s)
- F D Guerrero
- USDA-ARS, Knipling Bushland US Livestock Insect Research Laboratory, 2700 Fredericksburg Road, Kerrville, TX 78028, USA.
| | | | | | | | | | | | | |
Collapse
|
192
|
Wilson HL, Aich P, Roche FM, Jalal S, Hodgson PD, Brinkman FSL, Potter A, Babiuk LA, Griebel PJ. Molecular analyses of disease pathogenesis: application of bovine microarrays. Vet Immunol Immunopathol 2005; 105:277-87. [PMID: 15808306 PMCID: PMC7112672 DOI: 10.1016/j.vetimm.2005.02.015] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The molecular analysis of disease pathogenesis in cattle has been limited by the lack of availability of tools to analyze both host and pathogen responses. These limitations are disappearing with the advent of methodologies such as microarrays that facilitate rapid characterization of global gene expression at the level of individual cells and tissues. The present review focuses on the use of microarray technologies to investigate the functional pathogenomics of infectious disease in cattle. We discuss a number of unique issues that must be addressed when designing both in vitro and in vivo model systems to analyze host responses to a specific pathogen. Furthermore, comparative functional genomic strategies are discussed that can be used to address questions regarding host responses that are either common to a variety of pathogens or unique to individual pathogens. These strategies can also be applied to investigations of cell signaling pathways and the analyses of innate immune responses. Microarray analyses of both host and pathogen responses hold substantial promise for the generation of databases that can be used in the future to address a wide variety of questions. A critical component limiting these comparative analyses will be the quality of the databases and the complete functional annotation of the bovine genome. These limitations are discussed with an indication of future developments that will accelerate the validation of data generated when completing a molecular characterization of disease pathogenesis in cattle.
Collapse
Affiliation(s)
- Heather L Wilson
- Vaccine and Infectious Disease Organization, University of Saskatchewan, Saskatoon, Sask., Canada S7N 5E3
| | | | | | | | | | | | | | | | | |
Collapse
|
193
|
Silverstein KAT, Graham MA, Paape TD, VandenBosch KA. Genome organization of more than 300 defensin-like genes in Arabidopsis. PLANT PHYSIOLOGY 2005; 138:600-10. [PMID: 15955924 PMCID: PMC1150381 DOI: 10.1104/pp.105.060079] [Citation(s) in RCA: 188] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Defensins represent an ancient and diverse set of small, cysteine-rich, antimicrobial peptides in mammals, insects, and plants. According to published accounts, most species' genomes contain 15 to 50 defensins. Starting with a set of largely nodule-specific defensin-like sequences (DEFLs) from the model legume Medicago truncatula, we built motif models to search the near-complete Arabidopsis (Arabidopsis thaliana) genome. We identified 317 DEFLs, yet 80% were unannotated at The Arabidopsis Information Resource and had no prior evidence of expression. We demonstrate that many of these DEFL genes are clustered in the Arabidopsis genome and that individual clusters have evolved from successive rounds of gene duplication and divergent or purifying selection. Sequencing reverse transcription-PCR products from five DEFL clusters confirmed our gene predictions and verified expression. For four of the largest clusters of DEFLs, we present the first evidence of expression, most frequently in floral tissues. To determine the abundance of DEFLs in other plant families, we used our motif models to search The Institute for Genomic Research's gene indices and identified approximately 1,100 DEFLs. These expressed DEFLs were found mostly in reproductive tissues, consistent with our reverse transcription-PCR results. Sequence-based clustering of all identified DEFLs revealed separate tissue- or taxon-specific subgroups. Previously, we and others showed that more than 300 DEFL genes were expressed in M. truncatula nodules, organs not present in most plants. We have used this information to annotate the Arabidopsis genome and now provide evidence of a large DEFL superfamily present in expressed tissues of all sequenced plants.
Collapse
Affiliation(s)
- Kevin A T Silverstein
- Department of Plant Biology, University of Minnesota, St. Paul, Minnesota 55108, USA
| | | | | | | |
Collapse
|
194
|
Djebbari A, Karamycheva S, Howe E, Quackenbush J. MeSHer: identifying biological concepts in microarray assays based on PubMed references and MeSH terms. Bioinformatics 2005; 21:3324-6. [PMID: 15919728 DOI: 10.1093/bioinformatics/bti503] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED MeSHer uses a simple statistical approach to identify biological concepts in the form of Medical Subject Headings (MeSH terms) obtained from the PubMed database that are significantly overrepresented within the identified gene set relative to those associated with the overall collection of genes on the underlying DNA microarray platform. As a demonstration, we apply this approach to gene lists acquired from a published study of the effects of angiotensin II (Ang II) treatment on cardiac gene expression and demonstrate that this approach can aid in the interpretation of the resulting 'significant' gene set. AVAILABILITY The software is available at http://www.tm4.org. SUPPLEMENTARY INFORMATION Results from the analysis of significant genes from the published Ang II study.
Collapse
Affiliation(s)
- Amira Djebbari
- The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850, USA
| | | | | | | |
Collapse
|
195
|
Gharbi K, Ferguson MM, Danzmann RG. Characterization of Na, K-ATPase genes in Atlantic salmon (Salmo salar) and comparative genomic organization with rainbow trout (Oncorhynchus mykiss). Mol Genet Genomics 2005; 273:474-83. [PMID: 15883826 DOI: 10.1007/s00438-005-1135-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2004] [Accepted: 02/28/2005] [Indexed: 10/25/2022]
Abstract
A combination of molecular and in silico approaches was employed to assemble a survey of Na, K-ATPase genes contained in the ancestrally tetraploid genome of the Atlantic salmon (Salmo salar). Molecular characterization of genomic clones coding for the alpha subunit revealed two single genes (alpha1a and alpha2) and two pairs of presumably homeologous genes (alpha1b/i-ii and alpha1c/i-ii). Each of the six genes showed high sequence similarity to isoforms previously isolated from rainbow trout and extensive structural differences relative to putative orthologs in the human genome. In silico analysis of expressed sequence tag (EST) collections indicated that at least five alpha (alpha1a, alpha1b, alpha1c, alpha2, and alpha3) and four beta (beta1a, beta1b, beta2, and beta3b) subunit isoforms are expressed in Atlantic salmon. Meiotic linkage analysis further showed that Na, K-ATPase genes are dispersed throughout the salmon genome, with the exception of two multigene clusters on linkage groups AS-22 and AS-28. Duplicate gene copies for the isoform alpha1b were assigned to linkage groups with multiple homeologous anchors (AS-22 and AS-23), while beta2 duplicates suggested a new homeologous affinity between AS-05 and AS-28. In addition, the comparison of linkage arrangements with rainbow trout also showed that the genomic organization of Na, K-ATPase genes is consistent with the evolutionary conservation of syntenic chromosome regions between these species.
Collapse
Affiliation(s)
- Karim Gharbi
- Department of Integrative Biology, University of Guelph, Guelph, ON, N1G 2W1 Canada
| | | | | |
Collapse
|
196
|
Kim N, Shin S, Lee S. ECgene: genome-based EST clustering and gene modeling for alternative splicing. Genome Res 2005; 15:566-76. [PMID: 15805497 PMCID: PMC1074371 DOI: 10.1101/gr.3030405] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
With the availability of the human genome map and fast algorithms for sequence alignment, genome-based EST clustering became a viable method for gene modeling. We developed a novel gene-modeling method, ECgene (Gene modeling by EST Clustering), which combines genome-based EST clustering and the transcript assembly procedure in a coherent and consistent fashion. Specifically, ECgene takes alternative splicing events into consideration. The position of splice sites (i.e., exon-intron boundaries) in the genome map is utilized as the critical information in the whole procedure. Sequences that share any splice sites are grouped together to define an EST cluster in a manner similar to that of the genome-based version of the UniGene algorithm. Transcript assembly is achieved using graph theory that represents the exon connectivity in each cluster as a directed acyclic graph (DAG). Distinct paths along exons correspond to possible gene models encompassing all alternative splicing events. EST sequences in each cluster are subclustered further according to the compatibility with gene structure of each splice variant, and they can be regarded as clone evidence for the corresponding isoform. The reliability of each isoform is assessed from the nature of cluster members and from the minimum number of clones required to reconstruct all exons in the transcript.
Collapse
Affiliation(s)
- Namshin Kim
- Division of Molecular Life Sciences, Ewha Womans University, Seoul 120-750, Korea
| | | | | |
Collapse
|
197
|
Lee YH, Moon IJ, Hur B, Park JH, Han KH, Uhm SY, Kim YJ, Kang KJ, Park JW, Seu YB, Kim YH, Park JG. Gene knockdown by large circular antisense for high-throughput functional genomics. Nat Biotechnol 2005; 23:591-9. [PMID: 15867911 DOI: 10.1038/nbt1089] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2004] [Accepted: 03/14/2005] [Indexed: 11/08/2022]
Abstract
Single-stranded genomic DNA of recombinant M13 phages was tested as an antisense molecule and examined for its usefulness in high-throughput functional genomics. cDNA fragments of various genes (TNF-alpha, c-myc, c-myb, cdk2 and cdk4) were independently cloned into phagemid vectors. Using the life cycle of M13 bacteriophages, large circular (LC)-molecules, antisense to their respective genes, were prepared from the culture supernatant of bacterial transformants. LC-antisense molecules exhibited enhanced stability, target specificity and no need for target-site searches. High-throughput functional genomics was then attempted with an LC-antisense library, which was generated by using a phagemid vector that incorporated a unidirectional subtracted cDNA library derived from liver cancer tissue. We identified 56 genes involved in the growth of these cells. These results indicate that an antisense sequence as a part of single-stranded LC-genomic DNA of recombinant M13 phages exhibits effective antisense activity, and may have potential for high-throughput functional genomics.
Collapse
Affiliation(s)
- Yun-Han Lee
- WelGENE Inc., 71B 4L, Development Sector 2-3, Sungseo Industrial Park, Dalseogu, Daegu, 704-230, South Korea
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
198
|
Yuan Q, Ouyang S, Wang A, Zhu W, Maiti R, Lin H, Hamilton J, Haas B, Sultana R, Cheung F, Wortman J, Buell CR. The institute for genomic research Osa1 rice genome annotation database. PLANT PHYSIOLOGY 2005; 138:18-26. [PMID: 15888674 PMCID: PMC1104156 DOI: 10.1104/pp.104.059063] [Citation(s) in RCA: 157] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
We have developed a rice (Oryza sativa) genome annotation database (Osa1) that provides structural and functional annotation for this emerging model species. Using the sequence of O. sativa subsp. japonica cv Nipponbare from the International Rice Genome Sequencing Project, pseudomolecules, or virtual contigs, of the 12 rice chromosomes were constructed. Our most recent release, version 3, represents our third build of the pseudomolecules and is composed of 98% finished sequence. Genes were identified using a series of computational methods developed for Arabidopsis (Arabidopsis thaliana) that were modified for use with the rice genome. In release 3 of our annotation, we identified 57,915 genes, of which 14,196 are related to transposable elements. Of these 43,719 non-transposable element-related genes, 18,545 (42.4%) were annotated with a putative function, 5,777 (13.2%) were annotated as encoding an expressed protein with no known function, and the remaining 19,397 (44.4%) were annotated as encoding a hypothetical protein. Multiple splice forms (5,873) were detected for 2,538 genes, resulting in a total of 61,250 gene models in the rice genome. We incorporated experimental evidence into 18,252 gene models to improve the quality of the structural annotation. A series of functional data types has been annotated for the rice genome that includes alignment with genetic markers, assignment of gene ontologies, identification of flanking sequence tags, alignment with homologs from related species, and syntenic mapping with other cereal species. All structural and functional annotation data are available through interactive search and display windows as well as through download of flat files. To integrate the data with other genome projects, the annotation data are available through a Distributed Annotation System and a Genome Browser. All data can be obtained through the project Web pages at http://rice.tigr.org.
Collapse
Affiliation(s)
- Qiaoping Yuan
- The Institute for Genomic Research, Rockville, Maryland 20850, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
199
|
Jantasuriyarat C, Gowda M, Haller K, Hatfield J, Lu G, Stahlberg E, Zhou B, Li H, Kim H, Yu Y, Dean RA, Wing RA, Soderlund C, Wang GL. Large-scale identification of expressed sequence tags involved in rice and rice blast fungus interaction. PLANT PHYSIOLOGY 2005; 138:105-15. [PMID: 15888683 PMCID: PMC1104166 DOI: 10.1104/pp.104.055624] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
To better understand the molecular basis of the defense response against the rice blast fungus (Magnaporthe grisea), a large-scale expressed sequence tag (EST) sequencing approach was used to identify genes involved in the early infection stages in rice (Oryza sativa). Six cDNA libraries were constructed using infected leaf tissues harvested from 6 conditions: resistant, partially resistant, and susceptible reactions at both 6 and 24 h after inoculation. Two additional libraries were constructed using uninoculated leaves and leaves from the lesion mimic mutant spl11. A total of 68,920 ESTs were generated from 8 libraries. Clustering and assembly analyses resulted in 13,570 unique sequences from 10,934 contigs and 2,636 singletons. Gene function classification showed that 42% of the ESTs were predicted to have putative gene function. Comparison of the pathogen-challenged libraries with the uninoculated control library revealed an increase in the percentage of genes in the functional categories of defense and signal transduction mechanisms and cell cycle control, cell division, and chromosome partitioning. In addition, hierarchical clustering analysis grouped the eight libraries based on their disease reactions. A total of 7,748 new and unique ESTs were identified from our collection compared with the KOME full-length cDNA collection. Interestingly, we found that rice ESTs are more closely related to sorghum (Sorghum bicolor) ESTs than to barley (Hordeum vulgare), wheat (Triticum aestivum), and maize (Zea mays) ESTs. The large cataloged collection of rice ESTs in this study provides a solid foundation for further characterization of the rice defense response and is a useful public genomic resource for rice functional genomics studies.
Collapse
|
200
|
Gene expression profiling of potato responses to cold, heat, and salt stress. Funct Integr Genomics 2005. [PMID: 15856349 DOI: 10.1007/s10142‐005‐0141‐6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/29/2022]
Abstract
In order to identify genes involved in abiotic stress responses in potato, seedlings were grown under controlled conditions and subjected to cold (4 degrees C), heat (35 degrees C), or salt (100 mM NaCl) stress for up to 27 h. Using an approximately 12,000 clone potato cDNA microarray, expression profiles were captured at three time points following initiation of the stress (3, 9, and 27 h) from two different tissues, roots and leaves. A total of 3,314 clones could be identified as significantly up- or down-regulated in response to at least one stress condition. The genes represented by these clones encode transcription factors, signal transduction factors, and heat-shock proteins which have been associated with abiotic stress responses in Arabidopsis and rice, suggesting similar response pathways function in potato. These stress-regulated clones could be separated into either stress-specific or shared-response clones, suggesting the existence of general response pathways as well as more stress-specific pathways. In addition, we identified expression profiles which are indicative for the type of stress applied to the plants.
Collapse
|