51
|
Mucher E, Jayr L, Rossignol F, Amiot F, Gidrol X, Barrey E. Gene expression profiling in equine muscle tissues using mouse cDNA microarrays. Equine Vet J 2010:359-64. [PMID: 17402448 DOI: 10.1111/j.2042-3306.2006.tb05569.x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
REASONS FOR PERFORMING STUDY Progress could be achieved by using microarrays to understand metabolic adaptations and disorders in equine muscle in response to exercise. OBJECTIVES To test the feasibility of using mouse cDNA microarrays to analyse gene expression profile in normal equine muscles. METHODS Muscular biopsies of dorsal gluteus medius and longissimus lumborum were done in 4 healthy Standardbreds. Total RNA was extracted from the muscle samples. The concentration and quality of RNA were measured before and after amplification. Gene expression profiles were measured using mouse cDNA microarrays including 15,264 unique genes representing about 11,000 documented genes. Three hybridisation tests were performed to check interspecificity, reproducibility and to compare gene expression in these muscles. For each test, a dye-swap hybridisation with Cy3 and Cy5 fluoromarkers were done and the gene list filtered according the signal level. RESULTS According to the specificity test, the mouse cDNA microarrays were correctly hybridised by equine muscle cDNA. All positive control genes (GAPDH, HPRT and beta-Actin) and no negative control gene (yeast, plant) hybridised. The reproducibility test demonstrated a good linearity between the duplicate hybridisations: 99.99% of the significant expressed genes have an expression ratio between 1.4 and 1/1.4 = 0.71. These limits can be considered as the thresholds to qualify as up-regulated (ratio >1.4) or downregulated (ratio <0.71). In the muscle comparison test between gluteus medius vs. longissimus lumborum, 63 genes were found up-regulated and 8 genes down-regulated. The range of gene expression ratios in the gluteus medius was 0.61-8.31 x the longissimus lumborum. This list of modulated genes was classified by functions using a gene ontology data basis. CONCLUSION Mouse microarrays could be used to hybridise equine RNA extracted from muscle tissues. For many genes there are large sequence identities that allowed interspecific cDNA hybridisation. The sensitivity of the method allowed quantification of up- and down-regulated genes after applying appropriate filters. POTENTIAL RELEVANCE Expression profiling could be used to explore the muscle metabolism changes related to exercise, training, pathology and illegal medication in horses.
Collapse
Affiliation(s)
- E Mucher
- INRA, Laboratoire d'Etude de la Physiologie de l'Exercice, Genopole, Evry, France
| | | | | | | | | | | |
Collapse
|
52
|
Amin R, Jesmin, Jamil H, Hossain MA. In silico analysis of human Telomerase Reverse Transcriptase (hTERT) gene: identification of a distant homolog of Melanoma Antigen Family Gene (MAGE). Cancer Inform 2009; 7:171-81. [PMID: 20011463 PMCID: PMC2791492 DOI: 10.4137/cin.s3392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Melanoma antigen family (MAGE) genes are widely expressed in various tumor types but silent in normal cells except germ-line cells lacking human leukocyte antigen (HLA) expression. Over 25 MAGE genes have been identified in different tissues, mostly located in Xq28 of human chromosome and some of them in chromosome 3 and 15, containing either single or multiple-exons. This in silico study predicted the genes on hTERT location and identified a distant relative of MAGE gene located on chromosome 5. The study identified a single exon coding ~850 residues polypeptide sharing ~30% homology with Macfa-MAGE E1 and hMAGE-E1. dbEST search of the predicted transcript matches 5′ and 3′ flanking ESTs. The predicted protein showed sequence homology within the MAGE homology domain 2 (MHD2). UCSC genome annotation of CpG Island around the coding region reveals that this gene could be silent by methylation. Affymetrix all-exon track indicates the gene could be expressed in different tissues particularly in cancer cells as they widely undergo a genome wide demethylation process.
Collapse
Affiliation(s)
- Ruhul Amin
- Department of Microbiology, University of Dhaka, Dhaka 1000, Bangladesh.
| | | | | | | |
Collapse
|
53
|
Ho MCW, Johnsen H, Goetz SE, Schiller BJ, Bae E, Tran DA, Shur AS, Allen JM, Rau C, Bender W, Fisher WW, Celniker SE, Drewell RA. Functional evolution of cis-regulatory modules at a homeotic gene in Drosophila. PLoS Genet 2009; 5:e1000709. [PMID: 19893611 PMCID: PMC2763271 DOI: 10.1371/journal.pgen.1000709] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2009] [Accepted: 10/05/2009] [Indexed: 11/19/2022] Open
Abstract
It is a long-held belief in evolutionary biology that the rate of molecular evolution for a given DNA sequence is inversely related to the level of functional constraint. This belief holds true for the protein-coding homeotic (Hox) genes originally discovered in Drosophila melanogaster. Expression of the Hox genes in Drosophila embryos is essential for body patterning and is controlled by an extensive array of cis-regulatory modules (CRMs). How the regulatory modules functionally evolve in different species is not clear. A comparison of the CRMs for the Abdominal-B gene from different Drosophila species reveals relatively low levels of overall sequence conservation. However, embryonic enhancer CRMs from other Drosophila species direct transgenic reporter gene expression in the same spatial and temporal patterns during development as their D. melanogaster orthologs. Bioinformatic analysis reveals the presence of short conserved sequences within defined CRMs, representing gap and pair-rule transcription factor binding sites. One predicted binding site for the gap transcription factor KRUPPEL in the IAB5 CRM was found to be altered in Superabdominal (Sab) mutations. In Sab mutant flies, the third abdominal segment is transformed into a copy of the fifth abdominal segment. A model for KRUPPEL-mediated repression at this binding site is presented. These findings challenge our current understanding of the relationship between sequence evolution at the molecular level and functional activity of a CRM. While the overall sequence conservation at Drosophila CRMs is not distinctive from neighboring genomic regions, functionally critical transcription factor binding sites within embryonic enhancer CRMs are highly conserved. These results have implications for understanding mechanisms of gene expression during embryonic development, enhancer function, and the molecular evolution of eukaryotic regulatory modules.
Collapse
Affiliation(s)
- Margaret C. W. Ho
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Holly Johnsen
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Sara E. Goetz
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Benjamin J. Schiller
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Esther Bae
- College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona, California, United States of America
| | - Diana A. Tran
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Andrey S. Shur
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - John M. Allen
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Christoph Rau
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Welcome Bender
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - William W. Fisher
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Susan E. Celniker
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Robert A. Drewell
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| |
Collapse
|
54
|
Salse J, Abrouk M, Murat F, Quraishi UM, Feuillet C. Improved criteria and comparative genomics tool provide new insights into grass paleogenomics. Brief Bioinform 2009; 10:619-30. [DOI: 10.1093/bib/bbp037] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
55
|
Piontkivska H, Yang MQ, Larkin DM, Lewin HA, Reecy J, Elnitski L. Cross-species mapping of bidirectional promoters enables prediction of unannotated 5' UTRs and identification of species-specific transcripts. BMC Genomics 2009; 10:189. [PMID: 19393065 PMCID: PMC2688522 DOI: 10.1186/1471-2164-10-189] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2008] [Accepted: 04/24/2009] [Indexed: 11/10/2022] Open
Abstract
Background Bidirectional promoters are shared regulatory regions that influence the expression of two oppositely oriented genes. This type of regulatory architecture is found more frequently than expected by chance in the human genome, yet many specifics underlying the regulatory design are unknown. Given that the function of most orthologous genes is similar across species, we hypothesized that the architecture and regulation of bidirectional promoters might also be similar across species, representing a core regulatory structure and enabling annotation of these regions in additional mammalian genomes. Results By mapping the intergenic distances of genes in human, chimpanzee, bovine, murine, and rat, we show an enrichment for pairs of genes equal to or less than 1,000 bp between their adjacent 5' ends ("head-to-head") compared to pairs of genes that fall in the same orientation ("head-to-tail") or whose 3' ends are side-by-side ("tail-to-tail"). A representative set of 1,369 human bidirectional promoters was mapped to orthologous sequences in other mammals. We confirmed predictions for 5' UTRs in nine of ten manual picks in bovine based on comparison to the orthologous human promoter set and in six of seven predictions in human based on comparison to the bovine dataset. The two predictions that did not have orthology as bidirectional promoters in the other species resulted from unique events that initiated transcription in the opposite direction in only those species. We found evidence supporting the independent emergence of bidirectional promoters from the family of five RecQ helicase genes, which gained their bidirectional promoters and partner genes independently rather than through a duplication process. Furthermore, by expanding our comparisons from pairwise to multispecies analyses we developed a map representing a core set of bidirectional promoters in mammals. Conclusion We show that the orthologous positions of bidirectional promoters provide a reliable guide to directly annotate over one thousand regulatory regions in sequences of mammalian genomes, while also serving as a useful tool to predict 5' UTR positions and identify genes that are novel to a single species.
Collapse
Affiliation(s)
- Helen Piontkivska
- 2Department of BiologicalSciences, Kent State University, Kent, Ohio 44242, USA.
| | | | | | | | | | | |
Collapse
|
56
|
Peterson BK, Hare EE, Iyer VN, Storage S, Conner L, Papaj DR, Kurashima R, Jang E, Eisen MB. Big genomes facilitate the comparative identification of regulatory elements. PLoS One 2009; 4:e4688. [PMID: 19259274 PMCID: PMC2650094 DOI: 10.1371/journal.pone.0004688] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2009] [Accepted: 01/08/2009] [Indexed: 01/08/2023] Open
Abstract
The identification of regulatory sequences in animal genomes remains a significant challenge. Comparative genomic methods that use patterns of evolutionary conservation to identify non-coding sequences with regulatory function have yielded many new vertebrate enhancers. However, these methods have not contributed significantly to the identification of regulatory sequences in sequenced invertebrate taxa. We demonstrate here that this differential success, which is often attributed to fundamental differences in the nature of vertebrate and invertebrate regulatory sequences, is instead primarily a product of the relatively small size of sequenced invertebrate genomes. We sequenced and compared loci involved in early embryonic patterning from four species of true fruit flies (family Tephritidae) that have genomes four to six times larger than those of Drosophila melanogaster. Unlike in Drosophila, where virtually all non-coding DNA is highly conserved, blocks of conserved non-coding sequence in tephritids are flanked by large stretches of poorly conserved sequence, similar to what is observed in vertebrate genomes. We tested the activities of nine conserved non-coding sequences flanking the even-skipped gene of the teprhitid Ceratis capitata in transgenic D. melanogaster embryos, six of which drove patterns that recapitulate those of known D. melanogaster enhancers. In contrast, none of the three non-conserved tephritid non-coding sequences that we tested drove expression in D. melanogaster embryos. Based on the landscape of non-coding conservation in tephritids, and our initial success in using conservation in tephritids to identify D. melanogaster regulatory sequences, we suggest that comparison of tephritid genomes may provide a systematic means to annotate the non-coding portion of the D. melanogaster genome. We also propose that large genomes be given more consideration in the selection of species for comparative genomics projects, to provide increased power to detect functional non-coding DNAs and to provide a less biased view of the evolution and function of animal genomes.
Collapse
Affiliation(s)
- Brant K. Peterson
- Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
- Genomics Division, Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Emily E. Hare
- Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
- Genomics Division, Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Venky N. Iyer
- Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
| | - Steven Storage
- Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
| | - Laura Conner
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona, United States of America
| | - Daniel R. Papaj
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona, United States of America
| | - Rick Kurashima
- Pacific Basin Agricultural Research Center, United States Department of Agriculture, Hilo, Hawaii, United States of America
| | - Eric Jang
- Pacific Basin Agricultural Research Center, United States Department of Agriculture, Hilo, Hawaii, United States of America
| | - Michael B. Eisen
- Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
- Genomics Division, Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- Howard Hughes Medical Institute, University of California, Berkeley, California, United States of America
- California Institute of Quantitative Biosciences, University of California, Berkeley, California, United States of America
- Center for Integrative Genomics, University of California, Berkeley, California, United States of America
- * E-mail:
| |
Collapse
|
57
|
Bahar B, Sweeney T. Mapping of the transcription start site (TSS) and identification of SNPs in the bovine neuropeptide Y (NPY) gene. BMC Genet 2008; 9:91. [PMID: 19105820 PMCID: PMC2657160 DOI: 10.1186/1471-2156-9-91] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2008] [Accepted: 12/23/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Neuropeptide Y is a key neurotransmitter of the central nervous system which plays a vital role in the feed energy homeostasis in mammals. Mutations in the regulatory and coding regions of the bovine NPY gene can potentially affect the neuronal regulation of appetite and feeding behaviour in cattle. The objectives of this experiment were to: a) fully characterize the bovine NPY gene transcript and b) identify the SNP diversity in both coding and non-coding regions of the NPY gene in a panel of Bos taurus and B. indicus cattle. RESULTS Bovine NPY gene consists of four exons (99, 188, 81 and 195 nucleotides) and three introns. The promoter region of the NPY gene consists of TATA and GC boxes which are separated from the transcription start site (TSS) by 29 and ~100 nt, respectively. Analyses of the tissue specific expression of the bovine NPY gene revealed the presence of highly abundant NPY gene transcripts in the arcuate nucleus, cerebral and cerebellar regions of the bovine brain. We identified a total of 59 SNPs in the 8.4 kb region of the bovine NPY gene. Seven out of nine total SNPs in the promoter region affect binding of putative transcription factors. A high level of nucleotide diversity was evident in the promoter regions (2.84 x 10(-3)) compared to the exonic (1.44 x 10(-3)), intronic (1.30 x 10(-3)) and 3' untranslated (1.26 x 10(-3)) regions. CONCLUSION The SNPs identified in different regions of bovine NPY gene may serve as a basis for understanding the regulation of the expression of the bovine NPY gene under a variety of physiological conditions and identification of genotypes with high feed energy conversion efficiency.
Collapse
Affiliation(s)
- Bojlul Bahar
- Cell and Molecular Biology Lab, School of Agriculture, Food Science & Veterinary Medicine, Veterinary Science Centre, University College Dublin, Belfield, Dublin 4, Ireland.
| | | |
Collapse
|
58
|
Abstract
The strategic importance of the genome sequence of the gray, short-tailed opossum, Monodelphis domestica, accrues from both the unique phylogenetic position of metatherian (marsupial) mammals and the fundamental biologic characteristics of metatherians that distinguish them from other mammalian species. Metatherian and eutherian (placental) mammals are more closely related to one another than to other vertebrate groups, and owing to this close relationship they share fundamentally similar genetic structures and molecular processes. However, during their long evolutionary separation these alternative mammals have developed distinctive anatomical, physiologic, and genetic features that hold tremendous potential for examining relationships between the molecular structures of mammalian genomes and the functional attributes of their components. Comparative analyses using the opossum genome have already provided a wealth of new evidence regarding the importance of noncoding elements in the evolution of mammalian genomes, the role of transposable elements in driving genomic innovation, and the relationships between recombination rate, nucleotide composition, and the genomic distributions of repetitive elements. The genome sequence is also beginning to enlarge our understanding of the evolution and function of the vertebrate immune system, and it provides an alternative model for investigating mechanisms of genomic imprinting. Equally important, availability of the genome sequence is fostering the development of new research tools for physical and functional genomic analyses of M. domestica that are expanding its versatility as an experimental system for a broad range of research applications in basic biology and biomedically oriented research.
Collapse
|
59
|
Antonellis A, Huynh JL, Lee-Lin SQ, Vinton RM, Renaud G, Loftus SK, Elliot G, Wolfsberg TG, Green ED, McCallion AS, Pavan WJ. Identification of neural crest and glial enhancers at the mouse Sox10 locus through transgenesis in zebrafish. PLoS Genet 2008; 4:e1000174. [PMID: 18773071 PMCID: PMC2518861 DOI: 10.1371/journal.pgen.1000174] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2008] [Accepted: 07/17/2008] [Indexed: 11/18/2022] Open
Abstract
Sox10 is a dynamically regulated transcription factor gene that is essential for the development of neural crest-derived and oligodendroglial populations. Developmental genes often require multiple regulatory sequences that integrate discrete and overlapping functions to coordinate their expression. To identify Sox10 cis-regulatory elements, we integrated multiple model systems, including cell-based screens and transposon-mediated transgensis in zebrafish, to scrutinize mammalian conserved, noncoding genomic segments at the mouse Sox10 locus. We demonstrate that eight of 11 Sox10 genomic elements direct reporter gene expression in transgenic zebrafish similar to patterns observed in transgenic mice, despite an absence of observable sequence conservation between mice and zebrafish. Multiple segments direct expression in overlapping populations of neural crest derivatives and glial cells, ranging from pan-Sox10 and pan-neural crest regulatory control to the modulation of expression in subpopulations of Sox10-expressing cells, including developing melanocytes and Schwann cells. Several sequences demonstrate overlapping spatial control, yet direct expression in incompletely overlapping developmental intervals. We were able to partially explain neural crest expression patterns by the presence of head to head SoxE family binding sites within two of the elements. Moreover, we were able to use this transcription factor binding site signature to identify the corresponding zebrafish enhancers in the absence of overall sequence homology. We demonstrate the utility of zebrafish transgenesis as a high-fidelity surrogate in the dissection of mammalian gene regulation, especially those with dynamically controlled developmental expression.
Collapse
Affiliation(s)
- Anthony Antonellis
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Jimmy L. Huynh
- McKusick–Nathans Institute of Genetic Medicine, Department of Molecular and Comparative Pathobiology, The Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Shih-Queen Lee-Lin
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Ryan M. Vinton
- McKusick–Nathans Institute of Genetic Medicine, Department of Molecular and Comparative Pathobiology, The Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Gabriel Renaud
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Stacie K. Loftus
- Genetic Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Gene Elliot
- Genetic Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Tyra G. Wolfsberg
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Eric D. Green
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Andrew S. McCallion
- McKusick–Nathans Institute of Genetic Medicine, Department of Molecular and Comparative Pathobiology, The Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
- * E-mail:
| | - William J. Pavan
- Genetic Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| |
Collapse
|
60
|
Chiang RA, Sali A, Babbitt PC. Evolutionarily conserved substrate substructures for automated annotation of enzyme superfamilies. PLoS Comput Biol 2008; 4:e1000142. [PMID: 18670595 PMCID: PMC2453236 DOI: 10.1371/journal.pcbi.1000142] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2007] [Accepted: 06/24/2008] [Indexed: 11/19/2022] Open
Abstract
The evolution of enzymes affects how well a species can adapt to new environmental conditions. During enzyme evolution, certain aspects of molecular function are conserved while other aspects can vary. Aspects of function that are more difficult to change or that need to be reused in multiple contexts are often conserved, while those that vary may indicate functions that are more easily changed or that are no longer required. In analogy to the study of conservation patterns in enzyme sequences and structures, we have examined the patterns of conservation and variation in enzyme function by analyzing graph isomorphisms among enzyme substrates of a large number of enzyme superfamilies. This systematic analysis of substrate substructures establishes the conservation patterns that typify individual superfamilies. Specifically, we determined the chemical substructures that are conserved among all known substrates of a superfamily and the substructures that are reacting in these substrates and then examined the relationship between the two. Across the 42 superfamilies that were analyzed, substantial variation was found in how much of the conserved substructure is reacting, suggesting that superfamilies may not be easily grouped into discrete and separable categories. Instead, our results suggest that many superfamilies may need to be treated individually for analyses of evolution, function prediction, and guiding enzyme engineering strategies. Annotating superfamilies with these conserved and reacting substructure patterns provides information that is orthogonal to information provided by studies of conservation in superfamily sequences and structures, thereby improving the precision with which we can predict the functions of enzymes of unknown function and direct studies in enzyme engineering. Because the method is automated, it is suitable for large-scale characterization and comparison of fundamental functional capabilities of both characterized and uncharacterized enzyme superfamilies.
Collapse
Affiliation(s)
- Ranyee A. Chiang
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and California Institute for Quantitative Biosciences, University of California at San Francisco, San Francisco, California, United States of America
| | - Andrej Sali
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and California Institute for Quantitative Biosciences, University of California at San Francisco, San Francisco, California, United States of America
| | - Patricia C. Babbitt
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and California Institute for Quantitative Biosciences, University of California at San Francisco, San Francisco, California, United States of America
- * E-mail:
| |
Collapse
|
61
|
Yang Y, Gupta V, Ho LL, Zhou B, Fan Q, Zhu Z, Zhang W, Lai ZC. Both upstream and downstream intergenic regions are critical for the mob as tumor suppressor gene activity in Drosophila. FEBS Lett 2008; 582:1766-70. [PMID: 18472003 DOI: 10.1016/j.febslet.2008.04.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2008] [Revised: 04/17/2008] [Accepted: 04/25/2008] [Indexed: 11/25/2022]
Abstract
The Drosophila mats gene plays a critical role in growth control. Using molecular genetic approaches we investigated how mats is regulated in development. A 2236-bp genomic sequence that contains entire mats including upstream and downstream intergenic regions can rescue mats mutant phenotypes, indicating that regulatory elements necessary for proper mats expression are mostly retained. However, constructs without the upstream or downstream intergenic region failed to rescue mats mutants, demonstrating the functional importance of these sequences. Moreover, mats expression is reduced in mats(e17), a mats allele with over one-third of the downstream intergenic region deleted. Consistent with a model that the downstream intergenic region is critical for mats activity, this sequence contains evolutionarily conserved elements and has enhancer activities.
Collapse
Affiliation(s)
- Yongfei Yang
- Center of Genetics and Developmental Biology, College of Life Sciences, Peking University, Beijing 100871, China
| | | | | | | | | | | | | | | |
Collapse
|
62
|
Abstract
Synteny is the preserved order of genes between related species. To detect syntenic regions one usually first applies sequence comparison methods to the genomic sequences of the considered species. Sequence similarities detected in this way often require manual curation to finally reveal the-in many cases-hidden syntenies. The open source software SynBrowse provides a convenient interface to visualize syntenies on a genomic scale. SynBrowse is based on the well-known GBrowse software. In this chapter, we describe the basic concepts of SynBrowse and show how to apply it to different kinds of data sets. Our exposition includes the description of software pipelines for the complete process of synteny detection: (1) applying standard software to compute sequence similarities, (2) parsing, combining, and storing detected similarities in a standard database, (3) installing, configuring, and using SynBrowse. The complete set of programs making up these pipelines as well as the data sets used are available on the SynBrowse homepage.
Collapse
|
63
|
Levy S, Sutton G, Ng PC, Feuk L, Halpern AL, Walenz BP, Axelrod N, Huang J, Kirkness EF, Denisov G, Lin Y, MacDonald JR, Pang AWC, Shago M, Stockwell TB, Tsiamouri A, Bafna V, Bansal V, Kravitz SA, Busam DA, Beeson KY, McIntosh TC, Remington KA, Abril JF, Gill J, Borman J, Rogers YH, Frazier ME, Scherer SW, Strausberg RL, Venter JC. The diploid genome sequence of an individual human. PLoS Biol 2008; 5:e254. [PMID: 17803354 PMCID: PMC1964779 DOI: 10.1371/journal.pbio.0050254] [Citation(s) in RCA: 1119] [Impact Index Per Article: 69.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2007] [Accepted: 07/30/2007] [Indexed: 01/20/2023] Open
Abstract
Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2-206 bp), 292,102 heterozygous insertion/deletion events (indels)(1-571 bp), 559,473 homozygous indels (1-82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.
Collapse
Affiliation(s)
- Samuel Levy
- J. Craig Venter Institute, Rockville, Maryland, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
64
|
Mackiewicz M, Paigen B, Naidoo N, Pack AI. Analysis of the QTL for sleep homeostasis in mice:Homer1ais a likely candidate. Physiol Genomics 2008; 33:91-9. [DOI: 10.1152/physiolgenomics.00189.2007] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Electroencephalographic oscillations in the frequency range of 0.5–4 Hz, characteristic of slow-wave sleep (SWS), are often referred to as the delta oscillation or delta power. Delta power reflects sleep intensity and correlates with the homeostatic response to sleep loss. A published survey of inbred strains of mice demonstrated that the time course of accumulation of delta power varied among inbred strains, and the segregation of the rebound of delta power in BxD recombinant inbred strains identified a genomic region on chromosome 13 referred to as the delta power in SWS (or Dps1). The quantitative trait locus (QTL) contains genes that modify the accumulation of delta power after sleep deprivation. Here, we narrow the QTL using interval-specific haplotype analysis and present a comprehensive annotation of the remaining genes in the Dps1 region with sequence comparisons to identify polymorphisms within the coding and regulatory regions. We established the expression pattern of selected genes located in the Dps1 interval in sleep and wakefulness in B6 and D2 parental strains. Taken together, these steps reduced the number of potential candidate genes that may underlie the accumulation of delta power after sleep deprivation and explain the Dps1 QTL. The strongest candidate gene is Homer1a, which is supported by expression differences between sleep and wakefulness and the SNP polymorphism in the upstream regulatory regions.
Collapse
Affiliation(s)
- M. Mackiewicz
- Division of Sleep Medicine/Department of Medicine and Center for Sleep and Respiratory Neurobiology, University of Pennsylvania School of Medicine, Translational Research Laboratories, Philadelphia, Pennsylvania
| | - B. Paigen
- The Jackson Laboratory, Bar Harbor, Maine
| | - N. Naidoo
- Division of Sleep Medicine/Department of Medicine and Center for Sleep and Respiratory Neurobiology, University of Pennsylvania School of Medicine, Translational Research Laboratories, Philadelphia, Pennsylvania
| | - A. I. Pack
- Division of Sleep Medicine/Department of Medicine and Center for Sleep and Respiratory Neurobiology, University of Pennsylvania School of Medicine, Translational Research Laboratories, Philadelphia, Pennsylvania
| |
Collapse
|
65
|
Expression of the endocannabinoid system in fibroblasts and myofascial tissues. J Bodyw Mov Ther 2008; 12:169-82. [PMID: 19083670 DOI: 10.1016/j.jbmt.2008.01.004] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2007] [Revised: 12/29/2007] [Accepted: 01/08/2008] [Indexed: 12/17/2022]
Abstract
The endocannabinoid (eCB) system, like the better-known endorphin system, consists of cell membrane receptors, endogenous ligands and ligand-metabolizing enzymes. Two cannabinoid receptors are known: CB(1) is principally located in the nervous system, whereas CB(2) is primarily associated with the immune system. Two eCB ligands, anandamide (AEA) and 2-arachidonoylglycerol (2-AG), are mimicked by cannabis plant compounds. The first purpose of this paper was to review the eCB system in detail, highlighting aspects of interest to bodyworkers, especially eCB modulation of pain and inflammation. Evidence suggests the eCB system may help resolve myofascial trigger points and relieve symptoms of fibromyalgia. However, expression of the eCB system in myofascial tissues has not been established. The second purpose of this paper was to investigate the eCB system in fibroblasts and other fascia-related cells. The investigation used a bioinformatics approach, obtaining microarray data via the GEO database (www.ncbi.nlm.nih.gov/geo/). GEO data mining revealed that fibroblasts, myofibroblasts, chondrocytes and synoviocytes expressed CB(1), CB(2) and eCB ligand-metabolizing enzymes. Fibroblast CB(1) levels nearly equalled levels expressed by adipocytes. CB(1) levels upregulated after exposure to inflammatory cytokines and equiaxial stretching of fibroblasts. The eCB system affects fibroblast remodeling through lipid rafts associated with focal adhesions and dampens cartilage destruction by decreasing fibroblast-secreted metalloproteinase enzymes. In conclusion, the eCB system helps shape biodynamic embryological development, diminishes nociception and pain, reduces inflammation in myofascial tissues and plays a role in fascial reorganization. Practitioners wield several tools that upregulate eCB activity, including myofascial manipulation, diet and lifestyle modifications, and pharmaceutical approaches.
Collapse
|
66
|
Gévry N, Schoonjans K, Guay F, Murphy BD. Cholesterol supply and SREBPs modulate transcription of the Niemann-Pick C-1 gene in steroidogenic tissues. J Lipid Res 2008; 49:1024-33. [PMID: 18272928 DOI: 10.1194/jlr.m700554-jlr200] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
We tested whether sterol-regulatory element binding proteins (SREBPs) mediate sterol-regulated transactivation of the Niemann-Pick C-1 (NPC-1) gene. Loading granulosa cells with 22- or 25-hydroxycholesterol decreased NPC-1 mRNA, whereas culturing in cholesterol-depleted medium or inhibition of cholesterol biosynthesis increased NPC-1 promoter activity and NPC-1 mRNA abundance. Cotransfection of SREBP1a, SREBP1c, and SREBP2 and the NPC-1 promoter-luciferase reporter into granulosa cell lines increased the transcriptional activity of porcine, human, and mouse NPC-1 promoters. Deletion analysis of the 5' flanking region of the pig NPC-1 gene demonstrated significant promoter activity between fragments -934 and -636 bp upstream from the transcription initiation site. Sequence analysis revealed three sterol-regulatory elements (SREs) clustered between -558 and -650 bp. Each site, along with E-box sequences, bound recombinant SREBP in electromobility shift assays. Mutation of all three sites attenuated the SREBP induction of promoter activity. Chromatin immunoprecipitation (ChIP) assays revealed that cholesterol depletion enriched the association of both SREBP and acetylated histone H3 with the NPC-1 promoter fragment containing the three SREs. ChIP analysis confirmed that SREBP's association with SRE and the E-box was enriched in cells cultured in cholesterol-depleted medium. We conclude that NPC-1 is sterol-regulated, achieved by SREBP acting via SRE and the E-box sequences.
Collapse
Affiliation(s)
- Nicolas Gévry
- Centre de Recherche en Reproduction Animale, Faculté de Médecine Vétérinaire, Université de Montréal, St. Hyacinthe, Quebec, Canada J2S 7C6
| | | | | | | |
Collapse
|
67
|
Salse J, Bolot S, Throude M, Jouffe V, Piegu B, Quraishi UM, Calcagno T, Cooke R, Delseny M, Feuillet C. Identification and characterization of shared duplications between rice and wheat provide new insight into grass genome evolution. THE PLANT CELL 2008; 20:11-24. [PMID: 18178768 PMCID: PMC2254919 DOI: 10.1105/tpc.107.056309] [Citation(s) in RCA: 258] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2007] [Revised: 11/21/2007] [Accepted: 12/12/2007] [Indexed: 05/17/2023]
Abstract
The grass family comprises the most important cereal crops and is a good system for studying, with comparative genomics, mechanisms of evolution, speciation, and domestication. Here, we identified and characterized the evolution of shared duplications in the rice (Oryza sativa) and wheat (Triticum aestivum) genomes by comparing 42,654 rice gene sequences with 6426 mapped wheat ESTs using improved sequence alignment criteria and statistical analysis. Intraspecific comparisons identified 29 interchromosomal duplications covering 72% of the rice genome and 10 duplication blocks covering 67.5% of the wheat genome. Using the same methodology, we assessed orthologous relationships between the two genomes and detected 13 blocks of colinearity that represent 83.1 and 90.4% of the rice and wheat genomes, respectively. Integration of the intraspecific duplications data with colinearity relationships revealed seven duplicated segments conserved at orthologous positions. A detailed analysis of the length, composition, and divergence time of these duplications and comparisons with sorghum (Sorghum bicolor) and maize (Zea mays) indicated common and lineage-specific patterns of conservation between the different genomes. This allowed us to propose a model in which the grass genomes have evolved from a common ancestor with a basic number of five chromosomes through a series of whole genome and segmental duplications, chromosome fusions, and translocations.
Collapse
Affiliation(s)
- Jérôme Salse
- Institut National de la Recherche Agronomique/Université Blaise Pascal Unité Mixte de Recherche 1095, Amélioration et Santé des Plantes, 63100 Clermont-Ferrand, France
| | | | | | | | | | | | | | | | | | | |
Collapse
|
68
|
Lee AP, Yang Y, Brenner S, Venkatesh B. TFCONES: a database of vertebrate transcription factor-encoding genes and their associated conserved noncoding elements. BMC Genomics 2007; 8:441. [PMID: 18045502 PMCID: PMC2148067 DOI: 10.1186/1471-2164-8-441] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2007] [Accepted: 11/29/2007] [Indexed: 02/04/2023] Open
Abstract
Background Transcription factors (TFs) regulate gene transcription and play pivotal roles in various biological processes such as development, cell cycle progression, cell differentiation and tumor suppression. Identifying cis-regulatory elements associated with TF-encoding genes is a crucial step in understanding gene regulatory networks. To this end, we have used a comparative genomics approach to identify putative cis-regulatory elements associated with TF-encoding genes in vertebrates. Description We have created a database named TFCONES (Transcription Factor Genes & Associated COnserved Noncoding ElementS) () which contains all human, mouse and fugu TF-encoding genes and conserved noncoding elements (CNEs) associated with them. The CNEs were identified by gene-by-gene alignments of orthologous TF-encoding gene loci using MLAGAN. We also predicted putative transcription factor binding sites within the CNEs. A significant proportion of human-fugu CNEs contain experimentally defined binding sites for transcriptional activators and repressors, indicating that a majority of the CNEs may function as transcriptional regulatory elements. The TF-encoding genes that are involved in nervous system development are generally enriched for human-fugu CNEs. Users can retrieve TF-encoding genes and their associated CNEs by conducting a keyword search or by selecting a family of DNA-binding proteins. Conclusion The conserved noncoding elements identified in TFCONES represent a catalog of highly prioritized putative cis-regulatory elements of TF-encoding genes and are candidates for functional assay.
Collapse
Affiliation(s)
- Alison P Lee
- Institute of Molecular and Cell Biology, 61 Biopolis Drive, Singapore 138673, Singapore.
| | | | | | | |
Collapse
|
69
|
Buckley NJ. Analysis of transcription, chromatin dynamics and epigenetic changes in neural genes. Prog Neurobiol 2007; 83:195-210. [PMID: 17884276 DOI: 10.1016/j.pneurobio.2007.07.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2006] [Revised: 06/14/2007] [Accepted: 07/18/2007] [Indexed: 01/08/2023]
Abstract
The ways in which gene transcription is investigated have undergone radical change since the turn of the millennium. Piece-meal approaches focussed upon model genes have increasingly been complemented by genome-wide approaches that allow interrogation of multiple cohorts of genes or even entire genomes. This sea change has been founded upon the increasing availability of whole genome sequences and the attendant evolution of microarray based discovery platforms. Collectively, these approaches are being used to build a global and dynamic perspective of transcription factor occupancy, co-factor recruitment and epigenetic signature. As yet, few of these approaches have been applied to the study of neuronal gene transcription, but this is set to change. Here, I review these key developments and point to their potential application to the study of transcriptional and epigenetic changes in neurons in health and disease.
Collapse
Affiliation(s)
- Noel J Buckley
- King's College London, Department of Neuroscience, Institute of Psychiatry, Centre for the Cellular Basis of Behaviour, CCBB/CCIB, Room 1-045, 125 Coldharbour Lane, London SE5 9NU, UK.
| |
Collapse
|
70
|
Abstract
Multi-sequence alignments of large genomic regions are at the core of many computational genome-annotation approaches aimed at identifying coding regions, RNA genes, regulatory regions, and other functional features. Such alignments also underlie many genome-evolution studies. Here we review recent computational advances in the area of multi-sequence alignment, focusing on methods suitable for aligning whole vertebrate genomes. We introduce the key algorithmic ideas in use today, and identify publicly available resources for computing, accessing, and visualizing genomic alignments. Finally, we describe the latest alignment-based approaches to identify and characterize various types of functional sequences. Key areas of research are identified and directions for future improvements are suggested.
Collapse
Affiliation(s)
- Mathieu Blanchette
- McGill Centre for Bioinformatics, McGill University, Montreal, Quebec, Canada.
| |
Collapse
|
71
|
Tirosh I, Bilu Y, Barkai N. Comparative biology: beyond sequence analysis. Curr Opin Biotechnol 2007; 18:371-7. [PMID: 17693073 DOI: 10.1016/j.copbio.2007.07.003] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2007] [Accepted: 07/12/2007] [Indexed: 12/18/2022]
Abstract
Comparative analysis is a fundamental tool in biology. Conservation among species greatly assists the detection and characterization of functional elements, whereas inter-species differences are probably the best indicators of biological adaptation. Traditionally, comparative approaches were applied to the analysis of genomic sequences. With the growing availability of functional genomic data, comparative paradigms are now being extended also to the study of other functional attributes, most notably the gene expression. Here we review recent works applying comparative analysis to large-scale gene expression datasets and discuss the central principles and challenges of such approaches.
Collapse
Affiliation(s)
- Itay Tirosh
- Department of Molecular Genetics, Weizmann Institute of Science, 76100 Rehovot, Israel
| | | | | |
Collapse
|
72
|
King DC, Taylor J, Zhang Y, Cheng Y, Lawson HA, Martin J, Chiaromonte F, Miller W, Hardison RC. Finding cis-regulatory elements using comparative genomics: some lessons from ENCODE data. Genome Res 2007; 17:775-86. [PMID: 17567996 PMCID: PMC1891337 DOI: 10.1101/gr.5592107] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Identification of functional genomic regions using interspecies comparison will be most effective when the full span of relationships between genomic function and evolutionary constraint are utilized. We find that sets of putative transcriptional regulatory sequences, defined by ENCODE experimental data, have a wide span of evolutionary histories, ranging from stringent constraint shown by deep phylogenetic comparisons to recent selection on lineage-specific elements. This diversity of evolutionary histories can be captured, at least in part, by the suite of available comparative genomics tools, especially after correction for regional differences in the neutral substitution rate. Putative transcriptional regulatory regions show alignability in different clades, and the genes associated with them are enriched for distinct functions. Some of the putative regulatory regions show evidence for recent selection, including a primate-specific, distal promoter that may play a novel role in regulation.
Collapse
Affiliation(s)
- David C. King
- Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - James Taylor
- Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Ying Zhang
- Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Yong Cheng
- Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Heather A. Lawson
- Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Anthropology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Joel Martin
- Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | | | - Francesca Chiaromonte
- Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Statistics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Webb Miller
- Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Ross C. Hardison
- Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Corresponding author.E-mail ; fax (814) 863-7024
| |
Collapse
|
73
|
Bocciardi R, Giorda R, Buttgereit J, Gimelli S, Divizia MT, Beri S, Garofalo S, Tavella S, Lerone M, Zuffardi O, Bader M, Ravazzolo R, Gimelli G. Overexpression of the C-type natriuretic peptide (CNP) is associated with overgrowth and bone anomalies in an individual with balanced t(2;7) translocation. Hum Mutat 2007; 28:724-31. [PMID: 17373680 DOI: 10.1002/humu.20511] [Citation(s) in RCA: 104] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Longitudinal bone growth is determined by the process of endochondral ossification in the cartilaginous growth plate, which is located at both ends of vertebrae and long bones and involves many systemic hormones and local regulators. We report the molecular characterization of a de novo balanced t(2;7)(q37.1;q21.3) translocation in a young female with Marfanoid habitus and skeletal anomalies. The translocation was characterized by fluorescence in situ hybridization (FISH), checked for other abnormalities by array-comparative genomic hybridization (CGH), and finally, the breakpoints were cloned, sequenced, and compared. Biochemical dosage was applied to study the possible mechanisms that may cause the proposita's phenotype. The breakpoint on chromosome 2 disrupts the hypothetical gene MGC42174 (HUGO-approved symbol DIS3L2) and is located in the proximity of the NPPC gene coding for C-type natriuretic peptide (CNP), a molecule that regulates endochondral bone growth. CNP plasma concentration was doubled in the proband compared to five normal controls, while NPPC was substantially overexpressed in her fibroblasts. A transgenic mouse generated to target NPPC overexpression in bone showed a phenotype highly reminiscent of the patient's phenotype. The breakpoint on chromosome 7 is localized proximally at about 75 kb from the COL1A2 gene. The COL1A2 allele on the derivative chromosome was strongly underexpressed in fibroblasts, but total collagen was not significantly different from controls. Several evidences support the conclusion that the proband's abnormal phenotype is associated with C-type natriuretic peptide overexpression.
Collapse
Affiliation(s)
- Renata Bocciardi
- Laboratory of Molecular Genetics, G. Gaslini Institute, Genova, Italy
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
74
|
Non-coding sequence retrieval system for comparative genomic analysis of gene regulatory elements. BMC Bioinformatics 2007; 8:94. [PMID: 17362514 PMCID: PMC1838437 DOI: 10.1186/1471-2105-8-94] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2006] [Accepted: 03/15/2007] [Indexed: 11/30/2022] Open
Abstract
Background Completion of the human genome sequence along with other species allows for greater understanding of the biochemical mechanisms and processes that govern healthy as well as diseased states. The large size of the genome sequences has made them difficult to study using traditional methods. There are many studies focusing on the protein coding sequences, however, not much is known about the function of non-coding regions of the genome. It has been demonstrated that parts of the non-coding region play a critical role as gene regulatory elements. Enhancers that regulate transcription processes have been found in intergenic regions. Furthermore, it is observed that regulatory elements found in non-coding regions are highly conserved across different species. However, the analysis of these regulatory elements is not as straightforward as it may first seem. The development of a centralized resource that allows for the quick and easy retrieval of non-coding sequences from multiple species and is capable of handing multi-gene queries is critical for the analysis of non-coding sequences. Here we describe the development of a web-based non-coding sequence retrieval system. Results This paper presents a Non-Coding Sequences Retrieval System (NCSRS). The NCSRS is a web-based bioinformatics tool that performs fast and convenient retrieval of non-coding and coding sequences from multiple species related to a specific gene or set of genes. This tool has compiled resources from multiple sources into one easy to use and convenient web based interface. With no software installation necessary, the user needs only internet access to use this tool. Conclusion The unique features of this tool will be very helpful for those studying gene regulatory elements that exist in non-coding regions. The web based application can be accessed on the internet at: .
Collapse
|
75
|
Sinha AU, Meller J. Cinteny: flexible analysis and visualization of synteny and genome rearrangements in multiple organisms. BMC Bioinformatics 2007; 8:82. [PMID: 17343765 PMCID: PMC1821339 DOI: 10.1186/1471-2105-8-82] [Citation(s) in RCA: 90] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2006] [Accepted: 03/08/2007] [Indexed: 11/26/2022] Open
Abstract
Background Identifying syntenic regions, i.e., blocks of genes or other markers with evolutionary conserved order, and quantifying evolutionary relatedness between genomes in terms of chromosomal rearrangements is one of the central goals in comparative genomics. However, the analysis of synteny and the resulting assessment of genome rearrangements are sensitive to the choice of a number of arbitrary parameters that affect the detection of synteny blocks. In particular, the choice of a set of markers and the effect of different aggregation strategies, which enable coarse graining of synteny blocks and exclusion of micro-rearrangements, need to be assessed. Therefore, existing tools and resources that facilitate identification, visualization and analysis of synteny need to be further improved to provide a flexible platform for such analysis, especially in the context of multiple genomes. Results We present a new tool, Cinteny, for fast identification and analysis of synteny with different sets of markers and various levels of coarse graining of syntenic blocks. Using Hannenhalli-Pevzner approach and its extensions, Cinteny also enables interactive determination of evolutionary relationships between genomes in terms of the number of rearrangements (the reversal distance). In particular, Cinteny provides: i) integration of synteny browsing with assessment of evolutionary distances for multiple genomes; ii) flexibility to adjust the parameters and re-compute the results on-the-fly; iii) ability to work with user provided data, such as orthologous genes, sequence tags or other conserved markers. In addition, Cinteny provides many annotated mammalian, invertebrate and fungal genomes that are pre-loaded and available for analysis at . Conclusion Cinteny allows one to automatically compare multiple genomes and perform sensitivity analysis for synteny block detection and for the subsequent computation of reversal distances. Cinteny can also be used to interactively browse syntenic blocks conserved in multiple genomes, to facilitate genome annotation and validation of assemblies for newly sequenced genomes, and to construct and assess phylogenomic trees.
Collapse
Affiliation(s)
- Amit U Sinha
- Department of Computer Science, University of Cincinnati, Cincinnati, OH 45221, USA
| | - Jaroslaw Meller
- Department of Environmental Health, University of Cincinnati College of Medicine, Cincinnati, OH 45267-0056, USA
- Department of Informatics, Nicholas Copernicus University, 87-100 Torun, Poland
| |
Collapse
|
76
|
Yengi LG, Leung L, Kao J. The Evolving Role of Drug Metabolism in Drug Discovery and Development. Pharm Res 2007; 24:842-58. [PMID: 17333392 DOI: 10.1007/s11095-006-9217-9] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2006] [Accepted: 12/13/2006] [Indexed: 01/16/2023]
Abstract
Drug metabolism in pharmaceutical research has traditionally focused on the well-defined aspects of absorption, distribution, metabolism and excretion, commonly-referred to ADME properties of a compound, particularly in the areas of metabolite identification, identification of drug metabolizing enzymes (DMEs) and associated metabolic pathways, and reaction mechanisms. This traditional emphasis was in part due to the limited scope of understanding and the unavailability of in vitro and in vivo tools with which to evaluate more complex properties and processes. However, advances over the past decade in separate but related fields such as pharmacogenetics, pharmacogenomics and drug transporters, have dramatically shifted the drug metabolism paradigm. For example, knowledge of the genetics and genomics of DMEs allows us to better understand and predict enzyme regulation and its effects on exogenous (pharmacokinetics) and endogenous pathways as well as biochemical processes (pharmacology). Advances in the transporter area have provided unprecedented insights into the role of transporter proteins in absorption, distribution, metabolism and excretion of drugs and their consequences with respect to clinical drug-drug and drug-endogenous substance interactions, toxicity and interindividual variability in pharmacokinetics. It is therefore essential that individuals involved in modern pharmaceutical research embrace a fully integrated approach and understanding of drug metabolism as is currently practiced. The intent of this review is to reexamine drug metabolism with respect to the traditional as well as current practices, with particular emphasis on the critical aspects of integrating chemistry and biology in the interpretation and application of metabolism data in pharmaceutical research.
Collapse
Affiliation(s)
- Lilian G Yengi
- Drug Metabolism Division, Drug Safety and Metabolism, Wyeth Research, 500 Arcola Road, Collegeville, Pennsylvania 19426, USA.
| | | | | |
Collapse
|
77
|
Alkharouf NW, Klink VP, Matthews BF. Identification of Heterodera glycines (soybean cyst nematode [SCN]) cDNA sequences with high identity to those of Caenorhabditis elegans having lethal mutant or RNAi phenotypes. Exp Parasitol 2007; 115:247-58. [PMID: 17052709 DOI: 10.1016/j.exppara.2006.09.009] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2006] [Revised: 09/05/2006] [Accepted: 09/06/2006] [Indexed: 10/24/2022]
Abstract
The soybean cyst nematode (SCN; Heterodera glycines) is a devastating obligate parasite of Glycine max (soybean) causing one billion dollars in losses to the US economy per year and over ten billion dollars in losses worldwide. While much is understood about the pathology of H. glycines, its genome sequence is not well characterized or fully sequenced. We sought to create bioinformatic tools to mine the H. glycines nucleotide database. One way is to use a comparative genomics approach by anchoring our analysis with an organism, like the free-living nematode Caenorhabditis elegans. Unlike H. glycines, the C. elegans genome is fully sequenced and is well characterized with a number of lethal genes identified through experimental methods. We compared an EST database of H. glycines with the C. elegans genome. Our goal was identifying genes that may be essential for H. glycines survival and would serve as an automated pipeline for RNAi studies to both study and control H. glycines. Our analysis yielded a total of nearly 8334 conserved genes between H. glycines and C. elegans. Of these, 1508 have lethal phenotypes/phenocopies in C. elegans. RNAi of a conserved ribosomal gene from H. glycines (Hg-rps-23) yielded dead and dying worms as shown by positive Sytox fluorescence. Endogenous Hg-rps-23 exhibited typical RNA silencing as shown by RT-PCR. However, an unrelated gene Hg-unc-87 did not exhibit RNA silencing in the Hg-rps-23 dsRNA-treated worms, demonstrating the specificity of the silencing.
Collapse
Affiliation(s)
- Nadim W Alkharouf
- United States Department of Agriculture, ARS, Soybean Genomics and Improvement Laboratory, Beltsville, MD 20705-2350, USA
| | | | | |
Collapse
|
78
|
Vandepoele K, Casneuf T, Van de Peer Y. Identification of novel regulatory modules in dicotyledonous plants using expression data and comparative genomics. Genome Biol 2007; 7:R103. [PMID: 17090307 PMCID: PMC1794593 DOI: 10.1186/gb-2006-7-11-r103] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2006] [Revised: 09/15/2006] [Accepted: 11/07/2006] [Indexed: 11/30/2022] Open
Abstract
A strategy combining classical motif overrepresentation in co-regulated genes with comparative footprinting is applied to identify 80 transcription factor binding sites and 139 regulatory modules in Arabidopsis thaliana. Background Transcriptional regulation plays an important role in the control of many biological processes. Transcription factor binding sites (TFBSs) are the functional elements that determine transcriptional activity and are organized into separable cis-regulatory modules, each defining the cooperation of several transcription factors required for a specific spatio-temporal expression pattern. Consequently, the discovery of novel TFBSs in promoter sequences is an important step to improve our understanding of gene regulation. Results Here, we applied a detection strategy that combines features of classic motif overrepresentation approaches in co-regulated genes with general comparative footprinting principles for the identification of biologically relevant regulatory elements and modules in Arabidopsis thaliana, a model system for plant biology. In total, we identified 80 TFBSs and 139 regulatory modules, most of which are novel, and primarily consist of two or three regulatory elements that could be linked to different important biological processes, such as protein biosynthesis, cell cycle control, photosynthesis and embryonic development. Moreover, studying the physical properties of some specific regulatory modules revealed that Arabidopsis promoters have a compact nature, with cooperative TFBSs located in close proximity of each other. Conclusion These results create a starting point to unravel regulatory networks in plants and to study the regulation of biological processes from a systems biology point of view.
Collapse
Affiliation(s)
- Klaas Vandepoele
- Department of Plant Systems Biology, Flanders Interuniversity Institute for Biotechnology (VIB), Ghent University, Technologiepark, B-9052 Ghent, Belgium
| | - Tineke Casneuf
- Department of Plant Systems Biology, Flanders Interuniversity Institute for Biotechnology (VIB), Ghent University, Technologiepark, B-9052 Ghent, Belgium
| | - Yves Van de Peer
- Department of Plant Systems Biology, Flanders Interuniversity Institute for Biotechnology (VIB), Ghent University, Technologiepark, B-9052 Ghent, Belgium
| |
Collapse
|
79
|
Yan J, Bi W, Lupski JR. Penetrance of craniofacial anomalies in mouse models of Smith-Magenis syndrome is modified by genomic sequence surrounding Rai1: not all null alleles are alike. Am J Hum Genet 2007; 80:518-25. [PMID: 17273973 PMCID: PMC1821110 DOI: 10.1086/512043] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2006] [Accepted: 12/19/2006] [Indexed: 11/03/2022] Open
Abstract
Craniofacial abnormality is one of the major clinical manifestations of Smith-Magenis syndrome (SMS). Previous analyses in a mixed genetic background of several SMS mouse models--including Df(11)17/+ and Df(11)17-1/+, which have 2-Mb and 590-kb deletions, respectively, and Rai1(-/+)--revealed that the penetrance of the craniofacial phenotype appears to be influenced by deletion size and genetic background. We generated an additional strain with a 1-Mb deletion intermediate in size between the two described above. Remarkably, the penetrance of its craniofacial anomalies in the mixed background was between those of Df(11)17 and Df(11)17-1. We further analyzed the deletion mutations and the Rai1(-/+) allele in a pure C57BL/6 background, to control for nonlinked modifier loci. The penetrance of the craniofacial anomalies was markedly increased for all the strains in comparison with the mixed background. Mice with Df(11)17 and Df(11)17-1 deletions had a similar penetrance, suggesting that penetrance may be less influenced by deletion size, whereas that of Rai1(-/+) mice was significantly lower than that of the deletion strains. We hypothesize that potential trans-regulatory sequence(s) or gene(s) that reside within the 590-kb genomic interval surrounding Rai1 are the major modifying genetic element(s) affecting the craniofacial penetrance. Moreover, we confirmed the influence of genetic background and different deletion sizes on the phenotype. The complicated control of the penetrance for one phenotype in SMS mouse models provides tools to elucidate molecular mechanisms for penetrance and clearly shows that a null allele caused by chromosomal deletion can have different phenotypic consequences than one caused by gene inactivation.
Collapse
Affiliation(s)
- Jiong Yan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | | | | |
Collapse
|
80
|
Hermann BP, Hornbaker KI, Maran RRM, Heckert LL. Distal regulatory elements are required for Fshr expression, in vivo. Mol Cell Endocrinol 2007; 260-262:49-58. [PMID: 17097219 PMCID: PMC1764205 DOI: 10.1016/j.mce.2006.01.017] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/19/2005] [Accepted: 01/23/2006] [Indexed: 10/23/2022]
Abstract
The gonadotropin follicle-stimulating hormone (FSH) is required for initiation and maintenance of normal gametogenesis and acts through a specific, cell-surface receptor (Fshr) present only on Sertoli and granulosa cells in the gonads. Despite extensive examination of the transcriptional mechanisms regulating Fshr, the sequences directing its expression to these cells remain unidentified. To establish the minimal region necessary for Fshr expression, we generated transgenic mice carrying a yeast artificial chromosome (YAC) that contained 413 kilobases (kb) of the rat Fshr locus (YAC60). Transgene expression, as determined by RT-PCR, was absent from immature testis and Sertoli cells, limited to germ cells of the adult testis, and never observed in the ovary. While the data is limited to only one transgenic line, it suggests that the 413kb region does not specify the normal spatiotemporal expression pattern of Fshr. Comparative genomics was used to identify potential distal regulatory elements, revealing seven regions of high evolutionary conservation (>80% identity over 100bp or more), six of which were absent from the transgene. Functional examination of the evolutionary conserved regions (ECRs) by transient transfection revealed that all of the ECRs had modest transcriptional activity in Sertoli or myoid cells with two, ECR4 and ECR5, showing differential effects in expressing and non-expressing cells. These data reveal that distal regulatory regions (outside the 413kb in YAC60) are required for appropriate temporal and spatial Fshr expression and implicate the identified ECRs in transcriptional regulation of Fshr.
Collapse
MESH Headings
- Animals
- Base Sequence
- Chromosome Mapping
- Chromosomes, Artificial, Yeast
- Conserved Sequence
- Evolution, Molecular
- Gene Expression Profiling
- Gene Expression Regulation/genetics
- Humans
- Integrases/metabolism
- Mice
- Mice, Transgenic
- RNA, Messenger/genetics
- RNA, Messenger/metabolism
- Rats
- Receptors, FSH/genetics
- Recombination, Genetic
- Regulatory Sequences, Nucleic Acid/genetics
- Saccharomyces cerevisiae/genetics
- Transcription, Genetic
Collapse
Affiliation(s)
- Brian P Hermann
- Department of Molecular and Integrative Physiology, University of Kansas Medical Center, 3901 Rainbow Boulevard, Kansas City, KS 66160, USA
| | | | | | | |
Collapse
|
81
|
Cheng JF, Priest JR, Pennacchio LA. Comparative genomics: a tool to functionally annotate human DNA. Methods Mol Biol 2007; 366:229-51. [PMID: 17568128 DOI: 10.1007/978-1-59745-030-0_13] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
The availability of an increasing number of vertebrate genomes has enabled comparative methods to infer functional sequences based on evolutionary constraint. Although this has proved powerful for gene identification, significant progress has also been made in uncovering gene regulatory sequences such as distant acting transcriptional enhancers. These pursuits have led to the development of a variety of valuable databases and resources that should serve as a routine toolbox for biological discovery.
Collapse
Affiliation(s)
- Jan-Fang Cheng
- Genomics Division, Lawrence Berkeley National Laboratory, CA, USA
| | | | | |
Collapse
|
82
|
Karmaker A, Harris SE, Kwek S. Constructing human transcriptional regulatory subnets from crossgenome comparison and gene expression profile analysis. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2007; 11:397-412. [PMID: 18092911 DOI: 10.1089/omi.2007.0028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
With the completion of Human Genome Project (HGP), understanding the complex interaction between trans- and cis-regulatory elements comprehensively and identifying these potential functional elements are fundamental problems in functional genomics. Although many computational approaches have been developed for lower eukaryotes and prokaryotes, most of them often do not generalize to vertebrates. Here, we use a decay function to characterize transcriptional behavior, and analyze correlations on gene expression profiles of human and mouse to construct coregulated gene groups. Using these two closely related species, we perform comparative genome analysis and identify target genes and conserved functional cis-regulatory elements by motif overrepresentation. Moreover, we presented experimental evidences (ChIP-Chip) for E2F to assert our findings.
Collapse
Affiliation(s)
- Amitava Karmaker
- Department of Computer Science, University of Texas at San Antonio, San Antonio, TX 78249, USA.
| | | | | |
Collapse
|
83
|
Ribich S, Tasic B, Maniatis T. Identification of long-range regulatory elements in the protocadherin-alpha gene cluster. Proc Natl Acad Sci U S A 2006; 103:19719-24. [PMID: 17172445 PMCID: PMC1750919 DOI: 10.1073/pnas.0609445104] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The clustered protocadherins (Pcdh) are encoded by three closely linked gene clusters (Pcdh-alpha, -beta, and -gamma) that span nearly 1 million base pairs of DNA. The Pcdh-alpha gene cluster encodes a family of 14 distinct cadherin-like cell surface proteins that are expressed in neurons and are present at synaptic junctions. Individual Pcdh-alpha mRNAs are assembled from one of 14 "variable" (V) exons and three "constant" exons in a process that involves both differential promoter activation and alternative pre-mRNA splicing. In individual neurons, only one (and rarely two) of the Pcdh alpha1-12 promoters is independently and randomly activated on each chromosome. Thus, in most cells, this unusual form of monoallelic expression leads to the expression of two different Pcdh-alpha 1-12 V exons, one from each chromosome. The two remaining V exons in the cluster (Pcdh-alphaC1 and alphaC2) are expressed biallelically in every neuron. The mechanisms that underlie promoter choice and monoallelic expression in the Pcdh-alpha gene cluster are not understood. Here we report the identification of two long-range cis-regulatory elements in the Pcdh-alpha gene cluster, HS5-1 and HS7. We show that HS5-1 is required for maximal levels of expression from the Pcdh alpha1-12 and alphaC1 promoters, but not the Pcdh-alphaC2 promoter. The nearly cluster-wide requirement of the HS5-1 element is consistent with the possibility that the monoallelic expression of Pcdh-alpha V exons is a consequence of competition between individual V exon promoters for the two regulatory elements.
Collapse
Affiliation(s)
- Scott Ribich
- Department of Molecular and Cellular Biology, Harvard University, 7 Divinity Avenue, Cambridge, MA 02138
| | - Bosiljka Tasic
- Department of Molecular and Cellular Biology, Harvard University, 7 Divinity Avenue, Cambridge, MA 02138
| | - Tom Maniatis
- Department of Molecular and Cellular Biology, Harvard University, 7 Divinity Avenue, Cambridge, MA 02138
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
84
|
Allende ML, Manzanares M, Tena JJ, Feijóo CG, Gómez-Skarmeta JL. Cracking the genome's second code: enhancer detection by combined phylogenetic footprinting and transgenic fish and frog embryos. Methods 2006; 39:212-9. [PMID: 16806968 DOI: 10.1016/j.ymeth.2005.12.005] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2005] [Accepted: 12/17/2005] [Indexed: 02/05/2023] Open
Abstract
Genes involved in vertebrate development are unusually enriched for highly conserved non-coding sequence elements. These regions are readily detected in silico, by genome-wide sequence comparisons between different vertebrates, from mammals to fish (phylogenetic footprinting). It follows that sequence conservation must be the result of positive selection for an essential physiological role. An obvious possibility is that these conserved sequences possess regulatory or structural functions important for gene expression and, thus, an in vivo assay becomes necessary. We have developed a rapid testing system using zebrafish and Xenopus laevis embryos that allows us to assign transcriptional regulatory functions to conserved non-coding sequence elements. The sequences are cloned into a vector containing a minimal promoter and the GFP reporter, and are assayed for their putative cis-regulatory activity in zebrafish or Xenopus transgenic experiments. Vectors used include plasmid DNA and the Tol2 transposon system in fish and X. laevis. We have followed this logic to detect and analyze conserved elements in an intergenic region present in the Iroquois (Irx) gene clusters of zebrafish, Xenopus tropicalis, Fugu rubripes and mouse. We have assayed approximately 50 of these conserved elements and shown that the majority behave as modular positive regulatory elements (enhancers) that contribute to specific temporal and spatial domains that are part of the endogenous gene expression pattern. Moreover, comparison of the activity of cognate Irx enhancers from different organisms demonstrates that conservation of sequence is accompanied by in vivo functional conservation across species. Finally, for some of the most conserved elements, we have been able to identify a critical core sequence, essential for correct enhancer function.
Collapse
Affiliation(s)
- Miguel L Allende
- Millennium Nucleus in Developmental Biology, Facultad de Ciencias, Universidad de Chile, Casilla 653, Santiago, Chile.
| | | | | | | | | |
Collapse
|
85
|
Jiang RHY, Tyler BM, Govers F. Comparative analysis of Phytophthora genes encoding secreted proteins reveals conserved synteny and lineage-specific gene duplications and deletions. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2006; 19:1311-21. [PMID: 17153915 DOI: 10.1094/mpmi-19-1311] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Comparative analysis of two Phytophthora genomes revealed overall colinearity in four genomic regions consisting of a 1.5-Mb sequence of Phytophthora sojae and a 0.9-Mb sequence of P. ramorum. In these regions with conserved synteny, the gene order is largely similar; however, genome rearrangements also have occurred. Deletions and duplications often were found in association with genes encoding secreted proteins, including effectors that are important for interaction with host plants. Among secreted protein genes, different evolutionary patterns were found. Elicitin genes that code for a complex family of highly conserved Phytophthora-specific elicitors show conservation in gene number and order, and often are clustered. In contrast, the race-specific elicitor gene Avrlb-1 appeared to be missing from the region with conserved synteny, as were its five homologs that are scattered over the four genomic regions. Some gene families encoding secreted proteins were found to be expanded in one species compared with the other. This could be the result of either repeated gene duplications in one species or specific deletions in the other. These different evolutionary patterns may shed light on the functions of these secreted proteins in the biology and pathology of the two Phytophthora spp.
Collapse
Affiliation(s)
- Rays H Y Jiang
- Laboratory of Phytopathology, Plant Sciences Group, Wageningen University, Binnenhaven 5, NL-6709 PD Wageningen, The Netherlands
| | | | | |
Collapse
|
86
|
Roh TY, Wei G, Farrell CM, Zhao K. Genome-wide prediction of conserved and nonconserved enhancers by histone acetylation patterns. Genome Res 2006; 17:74-81. [PMID: 17135569 PMCID: PMC1716270 DOI: 10.1101/gr.5767907] [Citation(s) in RCA: 109] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Comparative genomic studies have been useful in identifying transcriptional regulatory elements in higher eukaryotic genomes, but many important regulatory elements cannot be detected by such analyses due to evolutionary variations and alignment tool limitations. Therefore, in this study we exploit the highly conserved nature of epigenetic modifications to identify potential transcriptional enhancers. By using a high-resolution genome-wide mapping technique, which combines the chromatin immunoprecipitation and serial analysis of gene expression assays, we have recently determined the distribution of lysine 9/14-diacetylated histone H3 in human T cells. We showed the existence of 46,813 regions with clusters of histone acetylation, termed histone acetylation islands, some of which correspond to known transcriptional regulatory elements. In the present study, we find that 4679 sequences conserved between human and pufferfish coincide with histone acetylation islands, and random sampling shows that 33% (13/39) of these can function as transcriptional enhancers in human Jurkat T cells. In addition, by comparing the human histone acetylation island sequences with mouse genome sequences, we find that despite the conservation of many of these regions between these species, 21,855 of these sequences are not conserved. Furthermore, we demonstrate that about 50% (26/51) of these nonconserved sequences have enhancer activity in Jurkat cells, and that many of the orthologous mouse sequences also have enhancer activity in addition to conserved epigenetic modification patterns in mouse T-cell chromatin. Therefore, by combining epigenetic modification and sequence data, we have established a novel genome-wide method for identifying regulatory elements not discernable by comparative genomics alone.
Collapse
Affiliation(s)
- Tae-young Roh
- Laboratory of Molecular Immunology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Gang Wei
- Laboratory of Molecular Immunology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Catherine M. Farrell
- Laboratory of Molecular Immunology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Keji Zhao
- Laboratory of Molecular Immunology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
- Corresponding authorE-mail ; fax (301) 480-0961
| |
Collapse
|
87
|
Mattes WB. Cross-species comparative toxicogenomics as an aid to safety assessment. Expert Opin Drug Metab Toxicol 2006; 2:859-74. [PMID: 17125406 DOI: 10.1517/17425255.2.6.859] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Cross-species comparative toxicogenomics has the potential for improving the understanding of the different responses of animal models to toxicants at a molecular level. This understanding could then lead to a more accurate extrapolation of the risk posed by these toxicants to humans. Cross-species comparative studies have been carried out at the genomic sequence level and using microarrays to examine changes in global mRNA profiles. However, these studies face considerable bioinformatic challenges in terms of identifying which genes are truly orthologous across species. The resources to analyse such studies, in the context of such orthologues, beg improvement. Finally, the experimental design of such studies needs to be carefully considered to make their results fully interpretable. These issues are discussed, along with the current state-of-the-art cross-species comparative toxicogenomics in this review.
Collapse
|
88
|
Wessel J, Schork NJ. Generalized genomic distance-based regression methodology for multilocus association analysis. Am J Hum Genet 2006; 79:792-806. [PMID: 17033957 PMCID: PMC1698575 DOI: 10.1086/508346] [Citation(s) in RCA: 138] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2006] [Accepted: 08/08/2006] [Indexed: 01/29/2023] Open
Abstract
Large-scale, multilocus genetic association studies require powerful and appropriate statistical-analysis tools that are designed to relate genotype and haplotype information to phenotypes of interest. Many analysis approaches consider relating allelic, haplotypic, or genotypic information to a trait through use of extensions of traditional analysis techniques, such as contingency-table analysis, regression methods, and analysis-of-variance techniques. In this work, we consider a complementary approach that involves the characterization and measurement of the similarity and dissimilarity of the allelic composition of a set of individuals' diploid genomes at multiple loci in the regions of interest. We describe a regression method that can be used to relate variation in the measure of genomic dissimilarity (or "distance") among a set of individuals to variation in their trait values. Weighting factors associated with functional or evolutionary conservation information of the loci can be used in the assessment of similarity. The proposed method is very flexible and is easily extended to complex multilocus-analysis settings involving covariates. In addition, the proposed method actually encompasses both single-locus and haplotype-phylogeny analysis methods, which are two of the most widely used approaches in genetic association analysis. We showcase the method with data described in the literature. Ultimately, our method is appropriate for high-dimensional genomic data and anticipates an era when cost-effective exhaustive DNA sequence data can be obtained for a large number of individuals, over and above genotype information focused on a few well-chosen loci.
Collapse
Affiliation(s)
- Jennifer Wessel
- Polymorphism Research Laboratory, Department of Psychiatry, Divisions of Epidemiology, Center for Human Genetics and Genomics, University of California at San Diego, La Jolla, CA 92093-0603, USA
| | | |
Collapse
|
89
|
Elnitski L, Jin VX, Farnham PJ, Jones SJM. Locating mammalian transcription factor binding sites: a survey of computational and experimental techniques. Genome Res 2006; 16:1455-64. [PMID: 17053094 DOI: 10.1101/gr.4140006] [Citation(s) in RCA: 168] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Fields such as genomics and systems biology are built on the synergism between computational and experimental techniques. This type of synergism is especially important in accomplishing goals like identifying all functional transcription factor binding sites in vertebrate genomes. Precise detection of these elements is a prerequisite to deciphering the complex regulatory networks that direct tissue specific and lineage specific patterns of gene expression. This review summarizes approaches for in silico, in vitro, and in vivo identification of transcription factor binding sites. A variety of techniques useful for localized- and high-throughput analyses are discussed here, with emphasis on aspects of data generation and verification.
Collapse
Affiliation(s)
- Laura Elnitski
- Genomic Functional Analysis Section, National Human Genome Research Institute, National Institutes of Health, Rockville, Maryland 20878, USA.
| | | | | | | |
Collapse
|
90
|
Bhatti P, Church DM, Rutter JL, Struewing JP, Sigurdson AJ. Candidate single nucleotide polymorphism selection using publicly available tools: a guide for epidemiologists. Am J Epidemiol 2006; 164:794-804. [PMID: 16923772 DOI: 10.1093/aje/kwj269] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Single nucleotide polymorphisms (SNPs) are the most common form of human genetic variation, with millions present in the human genome. Because only 1% might be expected to confer more than modest individual effects in association studies, the selection of predictive candidate variants for complex disease analyses is formidable. Technologic advances in SNP discovery and the ever-changing annotation of the genome have led to massive informational resources that can be difficult to master across disciplines. A simplified guide is needed. Although methods for evaluating nonsynonymous coding SNPs are known, several other publicly available computational tools can be utilized to assess polymorphic variants in noncoding regions. As an example, the authors applied multiple methods to select SNPs in DNA double-strand break repair genes. They chose to evaluate SNPs that occurred among a preexisting set of 57 validated assays and to justify new assay development for 83 potential SNPs in the DNA-dependent protein kinase catalytic subunit. Of the 140 SNPs, the authors eliminated 119 variants with low or neutral predictions. The existing computational methods they used and the semiquantitative relative ranking strategy they developed can be adapted to a priori SNP selection or post hoc evaluation of variants identified in whole genome scans or within haplotype blocks associated with disease. The authors show a "real world" application of some existing bioinformatics tools for use in large epidemiologic studies and genetic analyses. They also reviewed alternative approaches that provide related information.
Collapse
Affiliation(s)
- Parveen Bhatti
- Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Department of Health and Human Services, Bethesda, MD 20892-7238, USA
| | | | | | | | | |
Collapse
|
91
|
Cannon CH, Kua CS, Lobenhofer EK, Hurban P. Capturing genomic signatures of DNA sequence variation using a standard anonymous microarray platform. Nucleic Acids Res 2006; 34:e121. [PMID: 17000641 PMCID: PMC1636412 DOI: 10.1093/nar/gkl478] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Comparative genomics, using the model organism approach, has provided powerful insights into the structure and evolution of whole genomes. Unfortunately, only a small fraction of Earth's biodiversity will have its genome sequenced in the foreseeable future. Most wild organisms have radically different life histories and evolutionary genomics than current model systems. A novel technique is needed to expand comparative genomics to a wider range of organisms. Here, we describe a novel approach using an anonymous DNA microarray platform that gathers genomic samples of sequence variation from any organism. Oligonucleotide probe sequences placed on a custom 44 K array were 25 bp long and designed using a simple set of criteria to maximize their complexity and dispersion in sequence probability space. Using whole genomic samples from three known genomes (mouse, rat and human) and one unknown (Gonystylus bancanus), we demonstrate and validate its power, reliability, transitivity and sensitivity. Using two separate statistical analyses, a large numbers of genomic ‘indicator’ probes were discovered. The construction of a genomic signature database based upon this technique would allow virtual comparisons and simple queries could generate optimal subsets of markers to be used in large-scale assays, using simple downstream techniques. Biologists from a wide range of fields, studying almost any organism, could efficiently perform genomic comparisons, at potentially any phylogenetic level after performing a small number of standardized DNA microarray hybridizations. Possibilities for refining and expanding the approach are discussed.
Collapse
Affiliation(s)
- C. H. Cannon
- To whom correspondence should be addressed. Tel: +1 806 742 3993; Fax: +1 806 742 2963;
| | - C. S. Kua
- 27 Jln. Dato Haji Harun, Taman Tayton ViewKuala Lumpur, Malaysia
| | - E. K. Lobenhofer
- Paradigm Array Labs, a Service Unit of Icoria Inc.Research Triangle Park, NC 27709, USA
| | - P. Hurban
- Paradigm Array Labs, a Service Unit of Icoria Inc.Research Triangle Park, NC 27709, USA
| |
Collapse
|
92
|
Boverhof DR, Burgoon LD, Tashiro C, Sharratt B, Chittim B, Harkema JR, Mendrick DL, Zacharewski TR. Comparative toxicogenomic analysis of the hepatotoxic effects of TCDD in Sprague Dawley rats and C57BL/6 mice. Toxicol Sci 2006; 94:398-416. [PMID: 16960034 DOI: 10.1093/toxsci/kfl100] [Citation(s) in RCA: 147] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In an effort to further characterize conserved and species-specific mechanisms of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD)-mediated toxicity, comparative temporal and dose-response microarray analyses were performed on hepatic tissue from immature, ovariectomized Sprague Dawley rats and C57BL/6 mice. For temporal studies, rats and mice were gavaged with 10 or 30 microg/kg of TCDD, respectively, and sacrificed after 2, 4, 8, 12, 18, 24, 72, or 168 h while dose-response studies were performed at 24 h. Hepatic gene expression profiles were monitored using custom cDNA microarrays containing 8567 (rat) or 13,361 (mouse) cDNA clones. Affymetrix data from male rats treated with 40 microg/kg TCDD were also included to expand the species comparison. In total, 3087 orthologous genes were represented in the cross-species comparison. Comparative analysis identified 33 orthologous genes that were commonly regulated by TCDD as well as 185 rat-specific and 225 mouse-specific responses. Functional annotation using Gene Ontology identified conserved gene responses associated with xenobiotic/chemical stress and amino acid and lipid metabolism. Rat-specific gene expression responses were associated with cellular growth and lipid metabolism while mouse-specific responses were associated with lipid uptake/metabolism and immune responses. The common and species-specific gene expression responses were also consistent with complementary histopathology, clinical chemistry, hepatic lipid analyses, and reports in the literature. These data expand our understanding of TCDD-mediated gene expression responses and indicate that species-specific toxicity may be mediated by differences in gene expression which may help explain the wide range of species sensitivities and will have important implications in risk assessment strategies.
Collapse
Affiliation(s)
- Darrell R Boverhof
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, USA
| | | | | | | | | | | | | | | |
Collapse
|
93
|
GuhaThakurta D. Computational identification of transcriptional regulatory elements in DNA sequence. Nucleic Acids Res 2006; 34:3585-98. [PMID: 16855295 PMCID: PMC1524905 DOI: 10.1093/nar/gkl372] [Citation(s) in RCA: 98] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Identification and annotation of all the functional elements in the genome, including genes and the regulatory sequences, is a fundamental challenge in genomics and computational biology. Since regulatory elements are frequently short and variable, their identification and discovery using computational algorithms is difficult. However, significant advances have been made in the computational methods for modeling and detection of DNA regulatory elements. The availability of complete genome sequence from multiple organisms, as well as mRNA profiling and high-throughput experimental methods for mapping protein-binding sites in DNA, have contributed to the development of methods that utilize these auxiliary data to inform the detection of transcriptional regulatory elements. Progress is also being made in the identification of cis-regulatory modules and higher order structures of the regulatory sequences, which is essential to the understanding of transcription regulation in the metazoan genomes. This article reviews the computational approaches for modeling and identification of genomic regulatory elements, with an emphasis on the recent developments, and current challenges.
Collapse
Affiliation(s)
- Debraj GuhaThakurta
- Research Genetics Division, Rosetta Inpharmatics LLC, Merck & Co., Inc, 401 Terry Avenue North, Seattle, WA 98109, USA.
| |
Collapse
|
94
|
Dupanloup I, Kaessmann H. Evolutionary simulations to detect functional lineage-specific genes. Bioinformatics 2006; 22:1815-22. [PMID: 16766551 DOI: 10.1093/bioinformatics/btl280] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Supporting the functionality of recent duplicate gene copies is usually difficult, owing to high sequence similarity between duplicate counterparts and shallow phylogenies, which hamper both the statistical and experimental inference. RESULTS We developed an integrated evolutionary approach to identify functional duplicate gene copies and other lineage-specific genes. By repeatedly simulating neutral evolution, our method estimates the probability that an ORF was selectively conserved and is therefore likely to represent a bona fide coding region. In parallel, our method tests whether the accumulation of non-synonymous substitutions reveals signatures of selective constraint. We show that our approach has high power to identify functional lineage-specific genes using simulated and real data. For example, a coding region of average length (approximately 1400 bp), restricted to hominoids, can be predicted to be functional in approximately 94-100% of cases. Notably, the method may support functionality for instances where classical selection tests based on the ratio of non-synonymous to synonymous substitutions fail to reveal signatures of selection. Our method is available as an automated tool, ReEVOLVER, which will also be useful to systematically detect functional lineage-specific genes of closely related species on a large scale. AVAILABILITY ReEVOLVER is available at http://www.unil.ch/cig/page7858.html.
Collapse
Affiliation(s)
- Isabelle Dupanloup
- Center for Integrative Genomics, University of Lausanne CH-1015 Lausanne, Switzerland
| | | |
Collapse
|
95
|
Kralik P, Matiasovic J, Horin P. Genetic evidence for the existence of interleukin-23 and for variation in the interleukin-12 and interleukin-12 receptor genes in the horse. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY. PART D, GENOMICS & PROTEOMICS 2006; 1:179-186. [PMID: 20483249 DOI: 10.1016/j.cbd.2005.09.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2005] [Revised: 09/11/2005] [Accepted: 09/14/2005] [Indexed: 05/29/2023]
Abstract
Immune loci, characterized by features reflecting their role in defense reactions and consequently related to evolutionary mechanisms, including polymorphisms or association with disease are suitable candidates for comparative analysis. Interleukin-12 and related cytokines are key molecules regulating natural and specific immune responses. In this study, we analyzed four horse IL12-related genes: IL23p19, IL12Rbeta2, IL12p40, and IL12p35. Genomic nucleotide sequence of the horse IL23 p19 sub-unit encoding gene was determined. The horse IL23p19 gene consists of four exons; its total mRNA length is 1004 bp, with a coding region of 579 bp. The predicted amino acid sequence of the horse IL23p19 sub-unit showed 88.0% sequence identity with the human sequence. A partial genomic sequence highly homologous to human IL12Rbeta2 suggesting existence of this gene in the horse was retrieved. Single nucleotide polymorphisms (SNPs) were identified in all four genes analyzed. PCR-RFLP genotyping was developed for selected SNPs. Inter-breed differences in allele and genotype frequencies were observed in IL12p35 SNP 242. The results showed that horse IL12-related genes are comparable to their counterparts in other mammalian species in terms of their structure and their genetic variation.
Collapse
Affiliation(s)
- Petr Kralik
- Institute of Animal Genetics, Faculty of Veterinary Medicine, Palackého 1/3, CZ-612 42 Brno, Czech Republic
| | | | | |
Collapse
|
96
|
Hadrys T, Punnamoottil B, Pieper M, Kikuta H, Pezeron G, Becker TS, Prince V, Baker R, Rinkwitz S. Conserved co-regulation and promoter sharing of hoxb3a and hoxb4a in zebrafish. Dev Biol 2006; 297:26-43. [PMID: 16860306 DOI: 10.1016/j.ydbio.2006.04.446] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2005] [Revised: 03/16/2006] [Accepted: 04/12/2006] [Indexed: 10/24/2022]
Abstract
The expression of zebrafish hoxb3a and hoxb4a has been found to be mediated through five transcripts, hoxb3a transcripts I-III and hoxb4a transcripts I-II, driven by four promoters. A "master" promoter, located about 2 kb downstream of hoxb5a, controls transcription of a pre-mRNA comprising exon sequences of both genes. This unique gene structure is proposed to provide a novel mechanism to ensure overlapping, tissue-specific expression of both genes in the posterior hindbrain and spinal cord. Transgenic approaches were used to analyze the functions of zebrafish hoxb3a/hoxb4a promoters and enhancer sequences containing regions of homology that were previously identified by comparative genomics. Two neural enhancers were shown to establish specific anterior expression borders within the hindbrain and mediate expression in defined neuronal populations derived from hindbrain rhombomeres (r) 5 to 8, suggesting a late role of the genes in neuronal cell lineage specification. Species comparison showed that the zebrafish hoxb3a r5 and r6 enhancer corresponded to a sequence within the mouse HoxA cluster controlling activity of Hoxa3 in r5 and r6, whereas a homologous region within the HoxB cluster activated Hoxb3 expression but limited to r5. We conclude that the similarity of hoxb3a/Hoxa3 regulatory mechanisms reflect the shared descent of both genes from a single ancestral paralog group 3 gene.
Collapse
Affiliation(s)
- Thorsten Hadrys
- Department of Physiology and Neuroscience, NYU Medical School, New York, NY 10016, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
97
|
Monsieurs P, Thijs G, Fadda AA, De Keersmaecker SCJ, Vanderleyden J, De Moor B, Marchal K. More robust detection of motifs in coexpressed genes by using phylogenetic information. BMC Bioinformatics 2006; 7:160. [PMID: 16549017 PMCID: PMC1525208 DOI: 10.1186/1471-2105-7-160] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2005] [Accepted: 03/20/2006] [Indexed: 11/30/2022] Open
Abstract
Background Several motif detection algorithms have been developed to discover overrepresented motifs in sets of coexpressed genes. However, in a noisy gene list, the number of genes containing the motif versus the number lacking the motif might not be sufficiently high to allow detection by classical motif detection tools. To still recover motifs which are not significantly enriched but still present, we developed a procedure in which we use phylogenetic footprinting to first delineate all potential motifs in each gene. Then we mutually compare all detected motifs and identify the ones that are shared by at least a few genes in the data set as potential candidates. Results We applied our methodology to a compiled test data set containing known regulatory motifs and to two biological data sets derived from genome wide expression studies. By executing four consecutive steps of 1) identifying conserved regions in orthologous intergenic regions, 2) aligning these conserved regions, 3) clustering the conserved regions containing similar regulatory regions followed by extraction of the regulatory motifs and 4) screening the input intergenic sequences with detected regulatory motif models, our methodology proves to be a powerful tool for detecting regulatory motifs when a low signal to noise ratio is present in the input data set. Comparing our results with two other motif detection algorithms points out the robustness of our algorithm. Conclusion We developed an approach that can reliably identify multiple regulatory motifs lacking a high degree of overrepresentation in a set of coexpressed genes (motifs belonging to sparsely connected hubs in the regulatory network) by exploiting the advantages of using both coexpression and phylogenetic information.
Collapse
Affiliation(s)
- Pieter Monsieurs
- ESAT-SCD/SISTA, K.U. Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium
| | - Gert Thijs
- ESAT-SCD/SISTA, K.U. Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium
| | - Abeer A Fadda
- Centre of Microbial and Plant Genetics, K.U. Leuven, Kasteelpark Arenberg 20, 3001 Leuven-Heverlee, Belgium
| | - Sigrid CJ De Keersmaecker
- Centre of Microbial and Plant Genetics, K.U. Leuven, Kasteelpark Arenberg 20, 3001 Leuven-Heverlee, Belgium
| | - Jozef Vanderleyden
- Centre of Microbial and Plant Genetics, K.U. Leuven, Kasteelpark Arenberg 20, 3001 Leuven-Heverlee, Belgium
| | - Bart De Moor
- ESAT-SCD/SISTA, K.U. Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium
| | - Kathleen Marchal
- Centre of Microbial and Plant Genetics, K.U. Leuven, Kasteelpark Arenberg 20, 3001 Leuven-Heverlee, Belgium
| |
Collapse
|
98
|
Hoppe R, Lambert TD, Samollow PB, Breer H, Strotmann J. Evolution of the "OR37" subfamily of olfactory receptors: a cross-species comparison. J Mol Evol 2006; 62:460-72. [PMID: 16547640 DOI: 10.1007/s00239-005-0093-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2005] [Accepted: 11/17/2005] [Indexed: 01/09/2023]
Abstract
Genes encoding the olfactory receptors of the "OR37" subfamily of the mouse are characterized by special features including a clustered expression pattern, assembly in two distinct gene clusters, and highly conserved putative promoter motifs. Mining the rat and dog databases revealed that these two species possess highly conserved clusters of OR37 genes at two syntenic genomic loci. In a prototherian mammal, the platypus (Ornithorhynchus anatinus), none of the characteristic OR37 genes were found. Examination of a metatherian mammal, the gray short-tailed opossum (Monodelphis domestica) revealed seven canonical OR37 genes, all phylogenetically related to cluster II genes and also organized similar to cluster II of eutherian species. In addition, their 5' upstream regions comprised sequence motifs related to the putative regulatory sequences of cluster II genes. Typical cluster I OR37 genes were identified only in the eutherian mammals examined, including the evolutionary ancient anteater, wherein OR37 genes related to both clusters were present. Together, these results reveal novel information concerning the phylogenetic origin and important evolutionary steps of the mammalian-specific OR37 olfactory receptor family.
Collapse
Affiliation(s)
- Reiner Hoppe
- Institute of Physiology, University of Hohenheim, Garbenstrasse 30, 70593, Stuttgart, Germany
| | | | | | | | | |
Collapse
|
99
|
Lelandais G, Vincens P, Badel-Chagnon A, Vialette S, Jacq C, Hazout S. Comparing gene expression networks in a multi-dimensional space to extract similarities and differences between organisms. Bioinformatics 2006; 22:1359-66. [PMID: 16527831 DOI: 10.1093/bioinformatics/btl087] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Molecular evolution, which is classically assessed by comparison of individual proteins or genes between species, can now be studied by comparing co-expressed functional groups of genes. This approach, which better reflects the functional constraints on the evolution of organisms, can exploit the large amount of data generated by genome-wide expression analyses. However, it requires new methodologies to represent the data in a more accessible way for cross-species comparisons. RESULTS In this work, we present an approach based on Multi-dimensional Scaling techniques, to compare the conformation of two gene expression networks, represented in a multi-dimensional space. The expression networks are optimally superimposed, taking into account two criteria: (1) inter-organism orthologous gene pairs have to be nearby points in the final multi-dimensional space and (2) the distortion of the gene expression networks, the organization of which reflects the similarities between the gene expression measurements, has to be circumscribed. Using this approach, we compared the transcriptional programs that drive sporulation in budding and fission yeasts, extracting some common properties and differences between the two species.
Collapse
Affiliation(s)
- Gaëlle Lelandais
- Laboratoire de Génétique Moléculaire, CNRS UMR 8541, Ecole Normale Supérieure, 46 rue d'Ulm, 75230 Paris cedex 05, France.
| | | | | | | | | | | |
Collapse
|
100
|
Vitaterna MH, Pinto LH, Takahashi JS. Large-scale mutagenesis and phenotypic screens for the nervous system and behavior in mice. Trends Neurosci 2006; 29:233-40. [PMID: 16519954 PMCID: PMC3761413 DOI: 10.1016/j.tins.2006.02.006] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2005] [Revised: 12/20/2005] [Accepted: 02/17/2006] [Indexed: 11/20/2022]
Abstract
Significant developments have occurred in our understanding of the mammalian genome thanks to informatics, expression profiling and sequencing of the human and rodent genomes. However, although these facets of genomic analysis are being addressed, analysis of in vivo gene function remains a formidable task. Evaluation of the phenotype of mutants provides powerful access to gene function, and this approach is particularly relevant to the nervous system and behavior. Here, we discuss the complementary mouse genetic approaches of gene-driven, targeted mutagenesis and phenotype-driven, chemical mutagenesis. We highlight an NIH-supported large-scale effort to use phenotype-driven mutagenesis screens to identify mouse mutants with neural and behavioral alterations. Such single-gene mutations can then be used for gene identification using positional candidate gene-cloning methods.
Collapse
Affiliation(s)
- Martha Hotz Vitaterna
- Center for Functional Genomics and Department of Neurobiology and Physiology, Northwestern University, Evanston, IL 60208, USA
| | | | | |
Collapse
|