51
|
Sun H, Skogerbø G, Wang Z, Liu W, Li Y. Structural relationships between highly conserved elements and genes in vertebrate genomes. PLoS One 2008; 3:e3727. [PMID: 19008958 PMCID: PMC2579482 DOI: 10.1371/journal.pone.0003727] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2008] [Accepted: 10/26/2008] [Indexed: 02/03/2023] Open
Abstract
Large numbers of sequence elements have been identified to be highly conserved among vertebrate genomes. These highly conserved elements (HCEs) are often located in or around genes that are involved in transcription regulation and early development. They have been shown to be involved in cis-regulatory activities through both in vivo and additional computational studies. We have investigated the structural relationships between such elements and genes in six vertebrate genomes human, mouse, rat, chicken, zebrafish and tetraodon and detected several thousand cases of conserved HCE-gene associations, and also cases of HCEs with no common target genes. A few examples underscore the potential significance of our findings about several individual genes. We found that the conserved association between HCE/HCEs and gene/genes are not restricted to elements by their absolute distance on the genome. Notably, long-range associations were identified and the molecular functions of the associated genes do not show any particular overrepresentation of the functional categories previously reported. HCEs in close proximity are found to be linked with different set of gene/genes. The results reflect the highly complex correlation between HCEs and their putative target genes.
Collapse
Affiliation(s)
- Hong Sun
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
- Biological Technologies, Wyeth Research, Cambridge, Massachusetts, United States of America
- Shanghai Center for Bioinformation Technology, Shanghai, China
- Zhongxin Biotechnology Shanghai Co. Ltd., Shanghai, China
| | - Geir Skogerbø
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Zhen Wang
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Wei Liu
- Biological Technologies, Wyeth Research, Cambridge, Massachusetts, United States of America
- * E-mail: (WL); (YL)
| | - Yixue Li
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
- Shanghai Center for Bioinformation Technology, Shanghai, China
- * E-mail: (WL); (YL)
| |
Collapse
|
52
|
Abstract
Ultraconserved elements (UCEs) are sequences that are identical between reference genomes of distantly related species. As they are under negative selection and enriched near or in specific classes of genes, one explanation for their ultraconservation may be their involvement in important functions. Indeed, many UCEs can drive tissue-specific gene expression. We have demonstrated that nonexonic UCEs are depleted among segmental duplications (SDs) and copy number variants (CNVs) and proposed that their ultraconservation may reflect a mechanism of copy counting via comparison. Here, we report that nonexonic UCEs are also depleted among 10 of 11 recent genomewide data sets of human CNVs, including 3 obtained with strategies permitting greater precision in determining the extents of CNVs. We further present observations suggesting that nonexonic UCEs per se may contribute to this depletion and that their apparent dosage sensitivity was in effect when they became fixed in the last common ancestor of mammals, birds, and reptiles, consistent with dosage sensitivity contributing to ultraconservation. Finally, in searching for the mechanism(s) underlying the function of nonexonic UCEs, we have found that they are enriched in TAATTA, which is also the recognition sequence for the homeodomain DNA-binding module, and bounded by a change in A + T frequency.
Collapse
|
53
|
Shakes LA, Malcolm TL, Allen KL, De S, Harewood KR, Chatterjee PK. Context dependent function of APPb enhancer identified using enhancer trap-containing BACs as transgenes in zebrafish. Nucleic Acids Res 2008; 36:6237-48. [PMID: 18832376 PMCID: PMC2577333 DOI: 10.1093/nar/gkn628] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
An enhancer within intron 1 of the amyloid precursor protein gene (APPb) of zebrafish is identified functionally using a novel approach. Bacterial artificial chromosomes (BACs) were retrofitted with enhancer traps, and expressed as transgenes in zebrafish. Expression from both transient assays and stable lines were used for analysis. Although the enhancer was active in specific nonneural cells of the notochord when placed with APPb gene promoter proximal elements its function was restricted to, and absolutely required for, specific expression in neurons when juxtaposed with additional far-upstream promoter elements of the gene. We demonstrate that expression of green fluorescent protein fluorescence resembling the tissue distribution of APPb mRNA requires both the intron 1 enhancer and approximately 28 kb of DNA upstream of the gene. The results indicate that tissue-specificity of an isolated enhancer may be quite different from that in the context of its own gene. Using this enhancer and upstream sequence, polymorphic variants of APPb can now more closely recapitulate the endogenous pattern and regulation of APPb expression in animal models for Alzheimer's disease. The methodology should help functionally map multiple noncontiguous regulatory elements in BACs with or without gene-coding sequences.
Collapse
Affiliation(s)
- Leighcraft A Shakes
- Julius L. Chambers Biomedical/Biotechnology Research Institute, Department of Chemistry, North Carolina Central University, Durham, NC 27707, USA
| | | | | | | | | | | |
Collapse
|
54
|
Isolation and characterization of conserved non-coding sequences among rice (Oryza sativa L.) paralogous regions. Mol Genet Genomics 2008; 281:11-8. [DOI: 10.1007/s00438-008-0388-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2008] [Accepted: 09/14/2008] [Indexed: 01/07/2023]
|
55
|
Abstract
The strategic importance of the genome sequence of the gray, short-tailed opossum, Monodelphis domestica, accrues from both the unique phylogenetic position of metatherian (marsupial) mammals and the fundamental biologic characteristics of metatherians that distinguish them from other mammalian species. Metatherian and eutherian (placental) mammals are more closely related to one another than to other vertebrate groups, and owing to this close relationship they share fundamentally similar genetic structures and molecular processes. However, during their long evolutionary separation these alternative mammals have developed distinctive anatomical, physiologic, and genetic features that hold tremendous potential for examining relationships between the molecular structures of mammalian genomes and the functional attributes of their components. Comparative analyses using the opossum genome have already provided a wealth of new evidence regarding the importance of noncoding elements in the evolution of mammalian genomes, the role of transposable elements in driving genomic innovation, and the relationships between recombination rate, nucleotide composition, and the genomic distributions of repetitive elements. The genome sequence is also beginning to enlarge our understanding of the evolution and function of the vertebrate immune system, and it provides an alternative model for investigating mechanisms of genomic imprinting. Equally important, availability of the genome sequence is fostering the development of new research tools for physical and functional genomic analyses of M. domestica that are expanding its versatility as an experimental system for a broad range of research applications in basic biology and biomedically oriented research.
Collapse
|
56
|
Graham LA, Lougheed SC, Ewart KV, Davies PL. Lateral transfer of a lectin-like antifreeze protein gene in fishes. PLoS One 2008; 3:e2616. [PMID: 18612417 PMCID: PMC2440524 DOI: 10.1371/journal.pone.0002616] [Citation(s) in RCA: 80] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2008] [Accepted: 06/05/2008] [Indexed: 11/18/2022] Open
Abstract
Fishes living in icy seawater are usually protected from freezing by endogenous antifreeze proteins (AFPs) that bind to ice crystals and stop them from growing. The scattered distribution of five highly diverse AFP types across phylogenetically disparate fish species is puzzling. The appearance of radically different AFPs in closely related species has been attributed to the rapid, independent evolution of these proteins in response to natural selection caused by sea level glaciations within the last 20 million years. In at least one instance the same type of simple repetitive AFP has independently originated in two distant species by convergent evolution. But, the isolated occurrence of three very similar type II AFPs in three distantly related species (herring, smelt and sea raven) cannot be explained by this mechanism. These globular, lectin-like AFPs have a unique disulfide-bonding pattern, and share up to 85% identity in their amino acid sequences, with regions of even higher identity in their genes. A thorough search of current databases failed to find a homolog in any other species with greater than 40% amino acid sequence identity. Consistent with this result, genomic Southern blots showed the lectin-like AFP gene was absent from all other fish species tested. The remarkable conservation of both intron and exon sequences, the lack of correlation between evolutionary distance and mutation rate, and the pattern of silent vs non-silent codon changes make it unlikely that the gene for this AFP pre-existed but was lost from most branches of the teleost radiation. We propose instead that lateral gene transfer has resulted in the occurrence of the type II AFPs in herring, smelt and sea raven and allowed these species to survive in an otherwise lethal niche.
Collapse
Affiliation(s)
- Laurie A. Graham
- Department of Biochemistry, Queen's University, Kingston, Ontario, Canada
| | | | - K. Vanya Ewart
- NRC Institute for Marine Biosciences, Halifax, Nova Scotia, Canada
| | - Peter L. Davies
- Department of Biochemistry, Queen's University, Kingston, Ontario, Canada
- Department of Biology, Queen's University, Kingston, Ontario, Canada
- NRC Institute for Marine Biosciences, Halifax, Nova Scotia, Canada
- * E-mail:
| |
Collapse
|
57
|
Elgar G, Vavouri T. Tuning in to the signals: noncoding sequence conservation in vertebrate genomes. Trends Genet 2008; 24:344-52. [PMID: 18514361 DOI: 10.1016/j.tig.2008.04.005] [Citation(s) in RCA: 129] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2008] [Revised: 04/14/2008] [Accepted: 04/14/2008] [Indexed: 01/25/2023]
|
58
|
Kleinjan DA, Bancewicz RM, Gautier P, Dahm R, Schonthaler HB, Damante G, Seawright A, Hever AM, Yeyati PL, van Heyningen V, Coutinho P. Subfunctionalization of duplicated zebrafish pax6 genes by cis-regulatory divergence. PLoS Genet 2008; 4:e29. [PMID: 18282108 PMCID: PMC2242813 DOI: 10.1371/journal.pgen.0040029] [Citation(s) in RCA: 126] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2006] [Accepted: 12/21/2007] [Indexed: 01/22/2023] Open
Abstract
Gene duplication is a major driver of evolutionary divergence. In most vertebrates a single PAX6 gene encodes a transcription factor required for eye, brain, olfactory system, and pancreas development. In zebrafish, following a postulated whole-genome duplication event in an ancestral teleost, duplicates pax6a and pax6b jointly fulfill these roles. Mapping of the homozygously viable eye mutant sunrise identified a homeodomain missense change in pax6b, leading to loss of target binding. The mild phenotype emphasizes role-sharing between the co-orthologues. Meticulous mapping of isolated BACs identified perturbed synteny relationships around the duplicates. This highlights the functional conservation of pax6 downstream (3') control sequences, which in most vertebrates reside within the introns of a ubiquitously expressed neighbour gene, ELP4, whose pax6a-linked exons have been lost in zebrafish. Reporter transgenic studies in both mouse and zebrafish, combined with analysis of vertebrate sequence conservation, reveal loss and retention of specific cis-regulatory elements, correlating strongly with the diverged expression of co-orthologues, and providing clear evidence for evolution by subfunctionalization.
Collapse
MESH Headings
- Animals
- Animals, Genetically Modified
- Base Sequence
- Chromosomes, Artificial, Bacterial/genetics
- Computational Biology
- DNA Primers/genetics
- Enhancer Elements, Genetic
- Evolution, Molecular
- Eye Abnormalities/embryology
- Eye Abnormalities/genetics
- Eye Proteins/genetics
- Gene Duplication
- Gene Expression Regulation, Developmental
- Genes, Homeobox
- Genes, Reporter
- Genetic Complementation Test
- Genetic Linkage
- Homeodomain Proteins/genetics
- Mice
- Mice, Transgenic
- Models, Genetic
- Molecular Sequence Data
- Mutation, Missense
- PAX6 Transcription Factor
- Paired Box Transcription Factors/genetics
- Phenotype
- Repressor Proteins/genetics
- Sequence Homology, Nucleic Acid
- Zebrafish/abnormalities
- Zebrafish/embryology
- Zebrafish/genetics
- Zebrafish Proteins/genetics
Collapse
Affiliation(s)
- Dirk A Kleinjan
- Medical Research Council (MRC) Human Genetics Unit, Western General Hospital, Edinburgh, United Kingdom
| | - Ruth M Bancewicz
- Medical Research Council (MRC) Human Genetics Unit, Western General Hospital, Edinburgh, United Kingdom
| | - Philippe Gautier
- Medical Research Council (MRC) Human Genetics Unit, Western General Hospital, Edinburgh, United Kingdom
| | - Ralf Dahm
- Department of Genetics, Max-Planck Institute for Developmental Biology, Tübingen, Germany
| | - Helia B Schonthaler
- Department of Genetics, Max-Planck Institute for Developmental Biology, Tübingen, Germany
| | - Giuseppe Damante
- Department of Science and Biomedical Technology, University of Udine, Udine, Italy
| | - Anne Seawright
- Medical Research Council (MRC) Human Genetics Unit, Western General Hospital, Edinburgh, United Kingdom
| | - Ann M Hever
- Medical Research Council (MRC) Human Genetics Unit, Western General Hospital, Edinburgh, United Kingdom
| | - Patricia L Yeyati
- Medical Research Council (MRC) Human Genetics Unit, Western General Hospital, Edinburgh, United Kingdom
| | - Veronica van Heyningen
- Medical Research Council (MRC) Human Genetics Unit, Western General Hospital, Edinburgh, United Kingdom
- * To whom correspondence should be addressed. E-mail:
| | - Pedro Coutinho
- Medical Research Council (MRC) Human Genetics Unit, Western General Hospital, Edinburgh, United Kingdom
| |
Collapse
|
59
|
Rose D, Hertel J, Reiche K, Stadler PF, Hackermüller J. NcDNAlign: plausible multiple alignments of non-protein-coding genomic sequences. Genomics 2008; 92:65-74. [PMID: 18511233 DOI: 10.1016/j.ygeno.2008.04.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2007] [Revised: 04/09/2008] [Accepted: 04/09/2008] [Indexed: 10/22/2022]
Abstract
Genome-wide multiple sequence alignments (MSAs) are a necessary prerequisite for an increasingly diverse collection of comparative genomic approaches. Here we present a versatile method that generates high-quality MSAs for non-protein-coding sequences. The NcDNAlign pipeline combines pairwise BLAST alignments to create initial MSAs, which are then locally improved and trimmed. The program is optimized for speed and hence is particulary well-suited to pilot studies. We demonstrate the practical use of NcDNAlign in three case studies: the search for ncRNAs in gammaproteobacteria and the analysis of conserved noncoding DNA in nematodes and teleost fish, in the latter case focusing on the fate of duplicated ultra-conserved regions. Compared to the currently widely used genome-wide alignment program TBA, our program results in a 20- to 30-fold reduction of CPU time necessary to generate gammaproteobacterial alignments. A showcase application of bacterial ncRNA prediction based on alignments of both algorithms results in similar sensitivity, false discovery rates, and up to 100 putatively novel ncRNA structures. Similar findings hold for our application of NcDNAlign to the identification of ultra-conserved regions in nematodes and teleosts. Both approaches yield conserved sequences of unknown function, result in novel evolutionary insights into conservation patterns among these genomes, and manifest the benefits of an efficient and reliable genome-wide alignment package. The software is available under the GNU Public License at http://www.bioinf.uni-leipzig.de/Software/NcDNAlign/.
Collapse
Affiliation(s)
- Dominic Rose
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| | | | | | | | | |
Collapse
|
60
|
Udvadia AJ. 3.6 kb genomic sequence from Takifugu capable of promoting axon growth-associated gene expression in developing and regenerating zebrafish neurons. Gene Expr Patterns 2008; 8:382-388. [PMID: 18599366 DOI: 10.1016/j.gep.2008.05.002] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2008] [Revised: 05/15/2008] [Accepted: 05/16/2008] [Indexed: 10/22/2022]
Abstract
Unlike mammals, fish have the capacity for functional adult CNS regeneration, which is due, in part, to their ability to express axon growth-related genes in response to nerve injury. One such axon growth-associated gene is gap43, which is expressed during periods of developmental and regenerative axon growth, but is not expressed in CNS neurons that do not regenerate in adult mammals. We previously demonstrated that cis-regulatory elements of gap43 that are sufficient for developmental expression are not sufficient for regenerative expression in the zebrafish. Here we have identified a 3.6kb genomic sequence from Fugu rubripes that can promote reporter gene expression in the nervous system during both development and regeneration in zebrafish. This compact sequence is advantageous for functional dissection of regions important for axon growth-associated gene expression during development and/or regeneration. In addition, this sequence will also be useful for targeting gene expression to neurons during periods of growth and plasticity.
Collapse
Affiliation(s)
- Ava J Udvadia
- University of Wisconsin - Milwaukee, Department of Biological Sciences, Great Lakes WATER Institute UWM-GLWI, 600 E. Greenfield Avenue, Milwaukee, WI 53204, United States.
| |
Collapse
|
61
|
Brinkmeyer-Langford C, Raudsepp T, Gustafson-Seabury A, Chowdhary BP. A BAC contig map over the proximal approximately 3.3 Mb region of horse chromosome 21. Cytogenet Genome Res 2008; 120:164-72. [PMID: 18467843 DOI: 10.1159/000118758] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2007] [Indexed: 11/19/2022] Open
Abstract
A total of 207 BAC clones containing 155 loci were isolated and arranged into a map of linearly ordered overlapping clones over the proximal part of horse chromosome 21 (ECA21), which corresponds to the proximal half of the short arm of human chromosome 19 (HSA19p) and part of HSA5. The clones form two contigs - each corresponding to the respective human chromosomes - that are estimated to be separated by a gap of approximately 200 kb. Of the 155 markers present in the two contigs, 141 (33 genes and 108 STS) were generated and mapped in this study. The BACs provide a 4-5x coverage of the region and span an estimated length of approximately 3.3 Mb. The region presently contains one mapped marker per 22 kb on average, which represents a major improvement over the previous resolution of one marker per 380 kb obtained through the generation of a dense RH map for this segment. Dual color fluorescence in situ hybridization on metaphase and interphase chromosomes verified the relative order of some of the BACs and helped to orient them accurately in the contigs. Despite having similar gene order and content, the equine region covered by the contigs appears to be distinctly smaller than the corresponding region in human (3.3 Mb vs. 5.5-6 Mb) because the latter harbors a host of repetitive elements and gene families unique to humans/primates. Considering limited representation of the region in the latest version of the horse whole genome sequence EquCab2, the dense map developed in this study will prove useful for the assembly and annotation of the sequence data on ECA21 and will be instrumental in rapid search and isolation of candidate genes for traits mapped to this region.
Collapse
Affiliation(s)
- C Brinkmeyer-Langford
- Department of Veterinary Integrative Biomedical Sciences, College of Veterinary Medicine, Texas A&M University, College Station, TX, USA
| | | | | | | |
Collapse
|
62
|
Multiple transcription start sites for FOXP2 with varying cellular specificities. Gene 2008; 413:42-8. [DOI: 10.1016/j.gene.2008.01.015] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2007] [Revised: 01/15/2008] [Accepted: 01/16/2008] [Indexed: 01/22/2023]
|
63
|
Hadzhiev Y, Lang M, Ertzer R, Meyer A, Strähle U, Müller F. Functional diversification of sonic hedgehog paralog enhancers identified by phylogenomic reconstruction. Genome Biol 2008; 8:R106. [PMID: 17559649 PMCID: PMC2394741 DOI: 10.1186/gb-2007-8-6-r106] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2007] [Revised: 05/09/2007] [Accepted: 06/08/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cis-regulatory modules of developmental genes are targets of evolutionary changes that underlie the morphologic diversity of animals. Little is known about the 'grammar' of interactions between transcription factors and cis-regulatory modules and therefore about the molecular mechanisms that underlie changes in these modules, particularly after gene and genome duplications. We investigated the ar-C midline enhancer of sonic hedgehog (shh) orthologs and paralogs from distantly related vertebrate lineages, from fish to human, including the basal vertebrate Latimeria menadoensis. RESULTS We demonstrate that the sonic hedgehog a (shha) paralogs sonic hedgehog b (tiggy winkle hedgehog; shhb) genes of fishes have a modified ar-C enhancer, which specifies a diverged function at the embryonic midline. We have identified several conserved motifs that are indicative of putative transcription factor binding sites by local alignment of ar-C enhancers of numerous vertebrate sequences. To trace the evolutionary changes among paralog enhancers, phylogenomic reconstruction was carried out and lineage-specific motif changes were identified. The relation between motif composition and observed developmental differences was evaluated through transgenic functional analyses. Altering and exchanging motifs between paralog enhancers resulted in reversal of enhancer specificity in the floor plate and notochord. A model reconstructing enhancer divergence during vertebrate evolution was developed. CONCLUSION Our model suggests that the identified motifs of the ar-C enhancer function as binary switches that are responsible for specific activity between midline tissues, and that these motifs are adjusted during functional diversification of paralogs. The unraveled motif changes can also account for the complex interpretation of activator and repressor input signals within a single enhancer.
Collapse
Affiliation(s)
- Yavor Hadzhiev
- Laboratory of Developmental Transcription Regulation, Institute of Toxicology and Genetics, Forschungszentrum Karlsruhe, Karlsruhe D-76021, Germany
- Laboratory of Developmental Neurobiology and Genetics, Institute of Toxicology and Genetics, Forschungszentrum Karlsruhe, Karlsruhe D-76021, Germany
| | - Michael Lang
- Department of Zoology and Evolution biology, Faculty of Biology, University of Konstanz, Konstanz D-78457, Germany
- Departament de Genètica, Universitat de Barcelona, Av. Diagonal 645, 08028 Barcelona, Spain
| | - Raymond Ertzer
- Laboratory of Developmental Neurobiology and Genetics, Institute of Toxicology and Genetics, Forschungszentrum Karlsruhe, Karlsruhe D-76021, Germany
| | - Axel Meyer
- Department of Zoology and Evolution biology, Faculty of Biology, University of Konstanz, Konstanz D-78457, Germany
| | - Uwe Strähle
- Laboratory of Developmental Neurobiology and Genetics, Institute of Toxicology and Genetics, Forschungszentrum Karlsruhe, Karlsruhe D-76021, Germany
| | - Ferenc Müller
- Laboratory of Developmental Transcription Regulation, Institute of Toxicology and Genetics, Forschungszentrum Karlsruhe, Karlsruhe D-76021, Germany
| |
Collapse
|
64
|
Spitz F, Duboule D. Global control regions and regulatory landscapes in vertebrate development and evolution. ADVANCES IN GENETICS 2008; 61:175-205. [PMID: 18282506 DOI: 10.1016/s0065-2660(07)00006-5] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
During the course of evolution, many genes that control the development of metazoan body plans were co-opted to exert novel functions, along with the emergence or modification of structures. Gene amplification and/or changes in the cis-regulatory modules responsible for the transcriptional activity of these genes have certainly contributed in a major way to evolution of gene functions. In some cases, these processes led to the formation of groups of adjacent genes that appear to be controlled by both global and shared mechanisms.
Collapse
Affiliation(s)
- Francois Spitz
- Developmental Biology Unit, EMBL, 69117 Heidelberg, Germany
| | | |
Collapse
|
65
|
Nebert DW, Zhang G, Vesell ES. From human genetics and genomics to pharmacogenetics and pharmacogenomics: past lessons, future directions. Drug Metab Rev 2008; 40:187-224. [PMID: 18464043 PMCID: PMC2752627 DOI: 10.1080/03602530801952864] [Citation(s) in RCA: 103] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
A brief history of human genetics and genomics is provided, comparing recent progress in those fields with that in pharmacogenetics and pharmacogenomics, which are subsets of genetics and genomics, respectively. Sequencing of the entire human genome, the mapping of common haplotypes of single-nucleotide polymorphisms (SNPs), and cost-effective genotyping technologies leading to genome-wide association (GWA) studies - have combined convincingly in the past several years to demonstrate the requirements needed to separate true associations from the plethora of false positives. While research in human genetics has moved from monogenic to oligogenic to complex diseases, its pharmacogenetics branch has followed, usually a few years behind. The continuous discoveries, even today, of new surprises about our genome cause us to question reviews declaring that "personalized medicine is almost here" or that "individualized drug therapy will soon be a reality." As summarized herein, numerous reasons exist to show that an "unequivocal genotype" or even an "unequivocal phenotype" is virtually impossible to achieve in current limited-size studies of human populations. This problem (of insufficiently stringent criteria) leads to a decrease in statistical power and, consequently, equivocal interpretation of most genotype-phenotype association studies. It remains unclear whether personalized medicine or individualized drug therapy will ever be achievable by means of DNA testing alone.
Collapse
Affiliation(s)
- Daniel W Nebert
- Division of Human Genetics, Department of Pediatrics & Molecular Developmental Biology, Cincinnati, Ohio 45267-0056, USA.
| | | | | |
Collapse
|
66
|
|
67
|
Woolfe A, Elgar G. Comparative genomics using Fugu reveals insights into regulatory subfunctionalization. Genome Biol 2007; 8:R53. [PMID: 17428329 PMCID: PMC1896008 DOI: 10.1186/gb-2007-8-4-r53] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2006] [Revised: 03/06/2007] [Accepted: 04/11/2007] [Indexed: 02/04/2023] Open
Abstract
Fish-mammal genomic alignments were used to compare over 800 conserved non-coding elements that associate with genes that have undergone fish-specific duplication and retention, revealing a pattern of element retention and loss between paralogs indicative of subfunctionalization. Background A major mechanism for the preservation of gene duplicates in the genome is thought to be mediated via loss or modification of cis-regulatory subfunctions between paralogs following duplication (a process known as regulatory subfunctionalization). Despite a number of gene expression studies that support this mechanism, no comprehensive analysis of regulatory subfunctionalization has been undertaken at the level of the distal cis-regulatory modules involved. We have exploited fish-mammal genomic alignments to identify and compare more than 800 conserved non-coding elements (CNEs) that associate with genes that have undergone fish-specific duplication and retention. Results Using the abundance of duplicated genes within the Fugu genome, we selected seven pairs of teleost-specific paralogs involved in early vertebrate development, each containing clusters of CNEs in their vicinity. CNEs present around each Fugu duplicated gene were identified using multiple alignments of orthologous regions between single-copy mammalian orthologs (representing the ancestral locus) and each fish duplicated region in turn. Comparative analysis reveals a pattern of element retention and loss between paralogs indicative of subfunctionalization, the extent of which differs between duplicate pairs. In addition to complete loss of specific regulatory elements, a number of CNEs have been retained in both regions but may be responsible for more subtle levels of subfunctionalization through sequence divergence. Conclusion Comparative analysis of conserved elements between duplicated genes provides a powerful approach for studying regulatory subfunctionalization at the level of the regulatory elements involved.
Collapse
Affiliation(s)
- Adam Woolfe
- School of Biological Sciences, Queen Mary, University of London, Mile End Road, London E1 4NS, UK
- Genomic Functional Analysis Section, National Human Genome Research Institute, National Institutes of Health, Rockville, MD 20870, USA
| | - Greg Elgar
- School of Biological Sciences, Queen Mary, University of London, Mile End Road, London E1 4NS, UK
| |
Collapse
|
68
|
Retelska D, Beaudoing E, Notredame C, Jongeneel CV, Bucher P. Vertebrate conserved non coding DNA regions have a high persistence length and a short persistence time. BMC Genomics 2007; 8:398. [PMID: 17973996 PMCID: PMC2211324 DOI: 10.1186/1471-2164-8-398] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2007] [Accepted: 10/31/2007] [Indexed: 12/21/2022] Open
Abstract
Background The comparison of complete genomes has revealed surprisingly large numbers of conserved non-protein-coding (CNC) DNA regions. However, the biological function of CNC remains elusive. CNC differ in two aspects from conserved protein-coding regions. They are not conserved across phylum boundaries, and they do not contain readily detectable sub-domains. Here we characterize the persistence length and time of CNC and conserved protein-coding regions in the vertebrate and insect lineages. Results The persistence length is the length of a genome region over which a certain level of sequence identity is consistently maintained. The persistence time is the evolutionary period during which a conserved region evolves under the same selective constraints. Our main findings are: (i) Insect genomes contain 1.60 times less conserved information than vertebrates; (ii) Vertebrate CNC have a higher persistence length than conserved coding regions or insect CNC; (iii) CNC have shorter persistence times as compared to conserved coding regions in both lineages. Conclusion Higher persistence length of vertebrate CNC indicates that the conserved information in vertebrates and insects is organized in functional elements of different lengths. These findings might be related to the higher morphological complexity of vertebrates and give clues about the structure of active CNC elements. Shorter persistence time might explain the previously puzzling observations of highly conserved CNC within each phylum, and of a lack of conservation between phyla. It suggests that CNC divergence might be a key factor in vertebrate evolution. Further evolutionary studies will help to relate individual CNC to specific developmental processes.
Collapse
Affiliation(s)
- Dorota Retelska
- Computational Cancer Genomics Group, Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| | | | | | | | | |
Collapse
|
69
|
Paparidis Z, Abbasi AA, Malik S, Goode DK, Callaway H, Elgar G, deGraaff E, Lopez-Rios J, Zeller R, Grzeschik KH. Ultraconserved non-coding sequence element controls a subset of spatiotemporal GLI3 expression. Dev Growth Differ 2007; 49:543-53. [PMID: 17661744 DOI: 10.1111/j.1440-169x.2007.00954.x] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
The zinc-finger transcription factor GLI3 acts during vertebrate development in a combinatorial, context-dependent fashion as a primary transducer of sonic hedgehog (SHH) signaling. In humans, mutations affecting this key regulator of development are associated with GLI3-morphopathies, a group of congenital malformations in which forebrain and limb development are preferentially affected. We show that a non-coding element from intron two of GLI3, ultraconserved in mammals and highly conserved in the pufferfish Fugu, is a transcriptional enhancer. In transient transfection assays, it activates reporter gene transcription in human cell cultures expressing endogenous GLI3 but not in GLI3 negative cells. The identified enhancer element is predicted to contain conserved binding sites for transcription factors crucial for developmental steps in which GLI3 is involved. The regulatory potential of this element is conserved and was used to direct tissue-specific expression of a green fluorescent protein reporter gene in zebrafish embryos and of a beta-galactosidase reporter in transgenic mouse embryos. Time, location, and quantity of reporter gene expression are congruent with part of the pattern previously reported for endogenous GLI3 transcription.
Collapse
Affiliation(s)
- Zissis Paparidis
- Institute of Human Genetics, Philipps-University, Bahnhofstrasse 7, D35037 Marburg, Germany
| | | | | | | | | | | | | | | | | | | |
Collapse
|
70
|
Conrad B, Antonarakis SE. Gene Duplication: A Drive for Phenotypic Diversity and Cause of Human Disease. Annu Rev Genomics Hum Genet 2007; 8:17-35. [PMID: 17386002 DOI: 10.1146/annurev.genom.8.021307.110233] [Citation(s) in RCA: 185] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Gene duplication is one of the key factors driving genetic innovation, i.e., producing novel genetic variants. Although the contribution of whole-genome and segmental duplications to phenotypic diversity across species is widely appreciated, the phenotypic spectrum and potential pathogenicity of small-scale duplications in individual genomes are less well explored. This review discusses the nature of small-scale duplications and the phenotypes produced by such duplications. Phenotypic variation and disease phenotypes induced by duplications are more diverse and widespread than previously anticipated, and duplications are a major class of disease-related genomic variation. Pathogenic duplications particularly involve dosage-sensitive genes with both similar and dissimilar over- and underexpression phenotypes, and genes encoding proteins with a propensity to aggregate. Phenotypes related to human-specific copy number variation in genes regulating environmental responses and immunity are increasingly recognized. Small genomic duplications containing defense-related genes also contribute to complex common phenotypes.
Collapse
Affiliation(s)
- Bernard Conrad
- Department of Genetic Medicine & Development, University of Geneva Medical School and Geneva University Hospitals, CH-1211 Geneva 4, Switzerland.
| | | |
Collapse
|
71
|
Woolfe A, Goode DK, Cooke J, Callaway H, Smith S, Snell P, McEwen GK, Elgar G. CONDOR: a database resource of developmentally associated conserved non-coding elements. BMC DEVELOPMENTAL BIOLOGY 2007; 7:100. [PMID: 17760977 PMCID: PMC2020477 DOI: 10.1186/1471-213x-7-100] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2007] [Accepted: 08/30/2007] [Indexed: 12/04/2022]
Abstract
Background Comparative genomics is currently one of the most popular approaches to study the regulatory architecture of vertebrate genomes. Fish-mammal genomic comparisons have proved powerful in identifying conserved non-coding elements likely to be distal cis-regulatory modules such as enhancers, silencers or insulators that control the expression of genes involved in the regulation of early development. The scientific community is showing increasing interest in characterizing the function, evolution and language of these sequences. Despite this, there remains little in the way of user-friendly access to a large dataset of such elements in conjunction with the analysis and the visualization tools needed to study them. Description Here we present CONDOR (COnserved Non-coDing Orthologous Regions) available at: . In an interactive and intuitive way the website displays data on > 6800 non-coding elements associated with over 120 early developmental genes and conserved across vertebrates. The database regularly incorporates results of ongoing in vivo zebrafish enhancer assays of the CNEs carried out in-house, which currently number ~100. Included and highlighted within this set are elements derived from duplication events both at the origin of vertebrates and more recently in the teleost lineage, thus providing valuable data for studying the divergence of regulatory roles between paralogs. CONDOR therefore provides a number of tools and facilities to allow scientists to progress in their own studies on the function and evolution of developmental cis-regulation. Conclusion By providing access to data with an approachable graphics interface, the CONDOR database presents a rich resource for further studies into the regulation and evolution of genes involved in early development.
Collapse
Affiliation(s)
- Adam Woolfe
- School of Biological Sciences, Queen Mary, University of London, Mile End Road, London E1 4NS, UK
- Genomic Functional Analysis Section, National Human Genome Research Institute, National Institutes of Health, Rockville, MD 20870, USA
| | - Debbie K Goode
- School of Biological Sciences, Queen Mary, University of London, Mile End Road, London E1 4NS, UK
| | - Julie Cooke
- School of Biological Sciences, Queen Mary, University of London, Mile End Road, London E1 4NS, UK
| | - Heather Callaway
- School of Biological Sciences, Queen Mary, University of London, Mile End Road, London E1 4NS, UK
| | - Sarah Smith
- School of Biological Sciences, Queen Mary, University of London, Mile End Road, London E1 4NS, UK
| | - Phil Snell
- School of Biological Sciences, Queen Mary, University of London, Mile End Road, London E1 4NS, UK
| | - Gayle K McEwen
- School of Biological Sciences, Queen Mary, University of London, Mile End Road, London E1 4NS, UK
- Genomic Functional Analysis Section, National Human Genome Research Institute, National Institutes of Health, Rockville, MD 20870, USA
| | - Greg Elgar
- School of Biological Sciences, Queen Mary, University of London, Mile End Road, London E1 4NS, UK
| |
Collapse
|
72
|
Kim SY, Pritchard JK. Adaptive evolution of conserved noncoding elements in mammals. PLoS Genet 2007; 3:1572-86. [PMID: 17845075 PMCID: PMC1971121 DOI: 10.1371/journal.pgen.0030147] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2007] [Accepted: 07/13/2007] [Indexed: 02/07/2023] Open
Abstract
Conserved noncoding elements (CNCs) are an abundant feature of vertebrate genomes. Some CNCs have been shown to act as cis-regulatory modules, but the function of most CNCs remains unclear. To study the evolution of CNCs, we have developed a statistical method called the "shared rates test" to identify CNCs that show significant variation in substitution rates across branches of a phylogenetic tree. We report an application of this method to alignments of 98,910 CNCs from the human, chimpanzee, dog, mouse, and rat genomes. We find that approximately 68% of CNCs evolve according to a null model where, for each CNC, a single parameter models the level of constraint acting throughout the phylogeny linking these five species. The remaining approximately 32% of CNCs show departures from the basic model including speed-ups and slow-downs on particular branches and occasionally multiple rate changes on different branches. We find that a subset of the significant CNCs have evolved significantly faster than the local neutral rate on a particular branch, providing strong evidence for adaptive evolution in these CNCs. The distribution of these signals on the phylogeny suggests that adaptive evolution of CNCs occurs in occasional short bursts of evolution. Our analyses suggest a large set of promising targets for future functional studies of adaptation.
Collapse
Affiliation(s)
- Su Yeon Kim
- Department of Statistics, The University of Chicago, Chicago, Illinois, United States of America
- * To whom correspondence should be addressed. E-mail: (SYK); (JKP)
| | - Jonathan K Pritchard
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
- * To whom correspondence should be addressed. E-mail: (SYK); (JKP)
| |
Collapse
|
73
|
Bond HM, Mesuraca M, Amodio N, Mega T, Agosti V, Fanello D, Pelaggi D, Bullinger L, Grieco M, Moore MAS, Venuta S, Morrone G. Early hematopoietic zinc finger protein-zinc finger protein 521: a candidate regulator of diverse immature cells. Int J Biochem Cell Biol 2007; 40:848-54. [PMID: 17543573 DOI: 10.1016/j.biocel.2007.04.006] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2007] [Revised: 04/10/2007] [Accepted: 04/11/2007] [Indexed: 12/12/2022]
Abstract
The early hematopoietic zinc finger protein/zinc finger protein 521 (EHZF/ZNF521) is a recently identified, 1131 amino-acid-long nuclear factor that contains 30 zinc fingers distributed in clusters throughout its sequence. A 13-AA motif, that binds to components of the nuclear remodelling and histone deacetylation (NuRD) complex and is conserved in several trascriptional co-repressors, is located at the amino-terminal end of the molecule. EHZF/ZNF521 expression is high in the most immature cells of the haematopoietic system and declines with differentiation. Its transcript is also abundant in brain, particularly in the cerebellum. Its murine counterpart, Evi3/Zfp521, is enriched in haematopoietic and neural stem cells, in cerebellar granule neuron precursors and in the developing striatum. Enforced expression of EHZF/ZNF521 in haematopoietic progenitors results in their expansion and in inhibition of differentiation. EHZF/ZNF521 is a member of the BMP signalling pathway and an inhibitor of the transcription factor OLF1/EBF1, implicated in the differentiation of neural progenitors and in the specification of the B-cell lineage. EHZF expression is observed in most acute myelogenous leukaemias and is particularly high in those with rearrangements of the MLL gene, where EHZF may contribute to the leukaemic phenotype. EHZF/ZNF521 is also abundant in medulloblastomas and other brain tumours. Taken together, the data available suggest a possible role for this factor in development, stem cell regulation and oncogenesis.
Collapse
Affiliation(s)
- Heather M Bond
- Laboratory of Molecular Haematopoiesis, Department of Experimental and Clinical Medicine, University of Catanzaro Magna Graecia, 88100 Catanzaro, Italy
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
74
|
Abbasi AA, Paparidis Z, Malik S, Goode DK, Callaway H, Elgar G, Grzeschik KH. Human GLI3 intragenic conserved non-coding sequences are tissue-specific enhancers. PLoS One 2007; 2:e366. [PMID: 17426814 PMCID: PMC1838922 DOI: 10.1371/journal.pone.0000366] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2007] [Accepted: 03/19/2007] [Indexed: 11/19/2022] Open
Abstract
The zinc-finger transcription factor GLI3 is a key regulator of development, acting as a primary transducer of Sonic hedgehog (SHH) signaling in a combinatorial context dependent fashion controlling multiple patterning steps in different tissues/organs. A tight temporal and spatial control of gene expression is indispensable, however, cis-acting sequence elements regulating GLI3 expression have not yet been reported. We show that 11 ancient genomic DNA signatures, conserved from the pufferfish Takifugu (Fugu) rubripes to man, are distributed throughout the introns of human GLI3. They map within larger conserved non-coding elements (CNEs) that are found in the tetrapod lineage. Full length CNEs transiently transfected into human cell cultures acted as cell type specific enhancers of gene transcription. The regulatory potential of these elements is conserved and was exploited to direct tissue specific expression of a reporter gene in zebrafish embryos. Assays of deletion constructs revealed that the human-Fugu conserved sequences within the GLI3 intronic CNEs were essential but not sufficient for full-scale transcriptional activation. The enhancer activity of the CNEs is determined by a combinatorial effect of a core sequence conserved between human and teleosts (Fugu) and flanking tetrapod-specific sequences, suggesting that successive clustering of sequences with regulatory potential around an ancient, highly conserved nucleus might be a possible mechanism for the evolution of cis-acting regulatory elements.
Collapse
Affiliation(s)
- Amir Ali Abbasi
- Institute of Human Genetics, Philipps-University, Marburg, Germany
| | - Zissis Paparidis
- Institute of Human Genetics, Philipps-University, Marburg, Germany
| | - Sajid Malik
- Institute of Human Genetics, Philipps-University, Marburg, Germany
| | - Debbie K. Goode
- School of Biological and Chemical Sciences, Queen Mary University of London, London, United Kingdom
| | - Heather Callaway
- School of Biological and Chemical Sciences, Queen Mary University of London, London, United Kingdom
| | - Greg Elgar
- School of Biological and Chemical Sciences, Queen Mary University of London, London, United Kingdom
| | - Karl-Heinz Grzeschik
- Institute of Human Genetics, Philipps-University, Marburg, Germany
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
75
|
Sasaki YTF, Sano M, Kin T, Asai K, Hirose T. Coordinated expression of ncRNAs and HOX mRNAs in the human HOXA locus. Biochem Biophys Res Commun 2007; 357:724-30. [PMID: 17445766 DOI: 10.1016/j.bbrc.2007.03.200] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2007] [Accepted: 03/29/2007] [Indexed: 11/26/2022]
Abstract
In the human HOXA locus a number of ncRNAs are transcribed from the intergenic regions in the opposite direction to HOXA mRNAs. We observed that the genomic organization of genes for the ncRNAs and HOXA proteins is highly conserved between human and mouse. We examined the expression profiles of these ncRNAs and HOXA mRNAs in various human tissues. The expression patterns of ncRNAs in human tissues coincide with those of the adjacent HOXA mRNAs that are collinearly expressed along the anteroposterior axis. This coordinated expression was observed even in transformed tumors and cancer cell lines, suggesting that the expression of ncRNAs is prerequisite for the regulated expression of HOXA genes. HIT18844 ncRNA transcribed from the most upstream position of the HOXA cluster possesses an ultra-conserved short stretch which potentially forms an evolutionarily conserved secondary structure. Our data suggest a critical role for ncRNAs in the regulation of HOXA gene expression.
Collapse
Affiliation(s)
- Yasnory T F Sasaki
- Functional RNA Research Team, Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, 2-42 Aomi, Koutou, Tokyo 135-0064, Japan
| | | | | | | | | |
Collapse
|
76
|
Abnizova I, Subhankulova T, Gilks WR. Recent computational approaches to understand gene regulation: mining gene regulation in silico. Curr Genomics 2007; 8:79-91. [PMID: 18660846 PMCID: PMC2435357 DOI: 10.2174/138920207780368150] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2006] [Revised: 12/13/2006] [Accepted: 12/15/2006] [Indexed: 01/03/2023] Open
Abstract
This paper reviews recent computational approaches to the understanding of gene regulation in eukaryotes. Cis-regulation of gene expression by the binding of transcription factors is a critical component of cellular physiology. In eukaryotes, a number of transcription factors often work together in a combinatorial fashion to enable cells to respond to a wide spectrum of environmental and developmental signals. Integration of genome sequences and/or Chromatin Immunoprecipitation on chip data with gene-expression data has facilitated in silico discovery of how the combinatorics and positioning of transcription factors binding sites underlie gene activation in a variety of cellular processes.The process of gene regulation is extremely complex and intriguing, therefore all possible points of view and related links should be carefully considered. Here we attempt to collect an inventory, not claiming it to be comprehensive and complete, of related computational biological topics covering gene regulation, which may en-lighten the process, and briefly review what is currently occurring in these areas.We will consider the following computational areas:o gene regulatory network construction;o evolution of regulatory DNA;o studies of its structural and statistical informational properties;o and finally, regulatory RNA.
Collapse
Affiliation(s)
| | - T Subhankulova
- Wellcome Trust/Cancer Research UK Gurdon Institute of Cancer and Developmental Biology, Cambridge, UK
| | | |
Collapse
|
77
|
Kikuta H, Laplante M, Navratilova P, Komisarczuk AZ, Engström PG, Fredman D, Akalin A, Caccamo M, Sealy I, Howe K, Ghislain J, Pezeron G, Mourrain P, Ellingsen S, Oates AC, Thisse C, Thisse B, Foucher I, Adolf B, Geling A, Lenhard B, Becker TS. Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates. Genome Res 2007; 17:545-55. [PMID: 17387144 PMCID: PMC1855176 DOI: 10.1101/gr.6086307] [Citation(s) in RCA: 261] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
We report evidence for a mechanism for the maintenance of long-range conserved synteny across vertebrate genomes. We found the largest mammal-teleost conserved chromosomal segments to be spanned by highly conserved noncoding elements (HCNEs), their developmental regulatory target genes, and phylogenetically and functionally unrelated "bystander" genes. Bystander genes are not specifically under the control of the regulatory elements that drive the target genes and are expressed in patterns that are different from those of the target genes. Reporter insertions distal to zebrafish developmental regulatory genes pax6.1/2, rx3, id1, and fgf8 and miRNA genes mirn9-1 and mirn9-5 recapitulate the expression patterns of these genes even if located inside or beyond bystander genes, suggesting that the regulatory domain of a developmental regulatory gene can extend into and beyond adjacent transcriptional units. We termed these chromosomal segments genomic regulatory blocks (GRBs). After whole genome duplication in teleosts, GRBs, including HCNEs and target genes, were often maintained in both copies, while bystander genes were typically lost from one GRB, strongly suggesting that evolutionary pressure acts to keep the single-copy GRBs of higher vertebrates intact. We show that loss of bystander genes and other mutational events suffered by duplicated GRBs in teleost genomes permits target gene identification and HCNE/target gene assignment. These findings explain the absence of evolutionary breakpoints from large vertebrate chromosomal segments and will aid in the recognition of position effect mutations within human GRBs.
Collapse
Affiliation(s)
- Hiroshi Kikuta
- Sars Centre for Marine Molecular Biology, University of Bergen, 5008 Bergen, Norway
| | - Mary Laplante
- Sars Centre for Marine Molecular Biology, University of Bergen, 5008 Bergen, Norway
| | - Pavla Navratilova
- Sars Centre for Marine Molecular Biology, University of Bergen, 5008 Bergen, Norway
| | - Anna Z. Komisarczuk
- Sars Centre for Marine Molecular Biology, University of Bergen, 5008 Bergen, Norway
| | - Pär G. Engström
- Computational Biology Unit, University of Bergen, 5008 Bergen, Norway
- Programme for Genomics and Bioinformatics, Department of Cell and Molecular Biology, Karolinska Institutet, 17177 Stockholm, Sweden
| | - David Fredman
- Computational Biology Unit, University of Bergen, 5008 Bergen, Norway
| | - Altuna Akalin
- Computational Biology Unit, University of Bergen, 5008 Bergen, Norway
| | - Mario Caccamo
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Ian Sealy
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Kerstin Howe
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Julien Ghislain
- Biologie Moléculaire du Développement, INSERM U368, Ecole Normale Supérieure, Paris, 75230 Paris, Cedex 05 France
| | - Guillaume Pezeron
- Biologie Moléculaire du Développement, INSERM U368, Ecole Normale Supérieure, Paris, 75230 Paris, Cedex 05 France
| | - Philippe Mourrain
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Staale Ellingsen
- Sars Centre for Marine Molecular Biology, University of Bergen, 5008 Bergen, Norway
| | - Andrew C. Oates
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
| | | | - Bernard Thisse
- IGBMC, CNRS/INSERM/ULP, BP10142, 67404 Illkirch, Cedex, France
| | - Isabelle Foucher
- Unité de Génétique des Déficits Sensoriels, Institut Pasteur, F-75724 Paris Cedex 15, France
| | - Birgit Adolf
- Institute of Developmental Genetics, GSF Research Center, 85764 Neuherberg, Germany
| | - Andrea Geling
- Institute of Developmental Genetics, GSF Research Center, 85764 Neuherberg, Germany
| | - Boris Lenhard
- Sars Centre for Marine Molecular Biology, University of Bergen, 5008 Bergen, Norway
- Computational Biology Unit, University of Bergen, 5008 Bergen, Norway
| | - Thomas S. Becker
- Sars Centre for Marine Molecular Biology, University of Bergen, 5008 Bergen, Norway
- Corresponding author.E-mail ; fax 47-55584305
| |
Collapse
|
78
|
Hadley D, Murphy T, Valladares O, Hannenhalli S, Ungar L, Kim J, Bućan M. Patterns of sequence conservation in presynaptic neural genes. Genome Biol 2007; 7:R105. [PMID: 17096848 PMCID: PMC1794582 DOI: 10.1186/gb-2006-7-11-r105] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2006] [Revised: 09/25/2006] [Accepted: 11/10/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The neuronal synapse is a fundamental functional unit in the central nervous system of animals. Because synaptic function is evolutionarily conserved, we reasoned that functional sequences of genes and related genomic elements known to play important roles in neurotransmitter release would also be conserved. RESULTS Evolutionary rate analysis revealed that presynaptic proteins evolve slowly, although some members of large gene families exhibit accelerated evolutionary rates relative to other family members. Comparative sequence analysis of 46 megabases spanning 150 presynaptic genes identified more than 26,000 elements that are highly conserved in eight vertebrate species, as well as a small subset of sequences (6%) that are shared among unrelated presynaptic genes. Analysis of large gene families revealed that upstream and intronic regions of closely related family members are extremely divergent. We also identified 504 exceptionally long conserved elements (> or =360 base pairs, > or =80% pair-wise identity between human and other mammals) in intergenic and intronic regions of presynaptic genes. Many of these elements form a highly stable stem-loop RNA structure and consequently are candidates for novel regulatory elements, whereas some conserved noncoding elements are shown to correlate with specific gene expression profiles. The SynapseDB online database integrates these findings and other functional genomic resources for synaptic genes. CONCLUSION Highly conserved elements in nonprotein coding regions of 150 presynaptic genes represent sequences that may be involved in the transcriptional or post-transcriptional regulation of these genes. Furthermore, comparative sequence analysis will facilitate selection of genes and noncoding sequences for future functional studies and analysis of variation studies in neurodevelopmental and psychiatric disorders.
Collapse
Affiliation(s)
- Dexter Hadley
- Penn Center for Bioinformatics, 423 Guardian Drive, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Genomics and Computational Biology Graduate Group, 423 Guardian Drive, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Tara Murphy
- Department of Genetics in the School of Medicine, University of Pennsylvania, 415 Curie Boulevard Philadelphia, Pennsylvania 19104, USA
- UCLA Neuroscience Graduate Office, 695 Young Drive South, Los Angeles, California 90095, USA
| | - Otto Valladares
- Department of Genetics in the School of Medicine, University of Pennsylvania, 415 Curie Boulevard Philadelphia, Pennsylvania 19104, USA
| | - Sridhar Hannenhalli
- Penn Center for Bioinformatics, 423 Guardian Drive, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Genomics and Computational Biology Graduate Group, 423 Guardian Drive, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Lyle Ungar
- Penn Center for Bioinformatics, 423 Guardian Drive, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Computer & Information Sciences in School of Engineering and Applied Sciences, 3330 Walnut Street, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Junhyong Kim
- Penn Center for Bioinformatics, 423 Guardian Drive, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Computer & Information Sciences in School of Engineering and Applied Sciences, 3330 Walnut Street, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Biology in the School of Arts and Sciences, 433 S University Avenue, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Maja Bućan
- Penn Center for Bioinformatics, 423 Guardian Drive, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Genetics in the School of Medicine, University of Pennsylvania, 415 Curie Boulevard Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
79
|
Vavouri T, Walter K, Gilks WR, Lehner B, Elgar G. Parallel evolution of conserved non-coding elements that target a common set of developmental regulatory genes from worms to humans. Genome Biol 2007; 8:R15. [PMID: 17274809 PMCID: PMC1852409 DOI: 10.1186/gb-2007-8-2-r15] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2006] [Revised: 10/20/2006] [Accepted: 02/02/2007] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND The human genome contains thousands of non-coding sequences that are often more conserved between vertebrate species than protein-coding exons. These highly conserved non-coding elements (CNEs) are associated with genes that coordinate development, and have been proposed to act as transcriptional enhancers. Despite their extreme sequence conservation in vertebrates, sequences homologous to CNEs have not been identified in invertebrates. RESULTS Here we report that nematode genomes contain an alternative set of CNEs that share sequence characteristics, but not identity, with their vertebrate counterparts. CNEs thus represent a very unusual class of sequences that are extremely conserved within specific animal lineages yet are highly divergent between lineages. Nematode CNEs are also associated with developmental regulatory genes, and include well-characterized enhancers and transcription factor binding sites, supporting the proposed function of CNEs as cis-regulatory elements. Most remarkably, 40 of 156 human CNE-associated genes with invertebrate orthologs are also associated with CNEs in both worms and flies. CONCLUSION A core set of genes that regulate development is associated with CNEs across three animal groups (worms, flies and vertebrates). We propose that these CNEs reflect the parallel evolution of alternative enhancers for a common set of developmental regulatory genes in different animal groups. This 're-wiring' of gene regulatory networks containing key developmental coordinators was probably a driving force during the evolution of animal body plans. CNEs may, therefore, represent the genomic traces of these 'hard-wired' core gene regulatory networks that specify the development of each alternative animal body plan.
Collapse
Affiliation(s)
- Tanya Vavouri
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
- School of Biological and Chemical Sciences, Queen Mary, University of London, London E1 4NS, UK
| | - Klaudia Walter
- MRC Biostatistics Unit, Institute of Public Health, Cambridge CB2 2SR, UK
| | - Walter R Gilks
- Department of Statistics, University of Leeds, Leeds LS2 9JT, UK
| | - Ben Lehner
- EMBL/CRG Systems Biology Unit, Centre for Genomic Regulation (CRG), UPF, C/Dr. Aiguader 88, Barcelona 08003, Spain
| | - Greg Elgar
- School of Biological and Chemical Sciences, Queen Mary, University of London, London E1 4NS, UK
| |
Collapse
|
80
|
Sun H, Skogerbø G, Chen R. Conserved distances between vertebrate highly conserved elements. Hum Mol Genet 2006; 15:2911-22. [PMID: 16923797 DOI: 10.1093/hmg/ddl232] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
High numbers of sequence element with very high (>95%) sequence conservation between the human and other vertebrate genomes have been reported and ascribed putative cis-regulatory functions. We have investigated the structural relationships between such elements in mammalian genomes and find that not only their sequences, but also the distances between them are significantly (P<2.2x10(-16)) more conserved than corresponding distances between orthologous protein-coding genes or between exons within these genes. Regions of largely conserved distance between consecutive highly conserved elements (HCE) generally overlap previously identified HCE clusters, but may be far longer (up to 20 Mb) and possibly cover close to 25% of the human genome sequence. Similar conservation of distance is found between bird (chicken) and mammalian genomes and is also discernible in comparisons between fish and mammals. The data suggest either that a substantial amount of essential (functionally active) elements with lower sequence conservation occupy the space between the HCEs or that distance itself is an important factor in transcriptional regulation or chromatin modelling.
Collapse
Affiliation(s)
- Hong Sun
- Bioinformatics Laboratory, Institute of Biophysics, Chinese Academy of Sciences, Beijing, P.R. China
| | | | | |
Collapse
|
81
|
Xie X, Kamal M, Lander ES. A family of conserved noncoding elements derived from an ancient transposable element. Proc Natl Acad Sci U S A 2006; 103:11659-64. [PMID: 16864796 PMCID: PMC1518811 DOI: 10.1073/pnas.0604768103] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The evolutionary origin of the conserved noncoding elements (CNEs) in the human genome remains poorly understood but may hold important clues to their biological functions. Here, we report the discovery of a CNE family with approximately 124 instances in the human genome that demonstrates a clear signature of having been derived from an ancient transposon. The CNE family is also present in the chicken genome, although typically not at orthologous locations. The CNE family is closely related to the active transposon SINE3 in zebrafish and also to a previously uncharacterized transposon in the coelacanth, the so-called "living fossil" belonging to the lobe-finned fish lineage. The mammal, bird, zebrafish, and coelacanth families all share a highly similar core element of approximately 180 bp but have important differences in their 5' and 3' ends. The core element has thus been preserved over 450 million years of evolution, implying an important biological function. In addition, we identify 95 additional CNE families that likely predate the mammalian radiation. The results highlight both the creative role of transposons and the importance of CNE families.
Collapse
Affiliation(s)
- Xiaohui Xie
- *Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142
| | - Michael Kamal
- *Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142
| | - Eric S. Lander
- *Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139; and
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
82
|
Bejerano G, Lowe CB, Ahituv N, King B, Siepel A, Salama SR, Rubin EM, Kent WJ, Haussler D. A distal enhancer and an ultraconserved exon are derived from a novel retroposon. Nature 2006; 441:87-90. [PMID: 16625209 DOI: 10.1038/nature04696] [Citation(s) in RCA: 369] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2005] [Accepted: 03/02/2006] [Indexed: 01/15/2023]
Abstract
Hundreds of highly conserved distal cis-regulatory elements have been characterized so far in vertebrate genomes. Many thousands more are predicted on the basis of comparative genomics. However, in stark contrast to the genes that they regulate, in invertebrates virtually none of these regions can be traced by using sequence similarity, leaving their evolutionary origins obscure. Here we show that a class of conserved, primarily non-coding regions in tetrapods originated from a previously unknown short interspersed repetitive element (SINE) retroposon family that was active in the Sarcopterygii (lobe-finned fishes and terrestrial vertebrates) in the Silurian period at least 410 million years ago (ref. 4), and seems to be recently active in the 'living fossil' Indonesian coelacanth, Latimeria menadoensis. Using a mouse enhancer assay we show that one copy, 0.5 million bases from the neuro-developmental gene ISL1, is an enhancer that recapitulates multiple aspects of Isl1 expression patterns. Several other copies represent new, possibly regulatory, alternatively spliced exons in the middle of pre-existing Sarcopterygian genes. One of these, a more than 200-base-pair ultraconserved region, 100% identical in mammals, and 80% identical to the coelacanth SINE, contains a 31-amino-acid-residue alternatively spliced exon of the messenger RNA processing gene PCBP2 (ref. 6). These add to a growing list of examples in which relics of transposable elements have acquired a function that serves their host, a process termed 'exaptation', and provide an origin for at least some of the many highly conserved vertebrate-specific genomic sequences.
Collapse
Affiliation(s)
- Gill Bejerano
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, California 95064, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|