1
|
Sabarís G, Ortíz DM, Laiker I, Mayansky I, Naik S, Cavalli G, Stern DL, Preger-Ben Noon E, Frankel N. The Density of Regulatory Information Is a Major Determinant of Evolutionary Constraint on Noncoding DNA in Drosophila. Mol Biol Evol 2024; 41:msae004. [PMID: 38364113 PMCID: PMC10871701 DOI: 10.1093/molbev/msae004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 11/26/2023] [Accepted: 01/05/2024] [Indexed: 02/18/2024] Open
Abstract
Evolutionary analyses have estimated that ∼60% of nucleotides in intergenic regions of the Drosophila melanogaster genome are functionally relevant, suggesting that regulatory information may be encoded more densely in intergenic regions than has been revealed by most functional dissections of regulatory DNA. Here, we approached this issue through a functional dissection of the regulatory region of the gene shavenbaby (svb). Most of the ∼90 kb of this large regulatory region is highly conserved in the genus Drosophila, though characterized enhancers occupy a small fraction of this region. By analyzing the regulation of svb in different contexts of Drosophila development, we found that the regulatory information that drives svb expression in the abdominal pupal epidermis is organized in a different way than the elements that drive svb expression in the embryonic epidermis. While in the embryonic epidermis svb is activated by compact enhancers separated by large inactive DNA regions, svb expression in the pupal epidermis is driven by regulatory information distributed over broader regions of svb cis-regulatory DNA. In the same vein, we observed that other developmental genes also display a dense distribution of putative regulatory elements in their regulatory regions. Furthermore, we found that a large percentage of conserved noncoding DNA of the Drosophila genome is contained within regions of open chromatin. These results suggest that part of the evolutionary constraint on noncoding DNA of Drosophila is explained by the density of regulatory information, which may be greater than previously appreciated.
Collapse
Affiliation(s)
- Gonzalo Sabarís
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad de Buenos Aires (UBA), Buenos Aires 1428, Argentina
- Institute of Human Genetics, UMR 9002 CNRS-Université de Montpellier, Montpellier, France
| | - Daniela M Ortíz
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad de Buenos Aires (UBA), Buenos Aires 1428, Argentina
| | - Ian Laiker
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad de Buenos Aires (UBA), Buenos Aires 1428, Argentina
| | - Ignacio Mayansky
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad de Buenos Aires (UBA), Buenos Aires 1428, Argentina
| | - Sujay Naik
- Department of Genetics and Developmental Biology, The Rappaport Faculty of Medicine and Research Institute, Technion—Israel Institute of Technology, Haifa 3109601, Israel
| | - Giacomo Cavalli
- Institute of Human Genetics, UMR 9002 CNRS-Université de Montpellier, Montpellier, France
| | - David L Stern
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA
| | - Ella Preger-Ben Noon
- Department of Genetics and Developmental Biology, The Rappaport Faculty of Medicine and Research Institute, Technion—Israel Institute of Technology, Haifa 3109601, Israel
| | - Nicolás Frankel
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad de Buenos Aires (UBA), Buenos Aires 1428, Argentina
- Departamento de Ecología, Genética y Evolución, Facultad de Ciencias Exactas y Naturales (FCEN), Universidad de Buenos Aires (UBA), Buenos Aires 1428, Argentina
| |
Collapse
|
2
|
Pizzollo J, Zintel TM, Babbitt CC. Differentially active and conserved neural enhancers define two forms of adaptive non-coding evolution in humans. Genome Biol Evol 2022; 14:6648393. [PMID: 35866592 PMCID: PMC9348619 DOI: 10.1093/gbe/evac108] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/11/2022] [Indexed: 11/28/2022] Open
Abstract
The human and chimpanzee genomes are strikingly similar, but our neural phenotypes are very different. Many of these differences are likely driven by changes in gene expression, and some of those changes may have been adaptive during human evolution. Yet, the relative contributions of positive selection on regulatory regions or other functional regulatory changes are unclear. Where are these changes located throughout the human genome? Are functional regulatory changes near genes or are they in distal enhancer regions? In this study, we experimentally combined both human and chimpanzee cis-regulatory elements (CREs) that showed either (1) signs of accelerated evolution in humans or (2) that have been shown to be active in the human brain. Using a massively parallel reporter assay, we tested the ability of orthologous human and chimpanzee CREs to activate transcription in induced pluripotent stem-cell-derived neural progenitor cells and neurons. With this assay, we identified 179 CREs with differential activity between human and chimpanzee; in contrast, we found 722 CREs with signs of positive selection in humans. Selection and differentially expressed CREs strikingly differ in level of expression, size, and genomic location. We found a subset of 69 CREs in loci with genetic variants associated with neuropsychiatric diseases, which underscores the consequence of regulatory activity in these loci for proper neural development and function. By combining CREs that either experienced recent selection in humans or CREs that are functional brain enhancers, presents a novel way of studying the evolution of noncoding elements that contribute to human neural phenotypes.
Collapse
Affiliation(s)
- Jason Pizzollo
- Molecular and Cellular Biology Graduate Program, University of Massachusetts Amherst, Amherst, MA 01003, USA.,Department of Biology, University of Massachusetts Amherst, Amherst, MA 01003, USA
| | - Trisha M Zintel
- Molecular and Cellular Biology Graduate Program, University of Massachusetts Amherst, Amherst, MA 01003, USA.,Department of Biology, University of Massachusetts Amherst, Amherst, MA 01003, USA
| | - Courtney C Babbitt
- Department of Biology, University of Massachusetts Amherst, Amherst, MA 01003, USA
| |
Collapse
|
3
|
tiRNAs: Insights into Their Biogenesis, Functions, and Future Applications in Livestock Research. Noncoding RNA 2022; 8:ncrna8030037. [PMID: 35736634 PMCID: PMC9231384 DOI: 10.3390/ncrna8030037] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 05/20/2022] [Accepted: 05/23/2022] [Indexed: 11/29/2022] Open
Abstract
Transfer RNA (tRNA)-derived small RNAs (tsRNAs) belong to a group of transfer ribonucleic acid (tRNA)-derived fragments that have recently gained interest as molecules with specific biological functions. Their involvement in the regulation of physiological processes and pathological phenotypes suggests molecular roles similar to those of miRNAs. tsRNA biogenesis under specific physiological conditions will offer new perspectives in understanding diseases, and may provide new sources for biological marker design to determine and monitor the health status of farm animals. In this review, we focus on the latest discoveries about tsRNAs and give special attention to molecules initially thought to be mainly associated with tRNA-derived stress-induced RNAs (tiRNAs). We present an outline of their biological functions, offer a collection of useful databases, and discuss future research perspectives and applications in livestock basic and applied research.
Collapse
|
4
|
McCole RB, Erceg J, Saylor W, Wu CT. Ultraconserved Elements Occupy Specific Arenas of Three-Dimensional Mammalian Genome Organization. Cell Rep 2019; 24:479-488. [PMID: 29996107 DOI: 10.1016/j.celrep.2018.06.031] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 05/09/2018] [Accepted: 06/07/2018] [Indexed: 12/23/2022] Open
Abstract
This study explores the relationship between three-dimensional genome organization and ultraconserved elements (UCEs), an enigmatic set of DNA elements that are perfectly conserved between the reference genomes of distantly related species. Examining both human and mouse genomes, we interrogate the relationship of UCEs to three features of chromosome organization derived from Hi-C studies. We find that UCEs are enriched within contact domains and, further, that the subset of UCEs within domains shared across diverse cell types are linked to kidney-related and neuronal processes. In boundaries, UCEs are generally depleted, with those that do overlap boundaries being overrepresented in exonic UCEs. Regarding loop anchors, UCEs are neither overrepresented nor underrepresented, but those present in loop anchors are enriched for splice sites. Finally, as the relationships between UCEs and human Hi-C features are conserved in mouse, our findings suggest that UCEs contribute to interspecies conservation of genome organization and, thus, genome stability.
Collapse
Affiliation(s)
- Ruth B McCole
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Jelena Erceg
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Wren Saylor
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Chao-Ting Wu
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
5
|
Li L, Barth NKH, Hirth E, Taher L. Pairs of Adjacent Conserved Noncoding Elements Separated by Conserved Genomic Distances Act as Cis-Regulatory Units. Genome Biol Evol 2018; 10:2535-2550. [PMID: 30184074 PMCID: PMC6161761 DOI: 10.1093/gbe/evy196] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/01/2018] [Indexed: 01/02/2023] Open
Abstract
Comparative genomic studies have identified thousands of conserved noncoding elements (CNEs) in the mammalian genome, many of which have been reported to exert cis-regulatory activity. We analyzed ∼5,500 pairs of adjacent CNEs in the human genome and found that despite divergence at the nucleotide sequence level, the inter-CNE distances of the pairs are under strong evolutionary constraint, with inter-CNE sequences featuring significantly lower transposon densities than expected. Further, we show that different degrees of conservation of the inter-CNE distance are associated with distinct cis-regulatory functions at the CNEs. Specifically, the CNEs in pairs with conserved and mildly contracted inter-CNE sequences are the most likely to represent active or poised enhancers. In contrast, CNEs in pairs with extremely contracted or expanded inter-CNE sequences are associated with no cis-regulatory activity. Furthermore, we observed that functional CNEs in a pair have very similar epigenetic profiles, hinting at a functional relationship between them. Taken together, our results support the existence of epistatic interactions between adjacent CNEs that are distance-sensitive and disrupted by transposon insertions and deletions, and contribute to our understanding of the selective forces acting on cis-regulatory elements, which are crucial for elucidating the molecular mechanisms underlying adaptive evolution and human genetic diseases.
Collapse
Affiliation(s)
- Lifei Li
- Division of Bioinformatics, Department of Biology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Nicolai K H Barth
- Division of Bioinformatics, Department of Biology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Eva Hirth
- Division of Bioinformatics, Department of Biology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Leila Taher
- Division of Bioinformatics, Department of Biology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
6
|
Polychronopoulos D, Sellis D, Almirantis Y. Conserved noncoding elements follow power-law-like distributions in several genomes as a result of genome dynamics. PLoS One 2014; 9:e95437. [PMID: 24787386 PMCID: PMC4008492 DOI: 10.1371/journal.pone.0095437] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2013] [Accepted: 03/26/2014] [Indexed: 12/31/2022] Open
Abstract
Conserved, ultraconserved and other classes of constrained elements (collectively referred as CNEs here), identified by comparative genomics in a wide variety of genomes, are non-randomly distributed across chromosomes. These elements are defined using various degrees of conservation between organisms and several thresholds of minimal length. We here investigate the chromosomal distribution of CNEs by studying the statistical properties of distances between consecutive CNEs. We find widespread power-law-like distributions, i.e. linearity in double logarithmic scale, in the inter-CNE distances, a feature which is connected with fractality and self-similarity. Given that CNEs are often found to be spatially associated with genes, especially with those that regulate developmental processes, we verify by appropriate gene masking that a power-law-like pattern emerges irrespectively of whether elements found close or inside genes are excluded or not. An evolutionary model is put forward for the understanding of these findings that includes segmental or whole genome duplication events and eliminations (loss) of most of the duplicated CNEs. Simulations reproduce the main features of the observed size distributions. Power-law-like patterns in the genomic distributions of CNEs are in accordance with current knowledge about their evolutionary history in several genomes.
Collapse
Affiliation(s)
- Dimitris Polychronopoulos
- Institute of Biosciences and Applications, National Center for Scientific Research “Demokritos”, Athens, Greece
- Department of Biochemistry and Molecular Biology, Faculty of Biology, National and Kapodistrian University of Athens, Athens, Greece
| | - Diamantis Sellis
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Yannis Almirantis
- Institute of Biosciences and Applications, National Center for Scientific Research “Demokritos”, Athens, Greece
- * E-mail:
| |
Collapse
|
7
|
Makunin IV, Shloma VV, Stephen SJ, Pheasant M, Belyakin SN. Comparison of ultra-conserved elements in drosophilids and vertebrates. PLoS One 2013; 8:e82362. [PMID: 24349264 PMCID: PMC3862641 DOI: 10.1371/journal.pone.0082362] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2013] [Accepted: 10/24/2013] [Indexed: 11/18/2022] Open
Abstract
Metazoan genomes contain many ultra-conserved elements (UCEs), long sequences identical between distant species. In this study we identified UCEs in drosophilid and vertebrate species with a similar level of phylogenetic divergence measured at protein-coding regions, and demonstrated that both the length and number of UCEs are larger in vertebrates. The proportion of non-exonic UCEs declines in distant drosophilids whilst an opposite trend was observed in vertebrates. We generated a set of 2,126 Sophophora UCEs by merging elements identified in several drosophila species and compared these to the eutherian UCEs identified in placental mammals. In contrast to vertebrates, the Sophophora UCEs are depleted around transcription start sites. Analysis of 52,954 P-element, piggyBac and Minos insertions in the D. melanogaster genome revealed depletion of the P-element and piggyBac insertions in and around the Sophophora UCEs. We examined eleven fly strains with transposon insertions into the intergenic UCEs and identified associated phenotypes in five strains. Four insertions behave as recessive lethals, and in one case we observed a suppression of the marker gene within the transgene, presumably by silenced chromatin around the integration site. To confirm the lethality is caused by integration of transposons we performed a phenotype rescue experiment for two stocks and demonstrated that the excision of the transposons from the intergenic UCEs restores viability. Sequencing of DNA after the transposon excision in one fly strain with the restored viability revealed a 47 bp insertion at the original transposon integration site suggesting that the nature of the mutation is important for the appearance of the phenotype. Our results suggest that the UCEs in flies and vertebrates have both common and distinct features, and demonstrate that a significant proportion of intergenic drosophila UCEs are sensitive to disruption.
Collapse
Affiliation(s)
- Igor V. Makunin
- Research Computing Centre, The University of Queensland, Brisbane, Queensland, Australia
- Institute of Molecular and Cellular Biology SD RAS, Novosibirsk, Russia
- * E-mail:
| | - Viktor V. Shloma
- Institute of Molecular and Cellular Biology SD RAS, Novosibirsk, Russia
| | - Stuart J. Stephen
- Computational Biology Group, CSIRO Plant Industry, Canberra, Australian Capital Territory, Australia
| | - Michael Pheasant
- Research Computing Centre, The University of Queensland, Brisbane, Queensland, Australia
| | | |
Collapse
|
8
|
When needles look like hay: how to find tissue-specific enhancers in model organism genomes. Dev Biol 2010; 350:239-54. [PMID: 21130761 DOI: 10.1016/j.ydbio.2010.11.026] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2010] [Revised: 11/11/2010] [Accepted: 11/22/2010] [Indexed: 01/22/2023]
Abstract
A major prerequisite for the investigation of tissue-specific processes is the identification of cis-regulatory elements. No generally applicable technique is available to distinguish them from any other type of genomic non-coding sequence. Therefore, researchers often have to identify these elements by elaborate in vivo screens, testing individual regions until the right one is found. Here, based on many examples from the literature, we summarize how functional enhancers have been isolated from other elements in the genome and how they have been characterized in transgenic animals. Covering computational and experimental studies, we provide an overview of the global properties of cis-regulatory elements, like their specific interactions with promoters and target gene distances. We describe conserved non-coding elements (CNEs) and their internal structure, nucleotide composition, binding site clustering and overlap, with a special focus on developmental enhancers. Conflicting data and unresolved questions on the nature of these elements are highlighted. Our comprehensive overview of the experimental shortcuts that have been found in the different model organism communities and the new field of high-throughput assays should help during the preparation phase of a screen for enhancers. The review is accompanied by a list of general guidelines for such a project.
Collapse
|
9
|
TIAN J, ZHAO ZH, CHEN HP. [Conserved non-coding elements in human genome]. YI CHUAN = HEREDITAS 2009; 31:1067-1076. [PMID: 19933086 DOI: 10.3724/sp.j.1005.2009.01067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Study of comparative genomics has revealed that about 5% of the human genome are under purifying selection, 3.5% of which are conserved non-coding elements (CNEs). While the coding regions comprise of only a small part. In human, the CNEs are functionally important, which may be associated with the process of the establishment and maintain of chromatin architecture, transcription regulation, and pre-mRNA processing. They are also related to ontogeny of mammals and human diseases. This review outlined the identification, functional significance, evolutionary origin, and effects on human genetic defects of the CNEs.
Collapse
Affiliation(s)
- Jing TIAN
- Institute of Biotechnology, Academy of Military Medical Science, Beijing 100071, China.
| | | | | |
Collapse
|
10
|
Moghadam HK, Ferguson MM, Danzmann RG. Comparative genomics and evolution of conserved noncoding elements (CNE) in rainbow trout. BMC Genomics 2009; 10:278. [PMID: 19549339 PMCID: PMC2711117 DOI: 10.1186/1471-2164-10-278] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2008] [Accepted: 06/23/2009] [Indexed: 12/04/2022] Open
Abstract
Background Recent advances in the accumulation of genetic mapping and DNA sequence information from several salmonid species support the long standing view of an autopolyploid origin of these fishes (i.e., 4R). However, the paralogy relationships of the chromosomal segments descendent from earlier polyploidization events (i.e., 2R/3R) largely remain unknown, mainly due to an unbalanced pseudogenization of paralogous genes that were once resident on the ancient duplicated segments. Inter-specific conserved noncoding elements (CNE) might hold the key in identifying these regions, if they are associated with arrays of genes that have been highly conserved in syntenic blocks through evolution. To test this hypothesis, we investigated the chromosomal positions of subset of CNE in the rainbow trout genome using a comparative genomic framework. Results Through a genome wide analysis, we selected 41 pairs of adjacent CNE located on various chromosomes in zebrafish and obtained their intervening, less conserved, sequence information from rainbow trout. We identified 56 distinct fragments corresponding to about 150 Kbp of sequence data that were localized to 67 different chromosomal regions in the rainbow trout genome. The genomic positions of many duplicated CNE provided additional support for some previously suggested homeologies in this species. Additionally, we now propose 40 new potential paralogous affinities by analyzing the variation in the segregation patterns of some multi-copy CNE along with the synteny association comparison using several model vertebrates. Some of these regions appear to carry signatures of the 1R, 2R or 3R duplications. A subset of these CNE markers also demonstrated high utility in identifying homologous chromosomal segments in the genomes of Atlantic salmon and Arctic charr. Conclusion CNE seem to be more efficacious than coding sequences in providing insights into the ancient paralogous affinities within the vertebrate genomes. Such a feature makes these elements extremely attractive for comparative genomics studies, as they can be treated as 'anchor' markers to investigate the association of distally located candidate genes on the homologous genomic segments of closely or distantly related organisms.
Collapse
Affiliation(s)
- Hooman K Moghadam
- Department of Integrative Biology, University of Guelph, Guelph, Ontario, Canada.
| | | | | |
Collapse
|
11
|
Sun H, Skogerbø G, Zheng X, Liu W, Li Y. Genomic regions with distinct genomic distance conservation in vertebrate genomes. BMC Genomics 2009; 10:133. [PMID: 19323843 PMCID: PMC2667192 DOI: 10.1186/1471-2164-10-133] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2008] [Accepted: 03/27/2009] [Indexed: 11/15/2022] Open
Abstract
Background A number of vertebrate highly conserved elements (HCEs) have been detected and their genomic interval distances have been reported to be more conserved than protein coding genes among mammalian genomes. A characteristic of the human – non-mammalian comparisons is a bimodal distribution of relative distance difference of conserved consecutive HCE pairs; and it is difficult to attribute such profile to a random assortment. We therefore undertook an analysis of the human genomic regions confined by consecutive HCE pairs common to eight genomes (human, mouse, rat, chicken, frog, zebrafish, tetradon and fugu). Results Among HCE pairs, we found that some consistently preserve highly conserved interval distance among genomes while others have relatively low distance conservation. Using a partition method, we detected two groups of inter-HCE regions (IHRs) with distinct distance conservation pattern in vertebrate genomes: IHR1s that are bordered by HCE pairs with relative small distance variation, and IHR2s with larger distance difference values. Compared to random background, annotated repeat sequences are significantly less frequent in IHR1s than IHR2s, which reflects a correlation between repeat sequences and the length expansion of IHRs. Both groups of IHRs are unexpectedly enriched in human indel (i.e. insertion and deletion) polymorphism-variations than random background. The correlation between the percentage of conserved sequence and human IHR length was stronger for IHR1 than IHR2. Both groups of IHRs are significantly enriched for CpG islands. Conclusion The data suggest that subsets of HCE pairs may undergo different evolutionary paths in light of their genomic distance conservation, and that sets of genomic regions pertain to HCEs, as well as the region in which HCEs reside, should be treated as integrated domains.
Collapse
Affiliation(s)
- Hong Sun
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, PR China.
| | | | | | | | | |
Collapse
|
12
|
Mello BP, Abrantes EF, Torres CH, Machado-Lima A, Fonseca RDS, Carraro DM, Brentani RR, Reis LFL, Brentani H. No-match ORESTES explored as tumor markers. Nucleic Acids Res 2009; 37:2607-17. [PMID: 19270067 PMCID: PMC2677862 DOI: 10.1093/nar/gkp074] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Sequencing technologies and new bioinformatics tools have led to the complete sequencing of various genomes. However, information regarding the human transcriptome and its annotation is yet to be completed. The Human Cancer Genome Project, using ORESTES (open reading frame EST sequences) methodology, contributed to this objective by generating data from about 1.2 million expressed sequence tags. Approximately 30% of these sequences did not align to ESTs in the public databases and were considered no-match ORESTES. On the basis that a set of these ESTs could represent new transcripts, we constructed a cDNA microarray. This platform was used to hybridize against 12 different normal or tumor tissues. We identified 3421 transcribed regions not associated with annotated transcripts, representing 83.3% of the platform. The total number of differentially expressed sequences was 1007. Also, 28% of analyzed sequences could represent noncoding RNAs. Our data reinforces the knowledge of the human genome being pervasively transcribed, and point out molecular marker candidates for different cancers. To reinforce our data, we confirmed, by real-time PCR, the differential expression of three out of eight potentially tumor markers in prostate tissues. Lists of 1007 differentially expressed sequences, and the 291 potentially noncoding tumor markers were provided.
Collapse
Affiliation(s)
- Barbara P Mello
- Hospital A. C. Camargo, Rua Prof. Antônio Prudente 211, São Paulo, SP, Brazil
| | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Abstract
Genomic DNA is being sequenced and annotated at a rapid rate, with terabases of DNA currently deposited in GenBank and other repositories. Genome browsers provide an essential collection of resources to visualize and analyze chromosomal DNA. The University of California, Santa Cruz (UCSC) Genome Browser provides annotations from the level of single nucleotides to whole chromosomes for four dozen metazoan and other species. The Genome Browser may be used to address a wide range of problems in bioinformatics (e.g., sequence analysis), comparative genomics, and evolution.
Collapse
Affiliation(s)
- Jonathan Pevsner
- Department of Neurology, Kennedy Krieger Institute, Baltimore, MD, USA
| |
Collapse
|
14
|
Sun H, Skogerbø G, Wang Z, Liu W, Li Y. Structural relationships between highly conserved elements and genes in vertebrate genomes. PLoS One 2008; 3:e3727. [PMID: 19008958 PMCID: PMC2579482 DOI: 10.1371/journal.pone.0003727] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2008] [Accepted: 10/26/2008] [Indexed: 02/03/2023] Open
Abstract
Large numbers of sequence elements have been identified to be highly conserved among vertebrate genomes. These highly conserved elements (HCEs) are often located in or around genes that are involved in transcription regulation and early development. They have been shown to be involved in cis-regulatory activities through both in vivo and additional computational studies. We have investigated the structural relationships between such elements and genes in six vertebrate genomes human, mouse, rat, chicken, zebrafish and tetraodon and detected several thousand cases of conserved HCE-gene associations, and also cases of HCEs with no common target genes. A few examples underscore the potential significance of our findings about several individual genes. We found that the conserved association between HCE/HCEs and gene/genes are not restricted to elements by their absolute distance on the genome. Notably, long-range associations were identified and the molecular functions of the associated genes do not show any particular overrepresentation of the functional categories previously reported. HCEs in close proximity are found to be linked with different set of gene/genes. The results reflect the highly complex correlation between HCEs and their putative target genes.
Collapse
Affiliation(s)
- Hong Sun
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
- Biological Technologies, Wyeth Research, Cambridge, Massachusetts, United States of America
- Shanghai Center for Bioinformation Technology, Shanghai, China
- Zhongxin Biotechnology Shanghai Co. Ltd., Shanghai, China
| | - Geir Skogerbø
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Zhen Wang
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Wei Liu
- Biological Technologies, Wyeth Research, Cambridge, Massachusetts, United States of America
- * E-mail: (WL); (YL)
| | - Yixue Li
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
- Shanghai Center for Bioinformation Technology, Shanghai, China
- * E-mail: (WL); (YL)
| |
Collapse
|
15
|
Xie HB, Irwin DM, Zhang YP. Evolution of conserved secondary structures and their function in transcriptional regulation networks. BMC Genomics 2008; 9:520. [PMID: 18976501 PMCID: PMC2584662 DOI: 10.1186/1471-2164-9-520] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2008] [Accepted: 11/02/2008] [Indexed: 12/12/2022] Open
Abstract
Background Many conserved secondary structures have been identified within conserved elements in the human genome, but only a small fraction of them are known to be functional RNAs. The evolutionary variations of these conserved secondary structures in human populations and their biological functions have not been fully studied. Results We searched for polymorphisms within conserved secondary structures and identified a number of SNPs within these elements even though they are highly conserved among species. The density of SNPs in conserved secondary structures is about 65% of that of their flanking, non-conserved, sequences. Classification of sites as stems or as loops/bulges revealed that the density of SNPs in stems is about 62% of that found in loops/bulges. Analysis of derived allele frequency data indicates that sites in stems are under stronger evolutionary constraint than sites in loops/bulges. Intergenic conserved secondary structures tend to associate with transcription factor-encoding genes with genetic distance being the measure of regulator-gene associations. A substantial fraction of intergenic conserved secondary structures overlap characterized binding sites for multiple transcription factors. Conclusion Strong purifying selection implies that secondary structures are probably important carriers of biological functions for conserved sequences. The overlap between intergenic conserved secondary structures and transcription factor binding sites further suggests that intergenic conserved secondary structures have essential roles in directing gene expression in transcriptional regulation networks.
Collapse
Affiliation(s)
- Hai-Bing Xie
- State Key Laboratory of Genetic Resource and Evolution, Kunming Institute of Zoology, Kunming 650223, PR China.
| | | | | |
Collapse
|
16
|
Pashos EE, Kague E, Fisher S. Evaluation of cis-regulatory function in zebrafish. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2008; 7:465-73. [PMID: 18820318 DOI: 10.1093/bfgp/eln045] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
As increasing numbers of vertebrate genomes are sequenced, comparative genomics offers tremendous promise to unveil mechanisms of transcriptional gene regulation on a large scale. However, the challenge of analysing immense amounts of sequence data and relating primary sequence to function is daunting. Several teleost species occupy crucial niches in the world of comparative genomics, as experimental model organisms of wide utility and living roadmaps of molecular evolution. Extant species have evolved after a teleost-specific genome duplication, and offer the opportunity to examine the evolution of thousands of duplicate gene pairs. Transgenesis in zebrafish is being increasingly employed to functionally examine non-coding sequences, from fish and mammals. Here, we discuss current approaches to the study of gene regulation in teleosts, and the promise of future research.
Collapse
|
17
|
Woolfe A, Elgar G. Organization of conserved elements near key developmental regulators in vertebrate genomes. ADVANCES IN GENETICS 2008; 61:307-38. [PMID: 18282512 DOI: 10.1016/s0065-2660(07)00012-0] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Sequence conservation has traditionally been used as a means to target functional regions of complex genomes. In addition to its use in identifying coding regions of genes, the recent availability of whole genome data for a number of vertebrates has permitted high-resolution analyses of the noncoding "dark matter" of the genome. This has resulted in the identification of a large number of highly conserved sequence elements that appear to be preserved in all bony vertebrates. Further positional analysis of these conserved noncoding elements (CNEs) in the genome demonstrates that they cluster around genes involved in developmental regulation. This chapter describes the identification and characterization of these elements, with particular reference to their composition and organization.
Collapse
Affiliation(s)
- Adam Woolfe
- School of Biological and Chemical Sciences, Queen Mary, University of London, London E1 4NS, United Kingdom
| | | |
Collapse
|
18
|
Kikuta H, Fredman D, Rinkwitz S, Lenhard B, Becker TS. Retroviral enhancer detection insertions in zebrafish combined with comparative genomics reveal genomic regulatory blocks - a fundamental feature of vertebrate genomes. Genome Biol 2007; 8 Suppl 1:S4. [PMID: 18047696 PMCID: PMC2106839 DOI: 10.1186/gb-2007-8-s1-s4] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
A large-scale enhancer detection screen was performed in the zebrafish using a retroviral vector carrying a basal promoter and a fluorescent protein reporter cassette. Analysis of insertional hotspots uncovered areas around developmental regulatory genes in which an insertion results in the same global expression pattern, irrespective of exact position. These areas coincide with vertebrate chromosomal segments containing identical gene order; a phenomenon known as conserved synteny and thought to be a vestige of evolution. Genomic comparative studies have found large numbers of highly conserved noncoding elements (HCNEs) spanning these and other loci. HCNEs are thought to act as transcriptional enhancers based on the finding that many of those that have been tested direct tissue specific expression in transient or transgenic assays. Although gene order in hox and other gene clusters has long been known to be conserved because of shared regulatory sequences or overlapping transcriptional units, the chromosomal areas found through insertional hotspots contain only one or a few developmental regulatory genes as well as phylogenetically unrelated genes. We have termed these regions genomic regulatory blocks (GRBs), and show that they underlie the phenomenon of conserved synteny through all sequenced vertebrate genomes. After teleost whole genome duplication, a subset of GRBs were retained in two copies, underwent degenerative changes compared with tetrapod loci that exist as single copy, and that therefore can be viewed as representing the ancestral form. We discuss these findings in light of evolution of vertebrate chromosomal architecture and the identification of human disease mutations.
Collapse
Affiliation(s)
- Hiroshi Kikuta
- Sars Centre for Marine Molecular Biology, University of Bergen, Thormoehlensgate, 5008 Bergen, Norway
| | | | | | | | | |
Collapse
|
19
|
Abstract
While less than 1.5% of the mammalian genome encodes proteins, it is now evident that the vast majority is transcribed, mainly into non-protein-coding RNAs. This raises the question of what fraction of the genome is functional, i.e., composed of sequences that yield functional products, are required for the expression (regulation or processing) of these products, or are required for chromosome replication and maintenance. Many of the observed noncoding transcripts are differentially expressed, and, while most have not yet been studied, increasing numbers are being shown to be functional and/or trafficked to specific subcellular locations, as well as exhibit subtle evidence of selection. On the other hand, analyses of conservation patterns indicate that only approximately 5% (3%-8%) of the human genome is under purifying selection for functions common to mammals. However, these estimates rely on the assumption that reference sequences (usually ancient transposon-derived sequences) have evolved neutrally, which may not be the case, and if so would lead to an underestimate of the fraction of the genome under evolutionary constraint. These analyses also do not detect functional sequences that are evolving rapidly and/or have acquired lineage-specific functions. Indeed, many regulatory sequences and known functional noncoding RNAs, including many microRNAs, are not conserved over significant evolutionary distances, and recent evidence from the ENCODE project suggests that many functional elements show no detectable level of sequence constraint. Thus, it is likely that much more than 5% of the genome encodes functional information, and although the upper bound is unknown, it may be considerably higher than currently thought.
Collapse
Affiliation(s)
- Michael Pheasant
- ARC Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland 4072, Australia
| | | |
Collapse
|
20
|
Abstract
SUMMARY
It is usually thought that the development of complex organisms is controlled by protein regulatory factors and morphogenetic signals exchanged between cells and differentiating tissues during ontogeny. However, it is now evident that the majority of all animal genomes is transcribed, apparently in a developmentally regulated manner, suggesting that these genomes largely encode RNA machines and that there may be a vast hidden layer of RNA regulatory transactions in the background. I propose that the epigenetic trajectories of differentiation and development are primarily programmed by feed-forward RNA regulatory networks and that most of the information required for multicellular development is embedded in these networks, with cell–cell signalling required to provide important positional information and to correct stochastic errors in the endogenous RNA-directed program.
Collapse
Affiliation(s)
- John S Mattick
- ARC Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia QLD 4072, Australia.
| |
Collapse
|
21
|
Abstract
The elucidation of a growing number of species' genomes heralds an unprecedented opportunity to ascertain functional attributes of non-coding sequences. In particular, cis regulatory modules (CRMs) controlling gene expression constitute a rich treasure trove of data to be defined and experimentally validated. Such information will provide insight into cell lineage determination and differentiation and the genetic basis of heritable diseases as well as the development of novel tools for restricting the inactivation of genes to specific cell types or conditions. Historically, the study of CRMs and their individual transcription factor binding sites has been limited to proximal regions around gene loci. Two important by-products of the genomics revolution, artificial chromosome vectors and comparative genomics, have fueled efforts to define an increasing number of CRMs acting remotely to control gene expression. Such regulation from a distance has challenged our perspectives of gene expression control and perhaps the very definition of a gene. This review summarizes current approaches to characterize remote control of gene expression in transgenic mice and inherent limitations for accurately interpreting the essential nature of CRM activity.
Collapse
Affiliation(s)
- Xiaochun Long
- Cardiovascular Research Institute, University of Rochester School of Medicine, Rochester, New York 14642, USA
| | | |
Collapse
|