1
|
Cornejo-Páramo P, Petrova V, Zhang X, Young RS, Wong ES. Emergence of enhancers at late DNA replicating regions. Nat Commun 2024; 15:3451. [PMID: 38658544 PMCID: PMC11043393 DOI: 10.1038/s41467-024-47391-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 03/26/2024] [Indexed: 04/26/2024] Open
Abstract
Enhancers are fast-evolving genomic sequences that control spatiotemporal gene expression patterns. By examining enhancer turnover across mammalian species and in multiple tissue types, we uncover a relationship between the emergence of enhancers and genome organization as a function of germline DNA replication time. While enhancers are most abundant in euchromatic regions, enhancers emerge almost twice as often in late compared to early germline replicating regions, independent of transposable elements. Using a deep learning sequence model, we demonstrate that new enhancers are enriched for mutations that alter transcription factor (TF) binding. Recently evolved enhancers appear to be mostly neutrally evolving and enriched in eQTLs. They also show more tissue specificity than conserved enhancers, and the TFs that bind to these elements, as inferred by binding sequences, also show increased tissue-specific gene expression. We find a similar relationship with DNA replication time in cancer, suggesting that these observations may be time-invariant principles of genome evolution. Our work underscores that genome organization has a profound impact in shaping mammalian gene regulation.
Collapse
Affiliation(s)
- Paola Cornejo-Páramo
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, Sydney, NSW, Australia
| | - Veronika Petrova
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, Sydney, NSW, Australia
| | - Xuan Zhang
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia
| | - Robert S Young
- Usher Institute, University of Edinburgh, Teviot Place, Edinburgh, EH8 9AG, United Kingdom
- Zhejiang University - University of Edinburgh Institute, Zhejiang University, 718 East Haizhou Road, 314400, Haining, PR China
| | - Emily S Wong
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia.
- School of Biotechnology and Biomolecular Sciences, Sydney, NSW, Australia.
| |
Collapse
|
2
|
Bernardi G. The "Genomic Code": DNA Pervasively Moulds Chromatin Structures Leaving no Room for "Junk". Life (Basel) 2021; 11:342. [PMID: 33924668 PMCID: PMC8070607 DOI: 10.3390/life11040342] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 04/06/2021] [Accepted: 04/07/2021] [Indexed: 02/07/2023] Open
Abstract
The chromatin of the human genome was analyzed at three DNA size levels. At the first, compartment level, two "gene spaces" were found many years ago: A GC-rich, gene-rich "genome core" and a GC-poor, gene-poor "genome desert", the former corresponding to open chromatin centrally located in the interphase nucleus, the latter to closed chromatin located peripherally. This bimodality was later confirmed and extended by the discoveries (1) of LADs, the Lamina-Associated Domains, and InterLADs; (2) of two "spatial compartments", A and B, identified on the basis of chromatin interactions; and (3) of "forests and prairies" characterized by high and low CpG islands densities. Chromatin compartments were shown to be associated with the compositionally different, flat and single- or multi-peak DNA structures of the two, GC-poor and GC-rich, "super-families" of isochores. At the second, sub-compartment, level, chromatin corresponds to flat isochores and to isochore loops (due to compositional DNA gradients) that are susceptible to extrusion. Finally, at the short-sequence level, two sets of sequences, GC-poor and GC-rich, define two different nucleosome spacings, a short one and a long one. In conclusion, chromatin structures are moulded according to a "genomic code" by DNA sequences that pervade the genome and leave no room for "junk".
Collapse
Affiliation(s)
- Giorgio Bernardi
- Science Department, Roma Tre University, Viale Marconi 446, 00146 Rome, Italy; ; Tel.: +39-33-540-5892
- Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy
| |
Collapse
|
3
|
Lamolle G, Protasio AV, Iriarte A, Jara E, Simón D, Musto H. An Isochore-Like Structure in the Genome of the Flatworm Schistosoma mansoni. Genome Biol Evol 2016; 8:2312-8. [PMID: 27435793 PMCID: PMC5010904 DOI: 10.1093/gbe/evw170] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Eukaryotic genomes are compositionally heterogeneous, that is, composed by regions that differ in guanine-cytosine (GC) content (isochores). The most well documented case is that of vertebrates (mainly mammals) although it has been also noted among unicellular eukaryotes and invertebrates. In the human genome, regarded as a typical mammal, this heterogeneity is associated with several features. Specifically, genes located in GC-richest regions are the GC3-richest, display CpG islands and have shorter introns. Furthermore, these genes are more heavily expressed and tend to be located at the extremes of the chromosomes. Although the compositional heterogeneity seems to be widespread among eukaryotes, the associated properties noted in the human genome and other mammals have not been investigated in depth in other taxa Here we provide evidence that the genome of the parasitic flatworm Schistosoma mansoni is compositionally heterogeneous and exhibits an isochore-like structure, displaying some features associated, until now, only with the human and other vertebrate genomes, with the exception of gene concentration.
Collapse
Affiliation(s)
- Guillermo Lamolle
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias, Udelar, Montevideo, Uruguay
| | - Anna V Protasio
- Wellcome Trust Genome Campus, Wellcome Trust Sanger Institute, Cambridge, United Kingdom
| | - Andrés Iriarte
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias, Udelar, Montevideo, Uruguay Dpto. de Desarrollo Biotecnológico, Facultad de Medicina, Instituto de Higiene, Udelar, Montevideo, Uruguay
| | - Eugenio Jara
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias, Udelar, Montevideo, Uruguay
| | - Diego Simón
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias, Udelar, Montevideo, Uruguay
| | - Héctor Musto
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias, Udelar, Montevideo, Uruguay
| |
Collapse
|
4
|
Price N, Graur D. Are Synonymous Sites in Primates and Rodents Functionally Constrained? J Mol Evol 2015; 82:51-64. [PMID: 26563252 DOI: 10.1007/s00239-015-9719-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2015] [Accepted: 11/04/2015] [Indexed: 11/28/2022]
Abstract
It has been claimed that synonymous sites in mammals are under selective constraint. Furthermore, in many studies the selective constraint at such sites in primates was claimed to be more stringent than that in rodents. Given the larger effective population sizes in rodents than in primates, the theoretical expectation is that selection in rodents would be more effective than that in primates. To resolve this contradiction between expectations and observations, we used processed pseudogenes as a model for strict neutral evolution, and estimated selective constraint on synonymous sites using the rate of substitution at pseudosynonymous and pseudononsynonymous sites in pseudogenes as the neutral expectation. After controlling for the effects of GC content, our results were similar to those from previous studies, i.e., synonymous sites in primates exhibited evidence for higher selective constraint that those in rodents. Specifically, our results indicated that in primates up to 24% of synonymous sites could be under purifying selection, while in rodents synonymous sites evolved neutrally. To further control for shifts in GC content, we estimated selective constraint at fourfold degenerate sites using a maximum parsimony approach. This allowed us to estimate selective constraint using mutational patterns that cause a shift in GC content (GT ↔ TG, CT ↔ TC, GA ↔ AG, and CA ↔ AC) and ones that do not (AT ↔ TA and CG ↔ GC). Using this approach, we found that synonymous sites evolve neutrally in both primates and rodents. Apparent deviations from neutrality were caused by a higher rate of C → A and C → T mutations in pseudogenes. Such differences are most likely caused by the shift in GC content experienced by pseudogenes. We conclude that previous estimates according to which 20-40% of synonymous sites in primates were under selective constraint were most likely artifacts of the biased pattern of mutation.
Collapse
Affiliation(s)
- Nicholas Price
- Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO, 80523, USA.
| | - Dan Graur
- Department of Biology and Biochemistry, University of Houston, Houston, TX, 77204-5001, USA
| |
Collapse
|
5
|
Fernandez-Vidal A, Guitton-Sert L, Cadoret JC, Drac M, Schwob E, Baldacci G, Cazaux C, Hoffmann JS. A role for DNA polymerase θ in the timing of DNA replication. Nat Commun 2014; 5:4285. [PMID: 24989122 DOI: 10.1038/ncomms5285] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2014] [Accepted: 06/03/2014] [Indexed: 01/01/2023] Open
Abstract
Although DNA polymerase θ (Pol θ) is known to carry out translesion synthesis and has been implicated in DNA repair, its physiological function under normal growth conditions remains unclear. Here we present evidence that Pol θ plays a role in determining the timing of replication in human cells. We find that Pol θ binds to chromatin during early G1, interacts with the Orc2 and Orc4 components of the Origin recognition complex and that the association of Mcm proteins with chromatin is enhanced in G1 when Pol θ is downregulated. Pol θ-depleted cells exhibit a normal density of activated origins in S phase, but early-to-late and late-to-early shifts are observed at a number of replication domains. Pol θ overexpression, on the other hand, causes delayed replication. Our results therefore suggest that Pol θ functions during the earliest steps of DNA replication and influences the timing of replication initiation.
Collapse
Affiliation(s)
- Anne Fernandez-Vidal
- 1] Equipe Labellisée Ligue contre le Cancer 2013 INSERM Unit 1037; CNRS ERL 5294; CRCT (Cancer Research Center of Toulouse), BP3028, CHU Purpan, Toulouse 31024, France [2] Université Paul Sabatier, University of Toulouse III, Toulouse F-31062, France [3]
| | - Laure Guitton-Sert
- 1] Equipe Labellisée Ligue contre le Cancer 2013 INSERM Unit 1037; CNRS ERL 5294; CRCT (Cancer Research Center of Toulouse), BP3028, CHU Purpan, Toulouse 31024, France [2] Université Paul Sabatier, University of Toulouse III, Toulouse F-31062, France [3]
| | - Jean-Charles Cadoret
- 1] Institut Jacques Monod, UMR7592, CNRS and University Paris-Diderot, 15 Rue Hélène Brion, Paris, Cedex 13 75205, France [2]
| | - Marjorie Drac
- Institut of Molecular Genetics, CNRS UMR5535 and University of Montpellier, Montpellier 34293, France
| | - Etienne Schwob
- Institut of Molecular Genetics, CNRS UMR5535 and University of Montpellier, Montpellier 34293, France
| | - Giuseppe Baldacci
- Institut Jacques Monod, UMR7592, CNRS and University Paris-Diderot, 15 Rue Hélène Brion, Paris, Cedex 13 75205, France
| | - Christophe Cazaux
- 1] Equipe Labellisée Ligue contre le Cancer 2013 INSERM Unit 1037; CNRS ERL 5294; CRCT (Cancer Research Center of Toulouse), BP3028, CHU Purpan, Toulouse 31024, France [2] Université Paul Sabatier, University of Toulouse III, Toulouse F-31062, France
| | - Jean-Sébastien Hoffmann
- 1] Equipe Labellisée Ligue contre le Cancer 2013 INSERM Unit 1037; CNRS ERL 5294; CRCT (Cancer Research Center of Toulouse), BP3028, CHU Purpan, Toulouse 31024, France [2] Université Paul Sabatier, University of Toulouse III, Toulouse F-31062, France
| |
Collapse
|
6
|
Implications of human genome structural heterogeneity: functionally related genes tend to reside in organizationally similar genomic regions. BMC Genomics 2014; 15:252. [PMID: 24684786 PMCID: PMC4234528 DOI: 10.1186/1471-2164-15-252] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2012] [Accepted: 03/21/2014] [Indexed: 01/30/2023] Open
Abstract
Background In an earlier study, we hypothesized that genomic segments with different sequence
organization patterns (OPs) might display functional specificity despite their
similar GC content. Here we tested this hypothesis by dividing the human genome
into 100 kb segments, classifying these segments into five compositional
groups according to GC content, and then characterizing each segment within the
five groups by oligonucleotide counting (k-mer analysis; also referred to as
compositional spectrum analysis, or CSA), to examine the distribution of sequence
OPs in the segments. We performed the CSA on the entire DNA, i.e., its coding and
non-coding parts the latter being much more abundant in the genome than the
former. Results We identified 38 OP-type clusters of segments that differ in their compositional
spectrum (CS) organization. Many of the segments that shared the same OP type were
enriched with genes related to the same biological processes (developmental,
signaling, etc.), components of biochemical complexes, or organelles. Thirteen
OP-type clusters showed significant enrichment in genes connected to specific
gene-ontology terms. Some of these clusters seemed to reflect certain events
during periods of horizontal gene transfer and genome expansion, and subsequent
evolution of genomic regions requiring coordinated regulation. Conclusions There may be a tendency for genes that are involved in the same biological
process, complex or organelle to use the same OP, even at a distance of ~
100 kb from the genes. Although the intergenic DNA is non-coding, the general
pattern of sequence organization (e.g., reflected in over-represented
oligonucleotide “words”) may be important and were protected, to some
extent, in the course of evolution.
Collapse
|
7
|
Abstract
Patterns of replication within eukaryotic genomes correlate with gene expression, chromatin structure, and genome evolution. Recent advances in genome-scale mapping of replication kinetics have allowed these correlations to be explored in many species, cell types, and growth conditions, and these large data sets have allowed quantitative and computational analyses. One striking new correlation to emerge from these analyses is between replication timing and the three-dimensional structure of chromosomes. This correlation, which is significantly stronger than with any single histone modification or chromosome-binding protein, suggests that replication timing is controlled at the level of chromosomal domains. This conclusion dovetails with parallel work on the heterogeneity of origin firing and the competition between origins for limiting activators to suggest a model in which the stochastic probability of individual origin firing is modulated by chromosomal domain structure to produce patterns of replication. Whether these patterns have inherent biological functions or simply reflect higher-order genome structure is an open question.
Collapse
Affiliation(s)
- Nicholas Rhind
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA.
| | | |
Collapse
|
8
|
Mugal CF, Arndt PF, Ellegren H. Twisted signatures of GC-biased gene conversion embedded in an evolutionary stable karyotype. Mol Biol Evol 2013; 30:1700-12. [PMID: 23564940 PMCID: PMC3684855 DOI: 10.1093/molbev/mst067] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
The genomes of many vertebrates show a characteristic heterogeneous distribution of GC content, the so-called GC isochore structure. The origin of isochores has been explained via the mechanism of GC-biased gene conversion (gBGC). However, although the isochore structure is declining in many mammalian genomes, the heterogeneity in GC content is being reinforced in the avian genome. Despite this discrepancy, which remains unexplained, examinations of individual substitution frequencies in mammals and birds are both consistent with the gBGC model of isochore evolution. On the other hand, a negative correlation between substitution and recombination rate found in the chicken genome is inconsistent with the gBGC model. It should therefore be important to consider along with gBGC other consequences of recombination on the origin and fate of mutations, as well as to account for relationships between recombination rate and other genomic features. We therefore developed an analytical model to describe the substitution patterns found in the chicken genome, and further investigated the relationships between substitution patterns and several genomic features in a rigorous statistical framework. Our analysis indicates that GC content itself, either directly or indirectly via interrelations to other genomic features, has an impact on the substitution pattern. Further, we suggest that this phenomenon is particularly visible in avian genomes due to their unusually low rate of chromosomal evolution. Because of this, interrelations between GC content and other genomic features are being reinforced, and are as such more pronounced in avian genomes as compared with other vertebrate genomes with a less stable karyotype.
Collapse
Affiliation(s)
- Carina F Mugal
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | | | | |
Collapse
|
9
|
Frenkel S, Kirzhner V, Korol A. Organizational heterogeneity of vertebrate genomes. PLoS One 2012; 7:e32076. [PMID: 22384143 PMCID: PMC3288070 DOI: 10.1371/journal.pone.0032076] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2011] [Accepted: 01/23/2012] [Indexed: 01/06/2023] Open
Abstract
Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.
Collapse
Affiliation(s)
| | | | - Abraham Korol
- Department of Evolutionary and Environmental Biology and Institute of Evolution, University of Haifa, Mount Carmel, Haifa, Israel
| |
Collapse
|
10
|
Weber CC, Pink CJ, Hurst LD. Late-replicating domains have higher divergence and diversity in Drosophila melanogaster. Mol Biol Evol 2011; 29:873-82. [PMID: 22046001 DOI: 10.1093/molbev/msr265] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Several reports from mammals indicate that an increase in the mutation rate in late-replicating regions may, in part, be responsible for the observed genomic heterogeneity in neutral substitution rates and levels of diversity, although the mechanisms for this remain poorly understood. Recent evidence also suggests that late replication is associated with high mutability in yeast. This then raises the question as to whether a similar effect is operating across all eukaryotes. Limited evidence from one chromosome arm in Drosophila melanogaster suggests the opposite pattern, with regions overlapping early-firing origins showing increased levels of diversity and divergence. Given the availability of genome-wide replication timing profiles for D. melanogaster, we now return to this issue. Consistent with what is seen in other taxa, we find that divergence at synonymous sites in exon cores, as well as divergence at putatively unconstrained intronic sites, is elevated in late-replicating regions. Analysis of genes with low codon usage bias suggests a ∼30% difference in mutation rate between the earliest and the latest replicating sequence. Intronic sequence suggests a more modest difference. We additionally show that an increase in diversity in late-replicating sequences is not owing to replication timing covarying with the local recombination rate. If anything, the effects of recombination mask the impact of replication timing. We conclude that, contrary to prior reports and consistent with what is seen in mammals and yeast, there is indeed a relationship between rates of nucleotide divergence and diversity and replication timing that is consistent with an increase in the mutation rate during late S-phase in D. melanogaster. It is therefore plausible that such an effect might be common among eukaryotes. The result may have implications for the inference of positive selection.
Collapse
Affiliation(s)
- Claudia C Weber
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | | | | |
Collapse
|
11
|
Late replicating domains are highly recombining in females but have low male recombination rates: implications for isochore evolution. PLoS One 2011; 6:e24480. [PMID: 21949720 PMCID: PMC3176772 DOI: 10.1371/journal.pone.0024480] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2011] [Accepted: 08/11/2011] [Indexed: 01/01/2023] Open
Abstract
In mammals sequences that are either late replicating or highly recombining have high rates of evolution at putatively neutral sites. As early replicating domains and highly recombining domains both tend to be GC rich we a priori expect these two variables to covary. If so, the relative contribution of either of these variables to the local neutral substitution rate might have been wrongly estimated owing to covariance with the other. Against our expectations, we find that sex-averaged recombination rates show little or no correlation with replication timing, suggesting that they are independent determinants of substitution rates. However, this result masks significant sex-specific complexity: late replicating domains tend to have high recombination rates in females but low recombination rates in males. That these trends are antagonistic explains why sex-averaged recombination is not correlated with replication timing. This unexpected result has several important implications. First, although both male and female recombination rates covary significantly with intronic substitution rates, the magnitude of this correlation is moderately underestimated for male recombination and slightly overestimated for female recombination, owing to covariance with replicating timing. Second, the result could explain why male recombination is strongly correlated with GC content but female recombination is not. If to explain the correlation between GC content and replication timing we suppose that late replication forces reduced GC content, then GC promotion by biased gene conversion during female recombination is partly countered by the antagonistic effect of later replicating sequence tending increase AT content. Indeed, the strength of the correlation between female recombination rate and local GC content is more than doubled by control for replication timing. Our results underpin the need to consider sex-specific recombination rates and potential covariates in analysis of GC content and rates of evolution.
Collapse
|
12
|
Cserzo M, Turu G, Varnai P, Hunyady L. Relating underrepresented genomic DNA patterns and tiRNAs: the rule behind the observation and beyond. Biol Direct 2010; 5:56. [PMID: 20860791 PMCID: PMC3583238 DOI: 10.1186/1745-6150-5-56] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2010] [Accepted: 09/22/2010] [Indexed: 11/10/2022] Open
Abstract
Background One of the central problems of post-genomic biology is the understanding of regulatory network of genes. Traditionally the problem is approached from the protein-DNA interaction perspective. In recent years various types of noncoding RNAs appeared on the scene as new potent players of the game. The exact role of these molecules in gene expression control is mostly unknown at present, while their importance is generally recognized. Results The Human and Mouse genomes have been screened with a statistical model for sequence patterns underrepresented in these genomes, and a subset of motifs, named spanions, has been identified. The common portion of the motif lists of the two species is 75% indicating evolutionary conservation of this feature. These motifs are arranged in clusters at close proximity of distinct genetic landmarks: 5' ends of genes, exon side of the exon/intron junctions and 5' ends of 3' UTRs. The length of the clusters is typically in the 20 to 25 bases range. The findings are in agreement with the known C/G bias of promoter regions while access much more sequential information than the simple composition based model. In the Human genome the recently reported transcription initiation RNAs (tiRNAs) are typically transcribed from these spanion clusters according to the presented results. The spanion clusters account for 70% of the published tiRNAs. Apparently, the model access the common statistical feature of this new and mostly uncharacterized non-coding RNA class and, in this way, supports the experimental observations with theoretical background. Conclusions The presented results seem to support the emerging model of the RNA-driven eukaryotic gene expression control. Beyond that, the model detects spanion clusters at genetic positions where no tiRNA counterpart was considered and reported. The GO-term analysis of genes with high concentration of spanion clusters in their promoter proximal region indicates involvement in gene regulatory processes. The results of the analysis suggest that the gene regulatory potential of the small non-coding RNAs is grossly underestimated at present. Reviewers This article was reviewed by Frank Eisenhaber, Sandor Pongor and Rotem Sorek (nominated by Doron Lancet).
Collapse
Affiliation(s)
- Miklos Cserzo
- Department of Physiology, Semmelweis University, Budapest, Tuzolto Street, 37-47, 1094, Hungary, EU.
| | | | | | | |
Collapse
|
13
|
Contrasting GC-content dynamics across 33 mammalian genomes: relationship with life-history traits and chromosome sizes. Genome Res 2010; 20:1001-9. [PMID: 20530252 DOI: 10.1101/gr.104372.109] [Citation(s) in RCA: 148] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The origin, evolution, and functional relevance of genomic variations in GC content are a long-debated topic, especially in mammals. Most of the existing literature, however, has focused on a small number of model species and/or limited sequence data sets. We analyzed more than 1000 orthologous genes in 33 fully sequenced mammalian genomes, reconstructed their ancestral isochore organization in the maximum likelihood framework, and explored the evolution of third-codon position GC content in representatives of 16 orders and 27 families. We showed that the previously reported erosion of GC-rich isochores is not a general trend. Several species (e.g., shrew, microbat, tenrec, rabbit) have independently undergone a marked increase in GC content, with a widening gap between the GC-poorest and GC-richest classes of genes. The intensively studied apes and (especially) murids do not reflect the general placental pattern. We correlated GC-content evolution with species life-history traits and cytology. Significant effects of body mass and genome size were detected, with each being consistent with the GC-biased gene conversion model.
Collapse
|
14
|
Abstract
Studies of replication timing provide a handle into previously impenetrable higher-order levels of chromosome organization and their plasticity during development. Although mechanisms regulating replication timing are not clear, novel genome-wide studies provide a thorough survey of the extent to which replication timing is regulated during most of the early cell fate transitions in mammals, revealing coordinated changes of a defined set of 400-800 kb chromosomal segments that involve at least half the genome. Furthermore, changes in replication time are linked to changes in sub-nuclear organization and domain-wide transcriptional potential, and tissue-specific replication timing profiles are conserved from mouse to human, suggesting that the program has developmental significance. Hence, these studies have provided a solid foundation for linking megabase level chromosome structure to function, and suggest a central role for replication in domain-level genome organization.
Collapse
|
15
|
Abstract
The discovery of the DNA double helix structure half a century ago immediately suggested a mechanism for its duplication by semi-conservative copying of the nucleotide sequence into two DNA daughter strands. Shortly after, a second fundamental step toward the elucidation of the mechanism of DNA replication was taken with the isolation of the first enzyme able to polymerize DNA from a template. In the subsequent years, the basic mechanism of DNA replication and its enzymatic machinery components were elucidated, mostly through genetic approaches and in vitro biochemistry. Most recently, the spatial and temporal organization of the DNA replication process in vivo within the context of chromatin and inside the intact cell are finally beginning to be elucidated. On the one hand, recent advances in genome-wide high throughput techniques are providing a new wave of information on the progression of genome replication at high spatial resolution. On the other hand, novel super-resolution microscopy techniques are just starting to give us the first glimpses of how DNA replication is organized within the context of single intact cells with high spatial resolution. The integration of these data with time lapse microscopy analysis will give us the ability to film and dissect the replication of the genome in situ and in real time.
Collapse
Affiliation(s)
- Vadim O Chagin
- Department of Biology, Technische Universität Darmstadt, Germany
| | | | | |
Collapse
|
16
|
Hiratani I, Gilbert DM. Autosomal Lyonization of Replication Domains During Early Mammalian Development. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2010; 695:41-58. [DOI: 10.1007/978-1-4419-7037-4_4] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
17
|
Abstract
Although early replication has long been associated with accessible chromatin, replication timing is not included in most discussions of epigenetic marks. This is partly due to a lack of understanding of the mechanisms behind this association but the issue has also been confounded by studies concluding that there are very few changes in replication timing during development. Recently, the first genome-wide study of replication timing during the course of differentiation revealed extensive changes that were strongly associated with changes in transcriptional activity and subnuclear organization. Domains of temporally coordinate replication delineate discrete units of chromosome structure and function that are characteristic of particular differentiation states. Hence, although we are still a long way from understanding the functional significance of replication timing, it is clear that replication timing is a distinct epigenetic signature of cell differentiation state.
Collapse
Affiliation(s)
- Ichiro Hiratani
- Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA
| | | |
Collapse
|
18
|
Hiratani I, Ryba T, Itoh M, Yokochi T, Schwaiger M, Chang CW, Lyou Y, Townes TM, Schübeler D, Gilbert DM. Global reorganization of replication domains during embryonic stem cell differentiation. PLoS Biol 2008; 6:e245. [PMID: 18842067 PMCID: PMC2561079 DOI: 10.1371/journal.pbio.0060245] [Citation(s) in RCA: 415] [Impact Index Per Article: 25.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2008] [Accepted: 08/27/2008] [Indexed: 01/20/2023] Open
Abstract
DNA replication in mammals is regulated via the coordinate firing of clusters of replicons that duplicate megabase-sized chromosome segments at specific times during S-phase. Cytogenetic studies show that these “replicon clusters” coalesce as subchromosomal units that persist through multiple cell generations, but the molecular boundaries of such units have remained elusive. Moreover, the extent to which changes in replication timing occur during differentiation and their relationship to transcription changes has not been rigorously investigated. We have constructed high-resolution replication-timing profiles in mouse embryonic stem cells (mESCs) before and after differentiation to neural precursor cells. We demonstrate that chromosomes can be segmented into multimegabase domains of coordinate replication, which we call “replication domains,” separated by transition regions whose replication kinetics are consistent with large originless segments. The molecular boundaries of replication domains are remarkably well conserved between distantly related ESC lines and induced pluripotent stem cells. Unexpectedly, ESC differentiation was accompanied by the consolidation of smaller differentially replicating domains into larger coordinately replicated units whose replication time was more aligned to isochore GC content and the density of LINE-1 transposable elements, but not gene density. Replication-timing changes were coordinated with transcription changes for weak promoters more than strong promoters, and were accompanied by rearrangements in subnuclear position. We conclude that replication profiles are cell-type specific, and changes in these profiles reveal chromosome segments that undergo large changes in organization during differentiation. Moreover, smaller replication domains and a higher density of timing transition regions that interrupt isochore replication timing define a novel characteristic of the pluripotent state. Microscopy studies have suggested that chromosomal DNA is composed of multiple, megabase-sized segments, each replicated at different times during S-phase of the cell cycle. However, a molecular definition of these coordinately replicated sequences and the stability of the boundaries between them has not been established. We constructed genome-wide replication-timing maps in mouse embryonic stem cells, identifying multimegabase coordinately replicated chromosome segments—“replication domains”—separated by remarkably distinct temporal boundaries. These domain boundaries were shared between several unrelated embryonic stem cell lines, including somatic cells reprogrammed to pluripotency (so-called induced pluripotent stem cells). However, upon differentiation to neural precursor cells, domains encompassing approximately 20% of the genome changed their replication timing, temporally consolidating into fewer, larger replication domains that were conserved between different neural precursor cell lines. Domains that changed replication timing showed a unique sequence composition, a strongly biased directionality for changes in resident gene expression, and altered radial positioning within the three-dimensional space in the cell nucleus, suggesting that changes in replication timing are related to the reorganization of higher-order chromosome structure and function during differentiation. Moreover, the property of smaller discordantly replicating domains may define a novel characteristic of pluripotency. Analyzing the temporal order of DNA replication across the genome during embryonic stem cell differentiation reveals stable boundaries between coordinately replicated regions that consolidate into fewer, larger domains during differentiation.
Collapse
Affiliation(s)
- Ichiro Hiratani
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
| | - Tyrone Ryba
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
| | - Mari Itoh
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
| | - Tomoki Yokochi
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
| | - Michaela Schwaiger
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
| | - Chia-Wei Chang
- Department of Biochemistry and Molecular Genetics, University of Alabama at Birmingham, Schools of Medicine and Dentistry, Birmingham, Alabama, United States of America
| | - Yung Lyou
- Department of Biochemistry and Molecular Biology, State University of New York, Upstate Medical University, Syracuse, New York, United States of America
| | - Tim M Townes
- Department of Biochemistry and Molecular Genetics, University of Alabama at Birmingham, Schools of Medicine and Dentistry, Birmingham, Alabama, United States of America
| | - Dirk Schübeler
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
| | - David M Gilbert
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
19
|
Schmidt T, Frishman D. Assignment of isochores for all completely sequenced vertebrate genomes using a consensus. Genome Biol 2008; 9:R104. [PMID: 18590563 PMCID: PMC2481423 DOI: 10.1186/gb-2008-9-6-r104] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2008] [Revised: 05/22/2008] [Accepted: 06/30/2008] [Indexed: 11/16/2022] Open
Abstract
A new consensus isochore assignment method and a database of isochore maps for all completely sequenced vertebrate genomes are presented. We show that although the currently available isochore mapping methods agree on the isochore classification of about two-thirds of the human DNA, they produce significantly different results with regard to the location of isochore boundaries and isochore length distribution. We present a new consensus isochore assignment method based on majority voting and provide IsoBase, a comprehensive on-line database of isochore maps for all completely sequenced vertebrate genomes.
Collapse
Affiliation(s)
- Thorsten Schmidt
- Department of Genome-Oriented Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, D-85350 Freising, Germany
| | | |
Collapse
|
20
|
Chojnowski JL, Braun EL. Turtle isochore structure is intermediate between amphibians and other amniotes. Integr Comp Biol 2008; 48:454-62. [PMID: 21669806 DOI: 10.1093/icb/icn062] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Vertebrate genomes are comprised of isochores that are relatively long (>100 kb) regions with a relatively homogenous (either GC-rich or AT-rich) base composition and with rather sharp boundaries with neighboring isochores. Mammals and living archosaurs (birds and crocodilians) have heterogeneous genomes that include very GC-rich isochores. In sharp contrast, the genomes of amphibians and fishes are more homogeneous and they have a lower overall GC content. Because DNA with higher GC content is more thermostable, the elevated GC content of mammalian and archosaurian DNA has been hypothesized to be an adaptation to higher body temperatures. This hypothesis can be tested by examining structure of isochores across the reptilian clade, which includes the archosaurs, testudines (turtles), and lepidosaurs (lizards and snakes), because reptiles exhibit diverse body sizes, metabolic rates, and patterns of thermoregulation. This study focuses on a comparative analysis of a new set of expressed genes of the red-eared slider turtle and orthologs of the turtle genes in mammalian (human, mouse, dog, and opossum), archosaurian (chicken and alligator), and amphibian (western clawed frog) genomes. EST (expressed sequence tag) data from a turtle cDNA library enriched for genes that have specialized functions (developmental genes) revealed using the GC content of the third-codon-position to examine isochore structure requires careful consideration of the types of genes examined. The more highly expressed genes (e.g., housekeeping genes) are more likely to be GC-rich than are genes with specialized functions. However, the set of highly expressed turtle genes demonstrated that the turtle genome has a GC content that is intermediate between the GC-poor amphibians and the GC-rich mammals and archosaurs. There was a strong correlation between the GC content of all turtle genes and the GC content of other vertebrate genes, with the slope of the line describing this relationship also indicating that the isochore structure of turtles is intermediate between that of amphibians and other amniotes. These data are consistent with some thermal hypotheses of isochore evolution, but we believe that the credible set of models for isochore evolution still includes a variety of models. These data expand the amount of genomic data available from reptiles upon which future studies of reptilian genomics can build.
Collapse
Affiliation(s)
- Jena L Chojnowski
- Department of Zoology, University of Florida, 223 Bartram Hall, PO Box 118525, Gainesville, FL 32611, USA
| | | |
Collapse
|
21
|
Gao F, Zhang CT. Prediction of replication time zones at single nucleotide resolution in the human genome. FEBS Lett 2008; 582:2441-4. [PMID: 18555015 DOI: 10.1016/j.febslet.2008.06.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2008] [Revised: 06/03/2008] [Accepted: 06/04/2008] [Indexed: 10/22/2022]
Abstract
The human genome is structured at multiple levels: it is organized into a series of replication time zones, and meanwhile it is composed of isochores. Accumulating evidence suggests a match between these two genome features. Based on newly developed software GC-Profile, we obtained a complete coverage of the human genome by 3198 isochores with boundaries at single nucleotide resolution. Interestingly, the experimentally confirmed replication timing sites in the regions of 1p36.1, 6p21.32, 17q11.2 and 22q12.1 nearly all coincide with the determined isochore boundaries. The precise boundaries of the 3198 isochores are available via the website: http://tubic.tju.edu.cn/isomap/.
Collapse
Affiliation(s)
- Feng Gao
- Department of Physics, Tianjin University, Tianjin 300072, China
| | | |
Collapse
|
22
|
Abstract
Chromosome replication timing is biphasic (early-late) in the cell cycle of vertebrates and of most (possibly all) eukaryotes. In the present work we have compared the extended, detailed replication timing maps that are available, namely those of human chromosomes 6, 11q, and 21q, with chromosomal bands as visualized at low (400 bands), high (850 bands), and highest (3,200 isochores) resolution. We have observed that the replicons located in a given isochore practically always show either all early or all late replication timing and that early-replicating isochores are short and GC-rich and late-replicating isochores are long and GC-poor. In the vast majority of cases, replicons are clustered in isochores, which are themselves most often clustered in early- or late-replication timing zones and may often reach the size of high-resolution bands and, very rarely, even that of low-resolution bands. Finally, we show that our results should be representative for the whole human genome and thus help to predict replication timing zones in all chromosomes.
Collapse
|
23
|
Zheng WX, Zhang CT. Biological Implications of Isochore Boundaries in the Human Genome. J Biomol Struct Dyn 2008; 25:327-36. [DOI: 10.1080/07391102.2008.10507181] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
24
|
Darai-Ramqvist E, Sandlund A, Müller S, Klein G, Imreh S, Kost-Alimova M. Segmental duplications and evolutionary plasticity at tumor chromosome break-prone regions. Genome Res 2008; 18:370-9. [PMID: 18230801 DOI: 10.1101/gr.7010208] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
We have previously found that the borders of evolutionarily conserved chromosomal regions often coincide with tumor-associated deletion breakpoints within human 3p12-p22. Moreover, a detailed analysis of a frequently deleted region at 3p21.3 (CER1) showed associations between tumor breaks and gene duplications. We now report on the analysis of 54 chromosome 3 breaks by multipoint FISH (mpFISH) in 10 carcinoma-derived cell lines. The centromeric region was broken in five lines. In lines with highly complex karyotypes, breaks were clustered near known fragile sites, FRA3B, FRA3C, and FRA3D (three lines), and in two other regions: 3p12.3-p13 ( approximately 75 Mb position) and 3q21.3-q22.1 ( approximately 130 Mb position) (six lines). All locations are shown based on NCBI Build 36.1 human genome sequence. The last two regions participated in three of four chromosome 3 inversions during primate evolution. Regions at 75, 127, and 131 Mb positions carry a large ( approximately 250 kb) segmental duplication (tumor break-prone segmental duplication [TBSD]). TBSD homologous sequences were found at 15 sites on different chromosomes. They were located within bands frequently involved in carcinoma-associated breaks. Thirteen of them have been involved in inversions during primate evolution; 10 were reused by breaks during mammalian evolution; 14 showed copy number polymorphism in man. TBSD sites showed an increase in satellite repeats, retrotransposed sequences, and other segmental duplications. We propose that the instability of these sites stems from specific organization of the chromosomal region, associated with location at a boundary between different CG-content isochores and with the presence of TBSDs and "instability elements," including satellite repeats and retroviral sequences.
Collapse
Affiliation(s)
- Eva Darai-Ramqvist
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Stockholm SE-171 77, Sweden
| | | | | | | | | | | |
Collapse
|
25
|
Audit B, Nicolay S, Huvet M, Touchon M, d'Aubenton-Carafa Y, Thermes C, Arneodo A. DNA replication timing data corroborate in silico human replication origin predictions. PHYSICAL REVIEW LETTERS 2007; 99:248102. [PMID: 18233493 DOI: 10.1103/physrevlett.99.248102] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2007] [Indexed: 05/25/2023]
Abstract
We develop a wavelet-based multiscale pattern recognition methodology to disentangle the replication- from the transcription-associated compositional strand asymmetries observed in the human genome. Comparing replication skew profiles to recent high-resolution replication timing data reveals that most of the putative replication origins that border the so-identified replication domains are replicated earlier than their surroundings whereas the central regions replicate late in the S phase. We discuss the implications of this first experimental confirmation of these replication origin predictions that are likely to be early replicating and active in most tissues.
Collapse
Affiliation(s)
- B Audit
- Laboratoire Joliot-Curie, ENS-Lyon, CNRS, France
| | | | | | | | | | | | | |
Collapse
|