1
|
Cornejo-Páramo P, Petrova V, Zhang X, Young RS, Wong ES. Emergence of enhancers at late DNA replicating regions. Nat Commun 2024; 15:3451. [PMID: 38658544 PMCID: PMC11043393 DOI: 10.1038/s41467-024-47391-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 03/26/2024] [Indexed: 04/26/2024] Open
Abstract
Enhancers are fast-evolving genomic sequences that control spatiotemporal gene expression patterns. By examining enhancer turnover across mammalian species and in multiple tissue types, we uncover a relationship between the emergence of enhancers and genome organization as a function of germline DNA replication time. While enhancers are most abundant in euchromatic regions, enhancers emerge almost twice as often in late compared to early germline replicating regions, independent of transposable elements. Using a deep learning sequence model, we demonstrate that new enhancers are enriched for mutations that alter transcription factor (TF) binding. Recently evolved enhancers appear to be mostly neutrally evolving and enriched in eQTLs. They also show more tissue specificity than conserved enhancers, and the TFs that bind to these elements, as inferred by binding sequences, also show increased tissue-specific gene expression. We find a similar relationship with DNA replication time in cancer, suggesting that these observations may be time-invariant principles of genome evolution. Our work underscores that genome organization has a profound impact in shaping mammalian gene regulation.
Collapse
Affiliation(s)
- Paola Cornejo-Páramo
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, Sydney, NSW, Australia
| | - Veronika Petrova
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, Sydney, NSW, Australia
| | - Xuan Zhang
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia
| | - Robert S Young
- Usher Institute, University of Edinburgh, Teviot Place, Edinburgh, EH8 9AG, United Kingdom
- Zhejiang University - University of Edinburgh Institute, Zhejiang University, 718 East Haizhou Road, 314400, Haining, PR China
| | - Emily S Wong
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia.
- School of Biotechnology and Biomolecular Sciences, Sydney, NSW, Australia.
| |
Collapse
|
2
|
Herrick J. DNA Damage, Genome Stability, and Adaptation: A Question of Chance or Necessity? Genes (Basel) 2024; 15:520. [PMID: 38674454 PMCID: PMC11049855 DOI: 10.3390/genes15040520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 04/14/2024] [Accepted: 04/18/2024] [Indexed: 04/28/2024] Open
Abstract
DNA damage causes the mutations that are the principal source of genetic variation. DNA damage detection and repair mechanisms therefore play a determining role in generating the genetic diversity on which natural selection acts. Speciation, it is commonly assumed, occurs at a rate set by the level of standing allelic diversity in a population. The process of speciation is driven by a combination of two evolutionary forces: genetic drift and ecological selection. Genetic drift takes place under the conditions of relaxed selection, and results in a balance between the rates of mutation and the rates of genetic substitution. These two processes, drift and selection, are necessarily mediated by a variety of mechanisms guaranteeing genome stability in any given species. One of the outstanding questions in evolutionary biology concerns the origin of the widely varying phylogenetic distribution of biodiversity across the Tree of Life and how the forces of drift and selection contribute to shaping that distribution. The following examines some of the molecular mechanisms underlying genome stability and the adaptive radiations that are associated with biodiversity and the widely varying species richness and evenness in the different eukaryotic lineages.
Collapse
Affiliation(s)
- John Herrick
- Independent Researcher at 3, Rue des Jeûneurs, 75002 Paris, France
| |
Collapse
|
3
|
de Manuel M, Wu FL, Przeworski M. A paternal bias in germline mutation is widespread in amniotes and can arise independently of cell division numbers. eLife 2022; 11:e80008. [PMID: 35916372 PMCID: PMC9439683 DOI: 10.7554/elife.80008] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Accepted: 08/01/2022] [Indexed: 11/13/2022] Open
Abstract
In humans and other mammals, germline mutations are more likely to arise in fathers than in mothers. Although this sex bias has long been attributed to DNA replication errors in spermatogenesis, recent evidence from humans points to the importance of mutagenic processes that do not depend on cell division, calling into question our understanding of this basic phenomenon. Here, we infer the ratio of paternal-to-maternal mutations, α, in 42 species of amniotes, from putatively neutral substitution rates of sex chromosomes and autosomes. Despite marked differences in gametogenesis, physiologies and environments across species, fathers consistently contribute more mutations than mothers in all the species examined, including mammals, birds, and reptiles. In mammals, α is as high as 4 and correlates with generation times; in birds and snakes, α appears more stable around 2. These observations are consistent with a simple model, in which mutations accrue at equal rates in both sexes during early development and at a higher rate in the male germline after sexual differentiation, with a conserved paternal-to-maternal ratio across species. Thus, α may reflect the relative contributions of two or more developmental phases to total germline mutations, and is expected to depend on generation time even if mutations do not track cell divisions.
Collapse
Affiliation(s)
- Marc de Manuel
- Department of Biological Sciences, Columbia UniversityNew YorkUnited States
| | - Felix L Wu
- Department of Biological Sciences, Columbia UniversityNew YorkUnited States
| | - Molly Przeworski
- Department of Systems Biology, Columbia UniversityNew YorkUnited States
| |
Collapse
|
4
|
Acosta A, Martínez-Pacheco ML, Díaz-Barba K, Porras N, Gutiérrez-Mariscal M, Cortez D. Deciphering Ancestral Sex Chromosome Turnovers Based on Analysis of Male Mutation Bias. Genome Biol Evol 2020; 11:3054-3067. [PMID: 31605487 PMCID: PMC6823514 DOI: 10.1093/gbe/evz221] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/05/2019] [Indexed: 12/13/2022] Open
Abstract
The age of sex chromosomes is commonly obtained by comparing the substitution rates of XY gametologs. Coupled with phylogenetic reconstructions, one can refine the origin of a sex chromosome system relative to specific speciation events. However, these approaches are insufficient to determine the presence and duration of ancestral sex chromosome systems that were lost in some species. In this study, we worked with genomic and transcriptomic data from mammals and squamates and analyzed the effect of male mutation bias on X-linked sequences in these groups. We searched for signatures indicating whether monotremes shared the same sex chromosomes with placental mammals or whether pleurodonts and acrodonts had a common ancestral sex chromosome system. Our analyses indicate that platypus did not share the XY chromosomes with placental mammals, in agreement with previous work. In contrast, analyses of agamids showed that this lineage maintained the pleurodont XY chromosomes for several million years. We performed multiple simulations using different strengths of male mutation bias to confirm the results. Overall, our work shows that variations in substitution rates due to male mutation bias could be applied to uncover signatures of ancestral sex chromosome systems.
Collapse
Affiliation(s)
| | | | | | | | | | - Diego Cortez
- Center for Genome Sciences, UNAM, Cuernavaca, Mexico
| |
Collapse
|
5
|
Hulke ML, Siefert JC, Sansam CL, Koren A. Germline Structural Variations Are Preferential Sites of DNA Replication Timing Plasticity during Development. Genome Biol Evol 2019; 11:1663-1678. [PMID: 31076752 PMCID: PMC6582765 DOI: 10.1093/gbe/evz098] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/02/2019] [Indexed: 02/06/2023] Open
Abstract
The DNA replication timing program is modulated throughout development and is also one of the main factors influencing the distribution of mutation rates across the genome. However, the relationship between the mutagenic influence of replication timing and its developmental plasticity remains unexplored. Here, we studied the distribution of copy number variations (CNVs) and single nucleotide polymorphisms across the zebrafish genome in relation to changes in DNA replication timing during embryonic development in this model vertebrate species. We show that CNV sites exhibit strong replication timing plasticity during development, replicating significantly early during early development but significantly late during more advanced developmental stages. Reciprocally, genomic regions that changed their replication timing during development contained a higher proportion of CNVs than developmentally constant regions. Developmentally plastic CNV sites, in particular those that become delayed in their replication timing, were enriched for the clustered protocadherins, a set of genes important for neuronal development that have undergone extensive genetic and epigenetic diversification during zebrafish evolution. In contrast, single nucleotide polymorphism sites replicated consistently early throughout embryonic development, highlighting a unique aspect of the zebrafish genome. Our results uncover a hitherto unrecognized interface between development and evolution.
Collapse
Affiliation(s)
- Michelle L Hulke
- Department of Molecular Biology and Genetics, Cornell University
| | - Joseph C Siefert
- Cell Cycle and Cancer Biology Research Program, Oklahoma Medical Research Foundation.,Department of Cell Biology, University of Oklahoma Health Sciences Center
| | - Christopher L Sansam
- Cell Cycle and Cancer Biology Research Program, Oklahoma Medical Research Foundation.,Department of Cell Biology, University of Oklahoma Health Sciences Center
| | - Amnon Koren
- Department of Molecular Biology and Genetics, Cornell University
| |
Collapse
|
6
|
Link V, Aguilar-Gómez D, Ramírez-Suástegui C, Hurst LD, Cortez D. Male Mutation Bias Is the Main Force Shaping Chromosomal Substitution Rates in Monotreme Mammals. Genome Biol Evol 2018; 9:2198-2210. [PMID: 28922870 PMCID: PMC5604096 DOI: 10.1093/gbe/evx155] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/22/2017] [Indexed: 12/12/2022] Open
Abstract
In many species, spermatogenesis involves more cell divisions than oogenesis, and the male germline, therefore, accumulates more DNA replication errors, a phenomenon known as male mutation bias. The extent of male mutation bias (α) is estimated by comparing substitution rates of the X, Y, and autosomal chromosomes, as these chromosomes spend different proportions of their time in the germlines of the two sexes. Male mutation bias has been characterized in placental and marsupial mammals as well as birds, but analyses in monotremes failed to detect any such bias. Monotremes are an ancient lineage of egg-laying mammals with distinct biological properties, which include unique germline features. Here, we sought to assess the presence and potential characteristics of male mutation bias in platypus and the short-beaked echidna based on substitution rate analyses of X, Y, and autosomes. We established the presence of moderate male mutation bias in monotremes, corresponding to an α value of 2.12–3.69. Given that it has been unclear what proportion of the variation in substitution rates on the different chromosomal classes is really due to differential number of replications, we analyzed the influence of other confounding forces (selection, replication-timing, etc.) and found that male mutation bias is the main force explaining the between-chromosome classes differences in substitution rates. Finally, we estimated the proportion of variation at the gene level in substitution rates that is owing to replication effects and found that this phenomenon can explain >68% of these variations in monotremes, and in control species, rodents, and primates.
Collapse
Affiliation(s)
- Vivian Link
- Department of Biology, University of Fribourg, Switzerland
| | | | | | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom
| | - Diego Cortez
- Center for Genomic Sciences, UNAM, Cuernavaca, México
| |
Collapse
|
7
|
Blumenfeld B, Ben-Zimra M, Simon I. Perturbations in the Replication Program Contribute to Genomic Instability in Cancer. Int J Mol Sci 2017; 18:E1138. [PMID: 28587102 PMCID: PMC5485962 DOI: 10.3390/ijms18061138] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Revised: 05/08/2017] [Accepted: 05/21/2017] [Indexed: 12/14/2022] Open
Abstract
Cancer and genomic instability are highly impacted by the deoxyribonucleic acid (DNA) replication program. Inaccuracies in DNA replication lead to the increased acquisition of mutations and structural variations. These inaccuracies mainly stem from loss of DNA fidelity due to replication stress or due to aberrations in the temporal organization of the replication process. Here we review the mechanisms and impact of these major sources of error to the replication program.
Collapse
Affiliation(s)
- Britny Blumenfeld
- Department of Microbiology and Molecular Genetics, IMRIC, Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem 91120, Israel.
| | - Micha Ben-Zimra
- Department of Microbiology and Molecular Genetics, IMRIC, Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem 91120, Israel.
- Pharmacology and Experimental Therapeutics Unit, The Institute for Drug Research, School of Pharmacy, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 91120, Israel.
| | - Itamar Simon
- Department of Microbiology and Molecular Genetics, IMRIC, Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem 91120, Israel.
| |
Collapse
|
8
|
Price N, Graur D. Are Synonymous Sites in Primates and Rodents Functionally Constrained? J Mol Evol 2015; 82:51-64. [PMID: 26563252 DOI: 10.1007/s00239-015-9719-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2015] [Accepted: 11/04/2015] [Indexed: 11/28/2022]
Abstract
It has been claimed that synonymous sites in mammals are under selective constraint. Furthermore, in many studies the selective constraint at such sites in primates was claimed to be more stringent than that in rodents. Given the larger effective population sizes in rodents than in primates, the theoretical expectation is that selection in rodents would be more effective than that in primates. To resolve this contradiction between expectations and observations, we used processed pseudogenes as a model for strict neutral evolution, and estimated selective constraint on synonymous sites using the rate of substitution at pseudosynonymous and pseudononsynonymous sites in pseudogenes as the neutral expectation. After controlling for the effects of GC content, our results were similar to those from previous studies, i.e., synonymous sites in primates exhibited evidence for higher selective constraint that those in rodents. Specifically, our results indicated that in primates up to 24% of synonymous sites could be under purifying selection, while in rodents synonymous sites evolved neutrally. To further control for shifts in GC content, we estimated selective constraint at fourfold degenerate sites using a maximum parsimony approach. This allowed us to estimate selective constraint using mutational patterns that cause a shift in GC content (GT ↔ TG, CT ↔ TC, GA ↔ AG, and CA ↔ AC) and ones that do not (AT ↔ TA and CG ↔ GC). Using this approach, we found that synonymous sites evolve neutrally in both primates and rodents. Apparent deviations from neutrality were caused by a higher rate of C → A and C → T mutations in pseudogenes. Such differences are most likely caused by the shift in GC content experienced by pseudogenes. We conclude that previous estimates according to which 20-40% of synonymous sites in primates were under selective constraint were most likely artifacts of the biased pattern of mutation.
Collapse
Affiliation(s)
- Nicholas Price
- Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO, 80523, USA.
| | - Dan Graur
- Department of Biology and Biochemistry, University of Houston, Houston, TX, 77204-5001, USA
| |
Collapse
|
9
|
Abstract
Mutational heterogeneity must be taken into account when reconstructing evolutionary histories, calibrating molecular clocks, and predicting links between genes and disease. Selective pressures and various DNA transactions have been invoked to explain the heterogeneous distribution of genetic variation between species, within populations, and in tissue-specific tumors. To examine relationships between such heterogeneity and variations in leading- and lagging-strand replication fidelity and mismatch repair, we accumulated 40,000 spontaneous mutations in eight diploid yeast strains in the absence of selective pressure. We found that replicase error rates vary by fork direction, coding state, nucleosome proximity, and sequence context. Further, error rates and DNA mismatch repair efficiency both vary by mismatch type, responsible polymerase, replication time, and replication origin proximity. Mutation patterns implicate replication infidelity as one driver of variation in somatic and germline evolution, suggest mechanisms of mutual modulation of genome stability and composition, and predict future observations in specific cancers.
Collapse
|
10
|
Koren A. DNA replication timing: Coordinating genome stability with genome regulation on the X chromosome and beyond. Bioessays 2014; 36:997-1004. [PMID: 25138663 DOI: 10.1002/bies.201400077] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Recent studies based on next-generation DNA sequencing have revealed that the female inactive X chromosome is replicated in a rapid, unorganized manner, and undergoes increased rates of mutation. These observations link the organization of DNA replication timing to gene regulation on one hand, and to the generation of mutations on the other hand. More generally, the exceptional biology of the inactive X chromosome highlights general principles of genome replication. Cells may control replication timing by a combination of intrinsic replication origin properties, local chromatin states and global levels of replication factors, leading to a functional separation between the activity of genes and their mutation.
Collapse
Affiliation(s)
- Amnon Koren
- Department of Genetics, Harvard Medical School, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
11
|
Pant S, Weiner R, Marton MJ. Navigating the rapids: the development of regulated next-generation sequencing-based clinical trial assays and companion diagnostics. Front Oncol 2014; 4:78. [PMID: 24860780 PMCID: PMC4029014 DOI: 10.3389/fonc.2014.00078] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2013] [Accepted: 03/28/2014] [Indexed: 12/11/2022] Open
Abstract
Over the past decade, next-generation sequencing (NGS) technology has experienced meteoric growth in the aspects of platform, technology, and supporting bioinformatics development allowing its widespread and rapid uptake in research settings. More recently, NGS-based genomic data have been exploited to better understand disease development and patient characteristics that influence response to a given therapeutic intervention. Cancer, as a disease characterized by and driven by the tumor genetic landscape, is particularly amenable to NGS-based diagnostic (Dx) approaches. NGS-based technologies are particularly well suited to studying cancer disease development, progression and emergence of resistance, all key factors in the development of next-generation cancer Dxs. Yet, to achieve the promise of NGS-based patient treatment, drug developers will need to overcome a number of operational, technical, regulatory, and strategic challenges. Here, we provide a succinct overview of the state of the clinical NGS field in terms of the available clinically targeted platforms and sequencing technologies. We discuss the various operational and practical aspects of clinical NGS testing that will facilitate or limit the uptake of such assays in routine clinical care. We examine the current strategies for analytical validation and Food and Drug Administration (FDA)-approval of NGS-based assays and ongoing efforts to standardize clinical NGS and build quality control standards for the same. The rapidly evolving companion diagnostic (CDx) landscape for NGS-based assays will be reviewed, highlighting the key areas of concern and suggesting strategies to mitigate risk. The review will conclude with a series of strategic questions that face drug developers and a discussion of the likely future course of NGS-based CDx development efforts.
Collapse
Affiliation(s)
- Saumya Pant
- Merck Research Laboratories, Molecular Biomarkers and Diagnostics , Rahway, NJ , USA
| | - Russell Weiner
- Merck Research Laboratories, Molecular Biomarkers and Diagnostics , Rahway, NJ , USA
| | - Matthew J Marton
- Merck Research Laboratories, Molecular Biomarkers and Diagnostics , Rahway, NJ , USA
| |
Collapse
|
12
|
Sima J, Gilbert DM. Complex correlations: replication timing and mutational landscapes during cancer and genome evolution. Curr Opin Genet Dev 2014; 25:93-100. [PMID: 24598232 DOI: 10.1016/j.gde.2013.11.022] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Accepted: 11/29/2013] [Indexed: 12/23/2022]
Abstract
A recent flurry of reports correlates replication timing (RT) with mutation rates during both evolution and cancer. Specifically, point mutations and copy number losses correlate with late replication, while copy number gains and other rearrangements correlate with early replication. In some cases, plausible mechanisms have been proposed. Point mutation rates may reflect temporal variation in repair mechanisms. Transcription-induced double-strand breaks are expected to occur in transcriptionally active early replicating chromatin. Fusion partners are generally in close proximity, and chromatin in close proximity replicates at similar times. However, temporal enrichment of copy number gains and losses remains an enigma. Moreover, many conclusions are compromised by a lack of matched RT and sequence datasets, the filtering out of developmental variation in RT, and the use of somatic cell lines to make inferences about germline evolution.
Collapse
Affiliation(s)
- Jiao Sima
- Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA
| | - David M Gilbert
- Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA.
| |
Collapse
|
13
|
Eyre-Walker YC, Eyre-Walker A. The role of mutation rate variation and genetic diversity in the architecture of human disease. PLoS One 2014; 9:e90166. [PMID: 24587257 PMCID: PMC3937440 DOI: 10.1371/journal.pone.0090166] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2013] [Accepted: 01/28/2014] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND We have investigated the role that the mutation rate and the structure of genetic variation at a locus play in determining whether a gene is involved in disease. We predict that the mutation rate and its genetic diversity should be higher in genes associated with disease, unless all genes that could cause disease have already been identified. RESULTS Consistent with our predictions we find that genes associated with Mendelian and complex disease are substantially longer than non-disease genes. However, we find that both Mendelian and complex disease genes are found in regions of the genome with relatively low mutation rates, as inferred from intron divergence between humans and chimpanzees, and they are predicted to have similar rates of non-synonymous mutation as other genes. Finally, we find that disease genes are in regions of significantly elevated genetic diversity, even when variation in the rate of mutation is controlled for. The effect is small nevertheless. CONCLUSIONS Our results suggest that gene length contributes to whether a gene is associated with disease. However, the mutation rate and the genetic architecture of the locus appear to play only a minor role in determining whether a gene is associated with disease.
Collapse
Affiliation(s)
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
- * E-mail:
| |
Collapse
|
14
|
Juan D, Rico D, Marques-Bonet T, Fernández-Capetillo Ó, Valencia A. Late-replicating CNVs as a source of new genes. Biol Open 2013; 2:1402-11. [PMID: 24285712 PMCID: PMC3863426 DOI: 10.1242/bio.20136924] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2013] [Accepted: 10/23/2013] [Indexed: 01/09/2023] Open
Abstract
Asynchronous replication of the genome has been associated with different rates of point mutation and copy number variation (CNV) in human populations. Here, our aim was to investigate whether the bias in the generation of CNV that is associated with DNA replication timing might have conditioned the birth of new protein-coding genes during evolution. We show that genes that were duplicated during primate evolution are more commonly found among the human genes located in late-replicating CNV regions. We traced the relationship between replication timing and the evolutionary age of duplicated genes. Strikingly, we found that there is a significant enrichment of evolutionary younger duplicates in late-replicating regions of the human and mouse genome. Indeed, the presence of duplicates in late-replicating regions gradually decreases as the evolutionary time since duplication extends. Our results suggest that the accumulation of recent duplications in late-replicating CNV regions is an active process influencing genome evolution.
Collapse
Affiliation(s)
- David Juan
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Center (CNIO), Melchor Fernández Almagro 3, 28029 Madrid, Spain
| | - Daniel Rico
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Center (CNIO), Melchor Fernández Almagro 3, 28029 Madrid, Spain
| | - Tomas Marques-Bonet
- Institut Catala de Recerca i Estudis Avancats (ICREA) and Institut de Biologia Evolutiva (UPF/CSIC), Dr Aiguader 88, PRBB, 08003 Barcelona, Spain
| | - Óscar Fernández-Capetillo
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Center (CNIO), Melchor Fernández Almagro 3, 28029 Madrid, Spain
| | - Alfonso Valencia
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Center (CNIO), Melchor Fernández Almagro 3, 28029 Madrid, Spain
| |
Collapse
|
15
|
Hara Y, Imanishi T, Satta Y. Reconstructing the demographic history of the human lineage using whole-genome sequences from human and three great apes. Genome Biol Evol 2013; 4:1133-45. [PMID: 22975719 PMCID: PMC3752010 DOI: 10.1093/gbe/evs075] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The demographic history of human would provide helpful information for identifying the evolutionary events that shaped the humanity but remains controversial even in the genomic era. To settle the controversies, we inferred the speciation times (T) and ancestral population sizes (N) in the lineage leading to human and great apes based on whole-genome alignment. A coalescence simulation determined the sizes of alignment blocks and intervals between them required to obtain recombination-free blocks with a high frequency. This simulation revealed that the size of the block strongly affects the parameter inference, indicating that recombination is an important factor for achieving optimum parameter inference. From the whole genome alignments (1.9 giga-bases) of human (H), chimpanzee (C), gorilla (G), and orangutan, 100-bp alignment blocks separated by ≥5-kb intervals were sampled and subjected to estimate τ = μT and θ = 4μgN using the Markov chain Monte Carlo method, where μ is the mutation rate and g is the generation time. Although the estimated τHC differed across chromosomes, τHC and τHCG were strongly correlated across chromosomes, indicating that variation in τ is subject to variation in μ, rather than T, and thus, all chromosomes share a single speciation time. Subsequently, we estimated Ts of the human lineage from chimpanzee, gorilla, and orangutan to be 6.0–7.6, 7.6–9.7, and 15–19 Ma, respectively, assuming variable μ across lineages and chromosomes. These speciation times were consistent with the fossil records. We conclude that the speciation times in our recombination-free analysis would be conclusive and the speciation between human and chimpanzee was a single event.
Collapse
Affiliation(s)
- Yuichiro Hara
- Biomedicinal Information Research Center, National Institute of Advanced Industrial Science and Technology, Koto-ku, Tokyo, Japan
| | | | | |
Collapse
|
16
|
Koren A, Polak P, Nemesh J, Michaelson JJ, Sebat J, Sunyaev SR, McCarroll SA. Differential relationship of DNA replication timing to different forms of human mutation and variation. Am J Hum Genet 2012. [PMID: 23176822 DOI: 10.1016/j.ajhg.2012.10.018] [Citation(s) in RCA: 204] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Human genetic variation is distributed nonrandomly across the genome, though the principles governing its distribution are only partially known. DNA replication creates opportunities for mutation, and the timing of DNA replication correlates with the density of SNPs across the human genome. To enable deeper investigation of how DNA replication timing relates to human mutation and variation, we generated a high-resolution map of the human genome's replication timing program and analyzed its relationship to point mutations, copy number variations, and the meiotic recombination hotspots utilized by males and females. DNA replication timing associated with point mutations far more strongly than predicted from earlier analyses and showed a stronger relationship to transversion than transition mutations. Structural mutations arising from recombination-based mechanisms and recombination hotspots used more extensively by females were enriched in early-replicating parts of the genome, though these relationships appeared to relate more strongly to the genomic distribution of causative sequence features. These results indicate differential and sex-specific relationship of DNA replication timing to different forms of mutation and recombination.
Collapse
Affiliation(s)
- Amnon Koren
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | | | | | | | | | | | | |
Collapse
|
17
|
Nellåker C, Keane TM, Yalcin B, Wong K, Agam A, Belgard TG, Flint J, Adams DJ, Frankel WN, Ponting CP. The genomic landscape shaped by selection on transposable elements across 18 mouse strains. Genome Biol 2012; 13:R45. [PMID: 22703977 PMCID: PMC3446317 DOI: 10.1186/gb-2012-13-6-r45] [Citation(s) in RCA: 127] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2012] [Revised: 05/25/2012] [Accepted: 06/15/2012] [Indexed: 12/20/2022] Open
Abstract
Background Transposable element (TE)-derived sequence dominates the landscape of mammalian genomes and can modulate gene function by dysregulating transcription and translation. Our current knowledge of TEs in laboratory mouse strains is limited primarily to those present in the C57BL/6J reference genome, with most mouse TEs being drawn from three distinct classes, namely short interspersed nuclear elements (SINEs), long interspersed nuclear elements (LINEs) and the endogenous retrovirus (ERV) superfamily. Despite their high prevalence, the different genomic and gene properties controlling whether TEs are preferentially purged from, or are retained by, genetic drift or positive selection in mammalian genomes remain poorly defined. Results Using whole genome sequencing data from 13 classical laboratory and 4 wild-derived mouse inbred strains, we developed a comprehensive catalogue of 103,798 polymorphic TE variants. We employ this extensive data set to characterize TE variants across the Mus lineage, and to infer neutral and selective processes that have acted over 2 million years. Our results indicate that the majority of TE variants are introduced though the male germline and that only a minority of TE variants exert detectable changes in gene expression. However, among genes with differential expression across the strains there are twice as many TE variants identified as being putative causal variants as expected. Conclusions Most TE variants that cause gene expression changes appear to be purged rapidly by purifying selection. Our findings demonstrate that past TE insertions have often been highly deleterious, and help to prioritize TE variants according to their likely contribution to gene expression or phenotype variation.
Collapse
Affiliation(s)
- Christoffer Nellåker
- MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, UK.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Weber CC, Pink CJ, Hurst LD. Late-replicating domains have higher divergence and diversity in Drosophila melanogaster. Mol Biol Evol 2011; 29:873-82. [PMID: 22046001 DOI: 10.1093/molbev/msr265] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Several reports from mammals indicate that an increase in the mutation rate in late-replicating regions may, in part, be responsible for the observed genomic heterogeneity in neutral substitution rates and levels of diversity, although the mechanisms for this remain poorly understood. Recent evidence also suggests that late replication is associated with high mutability in yeast. This then raises the question as to whether a similar effect is operating across all eukaryotes. Limited evidence from one chromosome arm in Drosophila melanogaster suggests the opposite pattern, with regions overlapping early-firing origins showing increased levels of diversity and divergence. Given the availability of genome-wide replication timing profiles for D. melanogaster, we now return to this issue. Consistent with what is seen in other taxa, we find that divergence at synonymous sites in exon cores, as well as divergence at putatively unconstrained intronic sites, is elevated in late-replicating regions. Analysis of genes with low codon usage bias suggests a ∼30% difference in mutation rate between the earliest and the latest replicating sequence. Intronic sequence suggests a more modest difference. We additionally show that an increase in diversity in late-replicating sequences is not owing to replication timing covarying with the local recombination rate. If anything, the effects of recombination mask the impact of replication timing. We conclude that, contrary to prior reports and consistent with what is seen in mammals and yeast, there is indeed a relationship between rates of nucleotide divergence and diversity and replication timing that is consistent with an increase in the mutation rate during late S-phase in D. melanogaster. It is therefore plausible that such an effect might be common among eukaryotes. The result may have implications for the inference of positive selection.
Collapse
Affiliation(s)
- Claudia C Weber
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | | | | |
Collapse
|
19
|
Abstract
It has been known for many years that the mutation rate varies across the genome. However, only with the advent of large genomic data sets is the full extent of this variation becoming apparent. The mutation rate varies over many different scales, from adjacent sites to whole chromosomes, with the strongest variation seen at the smallest scales. Some of these patterns have clear mechanistic bases, but much of the rate variation remains unexplained, and some of it is deeply perplexing. Variation in the mutation rate has important implications in evolutionary biology and underexplored implications for our understanding of hereditary disease and cancer.
Collapse
|
20
|
Late replicating domains are highly recombining in females but have low male recombination rates: implications for isochore evolution. PLoS One 2011; 6:e24480. [PMID: 21949720 PMCID: PMC3176772 DOI: 10.1371/journal.pone.0024480] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2011] [Accepted: 08/11/2011] [Indexed: 01/01/2023] Open
Abstract
In mammals sequences that are either late replicating or highly recombining have high rates of evolution at putatively neutral sites. As early replicating domains and highly recombining domains both tend to be GC rich we a priori expect these two variables to covary. If so, the relative contribution of either of these variables to the local neutral substitution rate might have been wrongly estimated owing to covariance with the other. Against our expectations, we find that sex-averaged recombination rates show little or no correlation with replication timing, suggesting that they are independent determinants of substitution rates. However, this result masks significant sex-specific complexity: late replicating domains tend to have high recombination rates in females but low recombination rates in males. That these trends are antagonistic explains why sex-averaged recombination is not correlated with replication timing. This unexpected result has several important implications. First, although both male and female recombination rates covary significantly with intronic substitution rates, the magnitude of this correlation is moderately underestimated for male recombination and slightly overestimated for female recombination, owing to covariance with replicating timing. Second, the result could explain why male recombination is strongly correlated with GC content but female recombination is not. If to explain the correlation between GC content and replication timing we suppose that late replication forces reduced GC content, then GC promotion by biased gene conversion during female recombination is partly countered by the antagonistic effect of later replicating sequence tending increase AT content. Indeed, the strength of the correlation between female recombination rate and local GC content is more than doubled by control for replication timing. Our results underpin the need to consider sex-specific recombination rates and potential covariates in analysis of GC content and rates of evolution.
Collapse
|
21
|
Abstract
Mutation rates vary significantly within the genome and across species. Recent studies revealed a long suspected replication-timing effect on mutation rate, but the mechanisms that regulate the increase in mutation rate as the genome is replicated remain unclear. Evidence is emerging, however, that DNA repair systems, in general, are less efficient in late replicating heterochromatic regions compared to early replicating euchromatic regions of the genome. At the same time, mutation rates in both vertebrates and invertebrates have been shown to vary with generation time (GT). GT is correlated with genome size, which suggests a possible nucleotypic effect on species-specific mutation rates. These and other observations all converge on a role for DNA replication checkpoints in modulating generation times and mutation rates during the DNA synthetic phase (S phase) of the cell cycle. The following will examine the potential role of the intra-S checkpoint in regulating cell cycle times (GT) and mutation rates in eukaryotes. This article was published online on August 5, 2011. An error was subsequently identified. This notice is included in the online and print versions to indicate that both have been corrected October 4, 2011.
Collapse
Affiliation(s)
- John Herrick
- Department of Physics, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia, Canada.
| |
Collapse
|
22
|
Abstract
Mechanisms regulating where and when eukaryotic DNA replication initiates remain a mystery. Recently, genome-scale methods have been brought to bear on this problem. The identification of replication origins and their associated proteins in yeasts is a well-integrated investigative tool, but corresponding data sets from multicellular organisms are scarce. By contrast, standardized protocols for evaluating replication timing have generated informative data sets for most eukaryotic systems. Here, I summarize the genome-scale methods that are most frequently used to analyse replication in eukaryotes, the kinds of questions each method can address and the technical hurdles that must be overcome to gain a complete understanding of the nature of eukaryotic replication origins.
Collapse
|
23
|
Comparative analysis of DNA replication timing reveals conserved large-scale chromosomal architecture. PLoS Genet 2010; 6:e1001011. [PMID: 20617169 PMCID: PMC2895651 DOI: 10.1371/journal.pgen.1001011] [Citation(s) in RCA: 128] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2009] [Accepted: 06/01/2010] [Indexed: 01/02/2023] Open
Abstract
Recent evidence suggests that the timing of DNA replication is coordinated across megabase-scale domains in metazoan genomes, yet the importance of this aspect of genome organization is unclear. Here we show that replication timing is remarkably conserved between human and mouse, uncovering large regions that may have been governed by similar replication dynamics since these species have diverged. This conservation is both tissue-specific and independent of the genomic G+C content conservation. Moreover, we show that time of replication is globally conserved despite numerous large-scale genome rearrangements. We systematically identify rearrangement fusion points and demonstrate that replication time can be locally diverged at these loci. Conversely, rearrangements are shown to be correlated with early replication and physical chromosomal proximity. These results suggest that large chromosomal domains of coordinated replication are shuffled by evolution while conserving the large-scale nuclear architecture of the genome. During S-phase of the cell cycle, chromosomal DNA is replicated in a complex process involving the coordinated activity of thousands of replication forks, each of which duplicates a long stretch of DNA. Recent experiments revealed that the genome is replicating as a mosaic of large-scale early and late chromosomal domains and that this high-level domain organization is correlated with genomic properties like gene density and nucleotide composition. We compared genome-wide replication time maps of compatible human and mouse cells and revealed that their organization into replication domains is highly conserved despite the numerous large-scale genome rearrangements separating the two species. Analysis of recent chromosomal interaction data shows that regions with similar time of replication are more frequently interacting with each other than expected. The data also show that evolutionary rearrangements have predominantly occurred between regions that have similar time of replication and higher-than-expected chromosomal proximity. Our data suggests that the genome, while being continuously rearranged by evolution, maintains a conserved domain organization. Whether this conservation is driven by selection, or is a consequence of the rearrangement process itself, can be resolved by enhancing the comparative approach proposed here.
Collapse
|