Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Green P, Ewing B, Miller W, Thomas PJ, Green ED. Transcription-associated mutational asymmetry in mammalian evolution. Nat Genet 2003;33:514-7. [PMID: 12612582 DOI: 10.1038/ng1103] [Citation(s) in RCA: 202] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2002] [Accepted: 01/19/2003] [Indexed: 11/09/2022]

For:	Green P, Ewing B, Miller W, Thomas PJ, Green ED. Transcription-associated mutational asymmetry in mammalian evolution. Nat Genet 2003;33:514-7. [PMID: 12612582 DOI: 10.1038/ng1103] [Citation(s) in RCA: 202] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2002] [Accepted: 01/19/2003] [Indexed: 11/09/2022]

Number

Cited by Other Article(s)

101

Marsolier-Kergoat MC, Goldar A. DNA replication induces compositional biases in yeast. Mol Biol Evol 2011;29:893-904. [PMID: 21948086 DOI: 10.1093/molbev/msr240] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

102

Voets AM, van den Bosch BJC, Stassen AP, Hendrickx AT, Hellebrekers DM, Van Laer L, Van Eyken E, Van Camp G, Pyle A, Baudouin SV, Chinnery PF, Smeets HJM. Large scale mtDNA sequencing reveals sequence and functional conservation as major determinants of homoplasmic mtDNA variant distribution. Mitochondrion 2011;11:964-72. [PMID: 21946566 DOI: 10.1016/j.mito.2011.09.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2010] [Revised: 04/19/2011] [Accepted: 09/09/2011] [Indexed: 02/07/2023]

103

Comparative genomic analysis of dinucleotide repeats in Tritryps. Gene 2011;487:29-37. [PMID: 21824509 DOI: 10.1016/j.gene.2011.07.022] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2011] [Revised: 07/12/2011] [Accepted: 07/14/2011] [Indexed: 12/29/2022]

104

Bainbridge MN, Wang M, Wu Y, Newsham I, Muzny DM, Jefferies JL, Albert TJ, Burgess DL, Gibbs RA. Targeted enrichment beyond the consensus coding DNA sequence exome reveals exons with higher variant densities. Genome Biol 2011;12:R68. [PMID: 21787409 PMCID: PMC3218830 DOI: 10.1186/gb-2011-12-7-r68] [Citation(s) in RCA: 166] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2010] [Revised: 01/16/2011] [Accepted: 07/25/2011] [Indexed: 01/12/2023] Open

105

Mugal CF, Ellegren H. Substitution rate variation at human CpG sites correlates with non-CpG divergence, methylation level and GC content. Genome Biol 2011;12:R58. [PMID: 21696599 PMCID: PMC3218846 DOI: 10.1186/gb-2011-12-6-r58] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2011] [Revised: 05/04/2011] [Accepted: 06/22/2011] [Indexed: 01/08/2023] Open

106

Du P, Yang Y, Wang H, Liu D, Gao GF, Chen C. A large scale comparative genomic analysis reveals insertion sites for newly acquired genomic islands in bacterial genomes. BMC Microbiol 2011;11:135. [PMID: 21672261 PMCID: PMC3148964 DOI: 10.1186/1471-2180-11-135] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2010] [Accepted: 06/15/2011] [Indexed: 01/15/2023] Open

107

Calistri E, Livi R, Buiatti M. Evolutionary trends of GC/AT distribution patterns in promoters. Mol Phylogenet Evol 2011;60:228-35. [PMID: 21554969 DOI: 10.1016/j.ympev.2011.04.015] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2010] [Revised: 03/25/2011] [Accepted: 04/17/2011] [Indexed: 11/18/2022]

108

Unexpected functional similarities between gatekeeper tumour suppressor genes and proto-oncogenes revealed by systems biology. J Hum Genet 2011;56:369-76. [PMID: 21368766 DOI: 10.1038/jhg.2011.21] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

109

Chen CL, Duquenne L, Audit B, Guilbaud G, Rappailles A, Baker A, Huvet M, d'Aubenton-Carafa Y, Hyrien O, Arneodo A, Thermes C. Replication-associated mutational asymmetry in the human genome. Mol Biol Evol 2011;28:2327-37. [PMID: 21368316 DOI: 10.1093/molbev/msr056] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open

Abstract

During evolution, mutations occur at rates that can differ between the two DNA strands. In the human genome, nucleotide substitutions occur at different rates on the transcribed and non-transcribed strands that may result from transcription-coupled repair. These mutational asymmetries generate transcription-associated compositional skews. To date, the existence of such asymmetries associated with replication has not yet been established. Here, we compute the nucleotide substitution matrices around replication initiation zones identified as sharp peaks in replication timing profiles and associated with abrupt jumps in the compositional skew profile. We show that the substitution matrices computed in these regions fully explain the jumps in the compositional skew profile when crossing initiation zones. In intergenic regions, we observe mutational asymmetries measured as differences between complementary substitution rates; their sign changes when crossing initiation zones. These mutational asymmetries are unlikely to result from cryptic transcription but can be explained by a model based on replication errors and strand-biased repair. In transcribed regions, mutational asymmetries associated with replication superimpose on the previously described mutational asymmetries associated with transcription. We separate the substitution asymmetries associated with both mechanisms, which allows us to determine for the first time in eukaryotes, the mutational asymmetries associated with replication and to reevaluate those associated with transcription. Replication-associated mutational asymmetry may result from unequal rates of complementary base misincorporation by the DNA polymerases coupled with DNA mismatch repair (MMR) acting with different efficiencies on the leading and lagging strands. Replication, acting in germ line cells during long evolutionary times, contributed equally with transcription to produce the present abrupt jumps in the compositional skew. These results demonstrate that DNA replication is one of the major processes that shape human genome composition.

Collapse

110

Steele EJ, Williamson JF, Lester S, Stewart BJ, Millman JA, Carnegie P, Lindley RA, Pain GN, Dawkins RL. Genesis of ancestral haplotypes: RNA modifications and reverse transcription-mediated polymorphisms. Hum Immunol 2010;72:283-293.e1. [PMID: 21156194 DOI: 10.1016/j.humimm.2010.12.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2010] [Revised: 11/15/2010] [Accepted: 12/06/2010] [Indexed: 11/30/2022]

Abstract

Understanding the genesis of the block haplotype structure of the genome is a major challenge. With the completion of the sequencing of the Human Genome and the initiation of the HapMap project the concept that the chromosomes of the mammalian genome are a mosaic, or patchwork, of conserved extended block haplotype sequences is now accepted by the mainstream genomics research community. Ancestral Haplotypes (AHs) can be viewed as a recombined string of smaller Polymorphic Frozen Blocks (PFBs). How have such variant extended DNA sequence tracts emerged in evolution? Here the relevant literature on the problem is reviewed from various fields of molecular and cell biology particularly molecular immunology and comparative and functional genomics. Based on our synthesis we then advance a testable molecular and cellular model. A critical part of the analysis concerns the origin of the strand biased mutation signatures in the transcribed regions of the human and higher primate genome, A-to-G versus T-to-C (ratio ∼ 1.5 fold) and C-to-T versus G-to-A (≥ 1.5 fold). A comparison and evaluation of the current state of the fields of immunoglobulin Somatic Hypermutation (SHM) and Transcription-Coupled DNA Repair focused on how mutations in newly synthesized RNA might be copied back to DNA thus accounting for some of the genome-wide strand biases (e.g., the A-to-G vs T-to-C component of the strand biased spectrum). We hypothesize that the genesis of PFBs and extended AHs occurs during mutagenic episodes in evolution (e.g., retroviral infections) and that many of the critical DNA sequence diversifying events occur first at the RNA level, e.g., recombination between RNA strings resulting in tandem and dispersed RNA duplications (retroduplications), RNA mutations via adenosine-to-inosine pre-mRNA editing events as well as error prone RNA synthesis. These are then copied back into DNA by a cellular reverse transcription process (also likely to be error-prone) that we have called "reverse transcription-mediated long DNA conversion." Finally we suggest that all these activities and others can be envisaged as being brought physically under the umbrella of special sites in the nucleus involved in transcription known as "transcription factories."

Collapse

111

Nakken S, Rødland EA, Hovig E. Impact of DNA physical properties on local sequence bias of human mutation. Hum Mutat 2010;31:1316-25. [PMID: 20886615 DOI: 10.1002/humu.21371] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2010] [Accepted: 08/31/2010] [Indexed: 01/07/2023]

112

Weber CC, Hurst LD. Intronic AT skew is a defendable proxy for germline transcription but does not predict crossing-over or protein evolution rates in Drosophila melanogaster. J Mol Evol 2010;71:415-26. [PMID: 20938653 DOI: 10.1007/s00239-010-9395-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2010] [Accepted: 09/17/2010] [Indexed: 01/28/2023]

113

Wei SJ, Shi M, Chen XX, Sharkey MJ, van Achterberg C, Ye GY, He JH. New views on strand asymmetry in insect mitochondrial genomes. PLoS One 2010;5:e12708. [PMID: 20856815 PMCID: PMC2939890 DOI: 10.1371/journal.pone.0012708] [Citation(s) in RCA: 198] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2009] [Accepted: 08/20/2010] [Indexed: 01/16/2023] Open

114

Overlapping codes within protein-coding sequences. Genome Res 2010;20:1582-9. [PMID: 20841429 DOI: 10.1101/gr.105072.110] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

115

Mutation biases and mutation rate variation around very short human microsatellites revealed by human-chimpanzee-orangutan genomic sequence alignments. J Mol Evol 2010;71:192-201. [PMID: 20700734 DOI: 10.1007/s00239-010-9377-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2010] [Accepted: 07/26/2010] [Indexed: 01/21/2023]

116

Baele G, Van de Peer Y, Vansteelandt S. Modelling the ancestral sequence distribution and model frequencies in context-dependent models for primate non-coding sequences. BMC Evol Biol 2010;10:244. [PMID: 20698960 PMCID: PMC2928787 DOI: 10.1186/1471-2148-10-244] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2010] [Accepted: 08/10/2010] [Indexed: 12/04/2022] Open

Abstract

Background

Recent approaches for context-dependent evolutionary modelling assume that the evolution of a given site depends upon its ancestor and that ancestor's immediate flanking sites. Because such dependency pattern cannot be imposed on the root sequence, we consider the use of different orders of Markov chains to model dependence at the ancestral root sequence. Root distributions which are coupled to the context-dependent model across the underlying phylogenetic tree are deemed more realistic than decoupled Markov chains models, as the evolutionary process is responsible for shaping the composition of the ancestral root sequence.

Results

We find strong support, in terms of Bayes Factors, for using a second-order Markov chain at the ancestral root sequence along with a context-dependent model throughout the remainder of the phylogenetic tree in an ancestral repeats dataset, and for using a first-order Markov chain at the ancestral root sequence in a pseudogene dataset. Relaxing the assumption of a single context-independent set of independent model frequencies as presented in previous work, yields a further drastic increase in model fit. We show that the substitution rates associated with the CpG-methylation-deamination process can be modelled through context-dependent model frequencies and that their accuracy depends on the (order of the) Markov chain imposed at the ancestral root sequence. In addition, we provide evidence that this approach (which assumes that root distribution and evolutionary model are decoupled) outperforms an approach inspired by the work of Arndt et al., where the root distribution is coupled to the evolutionary model. We show that the continuous-time approximation of Hwang and Green has stronger support in terms of Bayes Factors, but the parameter estimates show minimal differences.

Conclusions

We show that the combination of a dependency scheme at the ancestral root sequence and a context-dependent evolutionary model across the remainder of the tree allows for accurate estimation of the model's parameters. The different assumptions tested in this manuscript clearly show that designing accurate context-dependent models is a complex process, with many different assumptions that require validation. Further, these assumptions are shown to change across different datasets, making the search for an adequate model for a given dataset quite challenging.

Collapse

117

McVicker G, Green P. Genomic signatures of germline gene expression. Genome Res 2010;20:1503-11. [PMID: 20686123 DOI: 10.1101/gr.106666.110] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

118

Baele G, Van de Peer Y, Vansteelandt S. Using non-reversible context-dependent evolutionary models to study substitution patterns in primate non-coding sequences. J Mol Evol 2010;71:34-50. [PMID: 20623275 DOI: 10.1007/s00239-010-9362-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2009] [Accepted: 05/26/2010] [Indexed: 11/28/2022]

119

Polak P, Querfurth R, Arndt PF. The evolution of transcription-associated biases of mutations across vertebrates. BMC Evol Biol 2010;10:187. [PMID: 20565875 PMCID: PMC2927911 DOI: 10.1186/1471-2148-10-187] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2009] [Accepted: 06/18/2010] [Indexed: 02/03/2024] Open

Abstract

Background

The interplay between transcription and mutational processes can lead to particular mutation patterns in transcribed regions of the genome. Transcription introduces several biases in mutational patterns; in particular it invokes strand specific mutations. In order to understand the forces that have shaped transcripts during evolution, one has to study mutation patterns associated with transcription across animals.

Results

Using multiple alignments of related species we estimated the regional single-nucleotide substitution patterns along genes in four vertebrate taxa: primates, rodents, laurasiatheria and bony fishes. Our analysis is focused on intronic and intergenic regions and reveals differences in the patterns of substitution asymmetries between mammals and fishes. In mammals, the levels of asymmetries are stronger for genes starting within CpG islands than in genes lacking this property. In contrast to all other species analyzed, we found a mutational pressure in dog and stickleback, promoting an increase of GC-contents in the proximity to transcriptional start sites.

Conclusions

We propose that the asymmetric patterns in transcribed regions are results of transcription associated mutagenic processes and transcription coupled repair, which both seem to evolve in a taxon related manner. We also discuss alternative mechanisms that can generate strand biases and involves error prone DNA polymerases and reverse transcription. A localized increase of the GC content near the transcription start site is a signature of biased gene conversion (BGC) that occurs during recombination and heteroduplex formation. Since dog and stickleback are known to be subject to rapid adaptations due to population bottlenecks and breeding, we further hypothesize that an increase in recombination rates near gene starts has been part of an adaptive process.

Collapse

120

Walser JC, Furano AV. The mutational spectrum of non-CpG DNA varies with CpG content. Genome Res 2010;20:875-82. [PMID: 20498119 DOI: 10.1101/gr.103283.109] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

121

Kim H, Lee BS, Tomita M, Kanai A. Transcription-associated mutagenesis increases protein sequence diversity more effectively than does random mutagenesis in Escherichia coli. PLoS One 2010;5:e10567. [PMID: 20479947 PMCID: PMC2866735 DOI: 10.1371/journal.pone.0010567] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2009] [Accepted: 04/19/2010] [Indexed: 01/15/2023] Open

Abstract

Background

During transcription, the nontranscribed DNA strand becomes single-stranded DNA (ssDNA), which can form secondary structures. Unpaired bases in the ssDNA are less protected from mutagens and hence experience more mutations than do paired bases. These mutations are called transcription-associated mutations. Transcription-associated mutagenesis is increased under stress and depends on the DNA sequence. Therefore, selection might significantly influence protein-coding sequences in terms of the transcription-associated mutability per transcription event under stress to improve the survival of Escherichia coli.

Methodology/Principal Findings

The mutability index (MI) was developed by Wright et al. to estimate the relative transcription-associated mutability of bases per transcription event. Using the most stable fold of each ssDNA that have an average length n, MI was defined as (the number of folds in which the base is unpaired)/n×(highest –ΔG of all n folds in which the base is unpaired), where ΔG is the free energy. The MI values show a significant correlation with mutation data under stress but not with spontaneous mutations in E. coli. Protein sequence diversity is preferred under stress but not under favorable conditions. Therefore, we evaluated the selection pressure on MI in terms of the protein sequence diversity for all the protein-coding sequences in E. coli. The distributions of the MI values were lower at bases that could be substituted with each of the other three bases without affecting the amino acid sequence than at bases that could not be so substituted. Start codons had lower distributions of MI values than did nonstart codons.

Conclusions/Significance

Our results suggest that the majority of protein-coding sequences have evolved to promote protein sequence diversity and to reduce gene knockout under stress. Consequently, transcription-associated mutagenesis increases protein sequence diversity more effectively than does random mutagenesis under stress. Nonrandom transcription-associated mutagenesis under stress should improve the survival of E. coli.

Collapse

122

Kondrashov FA, Kondrashov AS. Measurements of spontaneous rates of mutations in the recent past and the near future. Philos Trans R Soc Lond B Biol Sci 2010;365:1169-76. [PMID: 20308091 PMCID: PMC2871817 DOI: 10.1098/rstb.2009.0286] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

123

Eory L, Halligan DL, Keightley PD. Distributions of selectively constrained sites and deleterious mutation rates in the hominid and murid genomes. Mol Biol Evol 2010;27:177-92. [PMID: 19759235 DOI: 10.1093/molbev/msp219] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Abstract

Protein-coding sequences make up only about 1% of the mammalian genome. Much of the remaining 99% has been long assumed to be junk DNA, with little or no functional significance. Here, we show that in hominids, a group with historically low effective population sizes, all classes of noncoding DNA evolve more slowly than ancestral transposable elements and so appear to be subject to significant evolutionary constraints. Under the nearly neutral theory, we expected to see lower levels of selective constraints on most sequence types in hominids than murids, a group that is thought to have a higher effective population size. We found that this is the case for many sequence types examined, the most extreme example being 5'UTRs, for which constraint in hominids is only about one-third that of murids. Surprisingly, however, we observed higher constraints for some sequence types in hominids, notably 4-fold sites, where constraint is more than twice as high as in murids. This implies that more than about one-fifth of mutations at 4-fold sites are effectively selected against in hominids. The higher constraint at 4-fold sites in hominids suggests a more complex protein-coding gene structure than murids and indicates that methods for detecting selection on protein-coding sequences (e.g., using the d(N)/d(S) ratio), with 4-fold sites as a neutral standard, may lead to biased estimates, particularly in hominids. Our constraint estimates imply that 5.4% of nucleotide sites in the human genome are subject to effective negative selection and that there are three times as many constrained sites within noncoding sequences as within protein-coding sequences. Including coding and noncoding sites, we estimate that the genomic deleterious mutation rate U = 4.2. The mutational load predicted under a multiplicative model is therefore about 99% in hominids.

Collapse

124

Ying H, Epps J, Williams R, Huttley G. Evidence that localized variation in primate sequence divergence arises from an influence of nucleosome placement on DNA repair. Mol Biol Evol 2010;27:637-49. [PMID: 19843619 PMCID: PMC2822288 DOI: 10.1093/molbev/msp253] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

125

Geraci G, D'Elia I, del Gaudio R, Di Giaimo R. Evidence of genetic instability in tumors and normal nearby tissues. PLoS One 2010;5:e9343. [PMID: 20186333 PMCID: PMC2826410 DOI: 10.1371/journal.pone.0009343] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2009] [Accepted: 01/18/2010] [Indexed: 11/18/2022] Open

Abstract

BACKGROUND

Comprehensive analyses have recently been performed on many human cancer tissues, leading to the identification of a number of mutated genes but providing no information on the variety of mutations present in each of them. This information is of interest to understand the possible origin of gene mutations that cause tumors.

METHODOLOGY/PRINCIPAL FINDINGS

We have analyzed the sequence heterogeneity of the transcripts of the human HPRT and G6PD single copy genes that are not considered tumor markers. Analyses have been performed on different colon cancers and on the nearby histologically normal tissues of two male patients. Several copies of each cDNA, which were produced by cloning the RT-PCR-amplified fragments of the specific mRNA, have been sequenced. Similar analyses have been performed on blood samples of two ostensibly healthy males as reference controls. The sequence heterogeneity of the HPRT and G6PD genes was also determined on DNA from tumor tissues. The employed analytical approach revealed the presence of low-frequency mutations not detectable by other procedures. The results show that genetic heterogeneity is detectable in HPRT and G6PD transcripts in both tumors and nearby healthy tissues of the two studied colon tumors. Similar frequencies of mutations are observed in patient genomic DNA, indicating that mutations have a somatic origin. HPRT transcripts show genetic heterogeneity also in healthy individuals, in agreement with previous results on human T-cells, while G6PD transcript heterogeneity is a characteristic of the patient tissues. Interestingly, data on TP53 show little, if any, heterogeneity in the same tissues.

CONCLUSIONS/SIGNIFICANCE

These findings show that genetic heterogeneity is a peculiarity not only of cancer cells but also of the normal tissue where a tumor arises.

Collapse

126

Helleday T. Mutagenesis: mutating a gene while reading it. Curr Biol 2010;20:R57-8. [PMID: 20129038 DOI: 10.1016/j.cub.2009.11.047] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

127

Mugal CF, Wolf JBW, von Grünberg HH, Ellegren H. Conservation of neutral substitution rate and substitutional asymmetries in mammalian genes. Genome Biol Evol 2010;2:19-28. [PMID: 20333222 PMCID: PMC2839347 DOI: 10.1093/gbe/evp056] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/22/2009] [Indexed: 12/21/2022] Open

128

Pink CJ, Hurst LD. Timing of replication is a determinant of neutral substitution rates but does not explain slow Y chromosome evolution in rodents. Mol Biol Evol 2009;27:1077-86. [PMID: 20026481 DOI: 10.1093/molbev/msp314] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

129

Relative mutation rates of each nucleotide for another estimated from allele frequency spectra at human gene loci. Genet Res (Camb) 2009;91:293-303. [PMID: 19640324 DOI: 10.1017/s0016672309990164] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open

Abstract

This study aims to comprehensively examine the mutation rates of one base for another in human gene loci. In contrast to most previous efforts based on divergence data from untranscribed regions, the present study employs the basic theory of the reversible recurrent mutation model using large-scale, high-quality re-sequencing data from public databases of gene loci. Population mutation parameters (4Nnu and 4Nmu) are obtained for each pair of base substitutions. The estimated parameters show good strand reversal symmetry, supporting the existence of mutation-drift equilibrium. Analysis of specific gene regions including mRNA, coding sequence (CDS), 5'-untranslated region (5'-UTRs), 3'-UTR and intron shows that there are clear differences in the mutation rates of each base for another depending on the location of the base in question. Results from analyses that take the adjacent bases into account exhibit excellent strand reversal symmetry, confirming that the identity of an adjacent base influences mutation rates. The CpG to TpG (or CpG to CpA) substitution is found at a rate approximately seven-fold higher than the reverse transition in intron regions due to cytosine deamination, but the effect is strongly reduced in mRNA regions and almost entirely lost in 5'-UTRs. However, from the overall increased transitions in sites other than CpGs and the proportion of CpGs in the total sequence, CpG methylation is not the main factor responsible for the increased rate of transitions as compared with transversions. In this report, after adjusting average mutation rates to the sequence compositions, no substitution bias is found between A+T and C+G, indicating base composition equilibrium in human gene loci. Population differences are also identified between groups of people of African and European descent, presumably due to past population histories. By applying the basic theory of population genetics to re-sequenced data, this study contributes new, detailed information regarding mutations in human gene regions.

Collapse

130

Mutation patterns in cancer genomes. Proc Natl Acad Sci U S A 2009;106:21766-70. [PMID: 19995982 DOI: 10.1073/pnas.0912499106] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

131

Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res 2009;20:110-21. [PMID: 19858363 DOI: 10.1101/gr.097857.109] [Citation(s) in RCA: 1515] [Impact Index Per Article: 101.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

132

Increased rate of human mutations where DNA and RNA polymerases collide. Trends Genet 2009;25:523-7. [PMID: 19853958 DOI: 10.1016/j.tig.2009.10.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2009] [Revised: 10/05/2009] [Accepted: 10/05/2009] [Indexed: 12/27/2022]

133

Yap VB, Lindsay H, Easteal S, Huttley G. Estimates of the effect of natural selection on protein-coding content. Mol Biol Evol 2009;27:726-34. [PMID: 19815689 PMCID: PMC2822286 DOI: 10.1093/molbev/msp232] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open

134

Polak P, Arndt PF. Long-range bidirectional strand asymmetries originate at CpG islands in the human genome. Genome Biol Evol 2009;1:189-97. [PMID: 20333189 PMCID: PMC2817419 DOI: 10.1093/gbe/evp024] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/22/2009] [Indexed: 12/24/2022] Open

135

Poptsova MS, Larionov SA, Ryadchenko EV, Rybalko SD, Zakharov IA, Loskutov A. Hidden chromosome symmetry: in silico transformation reveals symmetry in 2D DNA walk trajectories of 671 chromosomes. PLoS One 2009;4:e6396. [PMID: 19636424 PMCID: PMC2712679 DOI: 10.1371/journal.pone.0006396] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2009] [Accepted: 06/23/2009] [Indexed: 11/18/2022] Open

136

Understanding what determines the frequency and pattern of human germline mutations. Nat Rev Genet 2009;10:478-88. [PMID: 19488047 DOI: 10.1038/nrg2529] [Citation(s) in RCA: 98] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

137

Kvikstad EM, Chiaromonte F, Makova KD. Ride the wavelet: A multiscale analysis of genomic contexts flanking small insertions and deletions. Genome Res 2009;19:1153-64. [PMID: 19502380 DOI: 10.1101/gr.088922.108] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

138

Pink CJ, Swaminathan SK, Dunham I, Rogers J, Ward A, Hurst LD. Evidence that replication-associated mutation alone does not explain between-chromosome differences in substitution rates. Genome Biol Evol 2009;1:13-22. [PMID: 20333173 PMCID: PMC2817397 DOI: 10.1093/gbe/evp001] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/05/2009] [Indexed: 12/12/2022] Open

139

Duret L. Mutation patterns in the human genome: more variable than expected. PLoS Biol 2009;7:e1000028. [PMID: 19192948 PMCID: PMC2634789 DOI: 10.1371/journal.pbio.1000028] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open

140

Hodgkinson A, Ladoukakis E, Eyre-Walker A. Cryptic variation in the human mutation rate. PLoS Biol 2009;7:e1000027. [PMID: 19192947 PMCID: PMC2634788 DOI: 10.1371/journal.pbio.1000027] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2008] [Accepted: 12/12/2008] [Indexed: 11/18/2022] Open

141

Sasaki S, Mello CC, Shimada A, Nakatani Y, Hashimoto SI, Ogawa M, Matsushima K, Gu SG, Kasahara M, Ahsan B, Sasaki A, Saito T, Suzuki Y, Sugano S, Kohara Y, Takeda H, Fire A, Morishita S. Chromatin-associated periodicity in genetic variation downstream of transcriptional start sites. Science 2009;323:401-4. [PMID: 19074313 PMCID: PMC2757552 DOI: 10.1126/science.1163183] [Citation(s) in RCA: 108] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

142

Necsulea A, Guillet C, Cadoret JC, Prioleau MN, Duret L. The relationship between DNA replication and human genome organization. Mol Biol Evol 2009;26:729-41. [PMID: 19126867 DOI: 10.1093/molbev/msn303] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

143

Lindsay H, Yap VB, Ying H, Huttley GA. Pitfalls of the most commonly used models of context dependent substitution. Biol Direct 2008;3:52. [PMID: 19087239 PMCID: PMC2628887 DOI: 10.1186/1745-6150-3-52] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2008] [Accepted: 12/16/2008] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Neighboring nucleotides exert a striking influence on mutation, with the hypermutability of CpG dinucleotides in many genomes being an exemplar. Among the approaches employed to measure the relative importance of sequence neighbors on molecular evolution have been continuous-time Markov process models for substitutions that treat sequences as a series of independent tuples. The most widely used examples are the codon substitution models. We evaluated the suitability of derivatives of the nucleotide frequency weighted (hereafter NF) and tuple frequency weighted (hereafter TF) models for measuring sequence context dependent substitution. Critical properties we address are their relationships to an independent nucleotide process and the robustness of parameter estimation to changes in sequence composition. We then consider the impact on inference concerning dinucleotide substitution processes from application of these two forms to intron sequence alignments from primates.

RESULTS

We prove that the NF form always nests the independent nucleotide process and that this is not true for the TF form. As a consequence, using TF to study context effects can be misleading, which is shown by both theoretical calculations and simulations. We describe a simple example where a context parameter estimated under TF is confounded with composition terms unless all sequence states are equi-frequent. We illustrate this for the dinucleotide case by simulation under a nucleotide model, showing that the TF form identifies a CpG effect when none exists. Our analysis of primate introns revealed that the effect of nucleotide neighbors is over-estimated under TF compared with NF. Parameter estimates for a number of contexts are also strikingly discordant between the two model forms.

CONCLUSION

Our results establish that the NF form should be used for analysis of independent-tuple context dependent processes. Although neighboring effects in general are still important, prominent influences such as the elevated CpG transversion rate previously identified using the TF form are an artifact. Our results further suggest as few as 5 parameters may account for approximately 85% of neighboring nucleotide influence.

Collapse

144

Mugal CF, von Grünberg HH, Peifer M. Transcription-induced mutational strand bias and its effect on substitution rates in human genes. Mol Biol Evol 2008;26:131-42. [PMID: 18974087 DOI: 10.1093/molbev/msn245] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

145

Marín A, Xia X. GC skew in protein-coding genes between the leading and lagging strands in bacterial genomes: New substitution models incorporating strand bias. J Theor Biol 2008;253:508-13. [DOI: 10.1016/j.jtbi.2008.04.004] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2007] [Revised: 02/29/2008] [Accepted: 04/04/2008] [Indexed: 10/22/2022]

146

Transcription induces strand-specific mutations at the 5' end of human genes. Genome Res 2008;18:1216-23. [PMID: 18463301 DOI: 10.1101/gr.076570.108] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

147

RNA landscape of evolution for optimal exon and intron discrimination. Proc Natl Acad Sci U S A 2008;105:5797-802. [PMID: 18391195 DOI: 10.1073/pnas.0801692105] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

148

Aguilera A, Gómez-González B. Genome instability: a mechanistic view of its causes and consequences. Nat Rev Genet 2008;9:204-17. [PMID: 18227811 DOI: 10.1038/nrg2268] [Citation(s) in RCA: 555] [Impact Index Per Article: 34.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

149

Krishnan NM, Seligmann H, Rao BJ. Relationship between mRNA secondary structure and sequence variability in Chloroplast genes: possible life history implications. BMC Genomics 2008;9:48. [PMID: 18226235 PMCID: PMC2276208 DOI: 10.1186/1471-2164-9-48] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2007] [Accepted: 01/28/2008] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Synonymous sites are freer to vary because of redundancy in genetic code. Messenger RNA secondary structure restricts this freedom, as revealed by previous findings in mitochondrial genes that mutations at third codon position nucleotides in helices are more selected against than those in loops. This motivated us to explore the constraints imposed by mRNA secondary structure on evolutionary variability at all codon positions in general, in chloroplast systems.

RESULTS

We found that the evolutionary variability and intrinsic secondary structure stability of these sequences share an inverse relationship. Simulations of most likely single nucleotide evolution in Psilotum nudum and Nephroselmis olivacea mRNAs, indicate that helix-forming propensities of mutated mRNAs are greater than those of the natural mRNAs for short sequences and vice-versa for long sequences. Moreover, helix-forming propensity estimated by the percentage of total mRNA in helices increases gradually with mRNA length, saturating beyond 1000 nucleotides. Protection levels of functionally important sites vary across plants and proteins: r-strategists minimize mutation costs in large genes; K-strategists do the opposite.

CONCLUSION

Mrna length presumably predisposes shorter mRNAs to evolve under different constraints than longer mRNAs. The positive correlation between secondary structure protection and functional importance of sites suggests that some sites might be conserved due to packing-protection constraints at the nucleic acid level in addition to protein level constraints. Consequently, nucleic acid secondary structure a priori biases mutations. The converse (exposure of conserved sites) apparently occurs in a smaller number of cases, indicating a different evolutionary adaptive strategy in these plants. The differences between the protection levels of functionally important sites for r- and K-strategists reflect their respective molecular adaptive strategies. These converge with increasing domestication levels of K-strategists, perhaps because domestication increases reproductive output.

Collapse

150

Evans KJ. Genomic DNA from animals shows contrasting strand bias in large and small subsequences. BMC Genomics 2008;9:43. [PMID: 18221531 PMCID: PMC2267173 DOI: 10.1186/1471-2164-9-43] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2007] [Accepted: 01/25/2008] [Indexed: 01/09/2023] Open

Abstract

Background

For eukaryotes, there is almost no strand bias with regard to base composition, with exceptions for origins of replication and transcription start sites and transcribed regions. This paper revisits the question for subsequences of DNA taken at random from the genome.

Results

For a typical mammal, for example mouse or human, there is a small strand bias throughout the genomic DNA: there is a correlation between (G - C) and (A - T) on the same strand, (that is between the difference in the number of guanine and cytosine bases and the difference in the number of adenine and thymine bases). For small subsequences – up to 1 kb – this correlation is weak but positive; but for large windows – around 50 kb to 2 Mb – the correlation is strong and negative. This effect is largely independent of GC%. Transcribed and untranscribed regions give similar correlations both for small and large subsequences, but there is a difference in these regions for intermediate sized subsequences. An analysis of the human genome showed that position within the isochore structure did not affect these correlations. An analysis of available genomes of different species shows that this contrast between large and small windows is a general feature of mammals and birds. Further down the evolutionary tree, other organisms show a similar but smaller effect. Except for the nematode, all the animals analysed showed at least a small effect.

Conclusion

The correlations on the large scale may be explained by DNA replication. Transcription may be a modifier of these effects but is not the fundamental cause. These results cast light on how DNA mutations affect the genome over evolutionary time. At least for vertebrates, there is a broad relationship between body temperature and the size of the correlation. The genome of mammals and birds has a structure marked by strand bias segments.

Collapse