1
|
Frequent nonallelic gene conversion on the human lineage and its effect on the divergence of gene duplicates. Proc Natl Acad Sci U S A 2017; 114:12779-12784. [PMID: 29138319 DOI: 10.1073/pnas.1708151114] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Gene conversion is the copying of a genetic sequence from a "donor" region to an "acceptor." In nonallelic gene conversion (NAGC), the donor and the acceptor are at distinct genetic loci. Despite the role NAGC plays in various genetic diseases and the concerted evolution of gene families, the parameters that govern NAGC are not well characterized. Here, we survey duplicate gene families and identify converted tracts in 46% of them. These conversions reflect a large GC bias of NAGC. We develop a sequence evolution model that leverages substantially more information in duplicate sequences than used by previous methods and use it to estimate the parameters that govern NAGC in humans: a mean converted tract length of 250 bp and a probability of [Formula: see text] per generation for a nucleotide to be converted (an order of magnitude higher than the point mutation rate). Despite this high baseline rate, we show that NAGC slows down as duplicate sequences diverge-until an eventual "escape" of the sequences from its influence. As a result, NAGC has a small average effect on the sequence divergence of duplicates. This work improves our understanding of the NAGC mechanism and the role that it plays in the evolution of gene duplicates.
Collapse
|
2
|
Ji X, Griffing A, Thorne JL. A Phylogenetic Approach Finds Abundant Interlocus Gene Conversion in Yeast. Mol Biol Evol 2016; 33:2469-76. [PMID: 27297467 DOI: 10.1093/molbev/msw114] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Interlocus gene conversion (IGC) homogenizes repeats. While genomes can be repeat-rich, the evolutionary importance of IGC is poorly understood. Additional statistical tools for characterizing it are needed. We propose a composite likelihood strategy for incorporating IGC into widely-used probabilistic models for sequence changes that originate with point mutation. We estimated the percentage of nucleotide substitutions that originate with an IGC event rather than a point mutation in 14 groups of yeast ribosomal protein-coding genes, and found values ranging from 20% to 38%. We designed and applied a procedure to determine whether these percentages are inflated due to artifacts arising from model misspecification. The results of this procedure are consistent with IGC having had an important role in the evolution of each of these 14 gene families. We further investigate the properties of our IGC approach via simulation. In contrast to usual practice, our findings suggest that the IGC should and can be considered when multigene family evolution is investigated.
Collapse
Affiliation(s)
- Xiang Ji
- Bioinformatics Research Center, North Carolina State University Department of Statistics, North Carolina State University
| | - Alexander Griffing
- Bioinformatics Research Center, North Carolina State University Department of Biological Sciences, North Carolina State University
| | - Jeffrey L Thorne
- Bioinformatics Research Center, North Carolina State University Department of Statistics, North Carolina State University Department of Biological Sciences, North Carolina State University
| |
Collapse
|
3
|
Dumont BL. Interlocus gene conversion explains at least 2.7% of single nucleotide variants in human segmental duplications. BMC Genomics 2015; 16:456. [PMID: 26077037 PMCID: PMC4467073 DOI: 10.1186/s12864-015-1681-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2015] [Accepted: 06/01/2015] [Indexed: 01/24/2023] Open
Abstract
Background Interlocus gene conversion (IGC) is a recombination-based mechanism that results in the unidirectional transfer of short stretches of sequence between paralogous loci. Although IGC is a well-established mechanism of human disease, the extent to which this mutagenic process has shaped overall patterns of segregating variation in multi-copy regions of the human genome remains unknown. One expected manifestation of IGC in population genomic data is the presence of one-to-one paralogous SNPs that segregate identical alleles. Results Here, I use SNP genotype calls from the low-coverage phase 3 release of the 1000 Genomes Project to identify 15,790 parallel, shared SNPs in duplicated regions of the human genome. My approach for identifying these sites accounts for the potential redundancy of short read mapping in multi-copy genomic regions, thereby effectively eliminating false positive SNP calls arising from paralogous sequence variation. I demonstrate that independent mutation events to identical nucleotides at paralogous sites are not a significant source of shared polymorphisms in the human genome, consistent with the interpretation that these sites are the outcome of historical IGC events. These putative signals of IGC are enriched in genomic contexts previously associated with non-allelic homologous recombination, including clear signals in gene families that form tandem intra-chromosomal clusters. Conclusions Taken together, my analyses implicate IGC, not point mutation, as the mechanism generating at least 2.7 % of single nucleotide variants in duplicated regions of the human genome. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1681-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Beth L Dumont
- Initiative in Biological Complexity, North Carolina State University, 112 Derieux Place, 3510 Thomas Hall, Campus Box 7614, Raleigh, NC, 27695-7614, USA.
| |
Collapse
|
4
|
Piscopo SP, Drouin G. [High gene conversion frequency between genes encoding 2-deoxyglucose-6-phosphate phosphatase in 3 Saccharomyces species]. Genome 2014; 57:303-8. [PMID: 25188289 DOI: 10.1139/gen-2014-0068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Gene conversions are nonreciprocal sequence exchanges between genes. They are relatively common in Saccharomyces cerevisiae, but few studies have investigated the evolutionary fate of gene conversions or their functional impacts. Here, we analyze the evolution and impact of gene conversions between the two genes encoding 2-deoxyglucose-6-phosphate phosphatase in S. cerevisiae, Saccharomyces paradoxus and Saccharomyces mikatae. Our results demonstrate that the last half of these genes are subject to gene conversions among these three species. The greater similarity and the greater percentage of GC nucleotides in the converted regions, as well as the absence of long regions of adjacent common converted sites, suggest that these gene conversions are frequent and occur independently in all three species. The high frequency of these conversions probably result from the fact that they have little impact on the protein sequences encoded by these genes.
Collapse
Affiliation(s)
- Sara-Pier Piscopo
- Département de biologie et Centre de recherche avancée en génomique environnementale, Université d'Ottawa, 30 Marie Curie, Ottawa, ON K1N 6N5, Canada
| | | |
Collapse
|
5
|
Mussotter T, Bengesser K, Högel J, Cooper DN, Kehrer-Sawatzki H. Population-specific differences in gene conversion patterns between human SUZ12 and SUZ12P are indicative of the dynamic nature of interparalog gene conversion. Hum Genet 2014; 133:383-401. [PMID: 24385046 DOI: 10.1007/s00439-013-1410-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2013] [Accepted: 12/08/2013] [Indexed: 11/29/2022]
Abstract
Nonallelic homologous gene conversion (NAHGC) resulting from interparalog recombination without crossover represents an important influence on the evolution of duplicated sequences in the human genome. In 17q11.2, different paralogous sequences mediate large NF1 deletions by nonallelic homologous recombination with crossover (NAHR). Among these paralogs are SUZ12 and its pseudogene SUZ12P which harbour the breakpoints of type-2 (1.2-Mb) NF1 deletions. Such deletions are caused predominantly by mitotic NAHR since somatic mosaicism with normal cells is evident in most patients. Investigating whether SUZ12 and SUZ12P have also been involved in NAHGC, we observed gene conversion tracts between these paralogs in both Africans (AFR) and Europeans (EUR). Since germline type-2 NF1 deletions resulting from meiotic NAHR are very rare, the vast majority of the gene conversion tracts in SUZ12 and SUZ12P are likely to have resulted from mitotic recombination during premeiotic cell divisions of germ cells. A higher number of gene conversion tracts were noted within SUZ12 and SUZ12P in AFR as compared to EUR. Further, the distinctive signature of NAHGC (a high number of SNPs per paralog and a high number of shared SNPs between paralogs), a characteristic of many actively recombining paralogs, was observed in both SUZ12 and SUZ12P but only in AFR and not in EUR. A novel polymorphic 2.3-kb deletion in SUZ12P was identified which exhibited a high allele frequency in EUR. We postulate that this interparalog structural difference, together with low allelic recombination rates, could have caused a reduction in NAHGC between SUZ12 and SUZ12P during human evolution.
Collapse
Affiliation(s)
- Tanja Mussotter
- Institute of Human Genetics, University of Ulm, Albert-Einstein-Allee 11, 89081, Ulm, Germany
| | | | | | | | | |
Collapse
|
6
|
Dumont BL, Eichler EE. Signals of historical interlocus gene conversion in human segmental duplications. PLoS One 2013; 8:e75949. [PMID: 24124524 PMCID: PMC3790853 DOI: 10.1371/journal.pone.0075949] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Accepted: 08/17/2013] [Indexed: 12/04/2022] Open
Abstract
Standard methods of DNA sequence analysis assume that sequences evolve independently, yet this assumption may not be appropriate for segmental duplications that exchange variants via interlocus gene conversion (IGC). Here, we use high quality multiple sequence alignments from well-annotated segmental duplications to systematically identify IGC signals in the human reference genome. Our analysis combines two complementary methods: (i) a paralog quartet method that uses DNA sequence simulations to identify a statistical excess of sites consistent with inter-paralog exchange, and (ii) the alignment-based method implemented in the GENECONV program. One-quarter (25.4%) of the paralog families in our analysis harbor clear IGC signals by the quartet approach. Using GENECONV, we identify 1477 gene conversion tracks that cumulatively span 1.54 Mb of the genome. Our analyses confirm the previously reported high rates of IGC in subtelomeric regions and Y-chromosome palindromes, and identify multiple novel IGC hotspots, including the pregnancy specific glycoproteins and the neuroblastoma breakpoint gene families. Although the duplication history of a paralog family is described by a single tree, we show that IGC has introduced incredible site-to-site variation in the evolutionary relationships among paralogs in the human genome. Our findings indicate that IGC has left significant footprints in patterns of sequence diversity across segmental duplications in the human genome, out-pacing the contributions of single base mutation by orders of magnitude. Collectively, the IGC signals we report comprise a catalog that will provide a critical reference for interpreting observed patterns of DNA sequence variation across duplicated genomic regions, including targets of recent adaptive evolution in humans.
Collapse
Affiliation(s)
- Beth L. Dumont
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, Seattle, Washington, United States of America
| |
Collapse
|
7
|
Fawcett JA, Innan H. The role of gene conversion in preserving rearrangement hotspots in the human genome. Trends Genet 2013; 29:561-8. [PMID: 23953668 DOI: 10.1016/j.tig.2013.07.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2013] [Revised: 06/20/2013] [Accepted: 07/08/2013] [Indexed: 11/27/2022]
Abstract
Hotspots of non-allelic homologous recombination (NAHR) have a crucial role in creating genetic diversity and are also associated with dozens of genomic disorders. Recent studies suggest that many human NAHR hotspots have been preserved throughout the evolution of primates. NAHR hotspots are likely to remain active as long as the segmental duplications (SDs) promoting NAHR retain sufficient similarity. Here, we propose an evolutionary model of SDs that incorporates the effect of gene conversion and compare it with a null model that assumes SDs evolve independently without gene conversion. The gene conversion model predicts a much longer lifespan of NAHR hotspots compared with the null model. We show that the literature on copy number variants (CNVs) and genomic disorders, and also the results of additional analysis of CNVs, are all more consistent with the gene conversion model.
Collapse
Affiliation(s)
- Jeffrey A Fawcett
- Graduate University for Advanced Studies, Hayama, Kanagawa 240-0193, Japan
| | | |
Collapse
|
8
|
Rapid and accurate large-scale genotyping of duplicated genes and discovery of interlocus gene conversions. Nat Methods 2013; 10:903-9. [PMID: 23892896 PMCID: PMC3985568 DOI: 10.1038/nmeth.2572] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2013] [Accepted: 06/06/2013] [Indexed: 01/17/2023]
Abstract
Over 900 genes have been annotated within duplicated regions of the human genome, yet their functions and potential roles in disease remain largely unknown. One major obstacle has been the inability to accurately and comprehensively assay genetic variation for these genes in a high-throughput manner. We developed a sequencing-based method for rapid and high-throughput genotyping of duplicated genes using molecular inversion probes designed to target unique paralogous sequence variants. We applied this method to genotype all members of two gene families, SRGAP2 and RH, among a diversity panel of 1,056 humans. The approach could accurately distinguish copy number in paralogs having up to ∼99.6% sequence identity, identify small gene-disruptive deletions, detect single-nucleotide variants, define breakpoints of unequal crossover and discover regions of interlocus gene conversion. The ability to rapidly and accurately genotype multiple gene families in thousands of individuals at low cost enables the development of genome-wide gene conversion maps and 'unlocks' many previously inaccessible duplicated genes for association with human traits.
Collapse
|
9
|
Abstract
The flow of genes between different species represents a form of genetic variation whose implications have not been fully appreciated. Here I examine some key findings on the extent of horizontal gene transfer (HGT) revealed by comparative genome analysis and their theoretical implications. In theoretical terms, HGT affects ideas pertaining to the tree of life, the notion of a last universal common ancestor, and the biological unities, as well as the rules of taxonomic nomenclature. This review discusses the emergence of the eukaryotic cell and the occurrence of HGT among metazoan phyla involving both transposable elements and structural genes for normal housekeeping functions. I also discuss the bacterial pangenome, which provides an important case study on the permeability of species boundaries. An interesting observation about bdelloid rotifers and their reversion to asexual reproduction as it pertains to HGT is included.
Collapse
Affiliation(s)
- Michael Syvanen
- Department of Microbiology and Immunology, School of Medicine, University of California, Davis, California 95616, USA.
| |
Collapse
|
10
|
Aleshin A, Zhi D. Recombination-associated sequence homogenization of neighboring Alu elements: signature of nonallelic gene conversion. Mol Biol Evol 2010; 27:2300-11. [PMID: 20453015 DOI: 10.1093/molbev/msq116] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Recently, researchers have begun to recognize that, in order to establish neutral models for disease association and evolutionary genomics studies, it is crucial to have a clear understanding of the genomic impact of nonallelic gene conversion. Drawing on previous successes in characterizing this phenomenon over protein-coding gene families, we undertook a computational analysis of neighboring Alu sequences in the genome scale. For this purpose, we developed adjusted comutation rate (aCMR), a novel statistical method measuring the excess number of identical point mutations shared by adjacent Alu sequences, vis-à-vis random pairs. Using aCMR, we uncovered a remarkable genome-wide sequence homogenization of neighboring Alus, with the strongest signal observed in the pseudoautosomal regions of the X and Y chromosomes. The magnitude of sequence homogenization between Alu pairs is greater with shorter interlocus distance, higher sequence identity, and parallel orientation. Moreover, shared substitutions show a strong directionality toward GC nucleotides, with multiple substitutions tending to cluster within the Alu sequence. Taken together, these observed recombination-associated sequence homogenization patterns are best explained by frequent ubiquitous gene conversion events between neighboring Alus. We believe that these observations help to illuminate the nature and impact of the enigmatic phenomenon of gene conversion.
Collapse
Affiliation(s)
- Alexey Aleshin
- Department of Medicine, Division of Hematology, Oncology, David Geffen School of Medicine, University of California, Los Angeles, USA
| | | |
Collapse
|
11
|
Ezawa K, Ikeo K, Gojobori T, Saitou N. Evolutionary Pattern of Gene Homogenization between Primate-Specific Paralogs after Human and Macaque Speciation Using the 4-2-4 Method. Mol Biol Evol 2010; 27:2152-71. [DOI: 10.1093/molbev/msq109] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
12
|
Abstract
Nonallelic gene conversion has been proposed as a major force in homogenizing the sequences of paralogous genes. In this work, we investigate the extent and characteristics of gene conversion among gene families in nine species of the genus Drosophila. We carried out a genome-wide study of 2855 gene families (including 17,742 genes) and determined that conversion events involved 2628 genes. The proportion of converted genes ranged across species from 1 to 9% when paralogs of all ages were included. Although higher levels of gene conversion were found among young gene duplicates, at most 1-2% of the coding sequences of these duplicates were affected by conversion. Using a second approach relying on gene family size changes and gene-tree/species-tree reconciliation methods, we estimate that only 1-15% of gene trees are misled by gene conversion, depending on the lineage considered. Several features of paralogous genes correlate with gene conversion, such as intra-/interchromosomal location, level of nucleotide divergence, and GC content, although we found no definitive evidence for biased substitution patterns. After considering species-specific differences in the age and distance between paralogs, we found a highly significant difference in the amount of gene conversion among species. In particular, members of the melanogaster group showed the lowest proportion of converted genes. Our data therefore suggest underlying differences in the mechanistic basis of gene conversion among species.
Collapse
|
13
|
Minimal effect of ectopic gene conversion among recent duplicates in four mammalian genomes. Genetics 2009; 182:615-22. [PMID: 19307604 DOI: 10.1534/genetics.109.101428] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Gene conversion between duplicated genes has been implicated in homogenization of gene families and reassortment of variation among paralogs. If conversion is common, this process could lead to errors in gene tree inference and subsequent overestimation of rates of gene duplication. After performing simulations to assess our power to detect gene conversion events, we determined rates of conversion among young, lineage-specific gene duplicates in four mammal species: human, rhesus macaque, mouse, and rat. Gene conversion rates (number of conversion events/number of gene pairs) among young duplicates range from 8.3% in macaque to 18.96% in rat, including a 5% false-positive rate. For all lineages, only 1-3% of the total amount of sequence examined was converted. There is no increase in GC content in conversion tracts compared to flanking regions of the same genes nor in conversion tracts compared to the same region in nonconverted gene-family members, suggesting that ectopic gene conversion does not significantly alter nucleotide composition in these duplicates. While the majority of gene duplicate pairs reside on different chromosomes in mammalian genomes, the majority of gene conversion events occur between duplicates on the same chromosome, even after controlling for divergence between duplicates. Among intrachromosomal duplicates, however, there is no correlation between the probability of conversion and physical distance between duplicates after controlling for divergence. Finally, we use a novel method to show that at most 5-10% of all gene trees involving young duplicates are likely to be incorrect due to gene conversion. We conclude that gene conversion has had only a small effect on mammalian genomes and gene duplicate evolution in general.
Collapse
|
14
|
Hsu CH, Zhang Y, Hardison R, Miller W. Whole-Genome Analysis of Gene Conversion Events. COMPARATIVE GENOMICS 2009. [DOI: 10.1007/978-3-642-04744-2_15] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
|
15
|
Rudd MK, Endicott RM, Friedman C, Walker M, Young JM, Osoegawa K, de Jong PJ, Green ED, Trask BJ. Comparative sequence analysis of primate subtelomeres originating from a chromosome fission event. Genome Res 2008; 19:33-41. [PMID: 18952852 DOI: 10.1101/gr.083170.108] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Subtelomeres are concentrations of interchromosomal segmental duplications capped by telomeric repeats at the ends of chromosomes. The nature of the segments shared by different sets of human subtelomeres reflects their high rate of recent interchromosomal exchange. Here, we characterize the rearrangements incurred by the 15q subtelomere after it arose from a chromosome fission event in the common ancestor of great apes. We used FISH, sequencing of genomic clones, and PCR to map the breakpoint of this fission and track the fate of flanking sequence in human, chimpanzee, gorilla, orangutan, and macaque genomes. The ancestral locus, a cluster of olfactory receptor (OR) genes, lies internally on macaque chromosome 7. Sequence originating from this fission site is split between the terminus of 15q and the pericentromere of 14q in the great apes. Numerous structural rearrangements, including interstitial deletions and transfers of material to or from other subtelomeres, occurred subsequent to the fission, such that each species has a unique 15q structure and unique collection of ORs derived from the fission locus. The most striking rearrangement involved transfer of at least 200 kb from the fission-site region to the end of chromosome 4q, where much still resides in chimpanzee and gorilla, but not in human. This gross structural difference places the subtelomeric defect underlying facioscapulohumeral muscular dystrophy (FSHD) much closer to the telomere in human 4q than in the hybrid 4q-15q subtelomere of chimpanzee.
Collapse
Affiliation(s)
- M Katharine Rudd
- Division of Human Biology, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Ectopic gene conversions in the human genome. Genomics 2008; 93:27-32. [PMID: 18848875 DOI: 10.1016/j.ygeno.2008.09.007] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2008] [Revised: 08/23/2008] [Accepted: 09/18/2008] [Indexed: 11/22/2022]
Abstract
We used the GENCONV method to characterize the gene conversions that occurred amongst the 1434 protein coding human gene families with three or more genes. Conversions occur at a frequency of 0.88% (483 conversion events/55,050 gene pairs compared) and have an average length of 371+/-752 bp (+/-standard deviation). Both the size and the frequency of conversions are positively correlated with the similarity of the sequences involved in these conversions. The frequency of conversions and the local recombination rate are also positively correlated. Intrachromosomal conversions are almost 5 times more frequent than interchromosomal conversions and the frequency of intrachromosomal conversions increases as the distance between genes decreases. However, the higher frequency of conversions between nearby genes with the same transcriptional orientation is due to the fact that most functional duplicated genes are found next to one another and in the same transcriptional orientation. The average length of a conversion spanning only an intron region is significantly smaller than conversions spanning both exons and introns or only exons. This suggests that the smaller degree of sequence similarity of introns limits the size of conversions between duplicated human genes. The significant excess of conversions at the 3'-end of human genes suggests that incomplete cDNA molecules are often involved in conversions with chromosomal gene copies.
Collapse
|
17
|
Münch C, Kirsch S, Fernandes AMG, Schempp W. Evolutionary analysis of the highly dynamic CHEK2 duplicon in anthropoids. BMC Evol Biol 2008; 8:269. [PMID: 18831734 PMCID: PMC2566985 DOI: 10.1186/1471-2148-8-269] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2008] [Accepted: 10/02/2008] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Segmental duplications (SDs) are euchromatic portions of genomic DNA (> or = 1 kb) that occur at more than one site within the genome, and typically share a high level of sequence identity (>90%). Approximately 5% of the human genome is composed of such duplicated sequences. Here we report the detailed investigation of CHEK2 duplications. CHEK2 is a multiorgan cancer susceptibility gene encoding a cell cycle checkpoint kinase acting in the DNA-damage response signalling pathway. The continuous presence of the CHEK2 gene in all eukaryotes and its important role in maintaining genome stability prompted us to investigate the duplicative evolution and phylogeny of CHEK2 and its paralogs during anthropoid evolution. RESULTS To study CHEK2 duplicon evolution in anthropoids we applied a combination of comparative FISH and in silico analyses. Our comparative FISH results with a CHEK2 fosmid probe revealed the single-copy status of CHEK2 in New World monkeys, Old World monkeys and gibbons. Whereas a single CHEK2 duplication was detected in orangutan, a multi-site signal pattern indicated a burst of duplication in African great apes and human. Phylogenetic analysis of paralogous and ancestral CHEK2 sequences in human, chimpanzee and rhesus macaque confirmed this burst of duplication, which occurred after the radiation of orangutan and African great apes. In addition, we used inter-species quantitative PCR to determine CHEK2 copy numbers. An amplification of CHEK2 was detected in African great apes and the highest CHEK2 copy number of all analysed species was observed in the human genome. Furthermore, we detected variation in CHEK2 copy numbers within the analysed set of human samples. CONCLUSION Our detailed analysis revealed the highly dynamic nature of CHEK2 duplication during anthropoid evolution. We determined a burst of CHEK2 duplication after the radiation of orangutan and African great apes and identified the highest CHEK2 copy number in human. In conclusion, our analysis of CHEK2 duplicon evolution revealed that SDs contribute to inter-species variation. Furthermore, our qPCR analysis led us to presume CHEK2 copy number variation in human, and molecular diagnostics of the cancer susceptibility gene CHEK2 inside the duplicated region might be hampered by the individual-specific set of duplicons.
Collapse
Affiliation(s)
- Claudia Münch
- Institute of Human Genetics and Anthropology, University of Freiburg, Breisacher Str. 33, 79106 Freiburg, Germany
| | - Stefan Kirsch
- Institute of Human Genetics and Anthropology, University of Freiburg, Breisacher Str. 33, 79106 Freiburg, Germany
| | - António MG Fernandes
- Institute of Human Genetics and Anthropology, University of Freiburg, Breisacher Str. 33, 79106 Freiburg, Germany
| | - Werner Schempp
- Institute of Human Genetics and Anthropology, University of Freiburg, Breisacher Str. 33, 79106 Freiburg, Germany
| |
Collapse
|
18
|
Kirsch S, Pasantes J, Wolf A, Bogdanova N, Münch C, Pennekamp P, Krawczak M, Dworniczak B, Schempp W. Chromosomal evolution of the PKD1 gene family in primates. BMC Evol Biol 2008; 8:263. [PMID: 18822117 PMCID: PMC2564946 DOI: 10.1186/1471-2148-8-263] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2007] [Accepted: 09/26/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The autosomal dominant polycystic kidney disease (ADPKD) is mostly caused by mutations in the PKD1 (polycystic kidney disease 1) gene located in 16p13.3. Moreover, there are six pseudogenes of PKD1 that are located proximal to the master gene in 16p13.1. In contrast, no pseudogene could be detected in the mouse genome, only a single copy gene on chromosome 17. The question arises how the human situation originated phylogenetically. To address this question we applied comparative FISH-mapping of a human PKD1-containing genomic BAC clone and a PKD1-cDNA clone to chromosomes of a variety of primate species and the dog as a non-primate outgroup species. RESULTS Comparative FISH with the PKD1-cDNA clone clearly shows that in all primate species studied distinct single signals map in subtelomeric chromosomal positions orthologous to the short arm of human chromosome 16 harbouring the master PKD1 gene. Only in human and African great apes, but not in orangutan, FISH with both BAC and cDNA clones reveals additional signal clusters located proximal of and clearly separated from the PKD1 master genes indicating the chromosomal position of PKD1 pseudogenes in 16p of these species, respectively. Indeed, this is in accordance with sequencing data in human, chimpanzee and orangutan. Apart from the master PKD1 gene, six pseudogenes are identified in both, human and chimpanzee, while only a single-copy gene is present in the whole-genome sequence of orangutan. The phylogenetic reconstruction of the PKD1-tree reveals that all human pseudogenes are closely related to the human PKD1 gene, and all chimpanzee pseudogenes are closely related to the chimpanzee PKD1 gene. However, our statistical analyses provide strong indication that gene conversion events may have occurred within the PKD1 family members of human and chimpanzee, respectively. CONCLUSION PKD1 must have undergone amplification very recently in hominid evolution. Duplicative transposition of the PKD1 gene and further amplification and evolution of the PKD1 pseudogenes may have arisen in a common ancestor of Homo, Pan and Gorilla approximately 8 MYA. Reticulate evolutionary processes such as gene conversion and non-allelic homologous recombination (NAHR) may have resulted in concerted evolution of PKD1 family members in human and chimpanzee and, thus, simulate an independent evolution of the PKD1 pseudogenes from their master PKD1 genes in human and chimpanzee.
Collapse
Affiliation(s)
- Stefan Kirsch
- Institut für Humangenetik und Anthropologie, Universität Freiburg, Breisacher Str. 33, 79106 Freiburg, Germany
| | - Juanjo Pasantes
- Institut für Humangenetik und Anthropologie, Universität Freiburg, Breisacher Str. 33, 79106 Freiburg, Germany
- Department of Biochemistry, Genetics & Immunology, University of Vigo, E-36200 Vigo, Spain
| | - Andreas Wolf
- Institut für Medizinische Informatik und Statistik, Universität Kiel, Brunswiker Str. 10, 24105 Kiel, Germany
| | - Nadia Bogdanova
- Institut für Humangenetik, Universität Münster, Vesaliusweg 12-14, 48129 Münster, Germany
| | - Claudia Münch
- Institut für Humangenetik und Anthropologie, Universität Freiburg, Breisacher Str. 33, 79106 Freiburg, Germany
| | - Petra Pennekamp
- Institut für Humangenetik, Universität Münster, Vesaliusweg 12-14, 48129 Münster, Germany
| | - Michael Krawczak
- Institut für Medizinische Informatik und Statistik, Universität Kiel, Brunswiker Str. 10, 24105 Kiel, Germany
| | - Bernd Dworniczak
- Institut für Humangenetik, Universität Münster, Vesaliusweg 12-14, 48129 Münster, Germany
| | - Werner Schempp
- Institut für Humangenetik und Anthropologie, Universität Freiburg, Breisacher Str. 33, 79106 Freiburg, Germany
| |
Collapse
|
19
|
Lee AS, Gutiérrez-Arcelus M, Perry GH, Vallender EJ, Johnson WE, Miller GM, Korbel JO, Lee C. Analysis of copy number variation in the rhesus macaque genome identifies candidate loci for evolutionary and human disease studies. Hum Mol Genet 2008; 17:1127-36. [PMID: 18180252 DOI: 10.1093/hmg/ddn002] [Citation(s) in RCA: 93] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Copy number variants (CNVs) are heritable gains and losses of genomic DNA in normal individuals. While copy number variation is widely studied in humans, our knowledge of CNVs in other mammalian species is more limited. We have designed a custom array-based comparative genomic hybridization (aCGH) platform with 385 000 oligonucleotide probes based on the reference genome sequence of the rhesus macaque (Macaca mulatta), the most widely studied non-human primate in biomedical research. We used this platform to identify 123 CNVs among 10 unrelated macaque individuals, with 24% of the CNVs observed in multiple individuals. We found that segmental duplications were significantly enriched at macaque CNV loci. We also observed significant overlap between rhesus macaque and human CNVs, suggesting that certain genomic regions are prone to recurrent CNV formation and instability, even across a total of approximately 50 million years of primate evolution ( approximately 25 million years in each lineage). Furthermore, for eight of the CNVs that were observed in both humans and macaques, previous human studies have reported a relationship between copy number and gene expression or disease susceptibility. Therefore, the rhesus macaque offers an intriguing, non-human primate outbred model organism with which hypotheses concerning the specific functions of phenotypically relevant human CNVs can be tested.
Collapse
Affiliation(s)
- Arthur S Lee
- Department of Pathology, Brigham and Women's Hospital, 221 Longwood Ave., Boston, MA 02115, USA
| | | | | | | | | | | | | | | |
Collapse
|
20
|
Abstract
Chromosomal inversions have an important role in evolution, and an increasing number of inversion polymorphisms are being identified in the human population. The evolutionary history of these inversions and the mechanisms by which they arise are therefore of significant interest. Previously, a polymorphic inversion on human chromosome Xq28 that includes the FLNA and EMD loci was discovered and hypothesized to have been the result of nonallelic homologous recombination (NAHR) between near-identical inverted duplications flanking this region. Here, we carried out an in-depth study of the orthologous region in 27 additional eutherians and report that this inversion is not specific to humans, but has occurred independently and repeatedly at least 10 times in multiple eutherian lineages. Moreover, inverted duplications flank the FLNA-EMD region in all 16 species for which high-quality sequence assemblies are available. Based on detailed sequence analyses, we propose a model in which the observed inverted duplications originated from a common duplication event that predates the eutherian radiation. Subsequent gene conversion homogenized the duplications, thereby providing a continuous substrate for NAHR that led to the recurrent inversion of this segment of the genome. These results provide an extreme example in support of the evolutionary breakpoint reusage hypothesis and point out that some near-identical human segmental duplications may, in fact, have originated >100 million years ago.
Collapse
|
21
|
Ancestral reconstruction of segmental duplications reveals punctuated cores of human genome evolution. Nat Genet 2007; 39:1361-8. [PMID: 17922013 DOI: 10.1038/ng.2007.9] [Citation(s) in RCA: 143] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2007] [Accepted: 08/07/2007] [Indexed: 01/22/2023]
Abstract
Human segmental duplications are hotspots for nonallelic homologous recombination leading to genomic disorders, copy-number polymorphisms and gene and transcript innovations. The complex structure and history of these regions have precluded a global evolutionary analysis. Combining a modified A-Bruijn graph algorithm with comparative genome sequence data, we identify the origin of 4,692 ancestral duplication loci and use these to cluster 437 complex duplication blocks into 24 distinct groups. The sequence-divergence data between ancestral-derivative pairs and a comparison with the chimpanzee and macaque genome support a 'punctuated' model of evolution. Our analysis reveals that human segmental duplications are frequently organized around 'core' duplicons, which are enriched for transcripts and, in some cases, encode primate-specific genes undergoing positive selection. We hypothesize that the rapid expansion and fixation of some intrachromosomal segmental duplications during great-ape evolution has been due to the selective advantage conferred by these genes and transcripts embedded within these core duplications.
Collapse
|
22
|
Foltz DW. An Ancient Repeat Sequence in the ATP Synthase β-Subunit Gene of Forcipulate Sea Stars. J Mol Evol 2007; 65:564-73. [PMID: 17909692 DOI: 10.1007/s00239-007-9036-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2007] [Revised: 08/10/2007] [Accepted: 08/17/2007] [Indexed: 10/22/2022]
Abstract
A novel repeat sequence with a conserved secondary structure is described from two nonadjacent introns of the ATP synthase beta-subunit gene in sea stars of the order Forcipulatida (Echinodermata: Asteroidea). The repeat is present in both introns of all forcipulate sea stars examined, which suggests that it is an ancient feature of this gene (with an approximate age of 200 Mya). Both stem and loop regions show high levels of sequence constraint when compared to flanking nonrepetitive intronic regions. The repeat was also detected in (1) the family Pterasteridae, order Velatida and (2) the family Korethrasteridae, order Velatida. The repeat was not detected in (1) the family Echinasteridae, order Spinulosida, (2) the family Astropectinidae, order Paxillosida, (3) the family Solasteridae, order Velatida, or (4) the family Goniasteridae, order Valvatida. The repeat lacks similarity to published sequences in unrestricted GenBank searches, and there are no significant open reading frames in the repeat or in the flanking intron sequences. Comparison via parametric bootstrapping to a published phylogeny based on 4.2 kb of nuclear and mitochondrial sequence for a subset of these species allowed the null hypothesis of a congruent phylogeny to be rejected for each repeat, when compared separately to the published phylogeny. In contrast, the flanking nonrepetitive sequences in each intron yielded separate phylogenies that were each congruent with the published phylogeny. In four species, the repeat in one or both introns has apparently experienced gene conversion. The two introns also show a correlated pattern of nucleotide substitutions, even after excluding the putative cases of gene conversion.
Collapse
Affiliation(s)
- David W Foltz
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803-1715, USA.
| |
Collapse
|
23
|
Chen JM, Cooper DN, Chuzhanova N, Férec C, Patrinos GP. Gene conversion: mechanisms, evolution and human disease. Nat Rev Genet 2007; 8:762-75. [PMID: 17846636 DOI: 10.1038/nrg2193] [Citation(s) in RCA: 455] [Impact Index Per Article: 26.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Gene conversion, one of the two mechanisms of homologous recombination, involves the unidirectional transfer of genetic material from a 'donor' sequence to a highly homologous 'acceptor'. Considerable progress has been made in understanding the molecular mechanisms that underlie gene conversion, its formative role in human genome evolution and its implications for human inherited disease. Here we assess current thinking about how gene conversion occurs, explore the key part it has played in fashioning extant human genes, and carry out a meta-analysis of gene-conversion events that are known to have caused human genetic disease.
Collapse
|
24
|
Johnson ME, Cheng Z, Morrison VA, Scherer S, Ventura M, Gibbs RA, Green ED, Eichler EE. Recurrent duplication-driven transposition of DNA during hominoid evolution. Proc Natl Acad Sci U S A 2006; 103:17626-31. [PMID: 17101969 PMCID: PMC1693797 DOI: 10.1073/pnas.0605426103] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2006] [Indexed: 12/13/2022] Open
Abstract
The underlying mechanism by which the interspersed pattern of human segmental duplications has evolved is unknown. Based on a comparative analysis of primate genomes, we show that a particular segmental duplication (LCR16a) has been the source locus for the formation of the majority of intrachromosomal duplications blocks on human chromosome 16. We provide evidence that this particular segment has been active independently in each great ape and human lineage at different points during evolution. Euchromatic sequence that flanks sites of LCR16a integration are frequently lineage-specific duplications. This process has mobilized duplication blocks (15-200 kb in size) to new genomic locations in each species. Breakpoint analysis of lineage-specific insertions suggests coordinated deletion of repeat-rich DNA at the target site, in some cases deleting genes in that species. Our data support a model of duplication where the probability that a segment of DNA becomes duplicated is determined by its proximity to core duplicons, such as LCR16a.
Collapse
Affiliation(s)
- Matthew E. Johnson
- *Department of Genome Sciences and the
- Department of Genetics and Center for Human Genetics, Case Western Reserve School of Medicine and University Hospitals of Cleveland, Cleveland, OH 44106
| | - Ze Cheng
- *Department of Genome Sciences and the
| | - V. Anne Morrison
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195
| | - Steven Scherer
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030; and
| | - Mario Ventura
- **Sezione di Genetica, Dipartimento di Anatomia Patologica e di Genetica, University of Bari, 70126 Bari, Italy
| | - Richard A. Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030; and
| | | | - Evan E. Eichler
- *Department of Genome Sciences and the
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195
| |
Collapse
|
25
|
Lindsay SJ, Khajavi M, Lupski JR, Hurles ME. A chromosomal rearrangement hotspot can be identified from population genetic variation and is coincident with a hotspot for allelic recombination. Am J Hum Genet 2006; 79:890-902. [PMID: 17033965 PMCID: PMC1698570 DOI: 10.1086/508709] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2006] [Accepted: 08/22/2006] [Indexed: 01/15/2023] Open
Abstract
Insights into the origins of structural variation and the mutational mechanisms underlying genomic disorders would be greatly improved by a genomewide map of hotspots of nonallelic homologous recombination (NAHR). Moreover, our understanding of sequence variation within the duplicated sequences that are substrates for NAHR lags far behind that of sequence variation within the single-copy portion of the genome. Perhaps the best-characterized NAHR hotspot lies within the 24-kb-long Charcot-Marie-Tooth disease type 1A (CMT1A)-repeats (REPs) that sponsor deletions and duplications that cause peripheral neuropathies. We investigated structural and sequence diversity within the CMT1A-REPs, both within and between species. We discovered a high frequency of retroelement insertions, accelerated sequence evolution after duplication, extensive paralogous gene conversion, and a greater than twofold enrichment of SNPs in humans relative to the genome average. We identified an allelic recombination hotspot underlying the known NAHR hotspot, which suggests that the two processes are intimately related. Finally, we used our data to develop a novel method for inferring the location of an NAHR hotspot from sequence variation within segmental duplications and applied it to identify a putative NAHR hotspot within the LCR22 repeats that sponsor velocardiofacial syndrome deletions. We propose that a large-scale project to map sequence variation within segmental duplications would reveal a wealth of novel chromosomal-rearrangement hotspots.
Collapse
Affiliation(s)
- Sarah J Lindsay
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | | | | | | |
Collapse
|
26
|
Bailey JA, Eichler EE. Primate segmental duplications: crucibles of evolution, diversity and disease. Nat Rev Genet 2006; 7:552-64. [PMID: 16770338 DOI: 10.1038/nrg1895] [Citation(s) in RCA: 441] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Compared with other mammals, the genomes of humans and other primates show an enrichment of large, interspersed segmental duplications (SDs) with high levels of sequence identity. Recent evidence has begun to shed light on the origin of primate SDs, pointing to a complex interplay of mechanisms and indicating that distinct waves of duplication took place during primate evolution. There is also evidence for a strong association between duplication, genomic instability and large-scale chromosomal rearrangements. Exciting new findings suggest that SDs have not only created novel primate gene families, but might have also influenced current human genic and phenotypic variation on a previously unappreciated scale. A growing number of examples link natural human genetic variation of these regions to susceptibility to common disease.
Collapse
Affiliation(s)
- Jeffrey A Bailey
- Department of Pathology, Case Western University School of Medicine and University Hospitals of Cleveland, Ohio 44106, USA
| | | |
Collapse
|
27
|
She X, Liu G, Ventura M, Zhao S, Misceo D, Roberto R, Cardone MF, Rocchi M, Green ED, Archidiacano N, Eichler EE. A preliminary comparative analysis of primate segmental duplications shows elevated substitution rates and a great-ape expansion of intrachromosomal duplications. Genome Res 2006; 16:576-83. [PMID: 16606706 PMCID: PMC1457043 DOI: 10.1101/gr.4949406] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Compared with other sequenced animal genomes, human segmental duplications appear larger, more interspersed, and disproportionately represented as high-sequence identity alignments. Global sequence divergence estimates of human duplications have suggested an expansion relatively recently during hominoid evolution. Based on primate comparative sequence analysis of 37 unique duplication-transition regions, we establish a molecular clock for their divergence that shows a significant increase in their effective substitution rate when compared with unique genomic sequence. Fluorescent in situ hybridization (FISH) analyses from 1053 random nonhuman primate BACs indicate that great-ape species have been enriched for interspersed segmental duplications compared with representative Old World and New World monkeys. These findings support computational analyses that show a 12-fold excess of recent (>98%) intrachromosomal duplications when compared with duplications between nonhomologous chromosomes. These architectural shifts in genomic structure and elevated substitution rates have important implications for the emergence of new genes, gene-expression differences, and structural variation among humans and great apes.
Collapse
Affiliation(s)
- Xinwei She
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Ge Liu
- Department of Genetics, Case Western Reserve University, Cleveland, Ohio 44106, USA
- Bovine Functional Genomics Laboratory, US Department of Agriculture, Beltsville, Maryland 20705, USA
| | - Mario Ventura
- Department of Genetics and Microbiology, University of Bari, 70126 Bari, Italy
| | - Shaying Zhao
- The Institute for Genomic Research, Rockville, Maryland 20850, USA
| | - Doriana Misceo
- Department of Genetics and Microbiology, University of Bari, 70126 Bari, Italy
| | - Roberta Roberto
- Department of Genetics and Microbiology, University of Bari, 70126 Bari, Italy
| | | | - Mariano Rocchi
- Department of Genetics and Microbiology, University of Bari, 70126 Bari, Italy
| | | | - Eric D. Green
- Genome Technology Branch and NIH Intramural Sequencing Center, National Human Genome Research Institute, Bethesda, Maryland 20892, USA
| | | | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
- Howard Hughes Medical Institute, Seattle, Washington 98195, USA
- Corresponding author.E-mail ; fax (206) 685-7301
| |
Collapse
|