1
|
Versoza CJ, Weiss S, Johal R, La Rosa B, Jensen JD, Pfeifer SP. Novel Insights into the Landscape of Crossover and Noncrossover Events in Rhesus Macaques (Macaca mulatta). Genome Biol Evol 2024; 16:evad223. [PMID: 38051960 PMCID: PMC10773715 DOI: 10.1093/gbe/evad223] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 11/04/2023] [Accepted: 11/28/2023] [Indexed: 12/07/2023] Open
Abstract
Meiotic recombination landscapes differ greatly between distantly and closely related taxa, populations, individuals, sexes, and even within genomes; however, the factors driving this variation are yet to be well elucidated. Here, we directly estimate contemporary crossover rates and, for the first time, noncrossover rates in rhesus macaques (Macaca mulatta) from four three-generation pedigrees comprising 32 individuals. We further compare these results with historical, demography-aware, linkage disequilibrium-based recombination rate estimates. From paternal meioses in the pedigrees, 165 crossover events with a median resolution of 22.3 kb were observed, corresponding to a male autosomal map length of 2,357 cM-approximately 15% longer than an existing linkage map based on human microsatellite loci. In addition, 85 noncrossover events with a mean tract length of 155 bp were identified-similar to the tract lengths observed in the only other two primates in which noncrossovers have been studied to date, humans and baboons. Consistent with observations in other placental mammals with PRDM9-directed recombination, crossover (and to a lesser extent noncrossover) events in rhesus macaques clustered in intergenic regions and toward the chromosomal ends in males-a pattern in broad agreement with the historical, sex-averaged recombination rate estimates-and evidence of GC-biased gene conversion was observed at noncrossover sites.
Collapse
Affiliation(s)
- Cyril J Versoza
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
| | - Sarah Weiss
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Ravneet Johal
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Bruno La Rosa
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
| | - Susanne P Pfeifer
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
2
|
Tetrad analysis in plants and fungi finds large differences in gene conversion rates but no GC bias. Nat Ecol Evol 2017; 2:164-173. [PMID: 29158556 PMCID: PMC5733138 DOI: 10.1038/s41559-017-0372-7] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2017] [Accepted: 10/09/2017] [Indexed: 11/29/2022]
Abstract
GC-favoring gene conversion enables fixation of deleterious alleles, disturbs tests of natural selection and potentially explains both the evolution of recombination as well as the commonly reported intra-genomic correlation between G+C content and recombination rate. In addition, gene conversion disturbs linkage disequilibrium, potentially affecting the ability to detect causative variants. However, the importance and generality of these effects is unresolved, not simply because direct analyses are technically challenging but also because prior within- and between-species discrepant results can be hard to appraise owing to methodological differences. Here we report results of methodologically uniform whole-genome sequencing of all tetrad products in Saccharomyces, Neurospora, Chlamydomonas and Arabidopsis. The proportion of polymorphic markers converted varies over three orders of magnitude between species (from 2% of markers converted in yeast to only ~0.005% in the two plants) with at least 87.5% of the variance in per tetrad conversion rates being between-species. This is largely owing to differences in recombination rate and median tract length. Despite three of the species showing a positive GC-recombination correlation, there is no significant net AT->GC conversion bias in any, despite relatively high resolution in the two taxa (Saccharomyces and Neurospora) with relatively common gene conversion. The absence of a GC bias means: 1) that there should be no presumption that gene conversion is GC biased, nor 2) that a GC-recombination correlation necessarily implies biased gene conversion, 3) that Ka/Ks tests should be unaffected in these species and 4) it is unlikely that gene conversion explains the evolution of recombination.
Collapse
|
3
|
Yin J. Hypothesis testing of meiotic recombination rates from population genetic data. BMC Genet 2014; 15:122. [PMID: 25433522 PMCID: PMC4267743 DOI: 10.1186/s12863-014-0122-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2014] [Accepted: 10/28/2014] [Indexed: 11/10/2022] Open
Abstract
Background Meiotic recombination, one of the central biological processes studied in population genetics, comes in two known forms: crossovers and gene conversions. A number of previous studies have shown that when one of these two events is nonexistent in the genealogical model, the point estimation of the corresponding recombination rate by population genetic methods tends to be inflated. Therefore, it has become necessary to obtain statistical evidence from population genetic data about whether one of the two recombination events is absent. Results In this paper, we formulate this problem in a hypothesis testing framework and devise a testing procedure based on the likelihood ratio test (LRT). However, because the null value (i.e., zero) lies on the boundary of the parameter space, the regularity conditions for the large‐sample approximation to the distribution of the LRT statistic do not apply. In turn, the standard chi‐squared approximation is inaccurate. To address this critical issue, we propose a parametric bootstrap procedure to obtain an approximate p‐value for the observed test statistic. Coalescent simulations are conducted to show that our approach yields accurate null p‐values that closely follow the theoretical prediction while the estimated alternative p‐values tend to concentrate closer to zero. Finally, the method is demonstrated on a real biological data set from the telomere of the X chromosome of African Drosophila melanogaster. Conclusions Our methodology provides a necessary complement to the existing procedures of estimating meiotic recombination rates from population genetic data. Electronic supplementary material The online version of this article (doi:10.1186/s12863-014-0122-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Junming Yin
- Department of Management Information Systems, Eller College of Management, University of Arizona, Tucson, 85721, USA.
| |
Collapse
|
4
|
Contrasted patterns of crossover and non-crossover at Arabidopsis thaliana meiotic recombination hotspots. PLoS Genet 2013; 9:e1003922. [PMID: 24244190 PMCID: PMC3828143 DOI: 10.1371/journal.pgen.1003922] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2013] [Accepted: 09/11/2013] [Indexed: 11/25/2022] Open
Abstract
The vast majority of meiotic recombination events (crossovers (COs) and non-crossovers (NCOs)) cluster in narrow hotspots surrounded by large regions devoid of recombinational activity. Here, using a new molecular approach in plants, called “pollen-typing”, we detected and characterized hundreds of CO and NCO molecules in two different hotspot regions in Arabidopsis thaliana. This analysis revealed that COs are concentrated in regions of a few kilobases where their rates reach up to 50 times the genome average. The hotspots themselves tend to cluster in regions less than 8 kilobases in size with overlapping CO distribution. Non-crossover (NCO) events also occurred in the two hotspots but at very different levels (local CO/NCO ratios of 1/1 and 30/1) and their track lengths were quite small (a few hundred base pairs). We also showed that the ZMM protein MSH4 plays a role in CO formation and somewhat unexpectedly we also found that it is involved in the generation of NCOs but with a different level of effect. Finally, factors acting in cis and in trans appear to shape the rate and distribution of COs at meiotic recombination hotspots. During meiosis, genomes are reshuffled by recombination between homologous chromosomes. Reciprocal recombination events called crossovers are clustered in several kilobase-wide regions called hotspots, where their frequency is greatly enhanced compared to adjacent regions. Our understanding of hotspot organization is based on analyses performed in only a few species and rules differ between species. For the first time, hundreds of recombination events were analyzed in Arabidopsis thaliana revealing several new features: (i) crossovers are concentrated in hotspots where their rate reaches up to 50 times the genome average; (ii) non-crossovers events, (also called gene conversions not associated with crossovers) also occur in hotspots but at very different levels; and (iii) in the absence of the recombination protein MSH4, the crossover rate is dramatically reduced (70 times less than the wild-type level) and the crossover distribution within a hotspot is also largely modified; unexpectedly, the non-crossover rate was also altered (15% of the wild-type level at a hotspot). Finally we showed that factors acting in cis and in trans may influence the level and distribution of crossovers at and between hotspots.
Collapse
|
5
|
Padhukasahasram B, Rannala B. Meiotic gene-conversion rate and tract length variation in the human genome. Eur J Hum Genet 2013:ejhg201330. [PMID: 23443031 DOI: 10.1038/ejhg.2013.30] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2012] [Revised: 12/17/2012] [Accepted: 01/10/2013] [Indexed: 01/11/2023] Open
Abstract
Meiotic recombination occurs in the form of two different mechanisms called crossing-over and gene-conversion and both processes have an important role in shaping genetic variation in populations. Although variation in crossing-over rates has been studied extensively using sperm-typing experiments, pedigree studies and population genetic approaches, our knowledge of variation in gene-conversion parameters (ie, rates and mean tract lengths) remains far from complete. To explore variability in population gene-conversion rates and its relationship to crossing-over rate variation patterns, we have developed and validated using coalescent simulations a comprehensive Bayesian full-likelihood method that can jointly infer crossing-over and gene-conversion rates as well as tract lengths from population genomic data under general variable rate models with recombination hotspots. Here, we apply this new method to SNP data from multiple human populations and attempt to characterize for the first time the fine-scale variation in gene-conversion parameters along the human genome. We find that the estimated ratio of gene-conversion to crossing-over rates varies considerably across genomic regions as well as between populations. However, there is a great degree of uncertainty associated with such estimates. We also find substantial evidence for variation in the mean conversion tract length. The estimated tract lengths did not show any negative relationship with the local heterozygosity levels in our analysis.European Journal of Human Genetics advance online publication, 27 February 2013; doi:10.1038/ejhg.2013.30.
Collapse
Affiliation(s)
- Badri Padhukasahasram
- 1] Center for Health Policy and Health Services Research, Henry Ford Health System, Detroit, MI, USA [2] Genome Center and Department of Evolution and Ecology, University of California, Davis, Davis, CA, USA
| | - Bruce Rannala
- Genome Center and Department of Evolution and Ecology, University of California, Davis, Davis, CA, USA
| |
Collapse
|
6
|
Great majority of recombination events in Arabidopsis are gene conversion events. Proc Natl Acad Sci U S A 2012; 109:20992-7. [PMID: 23213238 DOI: 10.1073/pnas.1211827110] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The evolutionary importance of meiosis may not solely be associated with allelic shuffling caused by crossing-over but also have to do with its more immediate effects such as gene conversion. Although estimates of the crossing-over rate are often well resolved, the gene conversion rate is much less clear. In Arabidopsis, for example, next-generation sequencing approaches suggest that the two rates are about the same, which contrasts with indirect measures, these suggesting an excess of gene conversion. Here, we provide analysis of this problem by sequencing 40 F(2) Arabidopsis plants and their parents. Small gene conversion tracts, with biased gene conversion content, represent over 90% (probably nearer 99%) of all recombination events. The rate of alteration of protein sequence caused by gene conversion is over 600 times that caused by mutation. Finally, our analysis reveals recombination hot spots and unexpectedly high recombination rates near centromeres. This may be responsible for the previously unexplained pattern of high genetic diversity near Arabidopsis centromeres.
Collapse
|
7
|
Ross KA. Evidence for somatic gene conversion and deletion in bipolar disorder, Crohn's disease, coronary artery disease, hypertension, rheumatoid arthritis, type-1 diabetes, and type-2 diabetes. BMC Med 2011; 9:12. [PMID: 21291537 PMCID: PMC3048570 DOI: 10.1186/1741-7015-9-12] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2010] [Accepted: 02/03/2011] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND During gene conversion, genetic information is transferred unidirectionally between highly homologous but non-allelic regions of DNA. While germ-line gene conversion has been implicated in the pathogenesis of some diseases, somatic gene conversion has remained technically difficult to investigate on a large scale. METHODS A novel analysis technique is proposed for detecting the signature of somatic gene conversion from SNP microarray data. The Wellcome Trust Case Control Consortium has gathered SNP microarray data for two control populations and cohorts for bipolar disorder (BD), cardiovascular disease (CAD), Crohn's disease (CD), hypertension (HT), rheumatoid arthritis (RA), type-1 diabetes (T1D) and type-2 diabetes (T2D). Using the new analysis technique, the seven disease cohorts are analyzed to identify cohort-specific SNPs at which conversion is predicted. The quality of the predictions is assessed by identifying known disease associations for genes in the homologous duplicons, and comparing the frequency of such associations with background rates. RESULTS Of 28 disease/locus pairs meeting stringent conditions, 22 show various degrees of disease association, compared with only 8 of 70 in a mock study designed to measure the background association rate (P < 10-9). Additional candidate genes are identified using less stringent filtering conditions. In some cases, somatic deletions appear likely. RA has a distinctive pattern of events relative to other diseases. Similarities in patterns are apparent between BD and HT. CONCLUSIONS The associations derived represent the first evidence that somatic gene conversion could be a significant causative factor in each of the seven diseases. The specific genes provide potential insights about disease mechanisms, and are strong candidates for further study.
Collapse
Affiliation(s)
- Kenneth Andrew Ross
- Department of Computer Science, Columbia University, New York, NY 10027, USA.
| |
Collapse
|
8
|
Clark AG, Wang X, Matise T. Contrasting methods of quantifying fine structure of human recombination. Annu Rev Genomics Hum Genet 2010; 11:45-64. [PMID: 20690817 DOI: 10.1146/annurev-genom-082908-150031] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
There has been considerable excitement over the ability to construct linkage maps based only on genome-wide genotype data for single nucleotide polymorphic sites (SNPs) in a population sample. These maps, which are derived from estimates of linkage disequilibrium (LD), rely on population genetics theory to relate the decay of LD to the local rate of recombination, but other population processes also come into play. Here we contrast these LD maps to the classically derived, pedigree-based human recombination maps. The LD maps have a level of resolution greatly exceeding that of the pedigree maps, and at this fine scale, sperm typing allows a means of validation. While at a gross level both the pedigree maps and the sperm typing methods generally agree with LD maps, there are significant local differences between them, and the fact that these maps measure different genetic features should be remembered when using them for other genetic inferences.
Collapse
Affiliation(s)
- Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA.
| | | | | |
Collapse
|
9
|
Schulz A, Fischer C, Chang-Claude J, Beckmann L. Entropy-supported marker selection and Mantel statistics for haplotype sharing analysis. Genet Epidemiol 2010; 34:354-63. [DOI: 10.1002/gepi.20491] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
10
|
Yin J, Jordan MI, Song YS. Joint estimation of gene conversion rates and mean conversion tract lengths from population SNP data. Bioinformatics 2009; 25:i231-9. [PMID: 19477993 PMCID: PMC2687983 DOI: 10.1093/bioinformatics/btp229] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Motivation: Two known types of meiotic recombination are crossovers and gene conversions. Although they leave behind different footprints in the genome, it is a challenging task to tease apart their relative contributions to the observed genetic variation. In particular, for a given population SNP dataset, the joint estimation of the crossover rate, the gene conversion rate and the mean conversion tract length is widely viewed as a very difficult problem. Results: In this article, we devise a likelihood-based method using an interleaved hidden Markov model (HMM) that can jointly estimate the aforementioned three parameters fundamental to recombination. Our method significantly improves upon a recently proposed method based on a factorial HMM. We show that modeling overlapping gene conversions is crucial for improving the joint estimation of the gene conversion rate and the mean conversion tract length. We test the performance of our method on simulated data. We then apply our method to analyze real biological data from the telomere of the X chromosome of Drosophila melanogaster, and show that the ratio of the gene conversion rate to the crossover rate for the region may not be nearly as high as previously claimed. Availability: A software implementation of the algorithms discussed in this article is available at http://www.cs.berkeley.edu/∼yss/software.html. Contact:yss@eecs.berkeley.edu
Collapse
Affiliation(s)
- Junming Yin
- Computer Science Division and Department of Statistics, University of California, Berkeley, CA, USA
| | | | | |
Collapse
|
11
|
[The contribution of gene conversion at the same chromosome to the HLA diversity]. YI CHUAN = HEREDITAS 2008; 30:1411-6. [PMID: 19073548 DOI: 10.3724/sp.j.1005.2008.01411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
HLA is the most polymorphic gene family in human genome, which is imperative for human to face numerous heterogeneous bio-molecules. Previous studies on the formation of HLA polymorphism have been focused on gene crossover. Here, we investigated the contribution of gene conversion, which is an important mechanism to generate polymorphism at shaping different patterns of HLA-DRB genes. Analysis of all known HLA-DRB haplotypes and alleles demonstrated that this was a highly polymorphic gene family. Using Ester Betran's algorithm, 32 gene conversion regions were identified. The minimal conversion tract was as short as 2 bp, and the maximum interval between two furthest SNPs was 204 bp. Moreover, gene conversion occurred more frequently in certain regions (71-75, 18-221) of various alleles, suggesting that these segments were conversion hotspots. Further analysis showed that the conversion regions of 71-75 and 205-217 appeared to correlate with populations of Oriental and Caucasian, respectively, indicating that conversion hotspots might be population specific.
Collapse
|
12
|
Abstract
Simulation of genomic sequences under the coalescent with recombination has conventionally been impractical for regions beyond tens of megabases. This work presents an algorithm, implemented as the program MaCS (Markovian Coalescent Simulator), that can efficiently simulate haplotypes under any arbitrary model of population history. We present several metrics comparing the performance of MaCS with other available simulation programs. Practical usage of MaCS is demonstrated through a comparison of measures of linkage disequilibrium between generated program output and real genotype data from populations considered to be structured.
Collapse
|
13
|
Verrelli BC, Lewis CM, Stone AC, Perry GH. Different selective pressures shape the molecular evolution of color vision in chimpanzee and human populations. Mol Biol Evol 2008; 25:2735-43. [PMID: 18832077 DOI: 10.1093/molbev/msn220] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
A population genetic analysis of the long-wavelength opsin (OPN1LW, "red") color vision gene in a global sample of 236 human nucleotide sequences had previously discovered nine amino acid replacement single nucleotide polymorphisms, which were found at high frequencies in both African and non-African populations and associated with an unusual haplotype diversity. Although this pattern of nucleotide diversity is consistent with balancing selection, it has been argued that a recombination "hot spot" or gene conversion within and between X-linked color vision genes alone may explain these patterns. The current analysis investigates a closely related primate with trichromatism to determine whether color vision gene amino acid polymorphism and signatures of adaptive evolution are characteristic of humans alone. Our population sample of 56 chimpanzee (Pan troglodytes) OPN1LW sequences shows three singleton amino acid polymorphisms and no unusual recombination or linkage disequilibrium patterns across the approximately 5.5-kb region analyzed. Our comparative population genetic approach shows that the patterns of OPN1LW variation in humans and chimpanzees are consistent with positive and purifying selection within the two lineages, respectively. Although the complex role of color vision has been greatly documented in primate evolution in general, it is surprising that trichromatism has followed very different selective trajectories even between humans and our closest relatives.
Collapse
Affiliation(s)
- Brian C Verrelli
- Center for Evolutionary Functional Genomics, The Biodesign Institute and School of Life Sciences, Arizona State University, Tempe, AZ, USA.
| | | | | | | |
Collapse
|
14
|
High-resolution mapping of meiotic crossovers and non-crossovers in yeast. Nature 2008; 454:479-85. [PMID: 18615017 DOI: 10.1038/nature07135] [Citation(s) in RCA: 461] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2008] [Accepted: 05/30/2008] [Indexed: 11/08/2022]
Abstract
Meiotic recombination has a central role in the evolution of sexually reproducing organisms. The two recombination outcomes, crossover and non-crossover, increase genetic diversity, but have the potential to homogenize alleles by gene conversion. Whereas crossover rates vary considerably across the genome, non-crossovers and gene conversions have only been identified in a handful of loci. To examine recombination genome wide and at high spatial resolution, we generated maps of crossovers, crossover-associated gene conversion and non-crossover gene conversion using dense genetic marker data collected from all four products of fifty-six yeast (Saccharomyces cerevisiae) meioses. Our maps reveal differences in the distributions of crossovers and non-crossovers, showing more regions where either crossovers or non-crossovers are favoured than expected by chance. Furthermore, we detect evidence for interference between crossovers and non-crossovers, a phenomenon previously only known to occur between crossovers. Up to 1% of the genome of each meiotic product is subject to gene conversion in a single meiosis, with detectable bias towards GC nucleotides. To our knowledge the maps represent the first high-resolution, genome-wide characterization of the multiple outcomes of recombination in any organism. In addition, because non-crossover hotspots create holes of reduced linkage within haplotype blocks, our results stress the need to incorporate non-crossovers into genetic linkage analysis.
Collapse
|
15
|
Song YS, Ding Z, Gusfield D, Langley CH, Wu Y. Algorithms to distinguish the role of gene-conversion from single-crossover recombination in the derivation of SNP sequences in populations. J Comput Biol 2008; 14:1273-86. [PMID: 18047424 DOI: 10.1089/cmb.2007.0096] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Meiotic recombination is a fundamental biological event and one of the principal evolutionary forces responsible for shaping genetic variation within species. In addition to its fundamental role, recombination is central to several critical applied problems. The most important example is "association mapping" in populations, which is widely hoped to help find genes that influence genetic diseases (Carlson et al., 2004; Clark, 2003). Hence, a great deal of recent attention has focused on problems of inferring the historical derivation of sequences in populations when both mutations and recombinations have occurred. In the algorithms literature, most of that recent work has been directed to single-crossover recombination. However, gene-conversion is an important, and more common, form of (two-crossover) recombination which has been much less investigated in the algorithms literature. In this paper, we explicitly incorporate gene-conversion into discrete methods to study historical recombination. We are concerned with algorithms for identifying and locating the extent of historical crossing-over and gene-conversion (along with single-nucleotide mutation), and problems of constructing full putative histories of those events. The novel technical issues concern the incorporation of gene-conversion into recently developed discrete methods (Myers and Griffiths, 2003; Song et al., 2005) that compute lower and upper-bound information on the amount of needed recombination without gene-conversion. We first examine the most natural extension of the lower bound methods from Myers and Griffiths (2003), showing that the extension can be computed efficiently, but that this extension can only yield weak lower bounds. We then develop additional ideas that lead to higher lower bounds, and show how to solve, via integer-linear programming, a more biologically realistic version of the lower bound problem. We also show how to compute effective upper bounds on the number of needed single-crossovers and gene-conversions, along with explicit networks showing a putative history of mutations, single-crossovers and gene-conversions. Both lower and upper bound methods can handle data with missing entries, and the upper bound method can be used to infer missing entries with high accuracy. We validate the significance of these methods by showing that they can be effectively used to distinguish simulation-derived sequences generated without gene-conversion from sequences that were generated with gene-conversion. We apply the methods to recently studied sequences of Arabidopsis thaliana, identifying many more regions in the sequences than were previously identified (Plagnol et al., 2006), where gene-conversion may have played a significant role. Demonstration software is available at www.csif.cs.ucdavis.edu/~gusfield.
Collapse
Affiliation(s)
- Yun S Song
- Department of Computer Science, University of California, Davis, CA 95616, USA
| | | | | | | | | |
Collapse
|
16
|
Chen JM, Cooper DN, Chuzhanova N, Férec C, Patrinos GP. Gene conversion: mechanisms, evolution and human disease. Nat Rev Genet 2007; 8:762-75. [PMID: 17846636 DOI: 10.1038/nrg2193] [Citation(s) in RCA: 449] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Gene conversion, one of the two mechanisms of homologous recombination, involves the unidirectional transfer of genetic material from a 'donor' sequence to a highly homologous 'acceptor'. Considerable progress has been made in understanding the molecular mechanisms that underlie gene conversion, its formative role in human genome evolution and its implications for human inherited disease. Here we assess current thinking about how gene conversion occurs, explore the key part it has played in fashioning extant human genes, and carry out a meta-analysis of gene-conversion events that are known to have caused human genetic disease.
Collapse
|
17
|
Ouyang C, Krontiris TG. Identification and functional significance of SNPs underlying conserved haplotype frameworks across ethnic populations. Pharmacogenet Genomics 2006; 16:667-82. [PMID: 16906021 DOI: 10.1097/01.fpc.0000220569.82842.9b] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
BACKGROUND The study of genetic variation will promote our understanding of the differential predisposition to common diseases and variation in drug responses of individuals and ethnic populations. Such genetic variation is intrinsically structured into blocks of haplotypes in populations. Therefore, a comprehensive haplotype map based on the most abundant form of genetic variation, single nucleotide polymorphisms, will be useful. At the present time, however, our knowledge of the similarities and differences of haplotype structure among different ancestral populations is still inadequate. METHODS To determine whether common underlying haplotype patterns existed across ethnic populations, we analyzed data derived from African and European Americans for twenty-two genes spanning a total of 516 kb and the HapMap ENCODE data across 500 kb on chromosome 2p16.3 from three major world populations. RESULTS AND CONCLUSIONS We observed that strong pairwise linkage disequilibrium (LD) between SNPs selected from populations having African ancestry was highly conserved across other non-African populations. Common haplotypes described by these LD-selected SNPs demonstrated a simple evolutionary structure with up to three major frameworks, which were likely ancestral backgrounds upon which more recent mutations have been superimposed. Also, haplotype block boundaries defined in populations having African ancestry revealed completely concordant recombinant haplotypes across all populations, providing a consistent definition of block structure. Finally, a large fraction of regulatory polymorphisms described in the literature appeared to tag these conserved haplotype frameworks, strongly suggesting their significance for disease association and pharmacogenetic studies.
Collapse
Affiliation(s)
- Ching Ouyang
- Division of Molecular Medicine, Beckman Research Institute of the City of Hope, Duarte, California 91010, USA
| | | |
Collapse
|
18
|
Song YS, Ding Z, Gusfield D, Langley CH, Wu Y. Algorithms to Distinguish the Role of Gene-Conversion from Single-Crossover Recombination in the Derivation of SNP Sequences in Populations. LECTURE NOTES IN COMPUTER SCIENCE 2006. [DOI: 10.1007/11732990_20] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
|
19
|
Jones R, Pembrey M, Golding J, Herrick D. The search for genenotype/phenotype associations and the phenome scan. Paediatr Perinat Epidemiol 2005; 19:264-75. [PMID: 15958149 DOI: 10.1111/j.1365-3016.2005.00664.x] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
All the approaches to the search for genotype/phenotype associations have their share of problems. Comparing the genome scan and candidate gene approaches, the former makes fewer assumptions at the genetic level or about mechanism but has greater statistical difficulties while the latter partially solves the statistical problem but makes more assumptions at both genetic and mechanistic levels. Among current difficulties is a lack of information about the nature of gene variant/phenotype associations: the frequency with which different classes of gene or sequence are involved; the type of genetic variation most commonly involved; the appropriate genetic models to apply to analysis. The overarching problem is that of multiple testing, one solution to which is to integrate genetic information to create a smaller number of compound variables. At the other end of the scale, decisions about the level of complexity at which to pitch the identification of phenotypes also affect the multiple testing problem: whether to pitch them at the level of disease outcomes, or at any of the multiple levels of intermediate phenotypes or traits. The third issue is how best to deal with gene/gene or gene/environment interactions, or whether to ignore them. Only as more genotype/phenotype associations emerge, by whatever means, will the numbers of results allow these questions to be answered. We describe here a new approach to genotype/phenotype association studies, the phenome scan, in which dense phenotypic information in human cohorts is scanned for associations with individual genetic variants. We believe that this approach can generate data that will be useful in answering generic questions about genotype/phenotype associations as well as in discovering novel ones.
Collapse
Affiliation(s)
- Richard Jones
- ALSPAC, Department of Community-Based Medicine, University of Bristol, Bristol, UK.
| | | | | | | |
Collapse
|
20
|
Nebert DW, Vesell ES. Advances in pharmacogenomics and individualized drug therapy: exciting challenges that lie ahead. Eur J Pharmacol 2004; 500:267-80. [PMID: 15464039 DOI: 10.1016/j.ejphar.2004.07.031] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/01/2004] [Indexed: 12/16/2022]
Abstract
Between the 1930s and 1990s, several dozen predominantly monogenic, high-penetrance disorders involving pharmacogenetics were described, fueling the crusade that gene-drug interactions are quite simple. Then, in 1990, the Human Genome Project was established; in 1995, the term pharmacogenomics was introduced; finally, the complexities of determining an unequivocal phenotype, as well as an unequivocal genotype, have recently become apparent. Since 1965, more than 1000 reviews on this topic have painted an overly optimistic picture-suggesting that the advent of individualized drug therapy used by the practicing physician is fast approaching. For many reasons listed here, however, we emphasize that these high expectations must be tempered. We now realize that the nucleotide sequence of the genome represents only a starting point from which we must proceed to a more difficult stage: knowledge of the function encoded and how this affects the phenotype. To achieve individualized drug therapy, a high level of accuracy and precision is required of any clinical test proposed in human patients. Finally, we suggest that metabonomics, perhaps in combination with proteomics, might complement genomics in eventually helping us to achieve individualized drug therapy.
Collapse
Affiliation(s)
- Daniel W Nebert
- Division of Human Genetics, Department of Pediatrics and Molecular Developmental Biology, University of Cincinnati Medical Center, P.O. Box 670056, Cincinnati OH 45267-0056, USA.
| | | |
Collapse
|
21
|
Verrelli BC, Tishkoff SA. Signatures of selection and gene conversion associated with human color vision variation. Am J Hum Genet 2004; 75:363-75. [PMID: 15252758 PMCID: PMC1182016 DOI: 10.1086/423287] [Citation(s) in RCA: 81] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2004] [Accepted: 06/10/2004] [Indexed: 11/03/2022] Open
Abstract
Trichromatic color vision in humans results from the combination of red, green, and blue photopigment opsins. Although color vision genes have been the targets of active molecular and psychophysical research on color vision abnormalities, little is known about patterns of normal genetic variation in these genes among global human populations. The current study presents nucleotide sequence analyses and tests of neutrality for a 5.5-kb region of the X-linked long-wave "red" opsin gene (OPN1LW) in 236 individuals from ethnically diverse human populations. Our analysis of the recombination landscape across OPN1LW reveals an unusual haplotype structure associated with amino acid replacement variation in exon 3 that is consistent with gene conversion. Compared with the absence of OPN1LW amino acid replacement fixation since divergence from chimpanzee, the human population exhibits a significant excess of high-frequency OPN1LW replacements. Our results suggest that subtle changes in L-cone opsin wavelength absorption may have been adaptive during human evolution.
Collapse
Affiliation(s)
- Brian C Verrelli
- Department of Biology, University of Maryland, College Park 20742, USA
| | | |
Collapse
|