Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kaplan N, Hudson RR. The use of sample genealogies for studying a selectively neutral m-loci model with recombination. Theor Popul Biol 1985;28:382-96. [PMID: 4071443 DOI: 10.1016/0040-5809(85)90036-x] [Citation(s) in RCA: 51] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

For:	Kaplan N, Hudson RR. The use of sample genealogies for studying a selectively neutral m-loci model with recombination. Theor Popul Biol 1985;28:382-96. [PMID: 4071443 DOI: 10.1016/0040-5809(85)90036-x] [Citation(s) in RCA: 51] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Number

Cited by Other Article(s)

Ray DD, Flagel L, Schrider DR. IntroUNET: Identifying introgressed alleles via semantic segmentation. PLoS Genet 2024;20:e1010657. [PMID: 38377104 PMCID: PMC10906877 DOI: 10.1371/journal.pgen.1010657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 03/01/2024] [Accepted: 01/29/2024] [Indexed: 02/22/2024] Open

Abstract

A growing body of evidence suggests that gene flow between closely related species is a widespread phenomenon. Alleles that introgress from one species into a close relative are typically neutral or deleterious, but sometimes confer a significant fitness advantage. Given the potential relevance to speciation and adaptation, numerous methods have therefore been devised to identify regions of the genome that have experienced introgression. Recently, supervised machine learning approaches have been shown to be highly effective for detecting introgression. One especially promising approach is to treat population genetic inference as an image classification problem, and feed an image representation of a population genetic alignment as input to a deep neural network that distinguishes among evolutionary models (i.e. introgression or no introgression). However, if we wish to investigate the full extent and fitness effects of introgression, merely identifying genomic regions in a population genetic alignment that harbor introgressed loci is insufficient-ideally we would be able to infer precisely which individuals have introgressed material and at which positions in the genome. Here we adapt a deep learning algorithm for semantic segmentation, the task of correctly identifying the type of object to which each individual pixel in an image belongs, to the task of identifying introgressed alleles. Our trained neural network is thus able to infer, for each individual in a two-population alignment, which of those individual's alleles were introgressed from the other population. We use simulated data to show that this approach is highly accurate, and that it can be readily extended to identify alleles that are introgressed from an unsampled "ghost" population, performing comparably to a supervised learning method tailored specifically to that task. Finally, we apply this method to data from Drosophila, showing that it is able to accurately recover introgressed haplotypes from real data. This analysis reveals that introgressed alleles are typically confined to lower frequencies within genic regions, suggestive of purifying selection, but are found at much higher frequencies in a region previously shown to be affected by adaptive introgression. Our method's success in recovering introgressed haplotypes in challenging real-world scenarios underscores the utility of deep learning approaches for making richer evolutionary inferences from genomic data.

Collapse

Ray DD, Flagel L, Schrider DR. IntroUNET: identifying introgressed alleles via semantic segmentation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.07.527435. [PMID: 36865105 PMCID: PMC9979274 DOI: 10.1101/2023.02.07.527435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]

Abstract

Collapse

Wakeley J, Fan WT(L, Koch E, Sunyaev S. Recurrent mutation in the ancestry of a rare variant. Genetics 2023;224:iyad049. [PMID: 36967220 PMCID: PMC10324944 DOI: 10.1093/genetics/iyad049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 01/30/2023] [Accepted: 03/08/2023] [Indexed: 03/28/2023] Open

Baumdicker F, Bisschop G, Goldstein D, Gower G, Ragsdale AP, Tsambos G, Zhu S, Eldon B, Ellerman EC, Galloway JG, Gladstein AL, Gorjanc G, Guo B, Jeffery B, Kretzschmar WW, Lohse K, Matschiner M, Nelson D, Pope NS, Quinto-Cortés CD, Rodrigues MF, Saunack K, Sellinger T, Thornton K, van Kemenade H, Wohns AW, Wong Y, Gravel S, Kern AD, Koskela J, Ralph PL, Kelleher J. Efficient ancestry and mutation simulation with msprime 1.0. Genetics 2021;220:6460344. [PMID: 34897427 PMCID: PMC9176297 DOI: 10.1093/genetics/iyab229] [Citation(s) in RCA: 91] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 12/03/2021] [Indexed: 11/13/2022] Open

Affiliation(s)

Franz Baumdicker Cluster of Excellence "Controlling Microbes to Fight Infections", Mathematical and Computational Population Genetics, University of Tübingen, 72076 Tübingen, Germany
Gertjan Bisschop Institute of Evolutionary Biology,The University of Edinburgh, EH9 3FL, UK
Daniel Goldstein Khoury College of Computer Sciences, Northeastern University, MA 02115, USA.,No affiliation
Graham Gower Lundbeck GeoGenetics Centre, Globe Institute, University of Copenhagen, 1350 Copenhagen K, Denmark
Aaron P Ragsdale Department of Integrative Biology, University of Wisconsin-Madison, WI 53706, USA
Georgia Tsambos Melbourne Integrative Genomics, School of Mathematics and Statistics, University of Melbourne, Victoria, 3010, Australia
Sha Zhu Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, OX3 7LF, UK
Bjarki Eldon Leibniz Institute for Evolution and Biodiversity Science,Museum für Naturkunde Berlin, 10115, Germany
E Castedo Ellerman Fresh Pond Research Institute, Cambridge, MA 02140, USA
Jared G Galloway Institute of Ecology and Evolution, Department of Biology, University of Oregon, OR 97403-5289, USA.,Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98102, USA
Ariella L Gladstein Department of Genetics, University of North Carolina at Chapel Hill, NC 27599-7264, USA.,Embark Veterinary, Inc., Boston, MA 02111, USA
Gregor Gorjanc The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, EH25 9RG, UK
Bing Guo Institute for Genome Sciences,University of Maryland School of Medicine, Baltimore, MD, 21201, USA
Ben Jeffery Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, OX3 7LF, UK
Warren W Kretzschmar Center for Hematology and Regenerative Medicine, Karolinska Institute, 141 83 Huddinge, Sweden
Konrad Lohse Institute of Evolutionary Biology,The University of Edinburgh, EH9 3FL, UK
Michael Matschiner Natural History Museum, University of Oslo, Blindern 0318 Oslo, Norway
Dominic Nelson Department of Human Genetics, McGill University, Montréal, QC H3A 0C7, Canada
Nathaniel S Pope Department of Entomology, Pennsylvania State University, PA 16802, USA
Consuelo D Quinto-Cortés National Laboratory of Genomics for Biodiversity (LANGEBIO), Unit of Advanced Genomics, CINVESTAV, Irapuato, Mexico
Murillo F Rodrigues Institute of Ecology and Evolution, Department of Biology, University of Oregon, OR 97403-5289, USA
Kumar Saunack IIT Bombay, Powai, Mumbai 400 076, Maharashtra, India
Thibaut Sellinger Professorship for Population Genetics, Department of Life Science Systems, Technical University of Munich, 85354 Freising, Germany
Kevin Thornton Ecology and Evolutionary Biology, University of California, Irvine, CA 92697, USA
Hugo van Kemenade No affiliation
Anthony W Wohns Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, OX3 7LF, UK.,Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Yan Wong Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, OX3 7LF, UK
Simon Gravel Department of Human Genetics, McGill University, Montréal, QC H3A 0C7, Canada
Andrew D Kern Institute of Ecology and Evolution, Department of Biology, University of Oregon, OR 97403-5289, USA
Jere Koskela Department of Statistics, University of Warwick, CV4 7AL, UK
Peter L Ralph Institute of Ecology and Evolution, Department of Biology, University of Oregon, OR 97403-5289, USA.,Department of Mathematics, University of Oregon, OR 97403-5289 USA
Jerome Kelleher Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, OX3 7LF, UK

Collapse

Wakeley J. Developments in coalescent theory from single loci to chromosomes. Theor Popul Biol 2020;133:56-64. [DOI: 10.1016/j.tpb.2020.02.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2019] [Revised: 02/19/2020] [Accepted: 02/26/2020] [Indexed: 10/24/2022]

Okii D, Badji A, Odong T, Talwana H, Tukamuhabwa P, Male A, Mukankusi C, Gepts P. Recombination fraction and genetic linkage among key disease resistance genes (Co-4² /Phg-2 and Co-5/"P.ult") in common bean. ACTA ACUST UNITED AC 2019;18:AJB-18-29-819. [PMID: 33281892 PMCID: PMC7672375 DOI: 10.5897/ajb2019.16776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 06/26/2019] [Indexed: 10/31/2022]

Kelleher J, Etheridge AM, McVean G. Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes. PLoS Comput Biol 2016;12:e1004842. [PMID: 27145223 PMCID: PMC4856371 DOI: 10.1371/journal.pcbi.1004842] [Citation(s) in RCA: 328] [Impact Index Per Article: 41.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2015] [Accepted: 03/02/2016] [Indexed: 01/23/2023] Open

Abstract

A central challenge in the analysis of genetic variation is to provide realistic genome simulation across millions of samples. Present day coalescent simulations do not scale well, or use approximations that fail to capture important long-range linkage properties. Analysing the results of simulations also presents a substantial challenge, as current methods to store genealogies consume a great deal of space, are slow to parse and do not take advantage of shared structure in correlated trees. We solve these problems by introducing sparse trees and coalescence records as the key units of genealogical analysis. Using these tools, exact simulation of the coalescent with recombination for chromosome-sized regions over hundreds of thousands of samples is possible, and substantially faster than present-day approximate methods. We can also analyse the results orders of magnitude more quickly than with existing methods.

Our understanding of the distribution of genetic variation in natural populations has been driven by mathematical models of the underlying biological and demographic processes. A key strength of such coalescent models is that they enable efficient simulation of data we might see under a variety of evolutionary scenarios. However, current methods are not well suited to simulating genome-scale data sets on hundreds of thousands of samples, which is essential if we are to understand the data generated by population-scale sequencing projects. Similarly, processing the results of large simulations also presents researchers with a major challenge, as it can take many days just to read the data files. In this paper we solve these problems by introducing a new way to represent information about the ancestral process. This new representation leads to huge gains in simulation speed and storage efficiency so that large simulations complete in minutes and the output files can be processed in seconds.

Collapse

The SMC' is a highly accurate approximation to the ancestral recombination graph. Genetics 2015;200:343-55. [PMID: 25786855 DOI: 10.1534/genetics.114.173898] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2014] [Accepted: 03/12/2015] [Indexed: 11/18/2022] Open

Wakeley J. Coalescent theory has many new branches. Theor Popul Biol 2013;87:1-4. [PMID: 23747657 DOI: 10.1016/j.tpb.2013.06.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

HIV-1 continues to replicate and evolve in patients with natural control of HIV infection. J Virol 2010;84:12971-81. [PMID: 20926564 DOI: 10.1128/jvi.00387-10] [Citation(s) in RCA: 88] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

RoyChoudhury A, Wakeley J. Sufficiency of the number of segregating sites in the limit under finite-sites mutation. Theor Popul Biol 2010;78:118-22. [DOI: 10.1016/j.tpb.2010.05.003] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2010] [Revised: 05/17/2010] [Accepted: 05/19/2010] [Indexed: 10/19/2022]

Verhoeven KJF, Casella G, McIntyre LM. Epistasis: obstacle or advantage for mapping complex traits? PLoS One 2010;5:e12264. [PMID: 20865037 PMCID: PMC2928725 DOI: 10.1371/journal.pone.0012264] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2009] [Accepted: 04/19/2010] [Indexed: 01/22/2023] Open

Coalescent simulation of intracodon recombination. Genetics 2009;184:429-37. [PMID: 19933876 DOI: 10.1534/genetics.109.109736] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Abstract

The coalescent with recombination is a very useful tool in molecular population genetics. Under this framework, genealogies often represent the evolution of the substitution unit, and because of this, the few coalescent algorithms implemented for the simulation of coding sequences force recombination to occur only between codons. However, it is clear that recombination is expected to occur most often within codons. Here we have developed an algorithm that can evolve coding sequences under an ancestral recombination graph that represents the genealogies at each nucleotide site, thereby allowing for intracodon recombination. The algorithm is a modification of Hudson's coalescent in which, in addition to keeping track of events occurring in the ancestral material that reaches the sample, we need to keep track of events occurring in ancestral material that does not reach the sample but that is produced by intracodon recombination. We are able to show that at typical substitution rates the number of nonsynonymous changes induced by intracodon recombination is small and that intracodon recombination does not generally result in inflated estimates of the overall nonsynonymous/synonymous substitution ratio (omega). On the other hand, recombination can bias the estimation of omega at particular codons, resulting in apparent rate variation among sites and in the spurious identification of positively selected sites. Importantly, in this case, allowing for variable synonymous rates across sites greatly reduces the false-positive rate and recovers statistical power. Finally, coalescent simulations with intracodon recombination could be used to better represent the evolution of nuclear coding genes or fast-evolving pathogens such as HIV-1.We have implemented this algorithm in a computer program called NetRecodon, freely available at http://darwin.uvigo.es.

Collapse

Ancestral population genomics: the coalescent hidden Markov model approach. Genetics 2009;183:259-74. [PMID: 19581452 DOI: 10.1534/genetics.109.103010] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Eriksson A, Mahjani B, Mehlig B. Sequential Markov coalescent algorithms for population models with demographic structure. Theor Popul Biol 2009;76:84-91. [PMID: 19433100 DOI: 10.1016/j.tpb.2009.05.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2009] [Revised: 05/04/2009] [Accepted: 05/04/2009] [Indexed: 10/24/2022]

Yawata M, Yawata N, Draghi M, Little AM, Partheniou F, Parham P. Roles for HLA and KIR polymorphisms in natural killer cell repertoire selection and modulation of effector function. ACTA ACUST UNITED AC 2006;203:633-45. [PMID: 16533882 PMCID: PMC2118260 DOI: 10.1084/jem.20051884] [Citation(s) in RCA: 436] [Impact Index Per Article: 24.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Eriksson A, Mehlig B. Gene-history correlation and population structure. Phys Biol 2005;1:220-8. [PMID: 16204842 DOI: 10.1088/1478-3967/1/4/004] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Zhang K, Sun F. Assessing the power of tag SNPs in the mapping of quantitative trait loci (QTL) with extremal and random samples. BMC Genet 2005;6:51. [PMID: 16236175 PMCID: PMC1274312 DOI: 10.1186/1471-2156-6-51] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2005] [Accepted: 10/19/2005] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Recent studies have indicated that the human genome could be divided into regions with low haplotype diversity interspersed with regions of high haplotype diversity. In regions of low haplotype diversity, a small fraction of SNPs (tag SNPs) are sufficient to account for most of the haplotype diversity of the human genome. These tag SNPs can be extremely useful for testing the association of a marker locus with a qualitative or quantitative trait locus in that it may not be necessary to genotype all the SNPs. When tag SNPs are used to reduce the genotyping effort in association studies, it is important to know how much power is lost. It is also important to know how much power is gained when tag SNPs instead of the same number of randomly chosen SNPs are used.

RESULTS

We design a simulation study to tackle these problems for a variety of quantitative association tests using either case-parent samples or unrelated population samples. First, the samples are generated based on the quantitative trait model with the assumption of either an extremal sampling scheme or a random sampling scheme. Second, a small number of samples are selected to determine the haplotype blocks and the tag SNPs. Third, the statistical power of the tests is evaluated using four kinds of data: (1) all the SNPs and the corresponding haplotypes, (2) the tag SNPs and the corresponding haplotypes, (3) the same number of evenly spaced SNPs with minor allele frequency greater than a threshold and the corresponding haplotypes, (4) the same number of randomly chosen SNPs and their corresponding haplotypes.

CONCLUSION

Our results suggest that in most situations genotyping efforts can be significantly reduced by using tag SNPs for mapping the QTL in association studies without much loss of power, which is consistent with previous studies on association mapping of qualitative traits. For all situations considered, two-locus haplotype analysis using tag SNPs are more powerful than those using the same number of randomly selected SNPs, but the degree of such power differences depends upon the sampling scheme and the population history.

Collapse

Takebayashi N, Newbigin E, Uyenoyama MK. Maximum-likelihood estimation of rates of recombination within mating-type regions. Genetics 2005;167:2097-109. [PMID: 15342543 PMCID: PMC1471000 DOI: 10.1534/genetics.103.021535] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Wakeley J. Recent trends in population genetics: more data! More math! Simple models? J Hered 2004;95:397-405. [PMID: 15388767 DOI: 10.1093/jhered/esh062] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Verhoeven KJF, Simonsen KL. Genomic haplotype blocks may not accurately reflect spatial variation in historic recombination intensity. Mol Biol Evol 2004;22:735-40. [PMID: 15563716 DOI: 10.1093/molbev/msi058] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open

Eriksson A, Mehlig B. On the effect of fluctuating recombination rates on the decorrelation of gene histories in the human genome. Genetics 2004;169:1175-8. [PMID: 15520270 PMCID: PMC1449132 DOI: 10.1534/genetics.103.018002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Zhang K, Qin ZS, Liu JS, Chen T, Waterman MS, Sun F. Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies. Genome Res 2004;14:908-16. [PMID: 15078859 PMCID: PMC479119 DOI: 10.1101/gr.1837404] [Citation(s) in RCA: 125] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Uyenoyama MK, Takebayashi N. A simple method for computing exact probabilities of mutation numbers. Theor Popul Biol 2004;65:271-84. [PMID: 15066423 DOI: 10.1016/j.tpb.2003.12.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2003] [Indexed: 10/26/2022]

Kim S, Zhang K, Sun F. Detecting susceptibility genes in case-control studies using set association. BMC Genet 2003;4 Suppl 1:S9. [PMID: 14975077 PMCID: PMC1866530 DOI: 10.1186/1471-2156-4-s1-s9] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Fearnhead P. Consistency of estimators of the population-scaled recombination rate. Theor Popul Biol 2003;64:67-79. [PMID: 12804872 DOI: 10.1016/s0040-5809(03)00041-8] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Wakeley J, Lessard S. Theory of the effects of population structure and sampling on patterns of linkage disequilibrium applied to genomic data from humans. Genetics 2003;164:1043-53. [PMID: 12871914 PMCID: PMC1462626 DOI: 10.1093/genetics/164.3.1043] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Wiuf C, Posada D. A coalescent model of recombination hotspots. Genetics 2003;164:407-17. [PMID: 12750351 PMCID: PMC1462539 DOI: 10.1093/genetics/164.1.407] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Broughton RE, Harrison RG. Nuclear gene genealogies reveal historical, demographic and selective factors associated with speciation in field crickets. Genetics 2003;163:1389-401. [PMID: 12702683 PMCID: PMC1462531 DOI: 10.1093/genetics/163.4.1389] [Citation(s) in RCA: 81] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Zhang K, Calabrese P, Nordborg M, Sun F. Haplotype block structure and its applications to association studies: power and study designs. Am J Hum Genet 2002;71:1386-94. [PMID: 12439824 PMCID: PMC378580 DOI: 10.1086/344780] [Citation(s) in RCA: 207] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2002] [Accepted: 09/16/2002] [Indexed: 11/04/2022] Open

Abstract

Recent studies have shown that the human genome has a haplotype block structure, such that it can be divided into discrete blocks of limited haplotype diversity. In each block, a small fraction of single-nucleotide polymorphisms (SNPs), referred to as "tag SNPs," can be used to distinguish a large fraction of the haplotypes. These tag SNPs can potentially be extremely useful for association studies, in that it may not be necessary to genotype all SNPs; however, this depends on how much power is lost. Here we develop a simulation study to quantitatively assess the power loss for a variety of study designs, including case-control designs and case-parental control designs. First, a number of data sets containing case-parental or case-control samples are generated on the basis of a disease model. Second, a small fraction of case and control individuals in each data set are genotyped at all the loci, and a dynamic programming algorithm is used to determine the haplotype blocks and the tag SNPs based on the genotypes of the sampled individuals. Third, the statistical power of tests was evaluated on the basis of three kinds of data: (1) all of the SNPs and the corresponding haplotypes, (2) the tag SNPs and the corresponding haplotypes, and (3) the same number of randomly chosen SNPs as the number of tag SNPs and the corresponding haplotypes. We study the power of different association tests with a variety of disease models and block-partitioning criteria. Our study indicates that the genotyping efforts can be significantly reduced by the tag SNPs, without much loss of power. Depending on the specific haplotype block-partitioning algorithm and the disease model, when the identified tag SNPs are only 25% of all the SNPs, the power is reduced by only 4%, on average, compared with a power loss of approximately 12% when the same number of randomly chosen SNPs is used in a two-locus haplotype analysis. When the identified tag SNPs are approximately 14% of all the SNPs, the power is reduced by approximately 9%, compared with a power loss of approximately 21% when the same number of randomly chosen SNPs is used in a two-locus haplotype analysis. Our study also indicates that haplotype-based analysis can be much more powerful than marker-by-marker analysis.

Collapse

McVean GAT. A genealogical interpretation of linkage disequilibrium. Genetics 2002;162:987-91. [PMID: 12399406 PMCID: PMC1462283 DOI: 10.1093/genetics/162.2.987] [Citation(s) in RCA: 142] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Fearnhead P, Donnelly P. Approximate likelihood methods for estimating local recombination rates. J R Stat Soc Series B Stat Methodol 2002. [DOI: 10.1111/1467-9868.00355] [Citation(s) in RCA: 64] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Discussion on the meeting on 'Statistical modelling and analysis of genetic data'. J R Stat Soc Series B Stat Methodol 2002. [DOI: 10.1111/1467-9868.00359] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Reich DE, Schaffner SF, Daly MJ, McVean G, Mullikin JC, Higgins JM, Richter DJ, Lander ES, Altshuler D. Human genome sequence variation and the influence of gene history, mutation and recombination. Nat Genet 2002;32:135-42. [PMID: 12161752 DOI: 10.1038/ng947] [Citation(s) in RCA: 235] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Wakeley J, Nielsen R, Liu-Cordero SN, Ardlie K. The discovery of single-nucleotide polymorphisms--and inferences about human demographic history. Am J Hum Genet 2001;69:1332-47. [PMID: 11704929 PMCID: PMC1235544 DOI: 10.1086/324521] [Citation(s) in RCA: 133] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2001] [Accepted: 09/24/2001] [Indexed: 11/03/2022] Open

Abstract

A method of historical inference that accounts for ascertainment bias is developed and applied to single-nucleotide polymorphism (SNP) data in humans. The data consist of 84 short fragments of the genome that were selected, from three recent SNP surveys, to contain at least two polymorphisms in their respective ascertainment samples and that were then fully resequenced in 47 globally distributed individuals. Ascertainment bias is the deviation, from what would be observed in a random sample, caused either by discovery of polymorphisms in small samples or by locus selection based on levels or patterns of polymorphism. The three SNP surveys from which the present data were derived differ both in their protocols for ascertainment and in the size of the samples used for discovery. We implemented a Monte Carlo maximum-likelihood method to fit a subdivided-population model that includes a possible change in effective size at some time in the past. Incorrectly assuming that ascertainment bias does not exist causes errors in inference, affecting both estimates of migration rates and historical changes in size. Migration rates are overestimated when ascertainment bias is ignored. However, the direction of error in inferences about changes in effective population size (whether the population is inferred to be shrinking or growing) depends on whether either the numbers of SNPs per fragment or the SNP-allele frequencies are analyzed. We use the abbreviation "SDL," for "SNP-discovered locus," in recognition of the genomic-discovery context of SNPs. When ascertainment bias is modeled fully, both the number of SNPs per SDL and their allele frequencies support a scenario of growth in effective size in the context of a subdivided population. If subdivision is ignored, however, the hypothesis of constant effective population size cannot be rejected. An important conclusion of this work is that, in demographic or other studies, SNP data are useful only to the extent that their ascertainment can be modeled.

Collapse

Posada D, Crandall KA. Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proc Natl Acad Sci U S A 2001;98:13757-62. [PMID: 11717435 PMCID: PMC61114 DOI: 10.1073/pnas.241370698] [Citation(s) in RCA: 1074] [Impact Index Per Article: 46.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2001] [Indexed: 11/18/2022] Open

Fearnhead P, Donnelly P. Estimating recombination rates from population genetic data. Genetics 2001;159:1299-318. [PMID: 11729171 PMCID: PMC1461855 DOI: 10.1093/genetics/159.3.1299] [Citation(s) in RCA: 223] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

McIntyre LM, Martin ER, Simonsen KL, Kaplan NL. Circumventing multiple testing: a multilocus Monte Carlo approach to testing for association. Genet Epidemiol 2000;19:18-29. [PMID: 10861894 DOI: 10.1002/1098-2272(200007)19:1<18::aid-gepi2>3.0.co;2-y] [Citation(s) in RCA: 81] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Wiuf C. A coalescence approach to gene conversion. Theor Popul Biol 2000;57:357-67. [PMID: 10900188 DOI: 10.1006/tpbi.2000.1462] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Wall JD. A comparison of estimators of the population recombination rate. Mol Biol Evol 2000;17:156-63. [PMID: 10666715 DOI: 10.1093/oxfordjournals.molbev.a026228] [Citation(s) in RCA: 122] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Wiuf C, Hein J. Recombination as a point process along sequences. Theor Popul Biol 1999;55:248-59. [PMID: 10366550 DOI: 10.1006/tpbi.1998.1403] [Citation(s) in RCA: 104] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Griffiths RC. The time to the ancestor along sequences with recombination. Theor Popul Biol 1999;55:137-44. [PMID: 10329513 DOI: 10.1006/tpbi.1998.1390] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Wiuf C, Hein J. The ancestry of a sample of sequences subject to recombination. Genetics 1999;151:1217-28. [PMID: 10049937 PMCID: PMC1460527 DOI: 10.1093/genetics/151.3.1217] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Simonsen KL, Churchill GA. A Markov Chain Model of Coalescence with Recombination. Theor Popul Biol 1997;52:43-59. [PMID: 9356323 DOI: 10.1006/tpbi.1997.1307] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

An Ancestral Recombination Graph. PROGRESS IN POPULATION GENETICS AND HUMAN EVOLUTION 1997. [DOI: 10.1007/978-1-4757-2609-1_16] [Citation(s) in RCA: 117] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]

Griffiths RC, Marjoram P. Ancestral inference from samples of DNA sequences with recombination. J Comput Biol 1996;3:479-502. [PMID: 9018600 DOI: 10.1089/cmb.1996.3.479] [Citation(s) in RCA: 251] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open

Griffiths RC. Which locus has the oldest allele? J Math Biol 1991;29:763-77. [PMID: 1940668 DOI: 10.1007/bf00160191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Ethier SN, Griffiths RC. On the two-locus sampling distribution. J Math Biol 1990. [DOI: 10.1007/bf00168175] [Citation(s) in RCA: 47] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]