1
|
Genetic analysis of complex traits via Bayesian variable selection: the utility of a mixture of uniform priors. Genet Res (Camb) 2011; 93:303-18. [PMID: 21767461 DOI: 10.1017/s0016672311000164] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
A new estimation-based Bayesian variable selection approach is presented for genetic analysis of complex traits based on linear or logistic regression. By assigning a mixture of uniform priors (MU) to genetic effects, the approach provides an intuitive way of specifying hyperparameters controlling the selection of multiple influential loci. It aims at avoiding the difficulty of interpreting assumptions made in the specifications of priors. The method is compared in two real datasets with two other approaches, stochastic search variable selection (SSVS) and a re-formulation of Bayes B utilizing indicator variables and adaptive Student's t-distributions (IAt). The Markov Chain Monte Carlo (MCMC) sampling performance of the three methods is evaluated using the publicly available software OpenBUGS (model scripts are provided in the Supplementary material). The sensitivity of MU to the specification of hyperparameters is assessed in one of the data examples.
Collapse
|
2
|
Schulz A, Fischer C, Chang-Claude J, Beckmann L. Entropy-supported marker selection and Mantel statistics for haplotype sharing analysis. Genet Epidemiol 2010; 34:354-63. [DOI: 10.1002/gepi.20491] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
3
|
Tsai MY, Hsiao CK, Wen SH. A Bayesian spatial multimarker genetic random-effect model for fine-scale mapping. Ann Hum Genet 2008; 72:658-69. [PMID: 18573105 DOI: 10.1111/j.1469-1809.2008.00459.x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Multiple markers in linkage disequilibrium (LD) are usually used to localize the disease gene location. These markers may contribute to the disease etiology simultaneously. In contrast to the single-locus tests, we propose a genetic random effects model that accounts for the dependence between loci via their spatial structures. In this model, the locus-specific random effects measure not only the genetic disease risk, but also the correlations between markers. In other words, the model incorporates this relation in both mean and covariance structures, and the variance components play important roles. We consider two different settings for the spatial relations. The first is our proposal, relative distance function (RDF), which is intuitive in the sense that markers nearby are likely to correlate with each other. The second setting is a common exponential decay function (EDF). Under each setting, the inference of the genetic parameters is fully Bayesian with Markov chain Monte Carlo (MCMC) sampling. We demonstrate the validity and the utility of the proposed approach with two real datasets and simulation studies. The analyses show that the proposed model with either one of two spatial correlations performs better as compared with the single locus analysis. In addition, under the RDF model, a more precise estimate for the disease locus can be obtained even when the candidate markers are fairly dense. In all simulations, the inference under the true model provides unbiased estimates of the genetic parameters, and the model with the spatial correlation structure does lead to greater confidence interval coverage probabilities.
Collapse
Affiliation(s)
- M-Y Tsai
- Institute of Statistics and Information Science, College of Science, National Changhua University of Education
| | | | | |
Collapse
|
4
|
Wang K, Abbott D. A principal components regression approach to multilocus genetic association studies. Genet Epidemiol 2008; 32:108-18. [PMID: 17849491 DOI: 10.1002/gepi.20266] [Citation(s) in RCA: 111] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
With the rapid development of modern genotyping technology, it is becoming commonplace to genotype densely spaced genetic markers such as single nucleotide polymorphisms (SNPs) along the genome. This development has inspired a strong interest in using multiple markers located in the target region for the detection of association. We introduce a principal components (PCs) regression method for candidate gene association studies where multiple SNPs from the candidate region tend to be correlated. In this approach, the total variance in the original genotype scores is decomposed into parts that correspond to uncorrelated PCs. The PCs with the largest variances are then used as regressors in a multiple regression. Simulation studies suggest that this approach can have higher power than some popular methods. An application to CHI3L2 gene expression data confirms a significant association between CHI3L2 gene expression level and SNPs from this gene that has been previously reported by others.
Collapse
Affiliation(s)
- Kai Wang
- Department of Biostatistics, College of Public Health, The University of Iowa, Iowa City, IA 52242, USA.
| | | |
Collapse
|
5
|
Wang T, Elston RC. Improved power by use of a weighted score test for linkage disequilibrium mapping. Am J Hum Genet 2007; 80:353-60. [PMID: 17236140 PMCID: PMC1785334 DOI: 10.1086/511312] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2006] [Accepted: 11/21/2006] [Indexed: 01/06/2023] Open
Abstract
Association studies offer an exciting approach to finding underlying genetic variants of complex human diseases. However, identification of genetic variants still includes difficult challenges, and it is important to develop powerful new statistical methods. Currently, association methods may depend on single-locus analysis--that is, analysis of the association of one locus, which is typically a single-nucleotide polymorphism (SNP), at a time--or on multilocus analysis, in which multiple SNPs are used to allow extraction of maximum information about linkage disequilibrium (LD). It has been shown that single-locus analysis may have low power because a single SNP often has limited LD information. Multilocus analysis, which is more informative, can be performed on the basis of either haplotypes or genotypes. It may lose power because of the often large number of degrees of freedom involved. The ideal method must make full use of important information from multiple loci but avoid increasing the degrees of freedom. Therefore, we propose a method to capture information from multiple SNPs but with the use of fewer degrees of freedom. When a set of SNPs in a block are correlated because of LD, we might expect that the genotype variation among the different phenotypic groups would extend across all the SNPs, and this information could be compressed into the low-frequency components of a Fourier transform. Therefore, we develop a test based on weighted Fourier transformation coefficients, with more weight given to the low-frequency components. Our simulation results demonstrate the validity and substantially higher power of the proposed method compared with other common methods. This method provides an additional tool to existing methods for identification of causative genetic variants underlying complex diseases.
Collapse
Affiliation(s)
- Tao Wang
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH 44106, USA
| | | |
Collapse
|
6
|
Dupuis J. Effect of linkage disequilibrium between markers in linkage and association analyses. Genet Epidemiol 2007; 31 Suppl 1:S139-48. [DOI: 10.1002/gepi.20291] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
7
|
Shugart YY, Chen L, Li R, Beaty T. Family-based linkage disequilibrium tests using general pedigrees. Methods Mol Biol 2007; 376:141-149. [PMID: 17984543 DOI: 10.1007/978-1-59745-389-9_10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Linkage disequilibrium (LD) mapping has been established as a promising approach to identifying disease genes. The presence of a disease gene located near a marker locus may cause LD between the marker and the disease loci. In LD mapping, we assume that some of the affected individuals may have a common ancestor carrying the mutation and that mutation carriers are likely to share alleles at the markers loci close to the disease gene. This chapter reviews the concept of LD mapping and outlines the advantages and disadvantages of two LD mapping approaches capable of handling general pedigrees: the family-based association test (FBAT) and pseudomarker. In summary, the pseudomarker statistical approach and the FBAT approach are both expected to offer reasonable statistical power to detect genes underlying complex traits. However, when the pedigree structure is more complicated, or when the number of informative families is limited, the pseudo-marker approach is anticipated to outperform FBAT.
Collapse
Affiliation(s)
- Yin Yao Shugart
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | | | | | | |
Collapse
|
8
|
Sillanpää MJ, Bhattacharjee M. Association mapping of complex trait loci with context-dependent effects and unknown context variable. Genetics 2006; 174:1597-611. [PMID: 17028339 PMCID: PMC1667093 DOI: 10.1534/genetics.106.061275] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2006] [Accepted: 08/28/2006] [Indexed: 11/18/2022] Open
Abstract
A novel method for Bayesian analysis of genetic heterogeneity and multilocus association in random population samples is presented. The method is valid for quantitative and binary traits as well as for multiallelic markers. In the method, individuals are stochastically assigned into two etiological groups that can have both their own, and possibly different, subsets of trait-associated (disease-predisposing) loci or alleles. The method is favorable especially in situations when etiological models are stratified by the factors that are unknown or went unmeasured, that is, if genetic heterogeneity is due to, for example, unknown genes x environment or genes x gene interactions. Additionally, a heterogeneity structure for the phenotype does not need to follow the structure of the general population; it can have a distinct selection history. The performance of the method is illustrated with simulated example of genes x environment interaction (quantitative trait with loosely linked markers) and compared to the results of single-group analysis in the presence of missing data. Additionally, example analyses with previously analyzed cystic fibrosis and type 2 diabetes data sets (binary traits with closely linked markers) are presented. The implementation (written in WinBUGS) is freely available for research purposes from http://www.rni.helsinki.fi/ approximately mjs/.
Collapse
|
9
|
Song PXK, Gao X, Liu R, Le W. Nonparametric inference for local extrema with application to oligonucleotide microarray data in yeast genome. Biometrics 2006; 62:545-54. [PMID: 16918919 DOI: 10.1111/j.1541-0420.2005.00501.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
Identifying local extrema of expression profiles is one primary objective in some cDNA microarray experiments. To study the replication dynamics of the yeast genome, for example, local peaks of hybridization intensity profiles correspond to putative replication origins. We propose a nonparametric kernel smoothing (NKS) technique to detect local hybridization intensity extrema across chromosomes. The novelty of our approach is that we base our inference procedures on equilibrium points, namely those locations at which the first derivative of the intensity curve is zero. The proposed smoothing technique provides both point and interval estimation for the location of local extrema. Also, this technique can be used to test for the hypothesis of either one or multiple suspected locations being the true equilibrium points. We illustrate the proposed method on a microarray data set from an experiment designed to study the replication origins in the yeast genome, in that the locations of autonomous replication sequence (ARS) elements are identified through the equilibrium points of the smoothed intensity profile curve. Our method found a few ARS elements that were not detected by the current smoothing methods such as the Fourier convolution smoothing.
Collapse
Affiliation(s)
- Peter X-K Song
- Department of Statistics and Actuarial Science, University of Waterloo, 200 University Avenue W., Waterloo, Ontario N2L 3G1, Canada.
| | | | | | | |
Collapse
|
10
|
Sevon P, Toivonen H, Ollikainen V. TreeDT: tree pattern mining for gene mapping. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2006; 3:174-85. [PMID: 17048403 DOI: 10.1109/tcbb.2006.28] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
We describe TreeDT, a novel association-based gene mapping method. Given a set of disease-associated haplotypes and a set of control haplotypes, TreeDT predicts likely locations of a disease susceptibility gene. TreeDT extracts, essentially in the form of haplotype trees, information about historical recombinations in the population: A haplotype tree constructed at a given chromosomal location is an estimate of the genealogy of the haplotypes. TreeDT constructs these trees for all locations on the given haplotypes and performs a novel disequilibrium test on each tree: Is there a small set of subtrees with relatively high proportions of disease-associated chromosomes, suggesting shared genetic history for those and a likely disease gene location? We give a detailed description of TreeDT and the tree disequilibrium tests, we analyze the algorithm formally, and we evaluate its performance experimentally on both simulated and real data sets. Experimental results demonstrate that TreeDT has high accuracy on difficult mapping tasks and comparisons to other methods (EATDT, HPM, TDT) show that TreeDT is very competitive.
Collapse
Affiliation(s)
- Petteri Sevon
- Department of Computer Science, PO Box 68, FI-00014 University of Helsinki, Finland.
| | | | | |
Collapse
|
11
|
Freimer NB, Sabatti C. Guidelines for association studies in Human Molecular Genetics. Hum Mol Genet 2005; 14:2481-3. [PMID: 16037069 DOI: 10.1093/hmg/ddi251] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Affiliation(s)
- Nelson B Freimer
- Department of Psychiatry, UCLA Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, Los Angeles, CA 90095-1761, USA.
| | | |
Collapse
|
12
|
Bardel C, Danjean V, Hugot JP, Darlu P, Génin E. On the use of haplotype phylogeny to detect disease susceptibility loci. BMC Genet 2005; 6:24. [PMID: 15904492 PMCID: PMC1173100 DOI: 10.1186/1471-2156-6-24] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2005] [Accepted: 05/18/2005] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND The cladistic approach proposed by Templeton has been presented as promising for the study of the genetic factors involved in common diseases. This approach allows the joint study of multiple markers within a gene by considering haplotypes and grouping them in nested clades. The idea is to search for clades with an excess of cases as compared to the whole sample and to identify the mutations defining these clades as potential candidate disease susceptibility sites. However, the performance of this approach for the study of the genetic factors involved in complex diseases has never been studied. RESULTS In this paper, we propose a new method to perform such a cladistic analysis and we estimate its power through simulations. We show that under models where the susceptibility to the disease is caused by a single genetic variant, the cladistic test is neither really more powerful to detect an association nor really more efficient to localize the susceptibility site than an individual SNP testing. However, when two interacting sites are responsible for the disease, the cladistic analysis greatly improves the probability to find the two susceptibility sites. The impact of the linkage disequilibrium and of the tree characteristics on the efficiency of the cladistic analysis are also discussed. An application on a real data set concerning the CARD15 gene and Crohn disease shows that the method can successfully identify the three variant sites that are involved in the disease susceptibility. CONCLUSION The use of phylogenies to group haplotypes is especially interesting to pinpoint the sites that are likely to be involved in disease susceptibility among the different markers identified within a gene.
Collapse
Affiliation(s)
- Claire Bardel
- Unité de recherche en Génétique Épidémiologique et structure des populations humaines, INSERM U535, Villejuif, France
| | - Vincent Danjean
- Laboratoire Bordelais de Recherche en Informatique, UMR 5800, Bordeaux, France
| | - Jean-Pierre Hugot
- Programme Avenir, INSERM U458, hôpital Robert Debré, AP-HP, Paris, France
- Fondation Jean Dausset, Paris, France
| | - Pierre Darlu
- Unité de recherche en Génétique Épidémiologique et structure des populations humaines, INSERM U535, Villejuif, France
| | - Emmanuelle Génin
- Unité de recherche en Génétique Épidémiologique et structure des populations humaines, INSERM U535, Villejuif, France
| |
Collapse
|
13
|
Gupta PK, Rustgi S, Kulwal PL. Linkage disequilibrium and association studies in higher plants: present status and future prospects. PLANT MOLECULAR BIOLOGY 2005; 57:461-85. [PMID: 15821975 DOI: 10.1007/s11103-005-0257-z] [Citation(s) in RCA: 284] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2004] [Accepted: 01/04/2005] [Indexed: 05/19/2023]
Abstract
During the last two decades, DNA-based molecular markers have been extensively utilized for a variety of studies in both plant and animal systems. One of the major uses of these markers is the construction of genome-wide molecular maps and the genetic analysis of simple and complex traits. However, these studies are generally based on linkage analysis in mapping populations, thus placing serious limitations in using molecular markers for genetic analysis in a variety of plant systems. Therefore, alternative approaches have been suggested, and one of these approaches makes use of linkage disequilibrium (LD)-based association analysis. Although this approach of association analysis has already been used for studies on genetics of complex traits (including different diseases) in humans, its use in plants has just started. In the present review, we first define and distinguish between LD and association mapping, and then briefly describe various measures of LD and the two methods of its depiction. We then give a list of different factors that affect LD without discussing them, and also discuss the current issues of LD research in plants. Later, we also describe the various uses of LD in plant genomics research and summarize the present status of LD research in different plant genomes. In the end, we discuss briefly the future prospects of LD research in plants, and give a list of softwares that are useful in LD research, which is available as electronic supplementary material (ESM).
Collapse
Affiliation(s)
- Pushpendra K Gupta
- Molecular Biology Laboratory, Department of Genetics & Plant Breeding, Ch. Charan Singh University, Meerut 250 004 (UP), India.
| | | | | |
Collapse
|
14
|
Zhang X, Roeder K, Wallstrom G, Devlin B. Integration of association statistics over genomic regions using Bayesian adaptive regression splines. Hum Genomics 2005; 1:20-9. [PMID: 15601530 PMCID: PMC3525002 DOI: 10.1186/1479-7364-1-1-20] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
In the search for genetic determinants of complex disease, two approaches to association analysis are most often employed, testing single loci or testing a small group of loci jointly via haplotypes for their relationship to disease status. It is still debatable which of these approaches is more favourable, and under what conditions. The former has the advantage of simplicity but suffers severely when alleles at the tested loci are not in linkage disequilibrium (LD) with liability alleles; the latter should capture more of the signal encoded in LD, but is far from simple. The complexity of haplotype analysis could be especially troublesome for association scans over large genomic regions, which, in fact, is becoming the standard design. For these reasons, the authors have been evaluating statistical methods that bridge the gap between single-locus and haplotype-based tests. In this article, they present one such method, which uses non-parametric regression techniques embodied by Bayesian adaptive regression splines (BARS). For a set of markers falling within a common genomic region and a corresponding set of single-locus association statistics, the BARS procedure integrates these results into a single test by examining the class of smooth curves consistent with the data. The non-parametric BARS procedure generally finds no signal when no liability allele exists in the tested region (ie it achieves the specified size of the test) and it is sensitive enough to pick up signals when a liability allele is present. The BARS procedure provides a robust and potentially powerful alternative to classical tests of association, diminishes the multiple testing problem inherent in those tests and can be applied to a wide range of data types, including genotype frequencies estimated from pooled samples.
Collapse
Affiliation(s)
- Xiaohua Zhang
- Department of Statistics, Carnegie Mellon University, Pittsburg, PA 15213, USA
| | - Kathryn Roeder
- Department of Statistics, Carnegie Mellon University, Pittsburg, PA 15213, USA
| | - Garrick Wallstrom
- Department of Statistics, Carnegie Mellon University, Pittsburg, PA 15213, USA
| | - Bernie Devlin
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| |
Collapse
|
15
|
Roeder K, Bacanu SA, Sonpar V, Zhang X, Devlin B. Analysis of single-locus tests to detect gene/disease associations. Genet Epidemiol 2005; 28:207-19. [PMID: 15637715 DOI: 10.1002/gepi.20050] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
A goal of association analysis is to determine whether variation in a particular candidate region or gene is associated with liability to complex disease. To evaluate such candidates, ubiquitous Single Nucleotide Polymorphisms (SNPs) are useful. It is critical, however, to select a set of SNPs that are in substantial linkage disequilibrium (LD) with all other polymorphisms in the region. Whether there is an ideal statistical framework to test such a set of 'tag SNPs' for association is unknown. Compared to tests for association based on frequencies of haplotypes, recent evidence suggests tests for association based on linear combinations of the tag SNPs (Hotelling T(2) test) are more powerful. Following this logical progression, we wondered if single-locus tests would prove generally more powerful than the regression-based tests? We answer this question by investigating four inferential procedures: the maximum of a series of test statistics corrected for multiple testing by the Bonferroni procedure, T(B), or by permutation of case-control status, T(P); a procedure that tests the maximum of a smoothed curve fitted to the series of of test statistics, T(S); and the Hotelling T(2) procedure, which we call T(R). These procedures are evaluated by simulating data like that from human populations, including realistic levels of LD and realistic effects of alleles conferring liability to disease. We find that power depends on the correlation structure of SNPs within a gene, the density of tag SNPs, and the placement of the liability allele. The clearest pattern emerges between power and the number of SNPs selected. When a large fraction of the SNPs within a gene are tested, and multiple SNPs are highly correlated with the liability allele, T(S) has better power. Using a SNP selection scheme that optimizes power but also requires a substantial number of SNPs to be genotyped (roughly 10-20 SNPs per gene), power of T(P) is generally superior to that for the other procedures, including T(R). Finally, when a SNP selection procedure that targets a minimal number of SNPs per gene is applied, the average performances of T(P) and T(R) are indistinguishable.
Collapse
Affiliation(s)
- Kathryn Roeder
- Department of Statistics, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA.
| | | | | | | | | |
Collapse
|
16
|
Sillanpää MJ, Bhattacharjee M. Bayesian association-based fine mapping in small chromosomal segments. Genetics 2005; 169:427-39. [PMID: 15371355 PMCID: PMC1448870 DOI: 10.1534/genetics.104.032680] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2004] [Accepted: 09/16/2004] [Indexed: 11/18/2022] Open
Abstract
A Bayesian method for fine mapping is presented, which deals with multiallelic markers (with two or more alleles), unknown phase, missing data, multiple causal variants, and both continuous and binary phenotypes. We consider small chromosomal segments spanned by a dense set of closely linked markers and putative genes only at marker points. In the phenotypic model, locus-specific indicator variables are used to control inclusion in or exclusion from marker contributions. To account for covariance between consecutive loci and to control fluctuations in association signals along a candidate region we introduce a joint prior for the indicators that depends on genetic or physical map distances. The potential of the method, including posterior estimation of trait-associated loci, their effects, linkage disequilibrium pattern due to close linkage of loci, and the age of a causal variant (time to most recent common ancestor), is illustrated with the well-known cystic fibrosis and Friedreich ataxia data sets by assuming that haplotypes were not available. In addition, simulation analysis with large genetic distances is shown. Estimation of model parameters is based on Markov chain Monte Carlo (MCMC) sampling and is implemented using WinBUGS. The model specification code is freely available for research purposes from http://www.rni.helsinki.fi/~mjs/.
Collapse
Affiliation(s)
- Mikko J Sillanpää
- Rolf Nevanlinna Institute, University of Helsinki, FIN-00014 Helsinki, Finland.
| | | |
Collapse
|
17
|
Frisch A, Colombo R, Michaelovsky E, Karpati M, Goldman B, Peleg L. Origin and spread of the 1278insTATC mutation causing Tay-Sachs disease in Ashkenazi Jews: genetic drift as a robust and parsimonious hypothesis. Hum Genet 2004; 114:366-76. [PMID: 14727180 DOI: 10.1007/s00439-003-1072-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2003] [Accepted: 11/29/2003] [Indexed: 11/30/2022]
Abstract
The 1278insTATC is the most prevalent beta-hexosaminidase A ( HEXA) gene mutation causing Tay-Sachs disease (TSD), one of the four lysosomal storage diseases (LSDs) occurring at elevated frequencies among Ashkenazi Jews (AJs). To investigate the genetic history of this mutation in the AJ population, a conserved haplotype (D15S981:175-D15S131:240-D15S1050:284-D15S197:144-D15S188:418) was identified in 1278insTATC chromosomes from 55 unrelated AJ individuals (15 homozygotes and 40 heterozygotes for the TSD mutation), suggesting the occurrence of a common founder. When two methods were used for analysis of linkage disequilibrium (LD) between flanking polymorphic markers and the disease locus and for the study of the decay of LD over time, the estimated age of the insertion was found to be 40+/-12 generations (95% confidence interval: 30-50 generations), so that the most recent common ancestor of the mutation-bearing chromosomes would date to the 8th-9th century. This corresponds with the demographic expansion of AJs in central Europe, following the founding of the Ashkenaz settlement in the early Middle Ages. The results are consistent with the geographic distribution of the main TSD mutation, 1278insTATC being more common in central Europe, and with the coalescent times of mutations causing two other LSDs, Gaucher disease and mucolipidosis type IV. Evidence for the absence of a determinant positive selection (heterozygote advantage) over the mutation is provided by a comparison between the estimated age of 1278insTATC and the probability of the current AJ frequency of the mutant allele as a function of its age, calculated by use of a branching-process model. Therefore, the founder effect in a rapidly expanding population arising from a bottleneck provides a robust parsimonious hypothesis explaining the spread of 1278insTATC-linked TSD in AJ individuals.
Collapse
Affiliation(s)
- Amos Frisch
- Felsenstein Medical Research Center, Rabin Medical Center, 49100, Petah Tikva, Israel.
| | | | | | | | | | | |
Collapse
|
18
|
|
19
|
Hsu FC, Liang KY, Beaty TH. Multipoint linkage disequilibrium mapping approach: incorporating evidence of linkage and linkage disequilibrium from unlinked region. Genet Epidemiol 2003; 25:1-13. [PMID: 12813722 DOI: 10.1002/gepi.10241] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Gene mapping for complex diseases is still a challenge in genetic studies. For family-based studies, the single-locus methods for detecting linkage and linkage disequilibrium (LD) one at a time may not capture the assumed interaction between multiple causal genes efficiently. We propose a multipoint LD approach for assessing the evidence of linkage and LD in a targeted chromosomal region by incorporating evidence from an unlinked region using the case-parent trio design. The paternal and maternal preferential transmission statistics defined in Liang et al. ([2001] Am. J. Hum. Genet. 68:937-950) are the primary statistics for this approach. Our generalized estimating equation (GEE) method builds on a model using the expected preferential transmission statistic from the targeted region conditional on this same statistic from the unlinked region. The major assumption is that there is no more than one trait locus in both the targeted region and unlinked region. The map position of an unobserved trait locus and its confidence interval can be calculated. Finally, we apply this approach to the African-American families drawn from the Collaborative Study on the Genetics of Asthma (CSGA). Previous analysis using this GEE approach developed by Liang et al. ([2001] Am. J. Hum. Genet. 68:937-950) suggested strong evidence of linkage and LD on chromosome 11, but only marginal evidence on chromosome 8. While conditioning on marker D11S937 on chromosome 11, a separate trait locus on chromosome 8 was estimated at tau(2) empty set = 11.67 cM, with a 95% confidence interval of (8.75, 14.59), and the test statistic shows significant evidence of linkage and LD (P-value=0.0198) in this region of chromosome 8.
Collapse
Affiliation(s)
- Fang-Chi Hsu
- Section on Biostatistics, Department of Public Health Sciences, Wake Forest University School of Medicine, Winston-Salem, North Carolina 21205, USA
| | | | | |
Collapse
|
20
|
Conti DV, Witte JS. Hierarchical modeling of linkage disequilibrium: genetic structure and spatial relations. Am J Hum Genet 2003; 72:351-63. [PMID: 12525994 PMCID: PMC379228 DOI: 10.1086/346117] [Citation(s) in RCA: 59] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2002] [Accepted: 11/04/2002] [Indexed: 11/03/2022] Open
Abstract
Linkage disequilibrium (LD) mapping offers much promise for the positional cloning of disease-causing genes. However, conventional estimates of LD may fluctuate substantially across contiguous genomic regions, because of population-specific phenomena such as mutation, genetic drift, population structure, and variations in allele frequencies. This fluctuation makes it difficult to interpret patterns of LD and distinguish where a causal gene is located. To address this issue, we propose hierarchical modeling of LD (HLD) for fine-scale mapping. This approach incorporates information on haplotype block structure and chromosomal spatial relations to refine the pattern of LD, increasing the ability to localize disease genes. Here, we present a framework for HLD, a simulation study assessing the performance of HLD under various scenarios, and an application of HLD to existing data. This work demonstrates that hierarchical modeling of linkage disequilibrium is a valuable and flexible approach for fine-scale mapping.
Collapse
Affiliation(s)
- David V Conti
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, 44106, USA
| | | |
Collapse
|
21
|
Hsu FC, Liang KY, Beaty TH, Barnes KC. Unified sampling approach for multipoint linkage disequilibrium mapping of qualitative and quantitative traits. Genet Epidemiol 2002; 22:298-312. [PMID: 11984863 DOI: 10.1002/gepi.0194] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Rapid development in biotechnology has enhanced the opportunity to deal with multipoint gene mapping for complex diseases, and association studies using quantitative traits have recently generated much attention. Unlike the conventional hypothesis-testing approach for fine mapping, we propose a unified multipoint method to localize a gene controlling a quantitative trait. We first calculate the sample size needed to detect linkage and linkage disequilibrium (LD) for a quantitative trait, categorized by decile, under three different modes of inheritance. Our results show that sampling trios of offspring and their parents from either extremely low (EL) or extremely high (EH) probands provides greater statistical power than sampling in the intermediate range. We next propose a unified sampling approach for multipoint LD mapping, where the goal is to estimate the map position (tau) of a trait locus and to calculate a confidence interval along with its sampling uncertainty. Our method builds upon a model for an expected preferential transmission statistic at an arbitrary locus conditional on the sampling scheme, such as sampling from EL and EH probands. This approach is valid regardless of the underlying genetic model. The one major assumption for this model is that no more than one quantitative trait locus (QTL) is linked to the region being mapped. Finally we illustrate the proposed method using family data on total serum IgE levels collected in multiplex asthmatic families from Barbados. An unobserved QTL appears to be located at tau; = 41.93 cM with 95% confidence interval of (40.84, 43.02) through the 20-cM region framed by markers D12S1052 and D12S1064 on chromosome 12. The test statistic shows strong evidence of linkage and LD (chi-square statistic = 18.39 with 2 df, P-value = 0.0001).
Collapse
Affiliation(s)
- Fang-Chi Hsu
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
| | | | | | | |
Collapse
|
22
|
Abstract
We illustrate how homozygosity of haplotypes can be used to measure the level of disequilibrium between two or more markers. An excess of either homozygosity or heterozygosity signals a departure from the gametic phase equilibrium: We describe the specific form of dependence that is associated with high (low) homozygosity and derive various linkage disequilibrium measures. They feature a clear biological interpretation, can be used to construct tests, and are standardized to allow comparison across loci and populations. They are particularly advantageous to measure linkage disequilibrium between highly polymorphic markers.
Collapse
Affiliation(s)
- Chiara Sabatti
- Department of Human Genetics and Statistics, University of California, Los Angeles, California 90095-7088, USA.
| | | |
Collapse
|
23
|
Lazzeroni LC. Allele sharing and allelic association I: sib pair tests with increased power. Genet Epidemiol 2002; 22:328-44. [PMID: 11984865 DOI: 10.1002/gepi.0185] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Affected sib pair data contain information about allele sharing and allelic association. Either of these features can point to the presence of a risk-related gene. This study introduces the elliptical sib pair test, a generalization of traditional sib pair tests. The proposed test can be implemented using any of three strategies, the choice of which depends on the anticipated combination of sharing and association. The elliptical sib pair test can achieve substantial gains in power relative to traditional tests for likely alternative hypotheses at little or no cost for other alternatives. The proposed test is valid under most models of genetic risk, disease etiology, and genotype-haplotype distributions. This study also provides new insight into the trade-off between tests of allelic association and tests of allele sharing.
Collapse
Affiliation(s)
- Laura C Lazzeroni
- Division of Biostatistics, Department of Health Research and Policy, Stanford University School of Medicine, Stanford, California 94305, USA
| |
Collapse
|
24
|
Abstract
Linkage disequilibrium has become important in the context of gene mapping. We argue that to understand the pattern of association between alleles at different loci, and of DNA sequence polymorphism in general, it is useful first to consider the underlying genealogy of the chromosomes. The stochastic process known as the coalescent is a convenient way to model such genealogies, and in this paper we set out the theory behind the coalescent and its implications for understanding linkage disequilibrium.
Collapse
Affiliation(s)
- Magnus Nordborg
- Molecular and Computational Biology, University of Southern California, 835 W 37th St, SHS172, Los Angeles, CA 90089-134, USA.
| | | |
Collapse
|
25
|
Schaid DJ, Rowland CM, Tines DE, Jacobson RM, Poland GA. Score tests for association between traits and haplotypes when linkage phase is ambiguous. Am J Hum Genet 2002; 70:425-34. [PMID: 11791212 PMCID: PMC384917 DOI: 10.1086/338688] [Citation(s) in RCA: 1469] [Impact Index Per Article: 66.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2001] [Accepted: 11/14/2001] [Indexed: 01/08/2023] Open
Abstract
A key step toward the discovery of a gene related to a trait is the finding of an association between the trait and one or more haplotypes. Haplotype analyses can also provide critical information regarding the function of a gene; however, when unrelated subjects are sampled, haplotypes are often ambiguous because of unknown linkage phase of the measured sites along a chromosome. A popular method of accounting for this ambiguity in case-control studies uses a likelihood that depends on haplotype frequencies, so that the haplotype frequencies can be compared between the cases and controls; however, this traditional method is limited to a binary trait (case vs. control), and it does not provide a method of testing the statistical significance of specific haplotypes. To address these limitations, we developed new methods of testing the statistical association between haplotypes and a wide variety of traits, including binary, ordinal, and quantitative traits. Our methods allow adjustment for nongenetic covariates, which may be critical when analyzing genetically complex traits. Furthermore, our methods provide several different global tests for association, as well as haplotype-specific tests, which give a meaningful advantage in attempts to understand the roles of many different haplotypes. The statistics can be computed rapidly, making it feasible to evaluate the associations between many haplotypes and a trait. To illustrate the use of our new methods, they are applied to a study of the association of haplotypes (composed of genes from the human-leukocyte-antigen complex) with humoral immune response to measles vaccination. Limited simulations are also presented to demonstrate the validity of our methods, as well as to provide guidelines on how our methods could be used.
Collapse
Affiliation(s)
- Daniel J Schaid
- Department of Health Sciences Research, Mayo Clinic/Foundation, Rochester, MN 55905, USA.
| | | | | | | | | |
Collapse
|
26
|
Liu JS, Sabatti C, Teng J, Keats BJ, Risch N. Bayesian analysis of haplotypes for linkage disequilibrium mapping. Genome Res 2001; 11:1716-24. [PMID: 11591648 PMCID: PMC311130 DOI: 10.1101/gr.194801] [Citation(s) in RCA: 93] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Haplotype analysis of disease chromosomes can help identify probable historical recombination events and localize disease mutations. Most available analyses use only marginal and pairwise allele frequency information. We have developed a Bayesian framework that utilizes full haplotype information to overcome various complications such as multiple founders, unphased chromosomes, data contamination, and incomplete marker data. A stochastic model is used to describe the dependence structure among several variables characterizing the observed haplotypes, for example, the ancestral haplotypes and their ages, mutation rate, recombination events, and the location of the disease mutation. An efficient Markov chain Monte Carlo algorithm was developed for computing the estimates of the quantities of interest. The method is shown to perform well in both real data sets (cystic fibrosis data and Friedreich ataxia data) and simulated data sets. The program that implements the proposed method, BLADE, as well as the two real datasets, can be obtained from http://www.fas.harvard.edu/~junliu/TechRept/01folder/diseq_prog.tar.gz.
Collapse
Affiliation(s)
- J S Liu
- Department of Statistics, Harvard University, Cambridge, Massachusetts 02138, USA.
| | | | | | | | | |
Collapse
|
27
|
McInnes LA, Service SK, Reus VI, Barnes G, Charlat O, Jawahar S, Lewitzky S, Yang Q, Duong Q, Spesny M, Araya C, Araya X, Gallegos A, Meza L, Molina J, Ramirez R, Mendez R, Silva S, Fournier E, Batki SL, Mathews CA, Neylan T, Glatt CE, Escamilla MA, Luo D, Gajiwala P, Song T, Crook S, Nguyen JB, Roche E, Meyer JM, Leon P, Sandkuijl LA, Freimer NB, Chen H. Fine-scale mapping of a locus for severe bipolar mood disorder on chromosome 18p11.3 in the Costa Rican population. Proc Natl Acad Sci U S A 2001; 98:11485-90. [PMID: 11572994 PMCID: PMC58756 DOI: 10.1073/pnas.191519098] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2000] [Indexed: 11/18/2022] Open
Abstract
We have searched for genes predisposing to bipolar disorder (BP) by studying individuals with the most extreme form of the affected phenotype, BP-I, ascertained from the genetically isolated population of the Central Valley of Costa Rica (CVCR). The results of a previous linkage analysis on two extended CVCR BP-I pedigrees, CR001 and CR004, and of linkage disequilibrium (LD) analyses of a CVCR population sample of BP-I patients implicated a candidate region on 18p11.3. We further investigated this region by creating a physical map and developing 4 new microsatellite and 26 single-nucleotide polymorphism markers for typing in the pedigree and population samples. We report the results of fine-scale association analyses in the population sample, as well as evaluation of haplotypes in pedigree CR001. Our results suggest a candidate region containing six genes but also highlight the complexities of LD mapping of common disorders.
Collapse
Affiliation(s)
- L A McInnes
- Neurogenetics Laboratory, Center for Neurobiology and Psychiatry, and Department of Psychiatry, University of California, San Francisco, CA 94143, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Luo ZW, Wu CI. Modeling linkage disequilibrium between a polymorphic marker locus and a locus affecting complex dichotomous traits in natural populations. Genetics 2001; 158:1785-800. [PMID: 11514462 PMCID: PMC1461768 DOI: 10.1093/genetics/158.4.1785] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Linkage disequilibrium is an important topic in evolutionary and population genetics. An issue yet to be settled is the theory required to extend the linkage disequilibrium analysis to complex traits. In this study, we present theoretical analysis and methods for detecting or estimating linkage disequilibrium (LD) between a polymorphic marker locus and any one of the loci affecting a complex dichotomous trait on the basis of samples randomly or selectively collected from natural populations. Statistical properties of these methods were investigated and their powers were compared analytically or by use of Monte Carlo simulations. The results show that the disequilibrium may be detected with a power of 80% by using phenotypic records and marker genotype when both the trait and marker variants are common (30%) and the LD is relatively high (40-100% of the theoretical maximum). The maximum-likelihood approach provides accurate estimates of the model parameters as well as detection of linkage disequilibrium. The likelihood method is preferred for its higher power and reliability in parameter estimation. The approaches developed in this article are also compared to those for analyzing a continuously distributed quantitative trait. It is shown that a larger sample size is required for the dichotomous trait model to obtain the same level of power in detecting linkage disequilibrium as the continuous trait analysis. Potential use of these estimates in mapping the trait locus is also discussed.
Collapse
Affiliation(s)
- Z W Luo
- School of Biosciences, The University of Birmingham, Edgbaston, Birmingham B15 2TT, England.
| | | |
Collapse
|
29
|
Durst R, Colombo R, Shpitzen S, Avi LB, Friedlander Y, Wexler R, Raal FJ, Marais DA, Defesche JC, Mandelshtam MY, Kotze MJ, Leitersdorf E, Meiner V. Recent origin and spread of a common Lithuanian mutation, G197del LDLR, causing familial hypercholesterolemia: positive selection is not always necessary to account for disease incidence among Ashkenazi Jews. Am J Hum Genet 2001; 68:1172-88. [PMID: 11309683 PMCID: PMC1226098 DOI: 10.1086/320123] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2001] [Accepted: 03/15/2001] [Indexed: 11/03/2022] Open
Abstract
G197del is the most prevalent LDL receptor (LDLR) mutation causing familial hypercholesterolemia (FH) in Ashkenazi Jew (AJ) individuals. The purpose of this study was to determine the origin, age, and population distribution of G197del, as well as to explore environmental and genetic effects on disease expression. Index cases from Israel (n=46), South Africa (n=24), Russia (n=7), The Netherlands (n=1), and the United States (n=1) were enlisted. All trace their ancestry to Lithuania. A highly conserved haplotype (D19S221:104-D19S865:208-D19S413:74) was identified in G197del chromosomes, suggesting the occurrence of a common founder. When two methods were used for analysis of linkage disequilibrium (LD) between flanking polymorphic markers and the disease locus and for the study of the decay of LD over time, the estimated age of the deletion was found to be 20 +/- 7 generations (the 95% confidence interval is 15-26 generations), so that the most recent common ancestor of the mutation-bearing chromosomes would date to the 14th century. This corresponds with the founding of the Jewish community of Lithuania (1338 a.d.), as well as with the great demographic expansion of AJ individuals in eastern Europe, which followed this settlement. The penetrance of mutation-linked severe hypercholesterolemia is high (94% of heterozygotes have a baseline concentration of LDL cholesterol (LDL-C) that is >160 mg/dl), and no significant differences in the mean baseline lipid level of G197del carriers from different countries were found. Polymorphisms of apolipoprotein E and of scavenger-receptor class B type I were observed to have minor effects on the plasma lipid profile. With respect to determinative genetic influences on the biochemical phenotype, there is no evidence that could support the possibility of a selective evolutionary metabolic advantage. Therefore, the founder effect in a rapidly expanding population from a limited number of families remains a simple, parsimonious hypothesis explaining the spread of G197del-LDLR-linked FH in AJ individuals.
Collapse
Affiliation(s)
- Ronen Durst
- Division of Medicine and the Center for Research, Prevention and Treatment of Atherosclerosis and Department of Human Genetics, Hadassah University Hospital, and Department of Social Medicine, School of Public Health, Hebrew University, Jerusalem; Human Biology and Genetics Research Unit, Department of Psychology, Catholic University of the Sacred Heart, Milan; Carbohydrate and Lipid Metabolism Research Unit, Department of Medicine, University of the Witwatersrand, Johannesburg; Cape Heart Centre Lipid Laboratory, Faculty of Health Sciences, University of Cape Town, Cape Town; Department of Vascular Medicine, Academic Medical Center, University of Amsterdam, Amsterdam; Department of Molecular Genetics, Institute for Experimental Medicine, St. Petersburg Academy, St. Petersburg; and Division of Human Genetics and The Cape Heart Research Group, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, South Africa
| | - Roberto Colombo
- Division of Medicine and the Center for Research, Prevention and Treatment of Atherosclerosis and Department of Human Genetics, Hadassah University Hospital, and Department of Social Medicine, School of Public Health, Hebrew University, Jerusalem; Human Biology and Genetics Research Unit, Department of Psychology, Catholic University of the Sacred Heart, Milan; Carbohydrate and Lipid Metabolism Research Unit, Department of Medicine, University of the Witwatersrand, Johannesburg; Cape Heart Centre Lipid Laboratory, Faculty of Health Sciences, University of Cape Town, Cape Town; Department of Vascular Medicine, Academic Medical Center, University of Amsterdam, Amsterdam; Department of Molecular Genetics, Institute for Experimental Medicine, St. Petersburg Academy, St. Petersburg; and Division of Human Genetics and The Cape Heart Research Group, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, South Africa
| | - Shoshi Shpitzen
- Division of Medicine and the Center for Research, Prevention and Treatment of Atherosclerosis and Department of Human Genetics, Hadassah University Hospital, and Department of Social Medicine, School of Public Health, Hebrew University, Jerusalem; Human Biology and Genetics Research Unit, Department of Psychology, Catholic University of the Sacred Heart, Milan; Carbohydrate and Lipid Metabolism Research Unit, Department of Medicine, University of the Witwatersrand, Johannesburg; Cape Heart Centre Lipid Laboratory, Faculty of Health Sciences, University of Cape Town, Cape Town; Department of Vascular Medicine, Academic Medical Center, University of Amsterdam, Amsterdam; Department of Molecular Genetics, Institute for Experimental Medicine, St. Petersburg Academy, St. Petersburg; and Division of Human Genetics and The Cape Heart Research Group, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, South Africa
| | - Liat Ben Avi
- Division of Medicine and the Center for Research, Prevention and Treatment of Atherosclerosis and Department of Human Genetics, Hadassah University Hospital, and Department of Social Medicine, School of Public Health, Hebrew University, Jerusalem; Human Biology and Genetics Research Unit, Department of Psychology, Catholic University of the Sacred Heart, Milan; Carbohydrate and Lipid Metabolism Research Unit, Department of Medicine, University of the Witwatersrand, Johannesburg; Cape Heart Centre Lipid Laboratory, Faculty of Health Sciences, University of Cape Town, Cape Town; Department of Vascular Medicine, Academic Medical Center, University of Amsterdam, Amsterdam; Department of Molecular Genetics, Institute for Experimental Medicine, St. Petersburg Academy, St. Petersburg; and Division of Human Genetics and The Cape Heart Research Group, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, South Africa
| | - Yechiel Friedlander
- Division of Medicine and the Center for Research, Prevention and Treatment of Atherosclerosis and Department of Human Genetics, Hadassah University Hospital, and Department of Social Medicine, School of Public Health, Hebrew University, Jerusalem; Human Biology and Genetics Research Unit, Department of Psychology, Catholic University of the Sacred Heart, Milan; Carbohydrate and Lipid Metabolism Research Unit, Department of Medicine, University of the Witwatersrand, Johannesburg; Cape Heart Centre Lipid Laboratory, Faculty of Health Sciences, University of Cape Town, Cape Town; Department of Vascular Medicine, Academic Medical Center, University of Amsterdam, Amsterdam; Department of Molecular Genetics, Institute for Experimental Medicine, St. Petersburg Academy, St. Petersburg; and Division of Human Genetics and The Cape Heart Research Group, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, South Africa
| | - Roni Wexler
- Division of Medicine and the Center for Research, Prevention and Treatment of Atherosclerosis and Department of Human Genetics, Hadassah University Hospital, and Department of Social Medicine, School of Public Health, Hebrew University, Jerusalem; Human Biology and Genetics Research Unit, Department of Psychology, Catholic University of the Sacred Heart, Milan; Carbohydrate and Lipid Metabolism Research Unit, Department of Medicine, University of the Witwatersrand, Johannesburg; Cape Heart Centre Lipid Laboratory, Faculty of Health Sciences, University of Cape Town, Cape Town; Department of Vascular Medicine, Academic Medical Center, University of Amsterdam, Amsterdam; Department of Molecular Genetics, Institute for Experimental Medicine, St. Petersburg Academy, St. Petersburg; and Division of Human Genetics and The Cape Heart Research Group, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, South Africa
| | - Frederick J. Raal
- Division of Medicine and the Center for Research, Prevention and Treatment of Atherosclerosis and Department of Human Genetics, Hadassah University Hospital, and Department of Social Medicine, School of Public Health, Hebrew University, Jerusalem; Human Biology and Genetics Research Unit, Department of Psychology, Catholic University of the Sacred Heart, Milan; Carbohydrate and Lipid Metabolism Research Unit, Department of Medicine, University of the Witwatersrand, Johannesburg; Cape Heart Centre Lipid Laboratory, Faculty of Health Sciences, University of Cape Town, Cape Town; Department of Vascular Medicine, Academic Medical Center, University of Amsterdam, Amsterdam; Department of Molecular Genetics, Institute for Experimental Medicine, St. Petersburg Academy, St. Petersburg; and Division of Human Genetics and The Cape Heart Research Group, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, South Africa
| | - David A. Marais
- Division of Medicine and the Center for Research, Prevention and Treatment of Atherosclerosis and Department of Human Genetics, Hadassah University Hospital, and Department of Social Medicine, School of Public Health, Hebrew University, Jerusalem; Human Biology and Genetics Research Unit, Department of Psychology, Catholic University of the Sacred Heart, Milan; Carbohydrate and Lipid Metabolism Research Unit, Department of Medicine, University of the Witwatersrand, Johannesburg; Cape Heart Centre Lipid Laboratory, Faculty of Health Sciences, University of Cape Town, Cape Town; Department of Vascular Medicine, Academic Medical Center, University of Amsterdam, Amsterdam; Department of Molecular Genetics, Institute for Experimental Medicine, St. Petersburg Academy, St. Petersburg; and Division of Human Genetics and The Cape Heart Research Group, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, South Africa
| | - Joep C. Defesche
- Division of Medicine and the Center for Research, Prevention and Treatment of Atherosclerosis and Department of Human Genetics, Hadassah University Hospital, and Department of Social Medicine, School of Public Health, Hebrew University, Jerusalem; Human Biology and Genetics Research Unit, Department of Psychology, Catholic University of the Sacred Heart, Milan; Carbohydrate and Lipid Metabolism Research Unit, Department of Medicine, University of the Witwatersrand, Johannesburg; Cape Heart Centre Lipid Laboratory, Faculty of Health Sciences, University of Cape Town, Cape Town; Department of Vascular Medicine, Academic Medical Center, University of Amsterdam, Amsterdam; Department of Molecular Genetics, Institute for Experimental Medicine, St. Petersburg Academy, St. Petersburg; and Division of Human Genetics and The Cape Heart Research Group, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, South Africa
| | - Michail Y. Mandelshtam
- Division of Medicine and the Center for Research, Prevention and Treatment of Atherosclerosis and Department of Human Genetics, Hadassah University Hospital, and Department of Social Medicine, School of Public Health, Hebrew University, Jerusalem; Human Biology and Genetics Research Unit, Department of Psychology, Catholic University of the Sacred Heart, Milan; Carbohydrate and Lipid Metabolism Research Unit, Department of Medicine, University of the Witwatersrand, Johannesburg; Cape Heart Centre Lipid Laboratory, Faculty of Health Sciences, University of Cape Town, Cape Town; Department of Vascular Medicine, Academic Medical Center, University of Amsterdam, Amsterdam; Department of Molecular Genetics, Institute for Experimental Medicine, St. Petersburg Academy, St. Petersburg; and Division of Human Genetics and The Cape Heart Research Group, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, South Africa
| | - Maritha J. Kotze
- Division of Medicine and the Center for Research, Prevention and Treatment of Atherosclerosis and Department of Human Genetics, Hadassah University Hospital, and Department of Social Medicine, School of Public Health, Hebrew University, Jerusalem; Human Biology and Genetics Research Unit, Department of Psychology, Catholic University of the Sacred Heart, Milan; Carbohydrate and Lipid Metabolism Research Unit, Department of Medicine, University of the Witwatersrand, Johannesburg; Cape Heart Centre Lipid Laboratory, Faculty of Health Sciences, University of Cape Town, Cape Town; Department of Vascular Medicine, Academic Medical Center, University of Amsterdam, Amsterdam; Department of Molecular Genetics, Institute for Experimental Medicine, St. Petersburg Academy, St. Petersburg; and Division of Human Genetics and The Cape Heart Research Group, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, South Africa
| | - Eran Leitersdorf
- Division of Medicine and the Center for Research, Prevention and Treatment of Atherosclerosis and Department of Human Genetics, Hadassah University Hospital, and Department of Social Medicine, School of Public Health, Hebrew University, Jerusalem; Human Biology and Genetics Research Unit, Department of Psychology, Catholic University of the Sacred Heart, Milan; Carbohydrate and Lipid Metabolism Research Unit, Department of Medicine, University of the Witwatersrand, Johannesburg; Cape Heart Centre Lipid Laboratory, Faculty of Health Sciences, University of Cape Town, Cape Town; Department of Vascular Medicine, Academic Medical Center, University of Amsterdam, Amsterdam; Department of Molecular Genetics, Institute for Experimental Medicine, St. Petersburg Academy, St. Petersburg; and Division of Human Genetics and The Cape Heart Research Group, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, South Africa
| | - Vardiella Meiner
- Division of Medicine and the Center for Research, Prevention and Treatment of Atherosclerosis and Department of Human Genetics, Hadassah University Hospital, and Department of Social Medicine, School of Public Health, Hebrew University, Jerusalem; Human Biology and Genetics Research Unit, Department of Psychology, Catholic University of the Sacred Heart, Milan; Carbohydrate and Lipid Metabolism Research Unit, Department of Medicine, University of the Witwatersrand, Johannesburg; Cape Heart Centre Lipid Laboratory, Faculty of Health Sciences, University of Cape Town, Cape Town; Department of Vascular Medicine, Academic Medical Center, University of Amsterdam, Amsterdam; Department of Molecular Genetics, Institute for Experimental Medicine, St. Petersburg Academy, St. Petersburg; and Division of Human Genetics and The Cape Heart Research Group, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, South Africa
| |
Collapse
|
30
|
Liang KY, Hsu FC, Beaty TH, Barnes KC. Multipoint linkage-disequilibrium-mapping approach based on the case-parent trio design. Am J Hum Genet 2001; 68:937-50. [PMID: 11254451 PMCID: PMC1275648 DOI: 10.1086/319504] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2000] [Accepted: 02/13/2001] [Indexed: 11/03/2022] Open
Abstract
In the present study we propose a multipoint approach, for the mapping of genes, that is based on the case-parent trio design. We first derive an expression for the expected preferential-allele-transmission statistics for transmission, from either parent to an affected child, for an arbitrary location within a chromosomal region demarcated by several genetic markers. No assumption about genetic mechanism is needed in this derivation, beyond the assumption that no more than one disease gene lies in the region framed by the markers. When one builds on this representation, the way in which one may maximize the genetic information from multiple markers becomes obvious. This proposed method differs from the popular transmission/disequilibrium test (TDT) approach for fine mapping, in the following ways: First, in contrast with the TDT approach, all markers contribute information, regardless of whether the parents are heterozygous at any one marker, and incomplete trio data can be utilized in our approach. Second, rather than performing the TDT at each marker separately, we propose a single test statistic that follows a chi(2) distribution with 1 df, under the null hypothesis of no linkage or linkage disequilibrium to the region. Third, in the presence of linkage evidence, we offer a means to estimate the location of the disease locus along with its sampling uncertainty. We illustrate the proposed method with data from a family study of asthma, conducted in Barbados.
Collapse
Affiliation(s)
- K Y Liang
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21205, USA.
| | | | | | | |
Collapse
|
31
|
Rate of decay in admixture linkage disequilibrium and its implication in gene mapping. ACTA ACUST UNITED AC 2001. [DOI: 10.1007/bf03183263] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
32
|
Abstract
The past decade produced several proposals for fine-scale gene mapping using linkage disequilibrium data. The suggested methods fall into two main groups, those that rely on pairwise statistics and those that rely on haplotypes. This paper reviews each strategy's development from a chronological perspective.
Collapse
Affiliation(s)
- L C Lazzeroni
- Biostatistics Division, Department of Health Research and Policy, Stanford University, Stanford, California 94305, USA.
| |
Collapse
|
33
|
Bourgain C, Génin E, Holopainen P, Mustalahti K, Mäki M, Partanen J, Clerget-Darpoux F. Use of closely related affected individuals for the genetic study of complex diseases in founder populations. Am J Hum Genet 2001; 68:154-159. [PMID: 11102286 PMCID: PMC1234909 DOI: 10.1086/316933] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2000] [Accepted: 11/06/2000] [Indexed: 11/04/2022] Open
Abstract
We propose a method, the maximum identity length contrast (MILC) statistic, to locate genetic risk factors for complex diseases in founder populations. The MILC approach compares the identity length of parental haplotypes that are transmitted to affected offspring with the identity length of those that are not transmitted to affected offspring. Initially, the statistical properties of the method were assessed using randomly selected affected individuals with unknown relationship. Because both nuclear families with multiple affected sibs and large pedigrees are often available in founder populations, we performed simulations to investigate the properties of the MILC statistic in the presence of closely related affected individuals. The simulation showed that the use of closely related affected individuals greatly enhances the power of the statistic. For a given sample size and type I error, the use of affected sib pairs, instead of affected individuals randomly selected from the population, could increase the power by a factor of two. This increase was related to an increase of kinship-coefficient contrast between haplotype groups when closely related individuals were considered. The MILC approach allows the simultaneous use of affected individuals from a founder population and affected individuals with any kind of relationship, close or remote. We used the MILC approach to analyze the role of HLA in celiac disease and showed that the effect of HLA may be detected with the MILC approach by typing only 11 affected individuals, who were part of a single large Finnish pedigree.
Collapse
Affiliation(s)
- C Bourgain
- Unité de Recherche d'Epidémiologie Génétique, INSERM U535, 94276 Le Kremlin-Bicêtre Cedex France.
| | | | | | | | | | | | | |
Collapse
|
34
|
Colombo R, Bignamini AA, Carobene A, Sasaki J, Tachikawa M, Kobayashi K, Toda T. Age and origin of the FCMD 3'-untranslated-region retrotransposal insertion mutation causing Fukuyama-type congenital muscular dystrophy in the Japanese population. Hum Genet 2000; 107:559-67. [PMID: 11153909 DOI: 10.1007/s004390000421] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Fukuyama-type congenital muscular dystrophy (FCMD), an autosomal recessive disorder with a high prevalence in the Japanese population, is characterised by severe muscular dystrophy associated with brain malformation (cortical dysgenesis) and mental retardation. In Japan, 87% of FCMD-bearing chromosomes carry a 3-kb retrotransposal insertion of tandemly repeated sequences within the disease gene recently identified on chromosome 9q31, and most of them share a common founder haplotype. FCMD is the first human disease known to be caused primarily by an ancient retrotransposal integration. By applying two methods for the study of linkage disequilibrium between flanking polymorphic markers and the disease locus, and of its decay over time, the age of the insertion mutation causing FCMD in Japanese patients is calculated to be approximately 102 generations (95% confidence interval: 86-117 g), or slightly less. The estimated age dates the most recent common ancestor of the mutation-bearing chromosomes back to the time (or a few centuries before) the Yayoi people started migrating to Japan from the Korean peninsula. This finding makes the molecular population genetics of FCMD understandable in the context of Japan's history and the founder effect consistent with the prevalent theory on the origins of the modern Japanese population.
Collapse
Affiliation(s)
- R Colombo
- Department of Psychology, Catholic University of the Sacred Heart, Milan, Italy.
| | | | | | | | | | | | | |
Collapse
|
35
|
Abstract
Linkage analysis and association studies, two major approaches for genetic studies of human diseases, are useful for mapping genes that are highly penetrant, but both use only part of the information that is available for mapping disease genes. Therefore, they provide limited utility when used alone. In this report, we present combined linkage and linkage disequilibrium mapping that simultaneously utilizes linkage and linkage disequilibrium information for mapping human disease genes. Compared with the existing linkage analysis and association study methods, this method has several advantages: 1) it has high statistical power by a joint analysis of linkage and linkage disequilibrium for localizing disease susceptibility loci: 2) it unifies the theory of linkage analysis and linkage disequilibrium mapping, 3) it retains the general framework for linkage analysis and, hence, can be easily incorporated into the existing software for the linkage analysis. The proposed LLDM is applied to familial hemophagocytic lymphohistiocytosis (FHL) disease.
Collapse
Affiliation(s)
- M Xiong
- Human Genetics Center, University of Texas, Houston Health Science Center, Houston, Texas 77225, USA.
| | | |
Collapse
|
36
|
Abstract
We briefly review the major contribution of biometrics to genetics over the last century (population genetic models, familial correlations, segregation analysis, and gene mapping) and current areas of active research and then speculate about what problems will be tackled in the next century.
Collapse
|
37
|
Morris AP, Whittaker JC, Balding DJ. Bayesian fine-scale mapping of disease loci, by hidden Markov models. Am J Hum Genet 2000; 67:155-69. [PMID: 10835299 PMCID: PMC1287074 DOI: 10.1086/302956] [Citation(s) in RCA: 66] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2000] [Accepted: 04/20/2000] [Indexed: 11/03/2022] Open
Abstract
We present a new multilocus method for the fine-scale mapping of genes contributing to human diseases. The method is designed for use with multiple biallelic markers-in particular, single-nucleotide polymorphisms for which high-density genetic maps will soon be available. We model disease-marker association in a candidate region via a hidden Markov process and allow for correlation between linked marker loci. Using Markov-chain-Monte Carlo simulation methods, we obtain posterior distributions of model parameter estimates including disease-gene location and the age of the disease-predisposing mutation. In addition, we allow for heterogeneity in recombination rates, across the candidate region, to account for recombination hot and cold spots. We also obtain, for the ancestral marker haplotype, a posterior distribution that is unique to our method and that, unlike maximum-likelihood estimation, can properly account for uncertainty. We apply the method to data for cystic fibrosis and Huntington disease, for which mutations in disease genes have already been identified. The new method performs well compared with existing multi-locus mapping methods.
Collapse
Affiliation(s)
- A P Morris
- Department of Applied Statistics, University of Reading, United Kingdom.
| | | | | |
Collapse
|
38
|
Toivonen HTT, Onkamo P, Vasko K, Ollikainen V, Sevon P, Mannila H, Herr M, Kere J. Data mining applied to linkage disequilibrium mapping. Am J Hum Genet 2000; 67:133-45. [PMID: 10848493 PMCID: PMC1287071 DOI: 10.1086/302954] [Citation(s) in RCA: 83] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2000] [Accepted: 05/04/2000] [Indexed: 11/03/2022] Open
Abstract
We introduce a new method for linkage disequilibrium mapping: haplotype pattern mining (HPM). The method, inspired by data mining methods, is based on discovery of recurrent patterns. We define a class of useful haplotype patterns in genetic case-control data and use the algorithm for finding disease-associated haplotypes. The haplotypes are ordered by their strength of association with the phenotype, and all haplotypes exceeding a given threshold level are used for prediction of disease susceptibility-gene location. The method is model-free, in the sense that it does not require (and is unable to utilize) any assumptions about the inheritance model of the disease. The statistical model is nonparametric. The haplotypes are allowed to contain gaps, which improves the method's robustness to mutations and to missing and erroneous data. Experimental studies with simulated microsatellite and SNP data show that the method has good localization power in data sets with large degrees of phenocopies and with lots of missing and erroneous data. The power of HPM is roughly identical for marker maps at a density of 3 single-nucleotide polymorphisms/cM or 1 microsatellite/cM. The capacity to handle high proportions of phenocopies makes the method promising for complex disease mapping. An example of correct disease susceptibility-gene localization with HPM is given with real marker data from families from the United Kingdom affected by type 1 diabetes. The method is extendable to include environmental covariates or phenotype measurements or to find several genes simultaneously.
Collapse
Affiliation(s)
- Hannu T. T. Toivonen
- Nokia Research Center; Rolf Nevanlinna Institute, Finnish Genome Center, and Department of Computer Science, University of Helsinki; and Helsinki University of Technology, Helsinki; and Wellcome Trust Centre for Molecular Mechanisms in Disease, Department of Medical Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Päivi Onkamo
- Nokia Research Center; Rolf Nevanlinna Institute, Finnish Genome Center, and Department of Computer Science, University of Helsinki; and Helsinki University of Technology, Helsinki; and Wellcome Trust Centre for Molecular Mechanisms in Disease, Department of Medical Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Kari Vasko
- Nokia Research Center; Rolf Nevanlinna Institute, Finnish Genome Center, and Department of Computer Science, University of Helsinki; and Helsinki University of Technology, Helsinki; and Wellcome Trust Centre for Molecular Mechanisms in Disease, Department of Medical Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Vesa Ollikainen
- Nokia Research Center; Rolf Nevanlinna Institute, Finnish Genome Center, and Department of Computer Science, University of Helsinki; and Helsinki University of Technology, Helsinki; and Wellcome Trust Centre for Molecular Mechanisms in Disease, Department of Medical Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Petteri Sevon
- Nokia Research Center; Rolf Nevanlinna Institute, Finnish Genome Center, and Department of Computer Science, University of Helsinki; and Helsinki University of Technology, Helsinki; and Wellcome Trust Centre for Molecular Mechanisms in Disease, Department of Medical Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Heikki Mannila
- Nokia Research Center; Rolf Nevanlinna Institute, Finnish Genome Center, and Department of Computer Science, University of Helsinki; and Helsinki University of Technology, Helsinki; and Wellcome Trust Centre for Molecular Mechanisms in Disease, Department of Medical Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Mathias Herr
- Nokia Research Center; Rolf Nevanlinna Institute, Finnish Genome Center, and Department of Computer Science, University of Helsinki; and Helsinki University of Technology, Helsinki; and Wellcome Trust Centre for Molecular Mechanisms in Disease, Department of Medical Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Juha Kere
- Nokia Research Center; Rolf Nevanlinna Institute, Finnish Genome Center, and Department of Computer Science, University of Helsinki; and Helsinki University of Technology, Helsinki; and Wellcome Trust Centre for Molecular Mechanisms in Disease, Department of Medical Genetics, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
39
|
MacLean CJ, Martin RB, Sham PC, Wang H, Straub RE, Kendler KS. The trimmed-haplotype test for linkage disequilibrium. Am J Hum Genet 2000; 66:1062-75. [PMID: 10712218 PMCID: PMC1288142 DOI: 10.1086/302796] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Abstract
Single-marker linkage-disequilibrium (LD) methods cannot fully describe disequilibrium in an entire chromosomal region surrounding a disease allele. With the advent of myriad tightly linked microsatellite markers, we have an opportunity to extend LD analysis from single markers to multiple-marker haplotypes. Haplotype analysis has increased statistical power to disclose the presence of a disease locus in situations where it correctly reflects the historical process involved. For maximum efficiency, evidence of LD ought to come not just from a single haplotype, which may well be rare, but in addition from many similar haplotypes that could have descended from the same ancestral founder but have been trimmed in succeeding generations. We present such an analysis, called the "trimmed-haplotype method." We focus on chromosomal regions that are small enough that disequilibrium in significant portions of them may have been preserved in some pedigrees and yet that contain enough markers to minimize coincidental occurrence of the haplotype in the absence of a disease allele: perhaps regions 1-2 cM in length. In general, we could have no idea what haplotype an ancestral founder carried generations ago, nor do we usually have a precise chromosomal location for the disease-susceptibility locus. Therefore, we must search through all possible haplotypes surrounding multiple locations. Since such repeated testing obliterates the sampling distribution of the test, we employ bootstrap methods to calculate significance levels. Trimmed-haplotype analysis is performed on family data in which genotypes have been assembled into haplotypes. It can be applied either to conventional parent-affected-offspring triads or to multiplex pedigrees. We present a method for summarizing the LD evidence, in any pedigree, that can be employed in trimmed-haplotype analysis as well as in other methods.
Collapse
Affiliation(s)
- C J MacLean
- Virginia Institute for Psychiatric and Behavioral Genetics, Medical College of Virginia, Virginia Commonwealth University, Richmond, VA, 23298, USA.
| | | | | | | | | | | |
Collapse
|
40
|
Jorde LB, Watkins WS, Kere J, Nyman D, Eriksson AW. Gene mapping in isolated populations: new roles for old friends? Hum Hered 2000; 50:57-65. [PMID: 10545758 DOI: 10.1159/000022891] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Population isolates are increasingly being used in attempts to map genes underlying complex diseases. To further explore the utility of isolates for this purpose, we explore linkage disequilibrium patterns in polymorphisms from two regions (VWF and NF1) in three isolated populations from Finland. At the NF1 locus, the Finnish populations have greater pairwise disequilibrium than populations from Africa, Asia, or northern Europe. However, populations from 'New Finland' and 'Old Finland' do not differ in their disequilibrium levels at either the NF1 or the VWF locus. In addition, disequilibrium patterns and haplotype diversity do not differ between a sample from the Aland Islands, Finland, and a collection of outbred Centre d'Etude du Polymorphisme Humain families. These results show that linkage disequilibrium patterns sometimes differ among populations with different histories and founding dates, but some putative isolated populations may not significantly differ from larger admixed populations. We discuss factors that should be considered when using isolated populations in gene-mapping studies.
Collapse
Affiliation(s)
- L B Jorde
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84112, USA
| | | | | | | | | |
Collapse
|
41
|
Abstract
Statistical genetic mapping methods are powerful tools for finding genes that contribute to complex human traits. Mapping methods combine knowledge of the biological mechanisms of inheritance and the randomness inherent in those mechanisms to locate, with increasing precision, trait genes on the human genome. We provide an overview of the two major classes of mapping methods, genetic linkage analysis and linkage disequilibrium analysis, and related concepts of genetic inheritance.
Collapse
Affiliation(s)
- J M Olson
- Department of Epidemiology and Biostatistics, Rammelkamp Center for Education and Research, MetroHealth Campus, Case Western Reserve University, Cleveland, Ohio 44109, USA.
| | | | | |
Collapse
|
42
|
Abstract
Linkage disequilibrium mapping exploits the fact that at genetic markers close enough to a disease locus on a particular chromosome, we expect to find an association between the disease and marker alleles. Furthermore, the magnitude of the association is expected to follow a unimodal curve when plotted against location, with the peak at the disease location. In practice, for real data, we usually see deviations from such a curve due to other influences such as evolutionary variability, mutation, and selection. Here we propose fitting a quadratic curve to data of this nature, estimating the location of the disease locus by the point at which the curve is maximum. A key feature of our method is the use of transformations of both location and disequilibrium, so that departures from a unimodal curve are incorporated by fitting the curve not to the original location and disequilibrium values but to the transformed values. In addition, we estimate the covariances between the disequilibrium values at linked loci using either a multinomial approximation or a bootstrap procedure. The location estimate from our method is the ratio of two quantities that, in large samples, are normally distributed, and so we use Fieller's theorem to obtain a confidence interval for the disease gene location. We successfully apply our method to data from several published studies in which the true disease gene location is known.
Collapse
Affiliation(s)
- H J Cordell
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio 44109, USA.
| | | |
Collapse
|
43
|
Abstract
A generalization of the transmission/disequilibrium test to detect association between polymorphic markers and discrete or quantitative traits is discussed, with particular emphasis on marker haplotypes formed by several adjacent loci. Furthermore, strategies for testing haplotype association, using methods from spatial statistics, are developed. This approach compares the "similarity" of transmitted and untransmitted haplotypes, with the aim of determining the regions where there is greater similarity within the transmitted set. This arises from the fact that, although the original haplotypes carrying the mutation will be broken down by recombination, there may be a subset of markers near the mutation that are common to many of the recombinant haplotypes. Thus, by examination of each marker in turn and by measurement of the average size of the region shared identically by state in the transmitted and untransmitted haplotypes, it may be possible to detect regions of linkage disequilibrium that encompass the susceptibility gene.
Collapse
Affiliation(s)
- D Clayton
- MRC Biostatistics Unit, Institute of Public Health, Cambridge, United Kingdom.
| | | |
Collapse
|
44
|
McPeek MS, Strahs A. Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping. Am J Hum Genet 1999; 65:858-75. [PMID: 10445904 PMCID: PMC1378001 DOI: 10.1086/302537] [Citation(s) in RCA: 151] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Abstract
Linkage disequilibrium (LD) is of great interest for gene mapping and the study of population history. We propose a multilocus model for LD, based on the decay of haplotype sharing (DHS). The DHS model is most appropriate when the LD in which one is interested is due to the introduction of a variant on an ancestral haplotype, with recombinations in succeeding generations resulting in preservation of only a small region of the ancestral haplotype around the variant. This is generally the scenario of interest for gene mapping by LD. The DHS parameter is a measure of LD that can be interpreted as the expected genetic distance to which the ancestral haplotype is preserved, or, equivalently, 1/(time in generations to the ancestral haplotype). The method allows for multiple origins of alleles and for mutations, and it takes into account missing observations and ambiguities in haplotype determination, via a hidden Markov model. Whereas most commonly used measures of LD apply to pairs of loci, the DHS measure is designed for application to the densely mapped haplotype data that are increasingly available. The DHS method explicitly models the dependence among multiple tightly linked loci on a chromosome. When the assumptions about population structure are sufficiently tractable, the estimate of LD is obtained by maximum likelihood. For more-complicated models of population history, we find means and covariances based on the model and solve a quasi-score estimating equation. Simulations show that this approach works extremely well both for estimation of LD and for fine mapping. We apply the DHS method to published data sets for cystic fibrosis and progressive myoclonus epilepsy.
Collapse
Affiliation(s)
- M S McPeek
- Department of Statistics, University of Chicago, Chicago, IL 60637, USA.
| | | |
Collapse
|
45
|
Service SK, Lang DW, Freimer NB, Sandkuijl LA. Linkage-disequilibrium mapping of disease genes by reconstruction of ancestral haplotypes in founder populations. Am J Hum Genet 1999; 64:1728-38. [PMID: 10330361 PMCID: PMC1377917 DOI: 10.1086/302398] [Citation(s) in RCA: 84] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
Linkage disequilibrium (LD) mapping may be a powerful means for genome screening to identify susceptibility loci for common diseases. A new statistical approach for detection of LD around a disease gene is presented here. This method compares the distribution of haplotypes in affected individuals versus that expected for individuals descended from a common ancestor who carried a mutation of the disease gene. Simulations demonstrate that this method, which we term "ancestral haplotype reconstruction" (AHR), should be powerful for genome screening of phenotypes characterized by a high degree of etiologic heterogeneity, even with currently available marker maps. AHR is best suited to application in isolated populations where affected individuals are relatively recently descended (< approximately 25 generations) from a common disease mutation-bearing founder.
Collapse
Affiliation(s)
- S K Service
- Neurogenetics Laboratory and Center for Neurobiology and Psychiatry, Department of Psychiatry, University of California, San Francisco, CA 94143, USA.
| | | | | | | |
Collapse
|
46
|
Terwilliger JD, Weiss KM. Linkage disequilibrium mapping of complex disease: fantasy or reality? Curr Opin Biotechnol 1998; 9:578-94. [PMID: 9889136 DOI: 10.1016/s0958-1669(98)80135-3] [Citation(s) in RCA: 210] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
In the past year, data about the level and nature of linkage disequilibrium between alleles of tightly linked SNPs have started to become available. Furthermore, increasing evidence of allelic heterogeneity at the loci predisposing to complex disease has been observed, which has lead to initial attempts to develop methods of linkage disequilibrium detection allowing for this difficulty. It has also become more obvious that we will need to think carefully about the types of populations we need to analyze in an attempt to identify these elusive genes, and it is becoming clear that we need to carefully re-evaluate the prognosis of the current paradigm with regard to its robustness to the types of problems that are likely to exist.
Collapse
Affiliation(s)
- J D Terwilliger
- Columbia University Department of Psychiatry Columbia and Genome Center 60, Haven Avenue #15-C New York NY 10032 USA. joseph.
| | | |
Collapse
|
47
|
de la Chapelle A, Wright FA. Linkage disequilibrium mapping in isolated populations: the example of Finland revisited. Proc Natl Acad Sci U S A 1998; 95:12416-23. [PMID: 9770501 PMCID: PMC22846 DOI: 10.1073/pnas.95.21.12416] [Citation(s) in RCA: 165] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/17/1998] [Indexed: 01/26/2023] Open
Abstract
Linkage disequilibrium analysis can provide high resolution in the mapping of disease genes because it incorporates information on recombinations that have occurred during the entire period from the mutational event to the present. A circumstance particularly favorable for high-resolution mapping is when a single founding mutation segregates in an isolated population. We review here the population structure of Finland in which a small founder population some 100 generations ago has expanded into 5.1 million people today. Among the 30-odd autosomal recessive disorders that are more prevalent in Finland than elsewhere, several appear to have segregated for this entire period in the "panmictic" southern Finnish population. Linkage disequilibrium analysis has allowed precise mapping and determination of genetic distances at the 0.1-cM level in several of these disorders. Estimates of genetic distance have proven accurate, but previous calculations of the confidence intervals were too small because sampling variation was ignored. In the north and east of Finland the population can be viewed as having been "founded" only after 1500. Disease mutations that have undergone such a founding bottleneck only 20 or so generations ago exhibit linkage disequilibrium and haplotype sharing over long genetic distances (5-15 cM). These features have been successfully exploited in the mapping and cloning of many genes. We review the statistical issues of fine mapping by linkage disequilibrium and suggest that improved methodologies may be necessary to map diseases of complex etiology that may have arisen from multiple founding mutations.
Collapse
Affiliation(s)
- A de la Chapelle
- Human Cancer Genetics Program, Comprehensive Cancer Center, Ohio State University, 420 West 12th Avenue, Columbus, OH 43210-1214, USA.
| | | |
Collapse
|