1
|
Cobat A, Abel L, Alcaïs A. The Maximum-Likelihood-Binomial method revisited: a robust approach for model-free linkage analysis of quantitative traits in large sibships. Genet Epidemiol 2011; 35:46-56. [PMID: 21181896 DOI: 10.1002/gepi.20548] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Model-free linkage analysis methods, based on identity-by-descent allele sharing, are commonly used for complex trait analysis. The Maximum-Likelihood-Binomial (MLB) approach, which is based on the hypothesis that parental alleles are binomially distributed among affected sibs, is particularly popular. An extension of this method to quantitative traits (QT) has been proposed (MLB-QTL), based on the introduction of a latent binary variable capturing information about the linkage between the QT and the marker. Interestingly, the MLB-QTL method does not require the decomposition of sibships into constituent sibpairs and requires no prior assumption about the distribution of the QT. We propose a new formulation of the MLB method for quantitative traits (nMLB-QTL) that explicitly takes advantage of the independence of paternal and maternal allele transmission under the null hypothesis of no linkage. Simulation studies under H₀ showed that the nMLB-QTL method generated very consistent type I errors. Furthermore, simulations under the alternative hypothesis showed that the nMLB-QTL method was slightly, but systematically more powerful than the MLB-QTL method, whatever the genetic model, residual correlation, ascertainment strategy and sibship size considered. Finally, the power of the nMLB-QTL method is illustrated by a chromosome-wide linkage scan for a quantitative endophenotype of leprosy infection. Overall, the nMLB-QTL method is a robust, powerful, and flexible approach for detecting linkage with quantitative phenotypes, particularly in studies of non Gaussian phenotypes in large sibships.
Collapse
Affiliation(s)
- Aurelie Cobat
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, Institut National de la Santé et de la Recherche Médicale, Paris, France
| | | | | |
Collapse
|
2
|
Lebrec JJP, Putter H, Houwing-Duistermaat JJ, van Houwelingen HC. Influence of genotyping error in linkage mapping for complex traits--an analytic study. BMC Genet 2008; 9:57. [PMID: 18721489 PMCID: PMC2533351 DOI: 10.1186/1471-2156-9-57] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2008] [Accepted: 08/25/2008] [Indexed: 11/21/2022] Open
Abstract
Background Despite the current trend towards large epidemiological studies of unrelated individuals, linkage studies in families are still thoroughly being utilized as tools for disease gene mapping. The use of the single-nucleotide-polymorphisms (SNP) array technology in genotyping of family data has the potential to provide more informative linkage data. Nevertheless, SNP array data are not immune to genotyping error which, as has been suggested in the past, could dramatically affect the evidence for linkage especially in selective designs such as affected sib pair (ASP) designs. The influence of genotyping error on selective designs for continuous traits has not been assessed yet. Results We use the identity-by-descent (IBD) regression-based paradigm for linkage testing to analytically quantify the effect of simple genotyping error models under specific selection schemes for sibling pairs. We show, for example, that in extremely concordant (EC) designs, genotyping error leads to decreased power whereas it leads to increased type I error in extremely discordant (ED) designs. Perhaps surprisingly, the effect of genotyping error on inference is most severe in designs where selection is least extreme. We suggest a genomic control for genotyping errors via a simple modification of the intercept in the regression for linkage. Conclusion This study extends earlier findings: genotyping error can substantially affect type I error and power in selective designs for continuous traits. Designs involving both EC and ED sib pairs are fairly immune to genotyping error. When those designs are not feasible the simple genomic control strategy that we suggest offers the potential to deliver more robust inference, especially if genotyping is carried out by SNP array technology.
Collapse
Affiliation(s)
- Jérémie J P Lebrec
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Postzone S-05-P, PO Box 9600 2300 RC Leiden, The Netherlands.
| | | | | | | |
Collapse
|
3
|
Bhattacharjee S, Kuo CL, Mukhopadhyay N, Brock GN, Weeks DE, Feingold E. Robust score statistics for QTL linkage analysis. Am J Hum Genet 2008; 82:567-82. [PMID: 18304491 DOI: 10.1016/j.ajhg.2007.11.012] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2007] [Revised: 10/22/2007] [Accepted: 11/29/2007] [Indexed: 10/22/2022] Open
Abstract
The traditional variance components approach for quantitative trait locus (QTL) linkage analysis is sensitive to violations of normality and fails for selected sampling schemes. Recently, a number of new methods have been developed for QTL mapping in humans. Most of the new methods are based on score statistics or regression-based statistics and are expected to be relatively robust to non-normality of the trait distribution and also to selected sampling, at least in terms of type I error. Whereas the theoretical development of these statistics is more or less complete, some practical issues concerning their implementation still need to be addressed. Here we study some of these issues such as the choice of denominator variance estimates, weighting of pedigrees, effect of parameter misspecification, effect of non-normality of the trait distribution, and effect of incorporating dominance. We present a comprehensive discussion of the theoretical properties of various denominator variance estimates and of the weighting issue and then perform simulation studies for nuclear families to compare the methods in terms of power and robustness. Based on our analytical and simulation results, we provide general guidelines regarding the choice of appropriate QTL mapping statistics in practical situations.
Collapse
|
4
|
Dupuis J, Siegmund DO, Yakir B. A unified framework for linkage and association analysis of quantitative traits. Proc Natl Acad Sci U S A 2007; 104:20210-5. [PMID: 18077372 PMCID: PMC2154410 DOI: 10.1073/pnas.0707138105] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2007] [Indexed: 11/18/2022] Open
Abstract
We give a unified treatment of the statistical foundations of population based association mapping and of family based linkage mapping of quantitative traits in humans. A central ingredient in the unification involves the efficient score statistic. The discussion focuses on generalized linear models with an additional illustration of the Cox (proportional hazards) model for age of onset data. We give analytic expressions for noncentrality parameters and show how they give qualitative insight into the loss of power that occurs if the scientist's assumed genetic model differs from nature's "true" genetic model. Issues to be studied in detail in the future development of this approach are discussed.
Collapse
Affiliation(s)
- Josée Dupuis
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA.
| | | | | |
Collapse
|
5
|
Zheng G, Ghosh K, Chen Z, Li Z. Extreme Rank Selections for Linkage Analysis of Quantitative Trait Loci Using Selected Sib-Pairs. Ann Hum Genet 2006; 70:857-66. [PMID: 17044861 DOI: 10.1111/j.1469-1809.2006.00268.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
It is well known that linkage analysis using simple random sib-pairs has relatively low power for detecting quantitative trait loci with small genetic effects. The power can be substantially increased by using samples selected based on their trait values. Usually, samples that are obtained by truncation selection consist of random samples from a truncated trait distribution. In this article we propose an alternative method using extreme ranks for linkage analysis with selected sib-pairs. This approach approximates the truncation selection. With similar screening sizes and the same sample size of selected sib-pairs, the extreme rank selection and truncation method have similar power performance, both of which are substantially more powerful than when using random sib-pairs. Simulation results on the comparison of powers between the truncation selection and the extreme rank selection and/or random selection for linkage analysis are reported.
Collapse
Affiliation(s)
- G Zheng
- Office of Biostatistics Research, National Heart, Lung and Blood Institute, 6701 Rockledge Drive, Bethesda, MD 20892, USA.
| | | | | | | |
Collapse
|
6
|
Szatkiewicz JP, Feingold E. QTL mapping with discordant and concordant sibling pairs: new statistics and new design strategies. Genet Epidemiol 2005; 28:326-40. [PMID: 15662636 DOI: 10.1002/gepi.20065] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The term "extreme discordant and concordant" (EDAC) sampling has been used to describe a variety of strategies for quantitative trait locus mapping using sibling pairs sampled from the corners of the bivariate trait distribution. The principle of the design is to gain efficiency by genotyping only the most informative of the available sibling pairs. EDAC-type designs have been studied in a number of papers, and have been applied in a few others. This literature is somewhat out of date, however, because there are many new statistics that are appropriate for EDAC data. With newer statistics, the power of EDAC designs can be improved. Moreover, the relative power of different designs must be re-evaluated, because the newer statistics improve the power of some designs more than others. That is, there is a circular relationship between design and statistic choices. In this report, we review a number of available design and statistic choices for EDAC studies, and use simulation to show what statistics are most powerful for each design. We then use those more powerful statistics to suggest strategies for making design choices among various EDAC and non-EDAC designs that use sibling pairs. We find that when genotyping must be minimized, an EDAC design with predominantly discordant pairs is the best choice, and when a balance of genotyping and phenotyping effort must be achieved, single proband ascertainment can do better. We also show that moderately selected samples (as opposed to very extreme samples) can be an efficient choice for many studies.
Collapse
Affiliation(s)
- Jin P Szatkiewicz
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pennsylvania, USA
| | | |
Collapse
|
7
|
Majumder PP, Ghosh S. Mapping quantitative trait loci in humans: achievements and limitations. J Clin Invest 2005; 115:1419-24. [PMID: 15931376 PMCID: PMC1137003 DOI: 10.1172/jci24757] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Recent advances in statistical methods and genomic technologies have ushered in a new era in mapping clinically important quantitative traits. However, many refinements and novel statistical approaches are required to enable greater successes in this mapping. The possible impact of recent findings pertaining to the structure of the human genome on efforts to map quantitative traits is yet unclear.
Collapse
|
8
|
Chen WM, Broman KW, Liang KY. Power and robustness of linkage tests for quantitative traits in general pedigrees. Genet Epidemiol 2005; 28:11-23. [PMID: 15493059 DOI: 10.1002/gepi.20034] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
There are numerous statistical methods for quantitative trait linkage analysis in human studies. An ideal such method would have high power to detect genetic loci contributing to the trait, would be robust to non-normality in the phenotype distribution, would be appropriate for general pedigrees, would allow the incorporation of environmental covariates, and would be appropriate in the presence of selective sampling. We recently described a general framework for quantitative trait linkage analysis, based on generalized estimating equations, for which many current methods are special cases. This procedure is appropriate for general pedigrees and easily accommodates environmental covariates. In this report, we use computer simulations to investigate the power and robustness of a variety of linkage test statistics built upon our general framework. We also propose two novel test statistics that take account of higher moments of the phenotype distribution, in order to accommodate non-normality. These new linkage tests are shown to have high power and to be robust to non-normality. While we have not yet examined the performance of our procedures in the context of selective sampling via computer simulations, the proposed tests satisfy all of the other qualities of an ideal quantitative trait linkage analysis method.
Collapse
Affiliation(s)
- Wei-Min Chen
- Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland, USA.
| | | | | |
Collapse
|
9
|
Abstract
BACKGROUND Early lifetime exposure to dietary or supplementary vitamin D has been predicted to be a risk factor for later allergy. Twin studies suggest that response to vitamin D exposure might be influenced by genetic factors. As these effects are primarily mediated through the vitamin D receptor (VDR), single base variants in this gene may be risk factors for asthma or allergy. RESULTS 951 individuals from 224 pedigrees with at least 2 asthmatic children were analyzed for 13 SNPs in the VDR. There was no preferential transmission to children with asthma. In their unaffected sibs, however, one allele in the 5' region was 0.5-fold undertransmitted (p = 0.049), while two other alleles in the 3' terminal region were 2-fold over-transmitted (p = 0.013 and 0.018). An association was also seen with bronchial hyperreactivity against methacholine and with specific immunoglobulin E serum levels. CONCLUSION The transmission disequilibrium in unaffected sibs of otherwise multiple-affected families seem to be a powerful statistical test. A preferential transmission of vitamin D receptor variants to children with asthma could not be confirmed but raises the possibility of a protective effect for unaffected children.
Collapse
Affiliation(s)
- Matthias Wjst
- Gruppe Molekulare Epidemiologie, Institut für Epidemiologie, GSF - Forschungszentrum für Umwelt und Gesundheit, Ingolstädter Landstrasse 1, D-85758 Neuherberg / Munich, Germany.
| |
Collapse
|
10
|
Szatkiewicz JP, Feingold E. A powerful and robust new linkage statistic for discordant sibling pairs. Am J Hum Genet 2004; 75:906-9. [PMID: 15368196 PMCID: PMC1182121 DOI: 10.1086/425523] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2004] [Accepted: 08/25/2004] [Indexed: 11/03/2022] Open
Abstract
Previously, Szatkiewicz and colleagues evaluated the performance of a wide variety of statistics for quantitative-trait-locus linkage, using discordant sibling pairs. They found that the most powerful statistics, in general, were a score statistic and a "composite statistic." However, whereas these two statistics have equal power under ideal conditions, each has limitations that reduce its power in certain circumstances. The score statistic depends on estimates of trait parameters and can lose a lot of power if those estimates are incorrect. The composite statistic is not sensitive to trait-parameter estimates but does depend on arbitrary weights that must be chosen on the basis of the ascertainment scheme. In this report, we elucidate the algebraic relationship between the score and composite statistics and then use that relationship to suggest a new statistic that combines the best properties of both. We call our new statistic the "robust discordant pair" (RDP) statistic. We report simulation studies to show that the RDP statistic does, indeed, have all of the strengths and none of the weaknesses of the score and composite statistics.
Collapse
Affiliation(s)
- Jin P. Szatkiewicz
- Departments of Biostatistics and Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh
| | - Eleanor Feingold
- Departments of Biostatistics and Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh
| |
Collapse
|
11
|
Yu X, Knott SA, Visscher PM. Theoretical and empirical power of regression and maximum-likelihood methods to map quantitative trait loci in general pedigrees. Am J Hum Genet 2004; 75:17-26. [PMID: 15152343 PMCID: PMC1182003 DOI: 10.1086/421845] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2004] [Accepted: 04/07/2004] [Indexed: 11/03/2022] Open
Abstract
Both theoretical calculations and simulation studies have been used to compare and contrast the statistical power of methods for mapping quantitative trait loci (QTLs) in simple and complex pedigrees. A widely used approach in such studies is to derive or simulate the expected mean test statistic under the alternative hypothesis of a segregating QTL and to equate a larger mean test statistic with larger power. In the present study, we show that, even when the test statistic under the null hypothesis of no linkage follows a known asymptotic distribution (the standard being chi(2)), it cannot be assumed that the distribution under the alternative hypothesis is noncentral chi(2). Hence, mean test statistics cannot be used to indicate power differences, and a comparison between methods that are based on simulated average test statistics may lead to the wrong conclusion. We illustrate this important finding, through simulations and analytical derivations, for a recently proposed new regression method for the analysis of general pedigrees to map quantitative trait loci. We show that this regression method is not necessarily more powerful nor computationally more efficient than a maximum-likelihood variance-component approach. We advocate the use of empirical power to compare trait-mapping methods.
Collapse
Affiliation(s)
- Xijiang Yu
- School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | | | | |
Collapse
|
12
|
Lebrec J, Putter H, Houwelingen JCV. Score test for detecting linkage to complex traits in selected samples. Genet Epidemiol 2004; 27:97-108. [PMID: 15305326 DOI: 10.1002/gepi.20012] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
We present a unified approach to selection and linkage analysis of selected samples, for both quantitative and dichotomous complex traits. It is based on the score test for the variance attributable to the trait locus and applies to general pedigrees. The method is equivalent to regressing excess IBD sharing on a function of the traits. It is shown that when population parameters for the trait are known, such inversion does not entail any loss of information. For dichotomous traits, pairs of pedigree members of different phenotypic nature (e.g., affected sib pairs and discordant sib pairs) can easily be combined as well as populations with different trait prevalences.
Collapse
Affiliation(s)
- J Lebrec
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, University of Leiden, PO Box 9604, Leiden, The Netherlands.
| | | | | |
Collapse
|
13
|
T.Cuenco K, Szatkiewicz JP, Feingold E. Recent advances in human quantitative-trait-locus mapping: comparison of methods for selected sibling pairs. Am J Hum Genet 2003; 73:863-73. [PMID: 12970847 PMCID: PMC1180608 DOI: 10.1086/378589] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2003] [Accepted: 07/22/2003] [Indexed: 11/03/2022] Open
Abstract
During the past few years, there has been a great deal of new work on methods for mapping quantitative-trait loci by use of sibling pairs and sibships. There are several new methods based on linear regression, as well as several more that are based on score statistics. In theory, most of the new methods should be relatively robust to violations of distributional assumptions and to selected sampling, but, in practice, there has been little evaluation of how the methods perform on selected samples. We survey most of the new regression-based statistics and score statistics and propose a few minor variations on the score statistics. We use simulation to evaluate the type I error and the power of all of the statistics, considering (a) population samples of sibling pairs and (b) sibling pairs ascertained on the basis of at least one sibling with a trait value in the top 10% of the distribution. Most of the statistics have correct type I error for selected samples. The statistics proposed by Xu et al. and by Sham and Purcell are generally the most powerful, along with one of our score statistic variants. Even among the methods that are most powerful for "nice" data, some are more robust than others to non-Gaussian trait models and/or misspecified trait parameters.
Collapse
Affiliation(s)
- Karen T.Cuenco
- Departments of Human Genetics and Biostatistics, University of Pittsburgh, Pittsburgh
| | - Jin P. Szatkiewicz
- Departments of Human Genetics and Biostatistics, University of Pittsburgh, Pittsburgh
| | - Eleanor Feingold
- Departments of Human Genetics and Biostatistics, University of Pittsburgh, Pittsburgh
| |
Collapse
|