51
|
Cheema J, Dicks J. Computational approaches and software tools for genetic linkage map estimation in plants. Brief Bioinform 2010; 10:595-608. [PMID: 19933208 DOI: 10.1093/bib/bbp045] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Genetic maps are an important component within the plant biologist's toolkit, underpinning crop plant improvement programs. The estimation of plant genetic maps is a conceptually simple yet computationally complex problem, growing ever more so with the development of inexpensive, high-throughput DNA markers. The challenge for bioinformaticians is to develop analytical methods and accompanying software tools that can cope with datasets of differing sizes, from tens to thousands of markers, that can incorporate the expert knowledge that plant biologists typically use when developing their maps, and that facilitate user-friendly approaches to achieving these goals. Here, we aim to give a flavour of computational approaches for genetic map estimation, discussing briefly many of the key concepts involved, and describing a selection of software tools that employ them. This review is intended both for plant geneticists as an introduction to software tools with which to estimate genetic maps, and for bioinformaticians as an introduction to the underlying computational approaches.
Collapse
Affiliation(s)
- Jitender Cheema
- John Innes Centre, Norwich Research Park, Colney, Norwich, NR4 7UH, UK
| | | |
Collapse
|
52
|
A study on the mapping of quantitative trait loci in advanced populations derived from two inbred lines. Genet Res (Camb) 2009; 91:85-99. [PMID: 19393125 DOI: 10.1017/s0016672309000081] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
In genetic and biological studies, the F2 population is one of the most popular and commonly used experimental populations mainly because it can be readily produced and its genome structure possesses several niceties that allow for productive investigation. These niceties include the equivalence between the proportion of recombinants and recombination rates, the capability of providing a complete set of three genotypes for every locus and an analytically attractive first-order Markovian property. Recently, there has been growing interest in using the progeny populations from F2 (advanced populations) because their genomes can be managed to meet specific purposes or can be used to enhance investigative studies. These advanced populations include recombinant inbred populations, advanced intercrossed populations, intermated recombinant inbred populations and immortalized F2 populations. Due to an increased number of meiosis cycles, the genomes of these advanced populations no longer possess the Markovian property and are relatively more complicated and different from the F2 genomes. Although issues related to quantitative trait locus (QTL) mapping using advanced populations have been well documented, still these advanced populations are often investigated in a manner similar to the way F2 populations are studied using a first-order Markovian assumption. Therefore, more efforts are needed to address the complexities of these advanced populations in more details. In this article, we attempt to tackle these issues by first modifying current methods developed under this Markovian assumption to propose an ad hoc method (the Markovian method) and explore its possible problems. We then consider the specific genome structures present in the advanced populations without invoking this assumption to propose a more adequate method (the non-Markovian method) for QTL mapping. Further, some QTL mapping properties related to the confounding problems that result from ignoring epistasis and to mapping closely linked QTL are derived and investigated across the different populations. Simulations show that the non-Markovian method outperforms the Markovian method, especially in the advanced populations subject to selfing. The results presented here may give some clues to the use of advanced populations for more powerful and precise QTL mapping.
Collapse
|
53
|
Abstract
When building genetic maps, it is necessary to choose from several marker ordering algorithms and criteria, and the choice is not always simple. In this study, we evaluate the efficiency of algorithms try (TRY), seriation (SER), rapid chain delineation (RCD), recombination counting and ordering (RECORD) and unidirectional growth (UG), as well as the criteria PARF (product of adjacent recombination fractions), SARF (sum of adjacent recombination fractions), SALOD (sum of adjacent LOD scores) and LHMC (likelihood through hidden Markov chains), used with the RIPPLE algorithm for error verification, in the construction of genetic linkage maps. A linkage map of a hypothetical diploid and monoecious plant species was simulated containing one linkage group and 21 markers with fixed distance of 3 cM between them. In all, 700 F(2) populations were randomly simulated with 100 and 400 individuals with different combinations of dominant and co-dominant markers, as well as 10 and 20% of missing data. The simulations showed that, in the presence of co-dominant markers only, any combination of algorithm and criteria may be used, even for a reduced population size. In the case of a smaller proportion of dominant markers, any of the algorithms and criteria (except SALOD) investigated may be used. In the presence of high proportions of dominant markers and smaller samples (around 100), the probability of repulsion linkage increases between them and, in this case, use of the algorithms TRY and SER associated to RIPPLE with criterion LHMC would provide better results.
Collapse
|
54
|
Abstract
The value of a new crop species is usually judged by the overall performance of multiple traits. Therefore, in most quantitative trait locus (QTL) mapping experiments, researchers tend to collect phenotypic records for multiple traits. Some traits may vary continuously and others may vary in a discrete fashion. Although mapping QTLs jointly for multiple traits is more efficient than mapping QTLs separately for individual traits, the latter is still commonly practised in QTL mapping. This is primarily due to the lack of efficient statistical methods and computer software packages to implement the methods. Mapping multiple QTLs simultaneously in a single multivariate model has not been available, especially when categorical traits are involved. In the present study, we developed a Bayesian method to map QTLs of the entire genome for multiple traits with continuous, discrete or both types of phenotypic distribution. Instead of using the reversible jump Markov chain Monte Carlo (MCMC) for model selection, we adopt a parameter shrinkage approach to estimate the genetic effects of all marker intervals. We demonstrate the method by analysing a set of simulated data with both continuous and discrete traits. We also apply the method to mapping QTLs responsible for multiple disease resistances to the blast fungus of rice. A computer program written in SAS/IML that implements the method is freely available, on request, to academic researchers.
Collapse
|
55
|
Cooper M, van Eeuwijk FA, Hammer GL, Podlich DW, Messina C. Modeling QTL for complex traits: detection and context for plant breeding. CURRENT OPINION IN PLANT BIOLOGY 2009; 12:231-40. [PMID: 19282235 DOI: 10.1016/j.pbi.2009.01.006] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2008] [Revised: 01/17/2009] [Accepted: 01/19/2009] [Indexed: 05/21/2023]
Abstract
The genetic architecture of a trait is defined by the set of genes contributing to genetic variation within a reference population of genotypes together with information on their location in the genome and the effects of their alleles on traits, including intra-locus and inter-locus interactions, environmental dependencies, and pleiotropy. Accumulated evidence from trait mapping studies emphasizes that plant breeders work within a trait genetic complexity continuum. Some traits show a relatively simple genetic architecture while others, such as grain yield, have a complex architecture. An important advance is that we now have empirical genetic models of trait genetic architecture obtained from mapping studies (multi-QTL models including various genetic effects that may vary in relation to environmental factors) to ground theoretical investigations on the merits of alternative breeding strategies. Such theoretical studies indicate that as the genetic complexity of traits increases the opportunities for realizing benefits from molecular enhanced breeding strategies increase. To realize these potential benefits and enable the plant breeder to increase rate of genetic gain for complex traits it is anticipated that the empirical genetic models of trait genetic architecture used for predicting trait variation will need to incorporate the effects of genetic interactions and be interpreted within a genotype-environment-management framework for the target agricultural production system.
Collapse
Affiliation(s)
- Mark Cooper
- Pioneer Hi-Bred International, 7250 NW 62nd Ave, Johnston, IA 50131, United States.
| | | | | | | | | |
Collapse
|
56
|
Yi N, Banerjee S. Hierarchical generalized linear models for multiple quantitative trait locus mapping. Genetics 2009; 181:1101-13. [PMID: 19139143 PMCID: PMC2651046 DOI: 10.1534/genetics.108.099556] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2008] [Accepted: 01/06/2009] [Indexed: 11/18/2022] Open
Abstract
We develop hierarchical generalized linear models and computationally efficient algorithms for genomewide analysis of quantitative trait loci (QTL) for various types of phenotypes in experimental crosses. The proposed models can fit a large number of effects, including covariates, main effects of numerous loci, and gene-gene (epistasis) and gene-environment (G x E) interactions. The key to the approach is the use of continuous prior distribution on coefficients that favors sparseness in the fitted model and facilitates computation. We develop a fast expectation-maximization (EM) algorithm to fit models by estimating posterior modes of coefficients. We incorporate our algorithm into the iteratively weighted least squares for classical generalized linear models as implemented in the package R. We propose a model search strategy to build a parsimonious model. Our method takes advantage of the special correlation structure in QTL data. Simulation studies demonstrate reasonable power to detect true effects, while controlling the rate of false positives. We illustrate with three real data sets and compare our method to existing methods for multiple-QTL mapping. Our method has been implemented in our freely available package R/qtlbim (www.qtlbim.org), providing a valuable addition to our previous Markov chain Monte Carlo (MCMC) approach.
Collapse
Affiliation(s)
- Nengjun Yi
- Department of Biostatistics, Section on Statistical Genetics, University of Alabama, Birmingham, Alabama 35294-0022, USA.
| | | |
Collapse
|
57
|
Bayesian QTL mapping for multiple families derived from crossing a set of inbred lines to a reference line. Heredity (Edinb) 2009; 102:497-505. [PMID: 19209187 DOI: 10.1038/hdy.2009.6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
In some crop species, germplasm collections consisting of a large number of accessions that include traditional landraces, modern cultivars and wild species have recently been established. Such collections are regarded as useful stocks of genes for breeding programs. However, to efficiently utilize these collections for plant breeding, understanding genetic variation in agronomic traits at the QTL level between the accessions is indispensable. One effective way to extract the actual QTL information included in these collections is to perform QTL analysis jointly for multiple families derived from crossing some accessions of the collection with a single reference line such as a standard commercial variety. We developed a Bayesian method for jointly analyzing QTL in such interconnected multiple families derived from a set of inbred lines crossed to the reference line, to detect QTL segregating between any of the inbred lines and the reference line. In this study, we considered multiple recombinant inbred lines, each of which was derived from crossing each of the inbred lines to the reference line. The method was evaluated through the use of simulated data sets for its efficiency in detecting QTL and identifying families segregating at each QTL.
Collapse
|
58
|
Zhu C, Zhang YM, Guo Z. Mapping quantitative trait loci for binary trait in the F2:3 design. J Genet 2009; 87:201-7. [PMID: 19147904 DOI: 10.1007/s12041-008-0033-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In the analysis of inheritance of quantitative traits with low heritability, an F(2:3) design that genotypes plants in F(2) and phenotypes plants in F(2:3) progeny is often used in plant genetics. Although statistical approaches for mapping quantitative trait loci (QTL) in the F(2:3) design have been well developed, those for binary traits of biological interest and economic importance are seldom addressed. In this study, an attempt was made to map binary trait loci (BTL) in the F(2:3) design. The fundamental idea was: the F(2) plants were genotyped, all phenotypic values of each F(2:3) progeny were measured for binary trait, and these binary trait values and the marker genotype informations were used to detect BTL under the penetrance and liability models. The proposed method was verified by a series of Monte-Carlo simulation experiments. These results showed that maximum likelihood approaches under the penetrance and liability models provide accurate estimates for the effects and the locations of BTL with high statistical power, even under of low heritability. Moreover, the penetrance model is as efficient as the liability model, and the F(2:3) design is more efficient than classical F(2) design, even though only a single progeny is collected from each F(2:3) family. With the maximum likelihood approaches under the penetrance and the liability models developed in this study, we can map binary traits as we can do for quantitative trait in the F(2:3) design.
Collapse
Affiliation(s)
- Chengsong Zhu
- Section on Statistical Genomics, State Key Laboratory of Crop Genetics and Germplasm Enhancement/National Center for Soybean Improvement, College of Agriculture, Nanjing Agricultural University, Nanjing 210095, People's Republic of China
| | | | | |
Collapse
|
59
|
Guo Z, Nelson JC. Multiple-trait quantitative trait locus mapping with incomplete phenotypic data. BMC Genet 2008; 9:82. [PMID: 19061502 PMCID: PMC2639387 DOI: 10.1186/1471-2156-9-82] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2008] [Accepted: 12/05/2008] [Indexed: 11/10/2022] Open
Abstract
Background Conventional multiple-trait quantitative trait locus (QTL) mapping methods must discard cases (individuals) with incomplete phenotypic data, thereby sacrificing other phenotypic and genotypic information contained in the discarded cases. Under standard assumptions about the missing-data mechanism, it is possible to exploit these cases. Results We present an expectation-maximization (EM) algorithm, derived for recombinant inbred and F2 genetic models but extensible to any mating design, that supports conventional hypothesis tests for QTL main effect, pleiotropy, and QTL-by-environment interaction in multiple-trait analyses with missing phenotypic data. We evaluate its performance by simulations and illustrate with a real-data example. Conclusion The EM method affords improved QTL detection power and precision of QTL location and effect estimation in comparison with case deletion or imputation methods. It may be incorporated into any least-squares or likelihood-maximization QTL-mapping approach.
Collapse
Affiliation(s)
- Zhigang Guo
- Department of Plant Pathology, Kansas State University, Manhattan, Kansas 66506, USA.
| | | |
Collapse
|
60
|
Mathews KL, Malosetti M, Chapman S, McIntyre L, Reynolds M, Shorter R, van Eeuwijk F. Multi-environment QTL mixed models for drought stress adaptation in wheat. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2008; 117:1077-91. [PMID: 18696042 DOI: 10.1007/s00122-008-0846-8] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2007] [Accepted: 07/09/2008] [Indexed: 05/18/2023]
Abstract
Many quantitative trait loci (QTL) detection methods ignore QTL-by-environment interaction (QEI) and are limited in accommodation of error and environment-specific variance. This paper outlines a mixed model approach using a recombinant inbred spring wheat population grown in six drought stress trials. Genotype estimates for yield, anthesis date and height were calculated using the best design and spatial effects model for each trial. Parsimonious factor analytic models best captured the variance-covariance structure, including genetic correlations, among environments. The 1RS.1BL rye chromosome translocation (from one parent) which decreased progeny yield by 13.8 g m(-2) was explicitly included in the QTL model. Simple interval mapping (SIM) was used in a genome-wide scan for significant QTL, where QTL effects were fitted as fixed environment-specific effects. All significant environment-specific QTL were subsequently included in a multi-QTL model and evaluated for main and QEI effects with non-significant QEI effects being dropped. QTL effects (either consistent or environment-specific) included eight yield, four anthesis, and six height QTL. One yield QTL co-located (or was linked) to an anthesis QTL, while another co-located with a height QTL. In the final multi-QTL model, only one QTL for yield (6 g m(-2)) was consistent across environments (no QEI), while the remaining QTL had significant QEI effects (average size per environment of 5.1 g m(-2)). Compared to single trial analyses, the described framework allowed explicit modelling and detection of QEI effects and incorporation of additional classification information about genotypes.
Collapse
Affiliation(s)
- Ky L Mathews
- CSIRO Plant Industry, Queensland Biosciences Precinct, 306 Carmody Rd, St Lucia, QLD 4067, Australia
| | | | | | | | | | | | | |
Collapse
|
61
|
Abstract
UNLABELLED Of many statistical methods developed to date for quantitative trait locus (QTL) analysis, only a limited subset are available in public software allowing their exploration, comparison and practical application by researchers. We have developed QGene 4.0, a plug-in platform that allows execution and comparison of a variety of modern QTL-mapping methods and supports third-party addition of new ones. The software accommodates line-cross mating designs consisting of any arbitrary sequence of selfing, backcrossing, intercrossing and haploid-doubling steps that includes map, population, and trait simulators; and is scriptable. AVAILABILITY Software and documentation are available at http://coding.plantpath.ksu.edu/qgene. Source code is available on request.
Collapse
Affiliation(s)
- Roby Joehanes
- Department of Plant Pathology, Kansas State University, Manhattan, KS 66506, USA
| | | |
Collapse
|
62
|
Zhang YM, Gai J. Methodologies for segregation analysis and QTL mapping in plants. Genetica 2008; 136:311-8. [PMID: 18726162 DOI: 10.1007/s10709-008-9313-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2008] [Accepted: 08/11/2008] [Indexed: 12/01/2022]
Abstract
Most characters of biological interest and economic importance are quantitative traits. To uncover the genetic architecture of quantitative traits, two approaches have become popular in China. One is the establishment of an analytical model for mixed major-gene plus polygenes inheritance and the other the discovery of quantitative trait locus (QTL). Here we review our progress employing these two approaches. First, we proposed joint segregation analysis of multiple generations for mixed major-gene plus polygenes inheritance. Second, we extended the multilocus method of Lander and Green (1987), Jiang and Zeng (1997) to a more generalized approach. Our methodology handles distorted, dominant and missing markers, including the effect of linked segregation distortion loci on the estimation of map distance. Finally, we developed several QTL mapping methods. In the Bayesian shrinkage estimation (BSE) method, we suggested a method to test the significance of QTL effects and studied the effect of the prior distribution of the variance of QTL effect on QTL mapping. To reduce running time, a penalized maximum likelihood method was adopted. To mine novel genes in crop inbred lines generated in the course of normal crop breeding work, three methods were introduced. If a well-documented genealogical history of the lines is available, two-stage variance component analysis and multi-QTL Haseman-Elston regression were suggested; if unavailable, multiple loci in silico mapping was proposed.
Collapse
Affiliation(s)
- Yuan-Ming Zhang
- Section on Statistical Genomics, State Key Laboratory of Crop Genetics and Germplasm Enhancement and National Center for Soybean Improvement, College of Agriculture, Nanjing Agricultural University, Nanjing, 210095, China.
| | | |
Collapse
|
63
|
Han L, Xu S. A Fisher scoring algorithm for the weighted regression method of QTL mapping. Heredity (Edinb) 2008; 101:453-64. [PMID: 18698336 DOI: 10.1038/hdy.2008.78] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
An improved weighted least square (LS) method for quantitative trait loci (QTL) mapping using the estimating equation (EE) algorithm was developed recently. The method is more efficient than both the LS and the weighted LS methods and slightly less efficient than the mixture model maximum likelihood (ML) method. The iteration process of the EE algorithm is implicit. We developed a Fisher-scoring algorithm for the weighted LS method. The iteration process is explicit and easy to program. In addition, the method automatically provides an approximate variance-covariance matrix for the estimated QTL parameters as a by-product of the iteration process. As a consequence, a W-test statistic can be used for testing the significance of QTL. To compare the Fisher scoring algorithm with the expectation maximization (EM)-based ML method, we also developed a slightly simplified method to compute the variance-covariance matrix of the estimated parameters under the EM algorithm.
Collapse
Affiliation(s)
- L Han
- Department of Botany and Plant Science, University of California, Riverside, CA 92521, USA
| | | |
Collapse
|
64
|
Banerjee S, Yandell BS, Yi N. Bayesian quantitative trait loci mapping for multiple traits. Genetics 2008; 179:2275-89. [PMID: 18689903 PMCID: PMC2516097 DOI: 10.1534/genetics.108.088427] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2008] [Accepted: 06/15/2008] [Indexed: 11/18/2022] Open
Abstract
Most quantitative trait loci (QTL) mapping experiments typically collect phenotypic data on multiple correlated complex traits. However, there is a lack of a comprehensive genomewide mapping strategy for correlated traits in the literature. We develop Bayesian multiple-QTL mapping methods for correlated continuous traits using two multivariate models: one that assumes the same genetic model for all traits, the traditional multivariate model, and the other known as the seemingly unrelated regression (SUR) model that allows different genetic models for different traits. We develop computationally efficient Markov chain Monte Carlo (MCMC) algorithms for performing joint analysis. We conduct extensive simulation studies to assess the performance of the proposed methods and to compare with the conventional single-trait model. Our methods have been implemented in the freely available package R/qtlbim (http://www.qtlbim.org), which greatly facilitates the general usage of the Bayesian methodology for unraveling the genetic architecture of complex traits.
Collapse
Affiliation(s)
- Samprit Banerjee
- Departments of Biostatistics, Section on Statistical Genetics, University of Alabama, Birmingham, AL 35294, USA
| | | | | |
Collapse
|
65
|
Sillanpää MJ, Noykova N. Hierarchical modeling of clinical and expression quantitative trait loci. Heredity (Edinb) 2008; 101:271-84. [DOI: 10.1038/hdy.2008.58] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
|
66
|
Abstract
The mapping of quantitative trait loci (QTL) is to identify molecular markers or genomic loci that influence the variation of complex traits. The problem is complicated by the facts that QTL data usually contain a large number of markers across the entire genome and most of them have little or no effect on the phenotype. In this article, we propose several Bayesian hierarchical models for mapping multiple QTL that simultaneously fit and estimate all possible genetic effects associated with all markers. The proposed models use prior distributions for the genetic effects that are scale mixtures of normal distributions with mean zero and variances distributed to give each effect a high probability of being near zero. We consider two types of priors for the variances, exponential and scaled inverse-chi(2) distributions, which result in a Bayesian version of the popular least absolute shrinkage and selection operator (LASSO) model and the well-known Student's t model, respectively. Unlike most applications where fixed values are preset for hyperparameters in the priors, we treat all hyperparameters as unknowns and estimate them along with other parameters. Markov chain Monte Carlo (MCMC) algorithms are developed to simulate the parameters from the posteriors. The methods are illustrated using well-known barley data.
Collapse
|
67
|
He XH, Zhang YM. Mapping epistatic quantitative trait loci underlying endosperm traits using all markers on the entire genome in a random hybridization design. Heredity (Edinb) 2008; 101:39-47. [PMID: 18461088 DOI: 10.1038/hdy.2008.23] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Triploid endosperm is of great economic importance owing to its nutritious quality. Mapping endosperm trait loci (ETL) can provide an efficient way to genetically improve grain quality. However, most triploid ETL mapping methods do not produce unbiased estimates of the two dominant effects of ETL. A random hybridization design is an alternative method that may be used to overcome this problem. However, epistasis has an important role in the dissection of genetic architecture for complex traits. In this study, therefore, an attempt was made to map epistatic ETL (eETL) under a triploid genetic model of endosperm traits in a random hybridization design. The endosperm trait means of random hybrid lines, together with known marker genotype information from their corresponding parental F(2) plants, were used to estimate, efficiently and without bias, the positions and all of the effects of eETL using a penalized maximum likelihood method. The method proposed in this article was verified by a series of Monte Carlo simulation experiments. Results from the simulated studies show that the proposed method provides accurate estimates of eETL parameters with a low false-positive rate and a relatively short running time. This new method enables us to map triploid eETL in the same way as diploid quantitative traits.
Collapse
Affiliation(s)
- X-H He
- 1Section on Statistical Genomics, State Key Laboratory of Crop Genetics and Germplasm Enhancement, National Center for Soybean Improvement, Nanjing Agricultural University, Nanjing, China
| | | |
Collapse
|
68
|
Yi N, Shriner D. Advances in Bayesian multiple quantitative trait loci mapping in experimental crosses. Heredity (Edinb) 2008; 100:240-52. [PMID: 17987056 PMCID: PMC5003624 DOI: 10.1038/sj.hdy.6801074] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Many complex human diseases and traits of biological and/or economic importance are determined by interacting networks of multiple quantitative trait loci (QTL) and environmental factors. Mapping QTL is critical for understanding the genetic basis of complex traits, and for ultimate identification of genes responsible. A variety of sophisticated statistical methods for QTL mapping have been developed. Among these developments, the evolution of Bayesian approaches for multiple QTL mapping over the past decade has been remarkable. Bayesian methods can jointly infer the number of QTL, their genomic positions and their genetic effects. Here, we review recently developed and still developing Bayesian methods and associated computer software for mapping multiple QTL in experimental crosses. We compare and contrast these methods to clearly describe the relationships among different Bayesian methods. We conclude this review by highlighting some areas of future research.
Collapse
Affiliation(s)
- N Yi
- Section on Statistical Genetics, Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL 35294-0022, USA.
| | | |
Collapse
|
69
|
Abstract
Under a hypothesis that the host-parasite interaction system is governed by genome-for-genome interaction, we propose a genetic model that integrates genetic information from both the host and parasite genomes. The model can be used for mapping quantitative trait loci (QTL) conferring the interaction between host and parasite and detecting interactions among these QTL. A one-dimensional genome-scan strategy is used to map QTL in both the host and parasite genomes simultaneously conditioned on selected pairs of markers controlling the background genetic variation; a two-dimensional genome-scan procedure is conducted to search for epistasis within the host and parasite genomes and interspecific QTL-by-QTL interactions between the host and parasite genomes. A permutation test is adopted to calculate the empirical threshold to control the experimentwise false-positive rate of detected QTL and QTL interactions. Monte Carlo simulations were conducted to examine the reliability and the efficiency of the proposed models and methods. Simulation results illustrated that our methods could provide reasonable estimates of the parameters and adequate powers for detecting QTL and QTL-by-QTL interactions.
Collapse
|
70
|
Zheng X, Wu JG, Lou XY, Xu HM, Shi CH. The QTL analysis on maternal and endosperm genome and their environmental interactions for characters of cooking quality in rice (Oryza sativa L.). TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2008; 116:335-42. [PMID: 17989953 DOI: 10.1007/s00122-007-0671-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2007] [Accepted: 10/23/2007] [Indexed: 05/21/2023]
Abstract
Investigations to identify quantitative trait loci (QTLs) governing cooking quality traits including amylose content, gel consistency and gelatinization temperature (expressed by the alkali spread value) were conducted using a set of 241 RIL populations derived from an elite hybrid cross of "Zhenshan 97"x"Minghui 63" and their reciprocal backcrosses BC1F1 and BC2F1 populations in two environments. QTLs and QTLxenvironment interactions were analyzed by using the genetic model with endosperm and maternal effects and environmental interaction effects on quantitative traits of seed in cereal crops. The results suggested that a total of seven QTLs were associated with cooking quality of rice, which were subsequently mapped to chromosomes 1, 4 and 6. Six of these QTLs were also found to have environmental interaction effects.
Collapse
Affiliation(s)
- X Zheng
- Department of Agronomy, College of Agriculture and Biotechnology, Zhejiang University, 310029, Hangzhou, People's Republic of China
| | | | | | | | | |
Collapse
|
71
|
Yu J, Holland JB, McMullen MD, Buckler ES. Genetic design and statistical power of nested association mapping in maize. Genetics 2008; 178:539-51. [PMID: 18202393 PMCID: PMC2206100 DOI: 10.1534/genetics.107.074245] [Citation(s) in RCA: 599] [Impact Index Per Article: 35.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2007] [Accepted: 11/06/2007] [Indexed: 12/24/2022] Open
Abstract
We investigated the genetic and statistical properties of the nested association mapping (NAM) design currently being implemented in maize (26 diverse founders and 5000 distinct immortal genotypes) to dissect the genetic basis of complex quantitative traits. The NAM design simultaneously exploits the advantages of both linkage analysis and association mapping. We demonstrated the power of NAM for high-power cost-effective genome scans through computer simulations based on empirical marker data and simulated traits with different complexities. With common-parent-specific (CPS) markers genotyped for the founders and the progenies, the inheritance of chromosome segments nested within two adjacent CPS markers was inferred through linkage. Genotyping the founders with additional high-density markers enabled the projection of genetic information, capturing linkage disequilibrium information, from founders to progenies. With 5000 genotypes, 30-79% of the simulated quantitative trait loci (QTL) were precisely identified. By integrating genetic design, natural diversity, and genomics technologies, this new complex trait dissection strategy should greatly facilitate endeavors to link molecular variation with phenotypic variation for various complex traits.
Collapse
Affiliation(s)
- Jianming Yu
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853-2703, USA
| | | | | | | |
Collapse
|
72
|
Sillanpää MJ, Hoti F. Mapping quantitative trait loci from a single-tail sample of the phenotype distribution including survival data. Genetics 2007; 177:2361-77. [PMID: 18073434 PMCID: PMC2219510 DOI: 10.1534/genetics.107.081299] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2007] [Accepted: 10/05/2007] [Indexed: 02/04/2023] Open
Abstract
A new effective Bayesian quantitative trait locus (QTL) mapping approach for the analysis of single-tail selected samples of the phenotype distribution is presented. The approach extends the affected-only tests to single-tail sampling with quantitative traits such as the log-normal survival time or censored/selected traits. A great benefit of the approach is that it enables the utilization of multiple-QTL models, is easy to incorporate into different data designs (experimental and outbred populations), and can potentially be extended to epistatic models. In inbred lines, the method exploits the fact that the parental mating type and the linkage phases (haplotypes) are known by definition. In outbred populations, two-generation data are needed, for example, selected offspring and one of the parents (the sires) in breeding material. The idea is to statistically (computationally) generate a fully complementary, maximally dissimilar, observation for each offspring in the sample. Bayesian data augmentation is then used to sample the space of possible trait values for the pseudoobservations. The benefits of the approach are illustrated using simulated data sets and a real data set on the survival of F(2) mice following infection with Listeria monocytogenes.
Collapse
Affiliation(s)
- Mikko J Sillanpää
- Department of Mathematics and Statistics, University of Helsinki, Finland.
| | | |
Collapse
|
73
|
Li H, Huang Z, Gai J, Wu S, Zeng Y, Li Q, Wu R. A conceptual framework for mapping quantitative trait Loci regulating ontogenetic allometry. PLoS One 2007; 2:e1245. [PMID: 18043752 PMCID: PMC2080758 DOI: 10.1371/journal.pone.0001245] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2006] [Accepted: 10/17/2007] [Indexed: 11/19/2022] Open
Abstract
Although ontogenetic changes in body shape and its associated allometry has been studied for over a century, essentially nothing is known about their underlying genetic and developmental mechanisms. One of the reasons for this ignorance is the unavailability of a conceptual framework to formulate the experimental design for data collection and statistical models for data analyses. We developed a framework model for unraveling the genetic machinery for ontogenetic changes of allometry. The model incorporates the mathematical aspects of ontogenetic growth and allometry into a maximum likelihood framework for quantitative trait locus (QTL) mapping. As a quantitative platform, the model allows for the testing of a number of biologically meaningful hypotheses to explore the pleiotropic basis of the QTL that regulate ontogeny and allometry. Simulation studies and real data analysis of a live example in soybean have been performed to investigate the statistical behavior of the model and validate its practical utilization. The statistical model proposed will help to study the genetic architecture of complex phenotypes and, therefore, gain better insights into the mechanistic regulation for developmental patterns and processes in organisms.
Collapse
Affiliation(s)
- Hongying Li
- Department of Statistics, University of Florida, Gainesville, Florida, United States of America
| | - Zhongwen Huang
- National Center for Soybean Improvement, Nanjing Agricultural University, Nanjing, Jiangsu, People’s Republic of China
- Department of Agronomy, Henan Institute of Science and Technology, Xinxiang, Henan, People’s Republic of China
| | - Junyi Gai
- National Center for Soybean Improvement, Nanjing Agricultural University, Nanjing, Jiangsu, People’s Republic of China
| | - Song Wu
- Department of Statistics, University of Florida, Gainesville, Florida, United States of America
| | - Yanru Zeng
- School of Forestry and Biotechnology, Zhejiang Forestry University, Lin’an, Zhejiang, People’s Republic of China
| | - Qin Li
- Department of Statistics, University of Florida, Gainesville, Florida, United States of America
| | - Rongling Wu
- Department of Statistics, University of Florida, Gainesville, Florida, United States of America
| |
Collapse
|
74
|
Boer MP, Wright D, Feng L, Podlich DW, Luo L, Cooper M, van Eeuwijk FA. A mixed-model quantitative trait loci (QTL) analysis for multiple-environment trial data using environmental covariables for QTL-by-environment interactions, with an example in maize. Genetics 2007; 177:1801-13. [PMID: 17947443 PMCID: PMC2147942 DOI: 10.1534/genetics.107.071068] [Citation(s) in RCA: 111] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2007] [Accepted: 08/28/2007] [Indexed: 11/18/2022] Open
Abstract
Complex quantitative traits of plants as measured on collections of genotypes across multiple environments are the outcome of processes that depend in intricate ways on genotype and environment simultaneously. For a better understanding of the genetic architecture of such traits as observed across environments, genotype-by-environment interaction should be modeled with statistical models that use explicit information on genotypes and environments. The modeling approach we propose explains genotype-by-environment interaction by differential quantitative trait locus (QTL) expression in relation to environmental variables. We analyzed grain yield and grain moisture for an experimental data set composed of 976 F(5) maize testcross progenies evaluated across 12 environments in the U.S. corn belt during 1994 and 1995. The strategy we used was based on mixed models and started with a phenotypic analysis of multi-environment data, modeling genotype-by-environment interactions and associated genetic correlations between environments, while taking into account intraenvironmental error structures. The phenotypic mixed models were then extended to QTL models via the incorporation of marker information as genotypic covariables. A majority of the detected QTL showed significant QTL-by-environment interactions (QEI). The QEI were further analyzed by including environmental covariates into the mixed model. Most QEI could be understood as differential QTL expression conditional on longitude or year, both consequences of temperature differences during critical stages of the growth.
Collapse
Affiliation(s)
- Martin P Boer
- Biometris, Wageningen UR, Wageningen, 6708 PD, The Netherlands.
| | | | | | | | | | | | | |
Collapse
|
75
|
Zhu C, Wang F, Wang J, Li G, Zhang H, Zhang Y. Reconstruction of linkage maps in the distorted segregation populations of backcross, doubled haploid and recombinant inbred lines. ACTA ACUST UNITED AC 2007. [DOI: 10.1007/s11434-007-0244-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
76
|
Yang J, Zhu J, Williams RW. Mapping the genetic architecture of complex traits in experimental populations. ACTA ACUST UNITED AC 2007; 23:1527-36. [PMID: 17459962 DOI: 10.1093/bioinformatics/btm143] [Citation(s) in RCA: 254] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
SUMMARY Understanding how interactions among set of genes affect diverse phenotypes is having a greater impact on biomedical research, agriculture and evolutionary biology. Mapping and characterizing the isolated effects of single quantitative trait locus (QTL) is a first step, but we also need to assemble networks of QTLs and define non-additive interactions (epistasis) together with a host of potential environmental modulators. In this article, we present a full-QTL model with which to explore the genetic architecture of complex trait in multiple environments. Our model includes the effects of multiple QTLs, epistasis, QTL-by-environment interactions and epistasis-by-environment interactions. A new mapping strategy, including marker interval selection, detection of marker interval interactions and genome scans, is used to evaluate putative locations of multiple QTLs and their interactions. All the mapping procedures are performed in the framework of mixed linear model that are flexible to model environmental factors regardless of fix or random effects being assumed. An F-statistic based on Henderson method III is used for hypothesis tests. This method is less computationally greedy than corresponding likelihood ratio test. In each of the mapping procedures, permutation testing is exploited to control for genome-wide false positive rate, and model selection is used to reduce ghost peaks in F-statistic profile. Parameters of the full-QTL model are estimated using a Bayesian method via Gibbs sampling. Monte Carlo simulations help define the reliability and efficiency of the method. Two real-world phenotypes (BXD mouse olfactory bulb weight data and rice yield data) are used as exemplars to demonstrate our methods. AVAILABILITY A software package is freely available at http://ibi.zju.edu.cn/software/qtlnetwork
Collapse
Affiliation(s)
- Jian Yang
- Institute of Bioinformatics, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, PR China
| | | | | |
Collapse
|
77
|
Wang X, Hu Z, Wang W, Li Y, Zhang YM, Xu C. A mixture model approach to the mapping of QTL controlling endosperm traits with bulked samples. Genetica 2007; 132:59-70. [PMID: 17427035 DOI: 10.1007/s10709-007-9149-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2005] [Accepted: 03/29/2007] [Indexed: 11/29/2022]
Abstract
Endosperm traits are of triploid inheritance and have become a focus of breeding effort for their close relations with the grain quality. Current methods for mapping quantitative trait loci (QTL) underlying endosperm traits are restricted to the use of the phenotypes of single grain samples as input data set, which are often not available in practice due to the small size of the cereal seeds. This paper proposed a statistical model for one specially tailored mapping strategy, where the marker genotypes are obtained from the maternal plants in the segregation population and the phenotypic responses are replaced by the trait means of composite endosperm samples pooled from each plant. It should therefore be more practical and have wide applicability in mapping endosperm traits. The method was implemented by fitting the phenotypic means of endosperms into a Gaussian mixture model. Both the exact and approximate Expectation-Maximization algorithms were proposed to estimate the model parameters. The presence of the QTL was determined by likelihood ratio test statistics. Statistical power and other properties of the new method were investigated and compared to the current single-seed method under a variety of scenarios through simulation studies. The simulations suggest a reasonable sample size should be used to ensure reliable results. The proposed method was also applied to a simulated genome data for further evaluation. As an illustration, a real data of maize was analyzed to find the loci responsible for the popping expansion volume.
Collapse
Affiliation(s)
- Xuefeng Wang
- Jiangsu Provincial Key Laboratory of Crop Genetics and Physiology; Key Laboratory of Plant Functional Genomics of Ministry of Education, Yangzhou University, Yangzhou 225009, China
| | | | | | | | | | | |
Collapse
|
78
|
Zhu C, Wang C, Zhang YM. Modeling segregation distortion for viability selection. I. Reconstruction of linkage maps with distorted markers. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2007; 114:295-305. [PMID: 17119913 DOI: 10.1007/s00122-006-0432-x] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2005] [Accepted: 10/14/2006] [Indexed: 05/10/2023]
Abstract
Molecular markers have been widely used to map quantitative trait loci (QTL). The QTL mapping partly relies on accurate linkage maps. The non-Mendelian segregation of markers, which affects not only the estimation of genetic distance between two markers but also the order of markers on a same linkage group, is usually observed in QTL analysis. However, these distorted markers are often ignored in the real data analysis of QTL mapping so that some important information may be lost. In this paper, we developed a multipoint approach via Hidden Markov chain model to reconstruct the linkage maps given a specified gene order while simultaneously making use of distorted, dominant and missing markers in an F(2) population. The new method was compared with the methods in the MapManager and Mapmaker programs, respectively, and verified by a series of Monte Carlo simulation experiments along with a working example. Results showed that the adjusted linkage maps can be used for further QTL or segregation distortion locus (SDL) analysis unless there are strong evidences to prove that all markers show normal Mendelian segregation.
Collapse
Affiliation(s)
- Chengsong Zhu
- Section on Statistical Genomics, State Key Laboratory of Crop Genetics and Germplasm Enhancement/National Center for Soybean Improvement, College of Agriculture, Nanjing Agricultural University, Nanjing, 210095, People's Republic of China
| | | | | |
Collapse
|
79
|
Liu M, Lu W, Shao Y. Mixture cure model with an application to interval mapping of quantitative trait loci. LIFETIME DATA ANALYSIS 2006; 12:421-40. [PMID: 17063400 DOI: 10.1007/s10985-006-9025-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2005] [Accepted: 09/18/2006] [Indexed: 05/12/2023]
Abstract
When censored time-to-event data are used to map quantitative trait loci (QTL), the existence of nonsusceptible subjects entails extra challenges. If the heterogeneous susceptibility is ignored or inappropriately handled, we may either fail to detect the responsible genetic factors or find spuriously significant locations. In this article, an interval mapping method based on parametric mixture cure models is proposed, which takes into consideration of nonsusceptible subjects. The proposed model can be used to detect the QTL that are responsible for differential susceptibility and/or time-to-event trait distribution. In particular, we propose a likelihood-based testing procedure with genome-wide significance levels calculated using a resampling method. The performance of the proposed method and the importance of considering the heterogeneous susceptibility are demonstrated by simulation studies and an application to survival data from an experiment on mice infected with Listeria monocytogenes.
Collapse
Affiliation(s)
- Mengling Liu
- Division of Biostatistics, School of Medicine, New York University, New York, NY 10016, USA.
| | | | | |
Collapse
|
80
|
A statistical framework for genome-wide scanning and testing of imprinted quantitative trait loci. J Theor Biol 2006; 244:115-26. [PMID: 16959270 DOI: 10.1016/j.jtbi.2006.07.009] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2006] [Revised: 06/28/2006] [Accepted: 07/11/2006] [Indexed: 10/24/2022]
Abstract
Non-equivalent expression of alleles at a locus results in genomic imprinting. In this article, a statistical framework for genome-wide scanning and testing of imprinted quantitative trait loci (iQTL) underlying complex traits is developed based on experimental crosses of inbred line species in backcross populations. The joint likelihood function is composed of four component likelihood functions with each of them derived from one of four backcross families. The proposed approach models genomic imprinting effect as a probability measure with which one can test the degree of imprinting. Simulation results show that the model is robust for identifying iQTL with various degree of imprinting ranging from no imprinting, partial imprinting to complete imprinting. Under various simulation scenarios, the proposed model shows consistent parameter estimation with reasonable precision and high power in testing iQTL. When a QTL shows Mendelian effect, the proposed model also outperforms traditional Mendelian model. Extension to incorporate maternal effect is also given. The developed model, built within the maximum likelihood framework and implemented with the EM algorithm, provides a quantitative framework for testing and estimating iQTL involved in the genetic control of complex traits.
Collapse
|
81
|
|
82
|
Malosetti M, Visser RGF, Celis-Gamboa C, van Eeuwijk FA. QTL methodology for response curves on the basis of non-linear mixed models, with an illustration to senescence in potato. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2006; 113:288-300. [PMID: 16791695 DOI: 10.1007/s00122-006-0294-2] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2006] [Accepted: 04/19/2006] [Indexed: 05/02/2023]
Abstract
The improvement of quantitative traits in plant breeding will in general benefit from a better understanding of the genetic basis underlying their development. In this paper, a QTL mapping strategy is presented for modelling the development of phenotypic traits over time. Traditionally, crop growth models are used to study development. We propose an integration of crop growth models and QTL models within the framework of non-linear mixed models. We illustrate our approach with a QTL model for leaf senescence in a diploid potato cross. Assuming a logistic progression of senescence in time, two curve parameters are modelled, slope and inflection point, as a function of QTLs. The final QTL model for our example data contained four QTLs, of which two affected the position of the inflection point, one the senescence progression-rate, and a final one both inflection point and rate.
Collapse
Affiliation(s)
- M Malosetti
- C.T. de Wit Graduate School for Production Ecology and Resource Conservation (PE and RC), Laboratory of Plant Breeding, Wageningen University, P.O. Box 386, 6700 AJ, Wageningen, The Netherlands.
| | | | | | | |
Collapse
|
83
|
Li J, Wang S, Zeng ZB. Multiple-interval mapping for ordinal traits. Genetics 2006; 173:1649-63. [PMID: 16585135 PMCID: PMC1526652 DOI: 10.1534/genetics.105.054619] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2005] [Accepted: 03/22/2006] [Indexed: 11/18/2022] Open
Abstract
Many statistical methods have been developed to map multiple quantitative trait loci (QTL) in experimental cross populations. Among these methods, multiple-interval mapping (MIM) can map QTL with epistasis simultaneously. However, the previous implementation of MIM is for continuously distributed traits. In this study we extend MIM to ordinal traits on the basis of a threshold model. The method inherits the properties and advantages of MIM and can fit a model of multiple QTL effects and epistasis on the underlying liability score. We study a number of statistical issues associated with the method, such as the efficiency and stability of maximization and model selection. We also use computer simulation to study the performance of the method and compare it to other alternative approaches. The method has been implemented in QTL Cartographer to facilitate its general usage for QTL mapping data analysis on binary and ordinal traits.
Collapse
Affiliation(s)
- Jian Li
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina 27695, USA
| | | | | |
Collapse
|
84
|
Abstract
Quantitative traits whose phenotypic values change over time are called longitudinal traits. Genetic analyses of longitudinal traits can be conducted using any of the following approaches: (1) treating the phenotypic values at different time points as repeated measurements of the same trait and analyzing the trait under the repeated measurements framework, (2) treating the phenotypes measured from different time points as different traits and analyzing the traits jointly on the basis of the theory of multivariate analysis, and (3) fitting a growth curve to the phenotypic values across time points and analyzing the fitted parameters of the growth trajectory under the theory of multivariate analysis. The third approach has been used in QTL mapping for longitudinal traits by fitting the data to a logistic growth trajectory. This approach applies only to the particular S-shaped growth process. In practice, a longitudinal trait may show a trajectory of any shape. We demonstrate that one can describe a longitudinal trait with orthogonal polynomials, which are sufficiently general for fitting any shaped curve. We develop a mixed-model methodology for QTL mapping of longitudinal traits and a maximum-likelihood method for parameter estimation and statistical tests. The expectation-maximization (EM) algorithm is applied to search for the maximum-likelihood estimates of parameters. The method is verified with simulated data and demonstrated with experimental data from a pseudobackcross family of Populus (poplar) trees.
Collapse
Affiliation(s)
- Runqing Yang
- School of Agriculture and Biology, Shanghai Jiaotong University, People's Republic of China
| | | | | |
Collapse
|
85
|
Lum PY, Chen Y, Zhu J, Lamb J, Melmed S, Wang S, Drake TA, Lusis AJ, Schadt EE. Elucidating the murine brain transcriptional network in a segregating mouse population to identify core functional modules for obesity and diabetes. J Neurochem 2006; 97 Suppl 1:50-62. [PMID: 16635250 DOI: 10.1111/j.1471-4159.2006.03661.x] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Complex biological systems are best modeled as highly modular, fluid systems exhibiting a plasticity that allows them to adapt to a vast array of changing conditions. Here we highlight several novel network-based approaches to elucidate genetic networks underlying complex traits. These integrative genomic approaches combine large-scale genotypic and gene expression results in segregating mouse populations to reconstruct reliable genetic networks underlying complex traits such as disease or drug response. We apply these novel approaches to one of the most extensive surveys of gene expression studies ever undertaken in whole brain in a segregating mouse population. More than 23,000 genes were monitored in whole brain samples from more than 300 mice derived from an F2 intercross population and genotyped at over 1200 SNP markers uniformly spread over the entire genome. We explore the topological properties of the brain transcriptional network and highlight different approaches to inferring causal associations among genes by integrating genotypic and expression data. We demonstrate the utility of these approaches by identifying and experimentally validating brain gene expression traits predicted to respond to a strong expression quantitative trait locus (eQTL) for the pituitary tumor-transforming 1 gene (Pttg1) that coincides with the physical location of this gene (a cis eQTL). We identify core functional modules making up the brain transcriptional network in mice that are coherent for core biological processes associated with metabolic disease traits including obesity and diabetes.
Collapse
Affiliation(s)
- Pek Yee Lum
- Rosetta Inpharmatics, LLC, Merck & Co., Inc., Seattle, WA 98109, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
86
|
Hoti F, Sillanpää MJ. Bayesian mapping of genotype x expression interactions in quantitative and qualitative traits. Heredity (Edinb) 2006; 97:4-18. [PMID: 16670709 DOI: 10.1038/sj.hdy.6800817] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
A novel Bayesian gene mapping method, which can simultaneously utilize both molecular marker and gene expression data, is introduced. The approach enables a quantitative or qualitative phenotype to be expressed as a linear combination of the marker genotypes, gene expression levels, and possible genotype x gene expression interactions. The interaction data, given as marker-gene pairs, contains possible in cis and in trans effects obtained from earlier allelic expression studies, genetical genomics studies, biological hypotheses, or known pathways. The method is presented for an inbred line cross design and can be easily generalized to handle other types of populations and designs. The model selection is based on the use of effect-specific variance components combined with Jeffreys' non-informative prior--the method operates by adaptively shrinking marker, expression, and interaction effects toward zero so that non-negligible effects are expected to occur only at very few positions. The estimation of the model parameters and the handling of missing genotype or expression data is performed via Markov chain Monte Carlo sampling. The potential of the method including heritability estimation is presented using simulated examples and novel summary statistics. The method is also applied to a real yeast data set with known pathways.
Collapse
Affiliation(s)
- F Hoti
- Department of Mathematics and Statistics, Rolf Nevanlinna Institute, University of Helsinki, FIN-00014 Helsinki, Finland
| | | |
Collapse
|
87
|
Vargas M, van Eeuwijk FA, Crossa J, Ribaut JM. Mapping QTLs and QTL x environment interaction for CIMMYT maize drought stress program using factorial regression and partial least squares methods. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2006; 112:1009-23. [PMID: 16538513 DOI: 10.1007/s00122-005-0204-z] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2005] [Accepted: 12/14/2005] [Indexed: 05/07/2023]
Abstract
The study of QTL x environment interaction (QEI) is important for understanding genotype x environment interaction (GEI) in many quantitative traits. For modeling GEI and QEI, factorial regression (FR) models form a powerful class of models. In FR models, covariables (contrasts) defined on the levels of the genotypic and/or environmental factor(s) are used to describe main effects and interactions. In FR models for QTL expression, considerable numbers of genotypic covariables can occur as for each putative QTL an additional covariable needs to be introduced. For large numbers of genotypic and/or environmental covariables, least square estimation breaks down and partial least squares (PLS) estimation procedures become an attractive alternative. In this paper we develop methodology for analyzing QEI by FR for estimating effects and locations of QTLs and QEI and interpreting QEI in terms of environmental variables. A randomization test for the main effects of QTLs and QEI is presented. A population of F2 derived F3 families was evaluated in eight environments differing in drought stress and soil nitrogen content and the traits yield and anthesis silking interval (ASI) were measured. For grain yield, chromosomes 1 and 10 showed significant QEI, whereas in chromosomes 3 and 8 only main effect QTLs were observed. For ASI, QTL main effects were observed on chromosomes 1, 2, 6, 8, and 10, whereas QEI was observed only on chromosome 8. The assessment of the QEI at chromosome 1 for grain yield showed that the QTL main effect explained 35.8% of the QTL + QEI variability, while QEI explained 64.2%. Minimum temperature during flowering time explained 77.6% of the QEI. The QEI analysis at chromosome 10 showed that the QTL main effect explained 59.8% of the QTL + QEI variability, while QEI explained 40.2%. Maximum temperature during flowering time explained 23.8% of the QEI. Results of this study show the possibilities of using FR for mapping QTL and for dissecting QEI in terms of environmental variables. PLS regression is efficient in accounting for background noise produced by other QTLs.
Collapse
Affiliation(s)
- Mateo Vargas
- Universidad Autónoma Chapingo, Chapingo, Edo. Mexico, Mexico
| | | | | | | |
Collapse
|
88
|
Abstract
Researchers in the field of molecular ecology and evolution require versatile and low-cost genetic typing methods. The AFLP (amplified fragment length polymorphism) method was introduced 10 years ago and shows many features that fulfil these requirements. With good quality genomic DNA at hand, it is relatively easy to generate anonymous multilocus DNA profiles in most species and the start-up time before data can be generated is often less than a week. Built-in dynamic, yet simple modifications make it possible to find a protocol suitable to the genome size of the species and to screen thousands of loci in hundreds of individuals for a relatively low cost. Until now, the method has primarily been applied in studies of plants, bacteria and fungi, with a strong bias towards economically important cultivated species and their pests. In this review we identify a number of research areas in the study of wild species of animals where the AFLP method, presently very much underused, should be a very valuable tool. These aspects include classical problems such as studies of population genetic structure and phylogenetic reconstructions, and also new challenges such as finding markers for genes governing adaptations in wild populations and modifications of the protocol that makes it possible to measure expression variation of multiple genes (cDNA-AFLP) and the distribution of DNA methylation. We hope this review will help molecular ecologists to identify when AFLP is likely to be superior to other more established methods, such as microsatellites, SNP (single nucleotide polymorphism) analyses and multigene DNA sequencing.
Collapse
Affiliation(s)
- Staffan Bensch
- Department of Animal Ecology, Ecology Building, Lund University, S-223 62 Lund, Sweden.
| | | |
Collapse
|
89
|
Yi N, Yandell BS, Churchill GA, Allison DB, Eisen EJ, Pomp D. Bayesian model selection for genome-wide epistatic quantitative trait loci analysis. Genetics 2005; 170:1333-44. [PMID: 15911579 PMCID: PMC1451197 DOI: 10.1534/genetics.104.040386] [Citation(s) in RCA: 103] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2004] [Accepted: 04/04/2005] [Indexed: 11/18/2022] Open
Abstract
The problem of identifying complex epistatic quantitative trait loci (QTL) across the entire genome continues to be a formidable challenge for geneticists. The complexity of genome-wide epistatic analysis results mainly from the number of QTL being unknown and the number of possible epistatic effects being huge. In this article, we use a composite model space approach to develop a Bayesian model selection framework for identifying epistatic QTL for complex traits in experimental crosses from two inbred lines. By placing a liberal constraint on the upper bound of the number of detectable QTL we restrict attention to models of fixed dimension, greatly simplifying calculations. Indicators specify which main and epistatic effects of putative QTL are included. We detail how to use prior knowledge to bound the number of detectable QTL and to specify prior distributions for indicators of genetic effects. We develop a computationally efficient Markov chain Monte Carlo (MCMC) algorithm using the Gibbs sampler and Metropolis-Hastings algorithm to explore the posterior distribution. We illustrate the proposed method by detecting new epistatic QTL for obesity in a backcross of CAST/Ei mice onto M16i.
Collapse
Affiliation(s)
- Nengjun Yi
- Department of Biostatistics, University of Alabama, Birmingham 35294, USA.
| | | | | | | | | | | |
Collapse
|
90
|
Schadt EE, Lamb J, Yang X, Zhu J, Edwards S, Guhathakurta D, Sieberts SK, Monks S, Reitman M, Zhang C, Lum PY, Leonardson A, Thieringer R, Metzger JM, Yang L, Castle J, Zhu H, Kash SF, Drake TA, Sachs A, Lusis AJ. An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet 2005; 37:710-7. [PMID: 15965475 PMCID: PMC2841396 DOI: 10.1038/ng1589] [Citation(s) in RCA: 726] [Impact Index Per Article: 36.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2005] [Accepted: 05/09/2005] [Indexed: 02/07/2023]
Abstract
A key goal of biomedical research is to elucidate the complex network of gene interactions underlying complex traits such as common human diseases. Here we detail a multistep procedure for identifying potential key drivers of complex traits that integrates DNA-variation and gene-expression data with other complex trait data in segregating mouse populations. Ordering gene expression traits relative to one another and relative to other complex traits is achieved by systematically testing whether variations in DNA that lead to variations in relative transcript abundances statistically support an independent, causative or reactive function relative to the complex traits under consideration. We show that this approach can predict transcriptional responses to single gene-perturbation experiments using gene-expression data in the context of a segregating mouse population. We also demonstrate the utility of this approach by identifying and experimentally validating the involvement of three new genes in susceptibility to obesity.
Collapse
Affiliation(s)
- Eric E Schadt
- Rosetta Inpharmatics, Seattle, Washington 98109, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
91
|
Abstract
Viability selection will change gene frequencies of loci controlling fitness. Consequently, the frequencies of marker loci linked to the viability loci will also change. In genetic mapping, the change of marker allelic frequencies is reflected by the departure from Mendelian segregation ratio. The non-Mendelian segregation of markers has been used to map viability loci along the genome. However, current methods have not been able to detect the amount of selection (s) and the degree of dominance (h) simultaneously. We developed a method to detect both s and h using an F2 mating design under the classical fitness model. We also developed a quantitative genetics model for viability selection by proposing a continuous liability controlling the viability of individuals. With the liability model, mapping viability loci has been formulated as mapping quantitative trait loci. As a result, nongenetic systematic environmental effects can be easily incorporated into the model and subsequently separated from the genetic effects of the viability loci. The quantitative genetic model has been verified with a series of Monte Carlo simulation experiments.
Collapse
Affiliation(s)
- L Luo
- Department of Botany and Plant Sciences, University of California, Riverside, CA 92521, USA
| | | | | |
Collapse
|
92
|
Abstract
Many disease resistance traits in plants have a polygenic background and the disease phenotypes are modified by environmental factors. As a consequence, the phenotypic values usually show a quantitative variation. The phenotypes of such disease traits, however, are often measured in discrete but ordered categories. These traits are called ordinal traits. In terms of disease resistance, they are called quantitative resistance traits, as opposed to qualitative resistance traits, and are controlled by the quantitative resistance loci (QRL). Classical quantitative trait locus mapping methods are not optimal for ordinal trait analysis because the assumption of normal distribution is violated. Methods for mapping binary trait loci are not suitable either because there are more than two categories in ordinal traits. We developed a maximum likelihood method to map these QRL. The method is implemented via a multicycle expectation-conditional-maximization (ECM) algorithm under the threshold model, where we can estimate both the QRL effects and the thresholds that link the disease liability and the categorical phenotype. The method is verified in simulated data under various combinations of the parameters. An SAS program is available to implement the multicycle ECM algorithm. The program can be downloaded from our website at www.statgen.ucr.edu.
Collapse
Affiliation(s)
- C Xu
- Department of Botany and Plant Sciences, University of California, Riverside, CA 92521, USA
| | | | | |
Collapse
|
93
|
Abstract
In this article, a unified Markov chain Monte Carlo (MCMC) framework is proposed to identify multiple quantitative trait loci (QTL) for complex traits in experimental designs, based on a composite space representation of the problem that has fixed dimension. The proposed unified approach includes the existing Bayesian QTL mapping methods using reversible jump MCMC algorithm as special cases. We also show that a variety of Bayesian variable selection methods using Gibbs sampling can be applied to the composite model space for mapping multiple QTL. The unified framework not only results in some new algorithms, but also gives useful insight into some of the important factors governing the performance of Gibbs sampling and reversible jump for mapping multiple QTL. Finally, we develop strategies to improve the performance of MCMC algorithms.
Collapse
Affiliation(s)
- Nengjun Yi
- Section on Statistical Genetics, Department of Biostatistics, University of Alabama, Birmingham, 35294-0022, USA.
| |
Collapse
|
94
|
Zhang YM, Mao Y, Xie C, Smith H, Luo L, Xu S. Mapping quantitative trait loci using naturally occurring genetic variance among commercial inbred lines of maize (Zea mays L.). Genetics 2005; 169:2267-75. [PMID: 15716509 PMCID: PMC1449576 DOI: 10.1534/genetics.104.033217] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2004] [Accepted: 01/07/2005] [Indexed: 11/18/2022] Open
Abstract
Many commercial inbred lines are available in crops. A large amount of genetic variation is preserved among these lines. The genealogical history of the inbred lines is usually well documented. However, quantitative trait loci (QTL) responsible for the genetic variances among the lines are largely unexplored due to lack of statistical methods. In this study, we show that the pedigree information of the lines along with the trait values and marker information can be used to map QTL without the need of further crossing experiments. We develop a Monte Carlo method to estimate locus-specific identity-by-descent (IBD) matrices. These IBD matrices are further incorporated into a mixed-model equation for variance component analysis. QTL variance is estimated and tested at every putative position of the genome. The actual QTL are detected by scanning the entire genome. Applying this new method to a well-documented pedigree of maize (Zea mays L.) that consists of 404 inbred lines, we mapped eight QTL for the maize male flowering trait, growing degree day heat units to pollen shedding (GDUSHD). These detected QTL contributed >80% of the variance observed among the inbred lines. The QTL were then used to evaluate all the inbred lines using the best linear unbiased prediction (BLUP) technique. Superior lines were selected according to the estimated QTL allelic values, a technique called marker-assisted selection (MAS). The MAS procedure implemented via BLUP may be routinely used by breeders to select superior lines and line combinations for development of new cultivars.
Collapse
Affiliation(s)
- Yuan-Ming Zhang
- Department of Botany and Plant Sciences, University of California, Riverside, 92521-0124, USA
| | | | | | | | | | | |
Collapse
|
95
|
Zhang M, Montooth KL, Wells MT, Clark AG, Zhang D. Mapping multiple Quantitative Trait Loci by Bayesian classification. Genetics 2005; 169:2305-18. [PMID: 15520261 PMCID: PMC1449613 DOI: 10.1534/genetics.104.034181] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2004] [Accepted: 11/01/2004] [Indexed: 12/13/2022] Open
Abstract
We developed a classification approach to multiple quantitative trait loci (QTL) mapping built upon a Bayesian framework that incorporates the important prior information that most genotypic markers are not cotransmitted with a QTL or their QTL effects are negligible. The genetic effect of each marker is modeled using a three-component mixture prior with a class for markers having negligible effects and separate classes for markers having positive or negative effects on the trait. The posterior probability of a marker's classification provides a natural statistic for evaluating credibility of identified QTL. This approach performs well, especially with a large number of markers but a relatively small sample size. A heat map to visualize the results is proposed so as to allow investigators to be more or less conservative when identifying QTL. We validated the method using a well-characterized data set for barley heading values from the North American Barley Genome Mapping Project. Application of the method to a new data set revealed sex-specific QTL underlying differences in glucose-6-phosphate dehydrogenase enzyme activity between two Drosophila species. A simulation study demonstrated the power of this approach across levels of trait heritability and when marker data were sparse.
Collapse
Affiliation(s)
- Min Zhang
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853, USA
| | | | | | | | | |
Collapse
|
96
|
Abstract
Quantitative trait loci (QTL) mapping has been used in a number of evolutionary studies to study the genetic basis of adaptation by mapping individual QTL that explain the differences between differentiated populations and also estimating their effects and interaction in the mapping population. This analysis can provide clues about the evolutionary history of populations and causes of the population differentiation. QTL mapping analysis methods and associated computer programs provide us tools for such an inference on the genetic basis and architecture of quantitative trait variation in a mapping population. Current methods have the capability to separate and localize multiple QTL and estimate their effects and interaction on a quantitative trait. More recent methods have been targeted to provide a comprehensive inference on the overall genetic architecture of multiple traits in a number of environments. This development is important for evolutionary studies on the genetic basis of multiple trait variation, genotype by environment interaction, host-parasite interaction, and also microarray gene expression QTL analysis.
Collapse
Affiliation(s)
- Zhao-Bang Zeng
- Department of Statistics, Bioinformatics Research Center, North Carolina State University, Raleigh, NC 27695-7566, USA.
| |
Collapse
|
97
|
Abstract
In plants and laboratory animals, QTL mapping is commonly performed using F(2) or BC individuals derived from the cross of two inbred lines. Typical QTL mapping statistics assume that each F(2) individual is genotyped for the markers and phenotyped for the trait. For plant traits with low heritability, it has been suggested to use the average phenotypic values of F(3) progeny derived from selfing F(2) plants in place of the F(2) phenotype itself. All F(3) progeny derived from the same F(2) plant belong to the same F(2:3) family, denoted by F(2:3). If the size of each F(2:3) family (the number of F(3) progeny) is sufficiently large, the average value of the family will represent the genotypic value of the F(2) plant, and thus the power of QTL mapping may be significantly increased. The strategy of using F(2) marker genotypes and F(3) average phenotypes for QTL mapping in plants is quite similar to the daughter design of QTL mapping in dairy cattle. We study the fundamental principle of the plant version of the daughter design and develop a new statistical method to map QTL under this F(2:3) strategy. We also propose to combine both the F(2) phenotypes and the F(2:3) average phenotypes to further increase the power of QTL mapping. The statistical method developed in this study differs from published ones in that the new method fully takes advantage of the mixture distribution for F(2:3) families of heterozygous F(2) plants. Incorporation of this new information has significantly increased the statistical power of QTL detection relative to the classical F(2) design, even if only a single F(3) progeny is collected from each F(2:3) family. The mixture model is developed on the basis of a single-QTL model and implemented via the EM algorithm. Substantial computer simulation was conducted to demonstrate the improved efficiency of the mixture model. Extension of the mixture model to multiple QTL analysis is developed using a Bayesian approach. The computer program performing the Bayesian analysis of the simulated data is available to users for real data analysis.
Collapse
Affiliation(s)
- Yuan-Ming Zhang
- Department of Botany and Plant Sciences, University of California, Riverside, California 92521, USA
| | | |
Collapse
|
98
|
van Eeuwijk FA, Malosetti M, Yin X, Struik PC, Stam P. Statistical models for genotype by environment data: from conventional ANOVA models to eco-physiological QTL models. ACTA ACUST UNITED AC 2005. [DOI: 10.1071/ar05153] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
To study the performance of genotypes under different growing conditions, plant breeders evaluate their germplasm in multi-environment trials. These trials produce genotype × environment data. We present statistical models for the analysis of such data that differ in the extent to which additional genetic, physiological, and environmental information is incorporated into the model formulation. The simplest model in our exposition is the additive 2-way analysis of variance model, without genotype × environment interaction, and with parameters whose interpretation depends strongly on the set of included genotypes and environments. The most complicated model is a synthesis of a multiple quantitative trait locus (QTL) model and an eco-physiological model to describe a collection of genotypic response curves. Between those extremes, we discuss linear-bilinear models, whose parameters can only indirectly be related to genetic and physiological information, and factorial regression models that allow direct incorporation of explicit genetic, physiological, and environmental covariables on the levels of the genotypic and environmental factors. Factorial regression models are also very suitable for the modelling of QTL main effects and QTL × environment interaction. Our conclusion is that statistical and physiological models can be fruitfully combined for the study of genotype × environment interaction.
Collapse
|
99
|
Abstract
Joint mapping for multiple quantitative traits has shed new light on genetic mapping by pinpointing pleiotropic effects and close linkage. Joint mapping also can improve statistical power of QTL detection. However, such a joint mapping procedure has not been available for discrete traits. Most disease resistance traits are measured as one or more discrete characters. These discrete characters are often correlated. Joint mapping for multiple binary disease traits may provide an opportunity to explore pleiotropic effects and increase the statistical power of detecting disease loci. We develop a maximum-likelihood method for mapping multiple binary traits. We postulate a set of multivariate normal disease liabilities, each contributing to the phenotypic variance of one disease trait. The underlying liabilities are linked to the binary phenotypes through some underlying thresholds. The new method actually maps loci for the variation of multivariate normal liabilities. As a result, we are able to take advantage of existing methods of joint mapping for quantitative traits. We treat the multivariate liabilities as missing values so that an expectation-maximization (EM) algorithm can be applied here. We also extend the method to joint mapping for both discrete and continuous traits. Efficiency of the method is demonstrated using simulated data. We also apply the new method to a set of real data and detect several loci responsible for blast resistance in rice.
Collapse
Affiliation(s)
- Chenwu Xu
- Department of Botany and Plant Sciences, University of California, Riverside, California 92521, USA
| | | | | |
Collapse
|
100
|
Abstract
Abstract
In plants and laboratory animals, QTL mapping is commonly performed using F2 or BC individuals derived from the cross of two inbred lines. Typical QTL mapping statistics assume that each F2 individual is genotyped for the markers and phenotyped for the trait. For plant traits with low heritability, it has been suggested to use the average phenotypic values of F3 progeny derived from selfing F2 plants in place of the F2 phenotype itself. All F3 progeny derived from the same F2 plant belong to the same F2:3 family, denoted by F2:3. If the size of each F2:3 family (the number of F3 progeny) is sufficiently large, the average value of the family will represent the genotypic value of the F2 plant, and thus the power of QTL mapping may be significantly increased. The strategy of using F2 marker genotypes and F3 average phenotypes for QTL mapping in plants is quite similar to the daughter design of QTL mapping in dairy cattle. We study the fundamental principle of the plant version of the daughter design and develop a new statistical method to map QTL under this F2:3 strategy. We also propose to combine both the F2 phenotypes and the F2:3 average phenotypes to further increase the power of QTL mapping. The statistical method developed in this study differs from published ones in that the new method fully takes advantage of the mixture distribution for F2:3 families of heterozygous F2 plants. Incorporation of this new information has significantly increased the statistical power of QTL detection relative to the classical F2 design, even if only a single F3 progeny is collected from each F2:3 family. The mixture model is developed on the basis of a single-QTL model and implemented via the EM algorithm. Substantial computer simulation was conducted to demonstrate the improved efficiency of the mixture model. Extension of the mixture model to multiple QTL analysis is developed using a Bayesian approach. The computer program performing the Bayesian analysis of the simulated data is available to users for real data analysis.
Collapse
Affiliation(s)
- Yuan-Ming Zhang
- Department of Botany and Plant Sciences, University of California, Riverside, California 92521
| | - Shizhong Xu
- Department of Botany and Plant Sciences, University of California, Riverside, California 92521
| |
Collapse
|