151
|
Rothammer S, Kremer PV, Bernau M, Fernandez-Figares I, Pfister-Schär J, Medugorac I, Scholz AM. Genome-wide QTL mapping of nine body composition and bone mineral density traits in pigs. Genet Sel Evol 2014; 46:68. [PMID: 25359100 PMCID: PMC4210560 DOI: 10.1186/s12711-014-0068-2] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2014] [Accepted: 09/19/2014] [Indexed: 12/20/2022] Open
Abstract
Background Since the pig is one of the most important livestock animals worldwide, mapping loci that are associated with economically important traits and/or traits that influence animal welfare is extremely relevant for efficient future pig breeding. Therefore, the purpose of this study was a genome-wide mapping of quantitative trait loci (QTL) associated with nine body composition and bone mineral traits: absolute (Fat, Lean) and percentage (FatPC, LeanPC) fat and lean mass, live weight (Weight), soft tissue X-ray attenuation coefficient (R), absolute (BMC) and percentage (BMCPC) bone mineral content and bone mineral density (BMD). Methods Data on the nine traits investigated were obtained by Dual-energy X-ray absorptiometry for 551 pigs that were between 160 and 200 days old. In addition, all pigs were genotyped using Illumina’s PorcineSNP60 Genotyping BeadChip. Based on these data, a genome-wide combined linkage and linkage disequilibrium analysis was conducted. Thus, we used 44 611 sliding windows that each consisted of 20 adjacent single nucleotide polymorphisms (SNPs). For the middle of each sliding window a variance component analysis was carried out using ASReml. The underlying mixed linear model included random QTL and polygenic effects, with fixed effects of sex, housing, season and age. Results Using a Bonferroni-corrected genome-wide significance threshold of P < 0.001, significant peaks were identified for all traits except BMCPC. Overall, we identified 72 QTL on 16 chromosomes, of which 24 were significantly associated with one trait only and the remaining with more than one trait. For example, a QTL on chromosome 2 included the highest peak across the genome for four traits (Fat, FatPC, LeanPC and R). The nearby gene, ZNF608, is known to be associated with body mass index in humans and involved in starvation in Drosophila, which makes it an extremely good candidate gene for this QTL. Conclusions Our QTL mapping approach identified 72 QTL, some of which confirmed results of previous studies in pigs. However, we also detected significant associations that have not been published before and were able to identify a number of new and promising candidate genes, such as ZNF608. Electronic supplementary material The online version of this article (doi:10.1186/s12711-014-0068-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | | | | | - Ivica Medugorac
- Chair of Animal Genetics and Husbandry, Ludwig-Maximilians-University Munich, Veterinärstrasse 13, Munich, 80539, Germany.
| | | |
Collapse
|
152
|
Abstract
The application of quantitative genetics in plant and animal breeding has largely focused on additive models, which may also capture dominance and epistatic effects. Partitioning genetic variance into its additive and nonadditive components using pedigree-based models (P-genomic best linear unbiased predictor) (P-BLUP) is difficult with most commonly available family structures. However, the availability of dense panels of molecular markers makes possible the use of additive- and dominance-realized genomic relationships for the estimation of variance components and the prediction of genetic values (G-BLUP). We evaluated height data from a multifamily population of the tree species Pinus taeda with a systematic series of models accounting for additive, dominance, and first-order epistatic interactions (additive by additive, dominance by dominance, and additive by dominance), using either pedigree- or marker-based information. We show that, compared with the pedigree, use of realized genomic relationships in marker-based models yields a substantially more precise separation of additive and nonadditive components of genetic variance. We conclude that the marker-based relationship matrices in a model including additive and nonadditive effects performed better, improving breeding value prediction. Moreover, our results suggest that, for tree height in this population, the additive and nonadditive components of genetic variance are similar in magnitude. This novel result improves our current understanding of the genetic control and architecture of a quantitative trait and should be considered when developing breeding strategies.
Collapse
|
153
|
Genetic Basis of Complex Genetic Disease: The Contribution of Disease Heterogeneity to Missing Heritability. CURR EPIDEMIOL REP 2014. [DOI: 10.1007/s40471-014-0023-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
|
154
|
Abstract
Parasite burden varies widely between individuals within a population, and can covary with multiple aspects of individual phenotype. Here we investigate the sources of variation in faecal strongyle eggs counts, and its association with body weight and a suite of haematological measures, in a cohort of indigenous zebu calves in Western Kenya, using relatedness matrices reconstructed from single nucleotide polymorphism (SNP) genotypes. Strongyle egg count was heritable (h2 = 23·9%, s.e. = 11·8%) and we also found heritability of white blood cell counts (WBC) (h2 = 27·6%, s.e. = 10·6%). All the traits investigated showed negative phenotypic covariances with strongyle egg count throughout the first year: high worm counts were associated with low values of WBC, red blood cell count, total serum protein and absolute eosinophil count. Furthermore, calf body weight at 1 week old was a significant predictor of strongyle EPG at 16–51 weeks, with smaller calves having a higher strongyle egg count later in life. Our results indicate a genetic basis to strongyle EPG in this population, and also reveal consistently strong negative associations between strongyle infection and other important aspects of the multivariate phenotype.
Collapse
|
155
|
Auvray B, McEwan JC, Newman SAN, Lee M, Dodds KG. Genomic prediction of breeding values in the New Zealand sheep industry using a 50K SNP chip. J Anim Sci 2014; 92:4375-89. [PMID: 25149326 DOI: 10.2527/jas.2014-7801] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
The aim of genomic prediction is to predict breeding value from genomic data. We describe the development of genomic prediction equations and accuracies for molecular breeding values (MBV) for industry use, focusing on the methodology used to deal with predictions for the New Zealand sheep population structure. This is made up of a mixture of pure and crossbred animals, but principally Romney based. In particular, we used pedigree-based EBV for 8 traits (weaning weight as a direct effect, weaning weight as a maternal effect, live weight at 8 mo, live weight at 12 mo, greasy fleece weight at 12 mo, lamb fleece weight, adult fleece weight, and number of lambs born) and Illumina OvineSNP50 BeadChip genotypes from 13,420 animals to investigate BLUP with different genomic relationship matrices (GRM) based on SNP markers and to investigate varying sets of older animals (training sets) to predict the MBV of younger animals (validation sets). The GRM tested included modifications to account for allele frequency differences between breeds, rescaling so that the mean GRM is equal to the mean of the traditional pedigree numerator relationship matrix A: , and combining of the GRM with A: using a convex combination with a weight estimated by maximizing a conditional restricted likelihood. We found that these modifications were beneficial and recommend using a breed-adjusted GRM combined with A: . Training data sets with Romney, Coopworth, and Perendale animals all together usually predicted better than using just a pure breed training data set for all traits. But predictions for the breed Perendale were more accurate with a Perendale training set for 3 of the 8 traits. We concluded that using a mixed-breed training set for all combinations of traits and breeds was best but advise that increasing the number of Perendale animals genotyped should be a priority to increase the MBV accuracies obtained for that breed.
Collapse
Affiliation(s)
- B Auvray
- Animal Productivity Group, AgResearch Limited, Mosgiel 9053, New Zealand
| | - J C McEwan
- Animal Productivity Group, AgResearch Limited, Mosgiel 9053, New Zealand
| | - S-A N Newman
- Animal Productivity Group, AgResearch Limited, Mosgiel 9053, New Zealand
| | - M Lee
- Animal Productivity Group, AgResearch Limited, Mosgiel 9053, New Zealand
| | - K G Dodds
- Animal Productivity Group, AgResearch Limited, Mosgiel 9053, New Zealand
| |
Collapse
|
156
|
Boison S, Neves H, Pérez O’Brien A, Utsunomiya Y, Carvalheiro R, da Silva M, Sölkner J, Garcia J. Imputation of non-genotyped individuals using genotyped progeny in Nellore, a Bos indicus cattle breed. Livest Sci 2014. [DOI: 10.1016/j.livsci.2014.05.033] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
157
|
|
158
|
Toro MA, Villanueva B, Fernández J. Genomics applied to management strategies in conservation programmes. Livest Sci 2014. [DOI: 10.1016/j.livsci.2014.04.020] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
159
|
Eu-ahsunthornwattana J, Miller EN, Fakiola M, Jeronimo SMB, Blackwell JM, Cordell HJ. Comparison of methods to account for relatedness in genome-wide association studies with family-based data. PLoS Genet 2014; 10:e1004445. [PMID: 25033443 PMCID: PMC4102448 DOI: 10.1371/journal.pgen.1004445] [Citation(s) in RCA: 83] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2013] [Accepted: 05/02/2014] [Indexed: 11/23/2022] Open
Abstract
Approaches based on linear mixed models (LMMs) have recently gained popularity for modelling population substructure and relatedness in genome-wide association studies. In the last few years, a bewildering variety of different LMM methods/software packages have been developed, but it is not always clear how (or indeed whether) any newly-proposed method differs from previously-proposed implementations. Here we compare the performance of several LMM approaches (and software implementations, including EMMAX, GenABEL, FaST-LMM, Mendel, GEMMA and MMM) via their application to a genome-wide association study of visceral leishmaniasis in 348 Brazilian families comprising 3626 individuals (1972 genotyped). The implementations differ in precise details of methodology implemented and through various user-chosen options such as the method and number of SNPs used to estimate the kinship (relatedness) matrix. We investigate sensitivity to these choices and the success (or otherwise) of the approaches in controlling the overall genome-wide error-rate for both real and simulated phenotypes. We compare the LMM results to those obtained using traditional family-based association tests (based on transmission of alleles within pedigrees) and to alternative approaches implemented in the software packages MQLS, ROADTRIPS and MASTOR. We find strong concordance between the results from different LMM approaches, and all are successful in controlling the genome-wide error rate (except for some approaches when applied naively to longitudinal data with many repeated measures). We also find high correlation between LMMs and alternative approaches (apart from transmission-based approaches when applied to SNPs with small or non-existent effects). We conclude that LMM approaches perform well in comparison to competing approaches. Given their strong concordance, in most applications, the choice of precise LMM implementation cannot be based on power/type I error considerations but must instead be based on considerations such as speed and ease-of-use. Recently, statistical approaches known as linear mixed models (LMMs) have become popular for analysing data from genome-wide association studies. In the last few years, a bewildering variety of different LMM methods/software packages have been developed, but it has not always been clear how (or indeed whether) any newly-proposed method differs from previously-proposed implementations. Here we compare the performance of several different LMM approaches (and software implementations) via their application to a genome-wide association study of visceral leishmaniasis in 348 Brazilian families comprising 3626 individuals. We also compare the LMM results to those obtained using alternative analysis methods. Overall, we find strong concordance between the results from the different LMM approaches and high correlation between the results from LMMs and most alternative approaches. We conclude that LMM approaches perform well in comparison to competing approaches and, in most applications, the precise LMM implementation will not be too important, and can be chosen on the basis of speed or convenience.
Collapse
Affiliation(s)
- Jakris Eu-ahsunthornwattana
- Institute of Genetic Medicine, Newcastle University, International Centre for Life, Newcastle upon Tyne, United Kingdom
- Division of Medical Genetics, Department of Internal Medicine, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Ratchathevi, Bangkok, Thailand
| | - E. Nancy Miller
- Cambridge Institute for Medical Research, University of Cambridge School of Clinical Medicine, Addenbrooke's Hospital, Cambridge, United Kingdom
| | - Michaela Fakiola
- Cambridge Institute for Medical Research, University of Cambridge School of Clinical Medicine, Addenbrooke's Hospital, Cambridge, United Kingdom
| | | | - Selma M. B. Jeronimo
- Department of Biochemistry, Center for Biosciences, Universidade Federal do Rio Grande do Norte, Natal, Brazil
| | - Jenefer M. Blackwell
- Cambridge Institute for Medical Research, University of Cambridge School of Clinical Medicine, Addenbrooke's Hospital, Cambridge, United Kingdom
- Telethon Institute for Child Health Research, Centre for Child Health Research, The University of Western Australia, Subiaco, Western Australia, Australia
| | - Heather J. Cordell
- Institute of Genetic Medicine, Newcastle University, International Centre for Life, Newcastle upon Tyne, United Kingdom
- * E-mail:
| |
Collapse
|
160
|
Wang H, Misztal I, Legarra A. Differences between genomic-based and pedigree-based relationships in a chicken population, as a function of quality control and pedigree links among individuals. J Anim Breed Genet 2014; 131:445-51. [PMID: 25039816 DOI: 10.1111/jbg.12109] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2013] [Accepted: 06/24/2014] [Indexed: 12/01/2022]
Abstract
This work studied differences between expected (calculated from pedigree) and realized (genomic, from markers) relationships in a real population, the influence of quality control on these differences, and their fit to current theory. Data included 4940 pure line chickens across five generations genotyped for 57,636 SNP. Pedigrees (5762 animals) were available for the five generations, pedigree starting on the first one. Three levels of quality control were used. With no quality control, mean difference between realized and expected relationships for different type of relationships was ≤ 0.04 with standard deviation ≤ 0.10. With strong quality control (call rate ≥ 0.9, parent-progeny conflicts, minor allele frequency and use of only autosomal chromosomes), these numbers reduced to ≤ 0.02 and ≤ 0.04, respectively. While the maximum difference was 1.02 with the complete data, it was only 0.18 with the latest three generations of genotypes (but including all pedigrees). Variation of expected minus realized relationships agreed with theoretical developments and suggests an effective number of loci of 70 for this population. When the pedigree is complete and as deep as the genotypes, the standard deviation of difference between the expected and realized relationships is around 0.04, all categories confounded. Standard deviation of differences larger than 0.10 suggests bad quality control, mistakes in pedigree recording or genotype labelling, or insufficient depth of pedigree.
Collapse
Affiliation(s)
- H Wang
- Genus Plc, Hendersonville, TN, USA
| | | | | |
Collapse
|
161
|
Abstract
The use of genetically isolated populations can empower next-generation association studies. In this review, we discuss the advantages of this approach and review study design and analytical considerations of genetic association studies focusing on isolates. We cite successful examples of using population isolates in association studies and outline potential ways forward.
Collapse
|
162
|
Bosse M, Megens HJ, Madsen O, Frantz LAF, Paudel Y, Crooijmans RPMA, Groenen MAM. Untangling the hybrid nature of modern pig genomes: a mosaic derived from biogeographically distinct and highly divergent Sus scrofa populations. Mol Ecol 2014; 23:4089-102. [PMID: 24863459 DOI: 10.1111/mec.12807] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2014] [Revised: 05/20/2014] [Accepted: 05/21/2014] [Indexed: 01/05/2023]
Abstract
The merging of populations after an extended period of isolation and divergence is a common phenomenon, in natural settings as well as due to human interference. Individuals with such hybrid origins contain genomes that essentially form a mosaic of different histories and demographies. Pigs are an excellent model species to study hybridization because European and Asian wild boars diverged ~1.2 Mya, and pigs were domesticated independently in Europe and Asia. During the Industrial Revolution in England, pigs were imported from China to improve the local pigs. This study utilizes the latest genomics tools to identify the origin of haplotypes in European domesticated pigs that are descendant from Asian and European populations. Our results reveal fine-scale haplotype structure representing different ancient demographic events, as well as a mosaic composition of those distinct histories due to recently introgressed haplotypes in the pig genome. As a consequence, nucleotide diversity in the genome of European domesticated pigs is higher when at least one haplotype of Asian origin is present, and haplotype length correlates negatively with recombination frequency and nucleotide diversity. Another consequence is that the inference of past effective population size is influenced by the background of the haplotypes in an individual, but we demonstrate that by careful sorting based on the origin of haplotypes, both distinct demographic histories can be reconstructed. Future detailed mapping of the genomic distribution of variation will enable a targeted approach to increase genetic diversity of captive and wild populations, thus facilitating conservation efforts in the near future.
Collapse
Affiliation(s)
- Mirte Bosse
- Animal Breeding and Genomics Centre, Wageningen University, De Elst 1 Zodiac, P.O. Box 338, Wageningen, 6708WD, the Netherlands
| | | | | | | | | | | | | |
Collapse
|
163
|
Anche MT, de Jong MCM, Bijma P. On the definition and utilization of heritable variation among hosts in reproduction ratio R0 for infectious diseases. Heredity (Edinb) 2014; 113:364-74. [PMID: 24824286 DOI: 10.1038/hdy.2014.38] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2013] [Revised: 03/18/2014] [Accepted: 03/21/2014] [Indexed: 01/30/2023] Open
Abstract
Infectious diseases have a major role in evolution by natural selection and pose a worldwide concern in livestock. Understanding quantitative genetics of infectious diseases, therefore, is essential both for understanding the consequences of natural selection and for designing artificial selection schemes in agriculture. The basic reproduction ratio, R0, is the key parameter determining risk and severity of infectious diseases. Genetic improvement for control of infectious diseases in host populations should therefore aim at reducing R0. This requires definitions of breeding value and heritable variation for R0, and understanding of mechanisms determining response to selection. This is challenging, as R0 is an emergent trait arising from interactions among individuals in the population. Here we show how to define breeding value and heritable variation for R0 for genetically heterogeneous host populations. Furthermore, we identify mechanisms determining utilization of heritable variation for R0. Using indirect genetic effects, next-generation matrices and a SIR (Susceptible, Infected and Recovered) model, we show that an individual's breeding value for R0 is a function of its own allele frequencies for susceptibility and infectivity and of population average susceptibility and infectivity. When interacting individuals are unrelated, selection for individual disease status captures heritable variation in susceptibility only, yielding limited response in R0. With related individuals, however, there is a secondary selection process, which also captures heritable variation in infectivity and additional variation in susceptibility, yielding substantially greater response. This shows that genetic variation in susceptibility represents an indirect genetic effect. As a consequence, response in R0 increased substantially when interacting individuals were genetically related.
Collapse
Affiliation(s)
- M T Anche
- 1] Animal Breeding and Genomics Centre, Wageningen Institute of Animal Sciences (WIAS), Wageningen University, Wageningen, The Netherlands [2] Quantitative Veterinary Epidemiology Group, Wageningen Institute of Animal Sciences (WIAS), Wageningen University, Wageningen, The Netherlands
| | - M C M de Jong
- Quantitative Veterinary Epidemiology Group, Wageningen Institute of Animal Sciences (WIAS), Wageningen University, Wageningen, The Netherlands
| | - P Bijma
- Animal Breeding and Genomics Centre, Wageningen Institute of Animal Sciences (WIAS), Wageningen University, Wageningen, The Netherlands
| |
Collapse
|
164
|
Beaulieu J, Doerksen T, Clément S, MacKay J, Bousquet J. Accuracy of genomic selection models in a large population of open-pollinated families in white spruce. Heredity (Edinb) 2014; 113:343-52. [PMID: 24781808 DOI: 10.1038/hdy.2014.36] [Citation(s) in RCA: 82] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2013] [Revised: 03/16/2014] [Accepted: 03/21/2014] [Indexed: 12/16/2022] Open
Abstract
Genomic selection (GS) is of interest in breeding because of its potential for predicting the genetic value of individuals and increasing genetic gains per unit of time. To date, very few studies have reported empirical results of GS potential in the context of large population sizes and long breeding cycles such as for boreal trees. In this study, we assessed the effectiveness of marker-aided selection in an undomesticated white spruce (Picea glauca (Moench) Voss) population of large effective size using a GS approach. A discovery population of 1694 trees representative of 214 open-pollinated families from 43 natural populations was phenotyped for 12 wood and growth traits and genotyped for 6385 single-nucleotide polymorphisms (SNPs) mined in 2660 gene sequences. GS models were built to predict estimated breeding values using all the available SNPs or SNP subsets of the largest absolute effects, and they were validated using various cross-validation schemes. The accuracy of genomic estimated breeding values (GEBVs) varied from 0.327 to 0.435 when the training and the validation data sets shared half-sibs that were on average 90% of the accuracies achieved through traditionally estimated breeding values. The trend was also the same for validation across sites. As expected, the accuracy of GEBVs obtained after cross-validation with individuals of unknown relatedness was lower with about half of the accuracy achieved when half-sibs were present. We showed that with the marker densities used in the current study, predictions with low to moderate accuracy could be obtained within a large undomesticated population of related individuals, potentially resulting in larger gains per unit of time with GS than with the traditional approach.
Collapse
Affiliation(s)
- J Beaulieu
- 1] Natural Resources Canada, Canadian Wood Fibre Centre, Québec, Québec, Canada [2] Canada Research Chair in Forest and Environmental Genomics and Institute for Systems and Integrative Biology, Université Laval, Québec, Québec, Canada
| | - T Doerksen
- 1] Natural Resources Canada, Canadian Wood Fibre Centre, Québec, Québec, Canada [2] Canada Research Chair in Forest and Environmental Genomics and Institute for Systems and Integrative Biology, Université Laval, Québec, Québec, Canada
| | - S Clément
- Natural Resources Canada, Canadian Wood Fibre Centre, Québec, Québec, Canada
| | - J MacKay
- Canada Research Chair in Forest and Environmental Genomics and Institute for Systems and Integrative Biology, Université Laval, Québec, Québec, Canada
| | - J Bousquet
- Canada Research Chair in Forest and Environmental Genomics and Institute for Systems and Integrative Biology, Université Laval, Québec, Québec, Canada
| |
Collapse
|
165
|
Chen GB. Estimating heritability of complex traits from genome-wide association studies using IBS-based Haseman-Elston regression. Front Genet 2014; 5:107. [PMID: 24817879 PMCID: PMC4012219 DOI: 10.3389/fgene.2014.00107] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2013] [Accepted: 04/10/2014] [Indexed: 11/13/2022] Open
Abstract
Exploring heritability of complex traits is a central focus of statistical genetics. Among various previously proposed methods to estimate heritability, variance component methods are advantageous when estimating heritability using markers. Due to the high-dimensional nature of data obtained from genome-wide association studies (GWAS) in which genetic architecture is often unknown, the most appropriate heritability estimator model is often unclear. The Haseman–Elston (HE) regression is a variance component method that was initially only proposed for linkage studies. However, this study presents a theoretical basis for a modified HE that models linkage disequilibrium for a quantitative trait, and consequently can be used for GWAS. After replacing identical by descent (IBD) scores with identity by state (IBS) scores, we applied the IBS-based HE regression to single-marker association studies (scenario I) and estimated the variance component using multiple markers (scenario II). In scenario II, we discuss the circumstances in which the HE regression and the mixed linear model are equivalent; the disparity between these two methods is observed when a covariance component exists for the additive variance. When we extended the IBS-based HE regression to case-control studies in a subsequent simulation study, we found that it provided a nearly unbiased estimate of heritability, more precise than that estimated via the mixed linear model. Thus, for the case-control scenario, the HE regression is preferable. GEnetic Analysis Repository (GEAR; http://sourceforge.net/p/gbchen/wiki/GEAR/) software implemented the HE regression method and is freely available.
Collapse
Affiliation(s)
- Guo-Bo Chen
- Queensland Brain Institute, The University of Queensland St. Lucia, QLD, Australia
| |
Collapse
|
166
|
Lee JJ, Chow CC. Conditions for the validity of SNP-based heritability estimation. Hum Genet 2014; 133:1011-22. [PMID: 24744256 DOI: 10.1007/s00439-014-1441-5] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2014] [Accepted: 03/28/2014] [Indexed: 01/05/2023]
Abstract
The heritability of a trait (h(2)) is the proportion of its population variance caused by genetic differences, and estimates of this parameter are important for interpreting the results of genome-wide association studies (GWAS). In recent years, researchers have adopted a novel method for estimating a lower bound on heritability directly from GWAS data that uses realized genetic similarities between nominally unrelated individuals. The quantity estimated by this method is purported to be the contribution to heritability that could in principle be recovered from association studies employing the given panel of SNPs (h(2)(SNP)). Thus far, the validity of this approach has mostly been tested empirically. Here, we provide a mathematical explication and show that the method should remain a robust means of obtaining h(2)(SNP)) under circumstances wider than those under which it has so far been derived.
Collapse
Affiliation(s)
- James J Lee
- Department of Psychology, University of Minnesota Twin Cities, Minneapolis, MN, 55455, USA,
| | | |
Collapse
|
167
|
Lee JJ, Chow CC. Conditions for the validity of SNP-based heritability estimation. Hum Genet 2014. [DOI: 10.1007/s00439-014-1441-5 (cit.on p.4).] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/29/2022]
|
168
|
Berry DP, Coffey MP, Pryce JE, de Haas Y, Løvendahl P, Krattenmacher N, Crowley JJ, Wang Z, Spurlock D, Weigel K, Macdonald K, Veerkamp RF. International genetic evaluations for feed intake in dairy cattle through the collation of data from multiple sources. J Dairy Sci 2014; 97:3894-905. [PMID: 24731627 DOI: 10.3168/jds.2013-7548] [Citation(s) in RCA: 87] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2013] [Accepted: 02/28/2014] [Indexed: 11/19/2022]
Abstract
Feed represents a large proportion of the variable costs in dairy production systems. The omission of feed intake measures explicitly from national dairy cow breeding objectives is predominantly due to a lack of information from which to make selection decisions. However, individual cow feed intake data are available in different countries, mostly from research or nucleus herds. None of these data sets are sufficiently large enough on their own to generate accurate genetic evaluations. In the current study, we collate data from 10 populations in 9 countries and estimate genetic parameters for dry matter intake (DMI). A total of 224,174 test-day records from 10,068 parity 1 to 5 records of 6,957 cows were available, as well as records from 1,784 growing heifers. Random regression models were fit to the lactating cow test-day records and predicted feed intake at 70 d postcalving was extracted from these fitted profiles. The random regression model included a fixed polynomial regression for each lactation separately, as well as herd-year-season of calving and experimental treatment as fixed effects; random effects fit in the model included individual animal deviation from the fixed regression for each parity as well as mean herd-specific deviations from the fixed regression. Predicted DMI at 70 d postcalving was used as the phenotype for the subsequent genetic analyses undertaken using an animal repeatability model. Heritability estimates of predicted cow feed intake 70 d postcalving was 0.34 across the entire data set and varied, within population, from 0.08 to 0.52. Repeatability of feed intake across lactations was 0.66. Heritability of feed intake in the growing heifers was 0.20 to 0.34 in the 2 populations with heifer data. The genetic correlation between feed intake in lactating cows and growing heifers was 0.67. A combined pedigree and genomic relationship matrix was used to improve linkages between populations for the estimation of genetic correlations of DMI in lactating cows; genotype information was available on 5,429 of the animals. Populations were categorized as North America, grazing, other low input, and high input European Union. Albeit associated with large standard errors, genetic correlation estimates for DMI between populations varied from 0.14 to 0.84 but were stronger (0.76 to 0.84) between the populations representative of high-input production systems. Genetic correlations with the grazing populations were weak to moderate, varying from 0.14 to 0.57. Genetic evaluations for DMI can be undertaken using data collated from international populations; however, genotype-by-environment interactions with grazing production systems need to be considered.
Collapse
Affiliation(s)
- D P Berry
- Animal & Grassland Research and Innovation Centre, Teagasc, Moorepark, Co. Cork, Ireland.
| | - M P Coffey
- Animal and Veterinary Sciences, Scotland's Rural College (SRUC), Easter Bush Campus, Midlothian EH25 9RG, United Kingdom
| | - J E Pryce
- Department of Environment and Primary Industries & Dairy Futures Cooperative Research Centre (CRC), Agribio, 5 Ring Road, La Trobe University, Bundoora 3083, Australia
| | - Y de Haas
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, 8200 AB Lelystad, the Netherlands
| | - P Løvendahl
- Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark
| | - N Krattenmacher
- Institute of Animal Breeding and Husbandry, Christian-Albrechts-University, D-24118 Kiel, Germany
| | - J J Crowley
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Alberta T6G 2P5, Canada
| | - Z Wang
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Alberta T6G 2P5, Canada
| | - D Spurlock
- Department of Animal Science, Iowa State University, Ames 50011
| | - K Weigel
- Department of Dairy Science, University of Wisconsin, Madison 53706
| | - K Macdonald
- DairyNZ, Private Bag 3221, Hamilton 3248, New Zealand
| | - R F Veerkamp
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, 8200 AB Lelystad, the Netherlands
| |
Collapse
|
169
|
Christensen OF, Madsen P, Nielsen B, Su G. Genomic evaluation of both purebred and crossbred performances. Genet Sel Evol 2014; 46:23. [PMID: 24666469 PMCID: PMC3994295 DOI: 10.1186/1297-9686-46-23] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2013] [Accepted: 02/24/2014] [Indexed: 11/14/2022] Open
Abstract
Background For a two-breed crossbreeding system, Wei and van der Werf presented a model for genetic evaluation using information from both purebred and crossbred animals. The model provides breeding values for both purebred and crossbred performances. Genomic evaluation incorporates marker genotypes into a genetic evaluation system. Among popular methods are the so-called single-step methods, in which marker genotypes are incorporated into a traditional animal model by using a combined relationship matrix that extends the marker-based relationship matrix to non-genotyped animals. However, a single-step method for genomic evaluation of both purebred and crossbred performances has not been developed yet. Results An extension of the Wei and van der Werf model that incorporates genomic information is presented. The extension consists of four steps: (1) the Wei van der Werf model is reformulated using two partial relationship matrices for the two breeds; (2) marker-based partial relationship matrices are constructed; (3) marker-based partial relationship matrices are adjusted to be compatible to pedigree-based partial relationship matrices and (4) combined partial relationship matrices are constructed using information from both pedigree and marker genotypes. The extension of the Wei van der Werf model can be implemented using software that allows inverse covariance matrices in sparse format as input. Conclusions A method for genomic evaluation of both purebred and crossbred performances was developed for a two-breed crossbreeding system. The method allows information from crossbred animals to be incorporated in a coherent manner for such crossbreeding systems.
Collapse
Affiliation(s)
- Ole F Christensen
- Aarhus University, Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Blichers Allé 20, P,O, BOX 50, DK-8830 Tjele, Denmark.
| | | | | | | |
Collapse
|
170
|
Santure AW, De Cauwer I, Robinson MR, Poissant J, Sheldon BC, Slate J. Genomic dissection of variation in clutch size and egg mass in a wild great tit (Parus major) population. Mol Ecol 2014; 22:3949-62. [PMID: 23889544 DOI: 10.1111/mec.12376] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2012] [Revised: 02/10/2013] [Accepted: 02/20/2013] [Indexed: 01/01/2023]
Abstract
Clutch size and egg mass are life history traits that have been extensively studied in wild bird populations, as life history theory predicts a negative trade-off between them, either at the phenotypic or at the genetic level. Here, we analyse the genomic architecture of these heritable traits in a wild great tit (Parus major) population, using three marker-based approaches - chromosome partitioning, quantitative trait locus (QTL) mapping and a genome-wide association study (GWAS). The variance explained by each great tit chromosome scales with predicted chromosome size, no location in the genome contains genome-wide significant QTL, and no individual SNPs are associated with a large proportion of phenotypic variation, all of which may suggest that variation in both traits is due to many loci of small effect, located across the genome. There is no evidence that any regions of the genome contribute significantly to both traits, which combined with a small, nonsignificant negative genetic covariance between the traits, suggests the absence of genetic constraints on the independent evolution of these traits. Our findings support the hypothesis that variation in life history traits in natural populations is likely to be determined by many loci of small effect spread throughout the genome, which are subject to continued input of variation by mutation and migration, although we cannot exclude the possibility of an additional input of major effect genes influencing either trait.
Collapse
Affiliation(s)
- Anna W Santure
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, S10 2TN, UK.
| | | | | | | | | | | |
Collapse
|
171
|
Marker-based estimates of relatedness and inbreeding coefficients: an assessment of current methods. J Evol Biol 2014; 27:518-30. [DOI: 10.1111/jeb.12315] [Citation(s) in RCA: 108] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2013] [Revised: 12/03/2013] [Accepted: 12/05/2013] [Indexed: 12/17/2022]
|
172
|
Li Y, Guo G. Data quality control in social surveys using genetic information. BIODEMOGRAPHY AND SOCIAL BIOLOGY 2014; 60:212-228. [PMID: 25343368 PMCID: PMC6642059 DOI: 10.1080/19485565.2014.953029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
This article introduces a novel way of taking advantage of genetic data in social surveys for the purposes of data quality control. Genetic information could detect and repair data issues such as missing data, reporting errors, differences in measures of the same variable, and flawed data. Using data from two surveys, the College Roommate Study (ROOM) and the National Longitudinal Study of Adolescent Health (Add Health), we show that proportion identical by descent score (a measure of genetic relationships) can identify "misreported" and unreported sibling type and detect misrepresented participants, bio-ancestry score (a measure of ancestral population memberships) can repair and recover missing race and discrepancies among different measures of self-reported race, and sex chromosomal information may help cross-check self-reported sex. This article represents an initial effort to utilize genetic data for the purposes of data quality control. As genetic data become increasingly available, researchers may explore more approaches to improving data quality.
Collapse
Affiliation(s)
- Yi Li
- Department of Sociology, University of North Carolina at Chapel Hill
| | - Guang Guo
- Department of Sociology, University of North Carolina at Chapel Hill
- Carolina Population Center, University of North Carolina at Chapel Hill
- Carolina Center for Genome Sciences, University of North Carolina at Chapel Hill
| |
Collapse
|
173
|
Hill WG. Applications of population genetics to animal breeding, from wright, fisher and lush to genomic prediction. Genetics 2014; 196:1-16. [PMID: 24395822 PMCID: PMC3872177 DOI: 10.1534/genetics.112.147850] [Citation(s) in RCA: 84] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2013] [Accepted: 10/18/2013] [Indexed: 11/18/2022] Open
Abstract
Although animal breeding was practiced long before the science of genetics and the relevant disciplines of population and quantitative genetics were known, breeding programs have mainly relied on simply selecting and mating the best individuals on their own or relatives' performance. This is based on sound quantitative genetic principles, developed and expounded by Lush, who attributed much of his understanding to Wright, and formalized in Fisher's infinitesimal model. Analysis at the level of individual loci and gene frequency distributions has had relatively little impact. Now with access to genomic data, a revolution in which molecular information is being used to enhance response with "genomic selection" is occurring. The predictions of breeding value still utilize multiple loci throughout the genome and, indeed, are largely compatible with additive and specifically infinitesimal model assumptions. I discuss some of the history and genetic issues as applied to the science of livestock improvement, which has had and continues to have major spin-offs into ideas and applications in other areas.
Collapse
Affiliation(s)
- William G. Hill
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh EH9 3JT, United Kingdom
| |
Collapse
|
174
|
Rothammer S, Seichter D, Förster M, Medugorac I. A genome-wide scan for signatures of differential artificial selection in ten cattle breeds. BMC Genomics 2013; 14:908. [PMID: 24359457 PMCID: PMC3878089 DOI: 10.1186/1471-2164-14-908] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2013] [Accepted: 12/16/2013] [Indexed: 11/11/2022] Open
Abstract
Background Since the times of domestication, cattle have been continually shaped by the influence of humans. Relatively recent history, including breed formation and the still enduring enormous improvement of economically important traits, is expected to have left distinctive footprints of selection within the genome. The purpose of this study was to map genome-wide selection signatures in ten cattle breeds and thus improve the understanding of the genome response to strong artificial selection and support the identification of the underlying genetic variants of favoured phenotypes. We analysed 47,651 single nucleotide polymorphisms (SNP) using Cross Population Extended Haplotype Homozygosity (XP-EHH). Results We set the significance thresholds using the maximum XP-EHH values of two essentially artificially unselected breeds and found up to 229 selection signatures per breed. Through a confirmation process we verified selection for three distinct phenotypes typical for one breed (polledness in Galloway, double muscling in Blanc-Bleu Belge and red coat colour in Red Holstein cattle). Moreover, we detected six genes strongly associated with known QTL for beef or dairy traits (TG, ABCG2, DGAT1, GH1, GHR and the Casein Cluster) within selection signatures of at least one breed. A literature search for genes lying in outstanding signatures revealed further promising candidate genes. However, in concordance with previous genome-wide studies, we also detected a substantial number of signatures without any yet known gene content. Conclusions These results show the power of XP-EHH analyses in cattle to discover promising candidate genes and raise the hope of identifying phenotypically important variants in the near future. The finding of plausible functional candidates in some short signatures supports this hope. For instance, MAP2K6 is the only annotated gene of two signatures detected in Galloway and Gelbvieh cattle and is already known to be associated with carcass weight, back fat thickness and marbling score in Korean beef cattle. Based on the confirmation process and literature search we deduce that XP-EHH is able to uncover numerous artificial selection targets in subpopulations of domesticated animals.
Collapse
Affiliation(s)
| | | | | | - Ivica Medugorac
- Chair of Animal Genetics and Husbandry, Ludwig-Maximilians-University Munich, Veterinärstr, 13, 80539 Munich , Germany.
| |
Collapse
|
175
|
Rowe SJ, Rowlatt A, Davies G, Harris SE, Porteous DJ, Liewald DC, McNeill G, Starr JM, Deary IJ, Tenesa A. Complex variation in measures of general intelligence and cognitive change. PLoS One 2013; 8:e81189. [PMID: 24349040 PMCID: PMC3865348 DOI: 10.1371/journal.pone.0081189] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2013] [Accepted: 10/20/2013] [Indexed: 11/18/2022] Open
Abstract
Combining information from multiple SNPs may capture a greater amount of genetic variation than from the sum of individual SNP effects and help identifying missing heritability. Regions may capture variation from multiple common variants of small effect, multiple rare variants or a combination of both. We describe regional heritability mapping of human cognition. Measures of crystallised (gc) and fluid intelligence (gf) in late adulthood (64-79 years) were available for 1806 individuals genotyped for 549,692 autosomal single nucleotide polymorphisms (SNPs). The same individuals were tested at age 11, enabling us the rare opportunity to measure cognitive change across most of their lifespan. 547,750 SNPs ranked by position are divided into 10, 908 overlapping regions of 101 SNPs to estimate the genetic variance each region explains, an approach that resembles classical linkage methods. We also estimate the genetic variation explained by individual autosomes and by SNPs within genes. Empirical significance thresholds are estimated separately for each trait from whole genome scans of 500 permutated data sets. The 5% significance threshold for the likelihood ratio test of a single region ranged from 17-17.5 for the three traits. This is the equivalent to nominal significance under the expectation of a chi-squared distribution (between 1 df and 0) of P<1.44×10(-5). These thresholds indicate that the distribution of the likelihood ratio test from this type of variance component analysis should be estimated empirically. Furthermore, we show that estimates of variation explained by these regions can be grossly overestimated. After applying permutation thresholds, a region for gf on chromosome 5 spanning the PRRC1 gene is significant at a genome-wide 10% empirical threshold. Analysis of gene methylation on the temporal cortex provides support for the association of PRRC1 and fluid intelligence (P = 0.004), and provides a prime candidate gene for high throughput sequencing of these uniquely informative cohorts.
Collapse
Affiliation(s)
- Suzanne J. Rowe
- The Roslin Institute, The University of Edinburgh, Roslin, Scotland, United Kingdom
| | - Amy Rowlatt
- The Roslin Institute, The University of Edinburgh, Roslin, Scotland, United Kingdom
| | - Gail Davies
- Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Sarah E. Harris
- Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, Scotland, United Kingdom
- Medical Genetics Section, The University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - David J. Porteous
- Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, Scotland, United Kingdom
- Medical Genetics Section, The University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - David C. Liewald
- Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Geraldine McNeill
- Institute of Applied Health Sciences, University of Aberdeen, Aberdeen, Scotland, United Kingdom
| | - John M. Starr
- Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, Scotland, United Kingdom
- Alzheimer Scotland Dementia Research Centre, The University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Ian J. Deary
- Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, Scotland, United Kingdom
- Department of Psychology, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Albert Tenesa
- The Roslin Institute, The University of Edinburgh, Roslin, Scotland, United Kingdom
- Medical Research Council Human Genetics Unit at the Medical Research Council Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, Scotland, United Kingdom
- * E-mail:
| |
Collapse
|
176
|
Stanton-Geddes J, Yoder JB, Briskine R, Young ND, Tiffin P. Estimating heritability using genomic data. Methods Ecol Evol 2013. [DOI: 10.1111/2041-210x.12129] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- John Stanton-Geddes
- Department of Plant Biology; University of Minnesota; Saint Paul MN 55108 USA
| | - Jeremy B. Yoder
- Department of Plant Biology; University of Minnesota; Saint Paul MN 55108 USA
| | - Roman Briskine
- Department of Computer Science and Engineering; University of Minnesota; Minneapolis MN 55455 USA
| | - Nevin D. Young
- Department of Plant Biology; University of Minnesota; Saint Paul MN 55108 USA
- Department of Plant Pathology; University of Minnesota; Saint Paul MN 55108 USA
| | - Peter Tiffin
- Department of Plant Biology; University of Minnesota; Saint Paul MN 55108 USA
| |
Collapse
|
177
|
Ferenčaković M, Sölkner J, Curik I. Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors. Genet Sel Evol 2013; 45:42. [PMID: 24168655 PMCID: PMC4176748 DOI: 10.1186/1297-9686-45-42] [Citation(s) in RCA: 166] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2013] [Accepted: 10/13/2013] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Runs of homozygosity are long, uninterrupted stretches of homozygous genotypes that enable reliable estimation of levels of inbreeding (i.e., autozygosity) based on high-throughput, chip-based single nucleotide polymorphism (SNP) genotypes. While the theoretical definition of runs of homozygosity is straightforward, their empirical identification depends on the type of SNP chip used to obtain the data and on a number of factors, including the number of heterozygous calls allowed to account for genotyping errors. We analyzed how SNP chip density and genotyping errors affect estimates of autozygosity based on runs of homozygosity in three cattle populations, using genotype data from an SNP chip with 777,972 SNPs and a 50 k chip. RESULTS Data from the 50 k chip led to overestimation of the number of runs of homozygosity that are shorter than 4 Mb, since the analysis could not identify heterozygous SNPs that were present on the denser chip. Conversely, data from the denser chip led to underestimation of the number of runs of homozygosity that were longer than 8 Mb, unless the presence of a small number of heterozygous SNP genotypes was allowed within a run of homozygosity. CONCLUSIONS We have shown that SNP chip density and genotyping errors introduce patterns of bias in the estimation of autozygosity based on runs of homozygosity. SNP chips with 50,000 to 60,000 markers are frequently available for livestock species and their information leads to a conservative prediction of autozygosity from runs of homozygosity longer than 4 Mb. Not allowing heterozygous SNP genotypes to be present in a homozygosity run, as has been advocated for human populations, is not adequate for livestock populations because they have much higher levels of autozygosity and therefore longer runs of homozygosity. When allowing a small number of heterozygous calls, current software does not differentiate between situations where these calls are adjacent and therefore indicative of an actual break of the run versus those where they are scattered across the length of the homozygous segment. Simple graphical tests that are used in this paper are a current, yet tedious solution.
Collapse
Affiliation(s)
- Maja Ferenčaković
- Department of Sustainable Agricultural Systems, Division of Livestock Sciences, University of Natural Resources and Life Sciences Vienna, Gregor Mendel Str, 33, A-1180 Vienna, Austria.
| | | | | |
Collapse
|
178
|
Gauvin H, Moreau C, Lefebvre JF, Laprise C, Vézina H, Labuda D, Roy-Gagnon MH. Genome-wide patterns of identity-by-descent sharing in the French Canadian founder population. Eur J Hum Genet 2013; 22:814-21. [PMID: 24129432 PMCID: PMC4023206 DOI: 10.1038/ejhg.2013.227] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2013] [Revised: 08/07/2013] [Accepted: 09/04/2013] [Indexed: 12/16/2022] Open
Abstract
In genetics the ability to accurately describe the familial relationships among a group of individuals can be very useful. Recent statistical tools succeeded in assessing the degree of relatedness up to 6-7 generations with good power using dense genome-wide single-nucleotide polymorphism data to estimate the extent of identity-by-descent (IBD) sharing. It is therefore important to describe genome-wide patterns of IBD sharing for more remote and complex relatedness between individuals, such as that observed in a founder population like Quebec, Canada. Taking advantage of the extended genealogical records of the French Canadian founder population, we first compared different tools to identify regions of IBD in order to best describe genome-wide IBD sharing and its correlation with genealogical characteristics. Results showed that the extent of IBD sharing identified with FastIBD correlates best with relatedness measured using genealogical data. Total length of IBD sharing explained 85% of the genealogical kinship's variance. In addition, we observed significantly higher sharing in pairs of individuals with at least one inbred ancestor compared with those without any. Furthermore, patterns of IBD sharing and average sharing were different across regional populations, consistent with the settlement history of Quebec. Our results suggest that, as expected, the complex relatedness present in founder populations is reflected in patterns of IBD sharing. Using these patterns, it is thus possible to gain insight on the types of distant relationships in a sample from a founder population like Quebec.
Collapse
Affiliation(s)
- Héloïse Gauvin
- 1] Département de médecine sociale et préventive, Université de Montréal, Montréal, Québec, Canada [2] Centre de recherche, Centre hospitalier universitaire Sainte-Justine, Université de Montréal, Montréal, Québec, Canada
| | - Claudia Moreau
- Centre de recherche, Centre hospitalier universitaire Sainte-Justine, Université de Montréal, Montréal, Québec, Canada
| | - Jean-François Lefebvre
- Centre de recherche, Centre hospitalier universitaire Sainte-Justine, Université de Montréal, Montréal, Québec, Canada
| | - Catherine Laprise
- Département des sciences fondamentales, Université du Québec à Chicoutimi, Chicoutimi, Québec, Canada
| | - Hélène Vézina
- Département des sciences humaines, Université du Québec à Chicoutimi, Chicoutimi, Québec, Canada
| | - Damian Labuda
- 1] Centre de recherche, Centre hospitalier universitaire Sainte-Justine, Université de Montréal, Montréal, Québec, Canada [2] Département de pédiatrie, Université de Montréal, Montréal, Québec, Canada
| | - Marie-Hélène Roy-Gagnon
- 1] Centre de recherche, Centre hospitalier universitaire Sainte-Justine, Université de Montréal, Montréal, Québec, Canada [2] Department of Epidemiology and Community Medicine, University of Ottawa, Ottawa, Ontario, Canada
| |
Collapse
|
179
|
Glodzik D, Navarro P, Vitart V, Hayward C, McQuillan R, Wild SH, Dunlop MG, Rudan I, Campbell H, Haley C, Wright AF, Wilson JF, McKeigue P. Inference of identity by descent in population isolates and optimal sequencing studies. Eur J Hum Genet 2013; 21:1140-5. [PMID: 23361219 PMCID: PMC3778345 DOI: 10.1038/ejhg.2012.307] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2012] [Revised: 12/18/2012] [Accepted: 12/28/2012] [Indexed: 01/24/2023] Open
Abstract
In an isolated population, individuals are likely to share large genetic regions inherited from common ancestors. Identity by descent (IBD) can be inferred from SNP genotypes, which is useful in a number of applications, including identifying genetic variants influencing complex disease risk, and planning efficient cohort-sequencing strategies. We present ANCHAP--a method for detecting IBD in isolated populations. We compare accuracy of the method against other long-range and local phasing methods, using parent-offspring trios. In our experiments, we show that ANCHAP performs similarly as the other long-range method, but requires an order-of-magnitude less computational resources. A local phasing model is able to achieve similar sensitivity, but only at the cost of higher false discovery rates. In some regions of the genome, the studied individuals share haplotypes particularly often, which hints at the history of the populations studied. We demonstrate the method using SNP genotypes from three isolated island populations, as well as in a cohort of unrelated individuals. In samples from three isolated populations of around 1000 individual each, an average individual shares a haplotype at a genetic locus with 9-12 other individuals, compared with only 1 individual within the non-isolated population. We describe an application of ANCHAP to optimally choose samples in resequencing studies. We find that with sample sizes of 1000 individuals from an isolated population genotyped using a dense SNP array, and with 20% of these individuals sequenced, 65% of sequences of the unsequenced subjects can be partially inferred.
Collapse
Affiliation(s)
- Dominik Glodzik
- MRC Institute of Genetics and Molecular Medicine (MRC IGMM), MRC Human Genetics Unit, University of Edinburgh, Western General Hospital, Edinburgh, UK
| | - Pau Navarro
- MRC Institute of Genetics and Molecular Medicine (MRC IGMM), MRC Human Genetics Unit, University of Edinburgh, Western General Hospital, Edinburgh, UK
| | - Veronique Vitart
- MRC Institute of Genetics and Molecular Medicine (MRC IGMM), MRC Human Genetics Unit, University of Edinburgh, Western General Hospital, Edinburgh, UK
| | - Caroline Hayward
- MRC Institute of Genetics and Molecular Medicine (MRC IGMM), MRC Human Genetics Unit, University of Edinburgh, Western General Hospital, Edinburgh, UK
| | - Ruth McQuillan
- College of Medicine and Veterinary Medicine, Centre for Population Health Sciences, University of Edinburgh, Edinburgh, UK
| | - Sarah H Wild
- College of Medicine and Veterinary Medicine, Centre for Population Health Sciences, University of Edinburgh, Edinburgh, UK
| | - Malcolm G Dunlop
- MRC Institute of Genetics and Molecular Medicine (MRC IGMM), MRC Human Genetics Unit, University of Edinburgh, Western General Hospital, Edinburgh, UK
| | - Igor Rudan
- MRC Institute of Genetics and Molecular Medicine (MRC IGMM), MRC Human Genetics Unit, University of Edinburgh, Western General Hospital, Edinburgh, UK
| | - Harry Campbell
- MRC Institute of Genetics and Molecular Medicine (MRC IGMM), MRC Human Genetics Unit, University of Edinburgh, Western General Hospital, Edinburgh, UK
| | - Chris Haley
- MRC Institute of Genetics and Molecular Medicine (MRC IGMM), MRC Human Genetics Unit, University of Edinburgh, Western General Hospital, Edinburgh, UK
| | - Alan F Wright
- MRC Institute of Genetics and Molecular Medicine (MRC IGMM), MRC Human Genetics Unit, University of Edinburgh, Western General Hospital, Edinburgh, UK
| | - James F Wilson
- College of Medicine and Veterinary Medicine, Centre for Population Health Sciences, University of Edinburgh, Edinburgh, UK
| | - Paul McKeigue
- College of Medicine and Veterinary Medicine, Centre for Population Health Sciences, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
180
|
Lopes MS, Silva FF, Harlizius B, Duijvesteijn N, Lopes PS, Guimarães SE, Knol EF. Improved estimation of inbreeding and kinship in pigs using optimized SNP panels. BMC Genet 2013; 14:92. [PMID: 24063757 PMCID: PMC3849284 DOI: 10.1186/1471-2156-14-92] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2012] [Accepted: 09/19/2013] [Indexed: 01/14/2023] Open
Abstract
Background Traditional breeding programs consider an average pairwise kinship between sibs. Based on pedigree information, the relationship matrix is used for genetic evaluations disregarding variation due to Mendelian sampling. Therefore, inbreeding and kinship coefficients are either over or underestimated resulting in reduction of accuracy of genetic evaluations and genetic progress. Single nucleotide polymorphism (SNPs) can be used to estimate pairwise kinship and individual inbreeding more accurately. The aim of this study was to optimize the selection of markers and determine the required number of SNPs for estimation of kinship and inbreeding. Results A total of 1,565 animals from three commercial pig populations were analyzed for 28,740 SNPs from the PorcineSNP60 Beadchip. Mean genomic inbreeding was higher than pedigree-based estimates in lines 2 and 3, but lower in line 1. As expected, a larger variation of genomic kinship estimates was observed for half and full sibs than for pedigree-based kinship reflecting Mendelian sampling. Genomic kinship between father-offspring pairs was lower (0.23) than the estimate based on pedigree (0.26). Bootstrap analyses using six reduced SNP panels (n = 500, 1000, 1500, 2000, 2500 and 3000) showed that 2,000 SNPs were able to reproduce the results very close to those obtained using the full set of unlinked markers (n = 7,984-10,235) with high correlations (inbreeding r > 0.82 and kinship r > 0.96) and low variation between different sets with the same number of SNPs. Conclusions Variation of kinship between sibs due to Mendelian sampling is better captured using genomic information than the pedigree-based method. Therefore, the reduced sets of SNPs could generate more accurate kinship coefficients between sibs than the pedigree-based method. Variation of genomic kinship of father-offspring pairs is recommended as a parameter to determine accuracy of the method rather than correlation with pedigree-based estimates. Inbreeding and kinship coefficients can be estimated with high accuracy using ≥2,000 unlinked SNPs within all three commercial pig lines evaluated. However, a larger number of SNPs might be necessary in other populations or across lines.
Collapse
Affiliation(s)
- Marcos S Lopes
- TOPIGS Research Center IPG B,V,, P,O, Box 43, 6640 AA, Beuningen, the Netherlands.
| | | | | | | | | | | | | |
Collapse
|
181
|
Mulder HA, Crump RE, Calus MPL, Veerkamp RF. Unraveling the genetic architecture of environmental variance of somatic cell score using high-density single nucleotide polymorphism and cow data from experimental farms. J Dairy Sci 2013; 96:7306-7317. [PMID: 24035025 DOI: 10.3168/jds.2013-6818] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2013] [Accepted: 07/18/2013] [Indexed: 11/19/2022]
Abstract
In recent years, it has been shown that not only is the phenotype under genetic control, but also the environmental variance. Very little, however, is known about the genetic architecture of environmental variance. The main objective of this study was to unravel the genetic architecture of the mean and environmental variance of somatic cell score (SCS) by identifying genome-wide associations for mean and environmental variance of SCS in dairy cows and by quantifying the accuracy of genome-wide breeding values. Somatic cell score was used because previous research has shown that the environmental variance of SCS is partly under genetic control and reduction of the variance of SCS by selection is desirable. In this study, we used 37,590 single nucleotide polymorphism (SNP) genotypes and 46,353 test-day records of 1,642 cows at experimental research farms in 4 countries in Europe. We used a genomic relationship matrix in a double hierarchical generalized linear model to estimate genome-wide breeding values and genetic parameters. The estimated mean and environmental variance per cow was used in a Bayesian multi-locus model to identify SNP associated with either the mean or the environmental variance of SCS. Based on the obtained accuracy of genome-wide breeding values, 985 and 541 independent chromosome segments affecting the mean and environmental variance of SCS, respectively, were identified. Using a genomic relationship matrix increased the accuracy of breeding values relative to using a pedigree relationship matrix. In total, 43 SNP were significantly associated with either the mean (22) or the environmental variance of SCS (21). The SNP with the highest Bayes factor was on chromosome 9 (Hapmap31053-BTA-111664) explaining approximately 3% of the genetic variance of the environmental variance of SCS. Other significant SNP explained less than 1% of the genetic variance. It can be concluded that fewer genomic regions affect the environmental variance of SCS than the mean of SCS, but genes with large effects seem to be absent for both traits.
Collapse
Affiliation(s)
- H A Mulder
- Animal Breeding and Genomics Centre, Wageningen University, PO Box 338, 6700 AH Wageningen, the Netherlands; Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, PO Box 65, 8200 AB Lelystad, the Netherlands.
| | - R E Crump
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, PO Box 65, 8200 AB Lelystad, the Netherlands
| | - M P L Calus
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, PO Box 65, 8200 AB Lelystad, the Netherlands
| | - R F Veerkamp
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, PO Box 65, 8200 AB Lelystad, the Netherlands
| |
Collapse
|
182
|
Fast genomic predictions via Bayesian G-BLUP and multilocus models of threshold traits including censored Gaussian data. G3-GENES GENOMES GENETICS 2013; 3:1511-23. [PMID: 23821618 PMCID: PMC3755911 DOI: 10.1534/g3.113.007096] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed.
Collapse
|
183
|
Wang K, Hu X, Peng Y. An analytical comparison of the principal component method and the mixed effects model for association studies in the presence of cryptic relatedness and population stratification. Hum Hered 2013; 76:1-9. [PMID: 23921716 DOI: 10.1159/000353345] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
The principal component method and the mixed effects model represent two popular approaches to controlling for population structure and cryptic relatedness in genetic association studies. There are only a handful of studies comparing their performance. These studies are typically based on simulation studies and the results are therefore limited in their applicability. In this paper, we conduct an analytical comparison of these two approaches in the presence of cryptic relatedness and population structure in terms of their validity and efficiency. In the presence of cryptic relatedness, we show that both methods are valid, but the mixed effects model is more powerful for detecting association. In the presence of population structure, however, we show that both methods can be invalid. The biases and variances of the estimates from the two methods are compared. Examples and simulation studies are provided to demonstrate the conclusions.
Collapse
|
184
|
Robinson MR, Santure AW, Decauwer I, Sheldon BC, Slate J. Partitioning of genetic variation across the genome using multimarker methods in a wild bird population. Mol Ecol 2013; 22:3963-80. [PMID: 23848161 DOI: 10.1111/mec.12375] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2013] [Revised: 04/11/2013] [Accepted: 04/11/2013] [Indexed: 01/09/2023]
Abstract
The underlying basis of genetic variation in quantitative traits, in terms of the number of causal variants and the size of their effects, is largely unknown in natural populations. The expectation is that complex quantitative trait variation is attributable to many, possibly interacting, causal variants, whose effects may depend upon the sex, age and the environment in which they are expressed. A recently developed methodology in animal breeding derives a value of relatedness among individuals from high-density genomic marker data, to estimate additive genetic variance within livestock populations. Here, we adapt and test the effectiveness of these methods to partition genetic variation for complex traits across genomic regions within ecological study populations where individuals have varying degrees of relatedness. We then apply this approach for the first time to a natural population and demonstrate that genetic variation in wing length in the great tit (Parus major) reflects contributions from multiple genomic regions. We show that a polygenic additive mode of gene action best describes the patterns observed, and we find no evidence of dosage compensation for the sex chromosome. Our results suggest that most of the genomic regions that influence wing length have the same effects in both sexes. We found a limited amount of genetic variance in males that is attributed to regions that have no effects in females, which could facilitate the sexual dimorphism observed for this trait. Although this exploratory work focuses on one complex trait, the methodology is generally applicable to any trait for any laboratory or wild population, paving the way for investigating sex-, age- and environment-specific genetic effects and thus the underlying genetic architecture of phenotype in biological study systems.
Collapse
Affiliation(s)
- Matthew R Robinson
- Department of Animal and Plant Science, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK.
| | | | | | | | | |
Collapse
|
185
|
Sverdlov S, Thompson EA. Correlation between relatives given complete genotypes: from identity by descent to identity by function. Theor Popul Biol 2013; 88:57-67. [PMID: 23851163 DOI: 10.1016/j.tpb.2013.06.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2012] [Revised: 04/18/2013] [Accepted: 06/12/2013] [Indexed: 02/06/2023]
Abstract
In classical quantitative genetics, the correlation between the phenotypes of individuals with unknown genotypes and a known pedigree relationship is expressed in terms of probabilities of IBD states. In existing approaches to the inverse problem where genotypes are observed but pedigree relationships are not, dependence between phenotypes is either modeled as Bayesian uncertainty or mapped to an IBD model via inferred relatedness parameters. Neither approach yields a relationship between genotypic similarity and phenotypic similarity with a probabilistic interpretation corresponding to a generative model. We introduce a generative model for diploid allele effect based on the classic infinite allele mutation process. This approach motivates the concept of IBF (Identity by Function). The phenotypic covariance between two individuals given their diploid genotypes is expressed in terms of functional identity states. The IBF parameters define a genetic architecture for a trait without reference to specific alleles or population. Given full genome sequences, we treat a gene-scale functional region, rather than a SNP, as a QTL, modeling patterns of dominance for multiple alleles. Applications demonstrated by simulation include phenotype and effect prediction and association, and estimation of heritability and classical variance components. A simulation case study of the Missing Heritability problem illustrates a decomposition of heritability under the IBF framework into Explained and Unexplained components.
Collapse
Affiliation(s)
- Serge Sverdlov
- Department of Statistics, University of Washington, Box 354322, Seattle, WA 98195, USA.
| | | |
Collapse
|
186
|
Gay L, Siol M, Ronfort J. Pedigree-free estimates of heritability in the wild: promising prospects for selfing populations. PLoS One 2013; 8:e66983. [PMID: 23825602 PMCID: PMC3692515 DOI: 10.1371/journal.pone.0066983] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2013] [Accepted: 05/14/2013] [Indexed: 11/19/2022] Open
Abstract
Estimating the genetic variance available for traits informs us about a population's ability to evolve in response to novel selective challenges. In selfing species, theory predicts a loss of genetic diversity that could lead to an evolutionary dead-end, but empirical support remains scarce. Genetic variability in a trait is estimated by correlating the phenotypic resemblance with the proportion of the genome that two relatives share identical by descent ('realized relatedness'). The latter is traditionally predicted from pedigrees (Φ A : expected value) but can also be estimated using molecular markers (average number of alleles shared). Nevertheless, evolutionary biologists, unlike animal breeders, remain cautious about using marker-based relatedness coefficients to study complex phenotypic traits in populations. In this paper, we review published results comparing five different pedigree-free methods and use simulations to test individual-based models (hereafter called animal models) using marker-based relatedness coefficients, with a special focus on the influence of mating systems. Our literature review confirms that Ritland's regression method is unreliable, but suggests that animal models with marker-based estimates of relatedness and genomic selection are promising and that more testing is required. Our simulations show that using molecular markers instead of pedigrees in animal models seriously worsens the estimation of heritability in outcrossing populations, unless a very large number of loci is available. In selfing populations the results are less biased. More generally, populations with high identity disequilibrium (consanguineous or bottlenecked populations) could be propitious for using marker-based animal models, but are also more likely to deviate from the standard assumptions of quantitative genetics models (non-additive variance).
Collapse
Affiliation(s)
- Laurene Gay
- Diversity and Adaptation of Mediterranean Species, UMR AGAP 1334, Montpellier, France.
| | | | | |
Collapse
|
187
|
Makgahlela ML, Strandén I, Nielsen US, Sillanpää MJ, Mäntysaari EA. The estimation of genomic relationships using breedwise allele frequencies among animals in multibreed populations. J Dairy Sci 2013; 96:5364-75. [PMID: 23769355 DOI: 10.3168/jds.2012-6523] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2012] [Accepted: 04/24/2013] [Indexed: 01/07/2023]
Abstract
Different approaches of calculating genomic measures of relationship were explored and compared with pedigree relationships (A) within and across base breeds in a crossbreed population, using genotypes for 38,194 loci of 4,106 Nordic Red dairy cattle. Four genomic relationship matrices (G) were calculated using either observed allele frequencies (AF) across breeds or within-breed AF. The G matrices were compared separately when the AF were estimated in the observed and in the base population. Breedwise AF in the current and base population were estimated using linear regression models of individual genotypes on breed composition. Different G matrices were further used to predict direct estimated genomic values using a genomic BLUP model. Higher variability existed in the diagonal elements of G across breeds (standard deviation=0.06, on average) compared with A (0.01). The use of simple observed AF across base breeds to compute G increased coefficients for individuals in distantly related populations. Estimated breedwise AF reduced differences in coefficients similarly within and across populations. The variability of the current adjusted G matrix decreased from 0.055 to 0.035 when breedwise AF were estimated from the base breed population. The direct estimated genomic values and their validation reliabilities were, however, unaffected by AF used to compute G when estimated with a genomic BLUP model, due to inclusion of breed means in the model. In multibreed populations, G adjusted with breedwise AF from the founder population may provide more consistency among relationship coefficients between genotyped and ungenotyped individuals in an across-breed single-step evaluation.
Collapse
Affiliation(s)
- M L Makgahlela
- Department of Agricultural Sciences, University of Helsinki, Helsinki, Finland.
| | | | | | | | | |
Collapse
|
188
|
Thompson EA. Identity by descent: variation in meiosis, across genomes, and in populations. Genetics 2013; 194:301-26. [PMID: 23733848 PMCID: PMC3664843 DOI: 10.1534/genetics.112.148825] [Citation(s) in RCA: 191] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2012] [Accepted: 03/10/2013] [Indexed: 01/04/2023] Open
Abstract
Gene identity by descent (IBD) is a fundamental concept that underlies genetically mediated similarities among relatives. Gene IBD is traced through ancestral meioses and is defined relative to founders of a pedigree, or to some time point or mutational origin in the coalescent of a set of extant genes in a population. The random process underlying changes in the patterns of IBD across the genome is recombination, so the natural context for defining IBD is the ancestral recombination graph (ARG), which specifies the complete ancestry of a collection of chromosomes. The ARG determines both the sequence of coalescent ancestries across the chromosome and the extant segments of DNA descending unbroken by recombination from their most recent common ancestor (MRCA). DNA segments IBD from a recent common ancestor have high probability of being of the same allelic type. Non-IBD DNA is modeled as of independent allelic type, but the population frame of reference for defining allelic independence can vary. Whether of IBD, allelic similarity, or phenotypic covariance, comparisons may be made to other genomic regions of the same gametes, or to the same genomic regions in other sets of gametes or diploid individuals. In this review, I present IBD as the framework connecting evolutionary and coalescent theory with the analysis of genetic data observed on individuals. I focus on the high variance of the processes that determine IBD, its changes across the genome, and its impact on observable data.
Collapse
Affiliation(s)
- Elizabeth A Thompson
- Department of Statistics, University of Washington, Seattle, WA 98195-4322, USA.
| |
Collapse
|
189
|
Zaitlen N, Kraft P, Patterson N, Pasaniuc B, Bhatia G, Pollack S, Price AL. Using extended genealogy to estimate components of heritability for 23 quantitative and dichotomous traits. PLoS Genet 2013; 9:e1003520. [PMID: 23737753 PMCID: PMC3667752 DOI: 10.1371/journal.pgen.1003520] [Citation(s) in RCA: 260] [Impact Index Per Article: 23.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2012] [Accepted: 04/06/2013] [Indexed: 12/18/2022] Open
Abstract
Important knowledge about the determinants of complex human phenotypes can be obtained from the estimation of heritability, the fraction of phenotypic variation in a population that is determined by genetic factors. Here, we make use of extensive phenotype data in Iceland, long-range phased genotypes, and a population-wide genealogical database to examine the heritability of 11 quantitative and 12 dichotomous phenotypes in a sample of 38,167 individuals. Most previous estimates of heritability are derived from family-based approaches such as twin studies, which may be biased upwards by epistatic interactions or shared environment. Our estimates of heritability, based on both closely and distantly related pairs of individuals, are significantly lower than those from previous studies. We examine phenotypic correlations across a range of relationships, from siblings to first cousins, and find that the excess phenotypic correlation in these related individuals is predominantly due to shared environment as opposed to dominance or epistasis. We also develop a new method to jointly estimate narrow-sense heritability and the heritability explained by genotyped SNPs. Unlike existing methods, this approach permits the use of information from both closely and distantly related pairs of individuals, thereby reducing the variance of estimates of heritability explained by genotyped SNPs while preventing upward bias. Our results show that common SNPs explain a larger proportion of the heritability than previously thought, with SNPs present on Illumina 300K genotyping arrays explaining more than half of the heritability for the 23 phenotypes examined in this study. Much of the remaining heritability is likely to be due to rare alleles that are not captured by standard genotyping arrays.
Collapse
Affiliation(s)
- Noah Zaitlen
- Department of Medicine, Lung Biology Center, University of California San Francisco, San Francisco, California, United States of America
- * E-mail: (NZ); (ALP)
| | - Peter Kraft
- Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
- Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Nick Patterson
- Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Bogdan Pasaniuc
- Interdepartmental Program in Bioinformatics Pathology and Laboratory Medicine, University of California Los Angeles, Los Angeles, California, United States of America
| | - Gaurav Bhatia
- Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
- Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Samuela Pollack
- Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
- Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Alkes L. Price
- Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
- Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- * E-mail: (NZ); (ALP)
| |
Collapse
|
190
|
Ralph P, Coop G. The geography of recent genetic ancestry across Europe. PLoS Biol 2013; 11:e1001555. [PMID: 23667324 PMCID: PMC3646727 DOI: 10.1371/journal.pbio.1001555] [Citation(s) in RCA: 192] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2012] [Accepted: 03/27/2013] [Indexed: 01/11/2023] Open
Abstract
A genomic survey of recent genealogical relatedness reveals the close ties of kinship and the impact of events across the past 3,000 years of European history. The recent genealogical history of human populations is a complex mosaic formed by individual migration, large-scale population movements, and other demographic events. Population genomics datasets can provide a window into this recent history, as rare traces of recent shared genetic ancestry are detectable due to long segments of shared genomic material. We make use of genomic data for 2,257 Europeans (in the Population Reference Sample [POPRES] dataset) to conduct one of the first surveys of recent genealogical ancestry over the past 3,000 years at a continental scale. We detected 1.9 million shared long genomic segments, and used the lengths of these to infer the distribution of shared ancestors across time and geography. We find that a pair of modern Europeans living in neighboring populations share around 2–12 genetic common ancestors from the last 1,500 years, and upwards of 100 genetic ancestors from the previous 1,000 years. These numbers drop off exponentially with geographic distance, but since these genetic ancestors are a tiny fraction of common genealogical ancestors, individuals from opposite ends of Europe are still expected to share millions of common genealogical ancestors over the last 1,000 years. There is also substantial regional variation in the number of shared genetic ancestors. For example, there are especially high numbers of common ancestors shared between many eastern populations that date roughly to the migration period (which includes the Slavic and Hunnic expansions into that region). Some of the lowest levels of common ancestry are seen in the Italian and Iberian peninsulas, which may indicate different effects of historical population expansions in these areas and/or more stably structured populations. Population genomic datasets have considerable power to uncover recent demographic history, and will allow a much fuller picture of the close genealogical kinship of individuals across the world. Few of us know our family histories more than a few generations back. It is therefore easy to overlook the fact that we are all distant cousins, related to one another via a vast network of relationships. Here we use genome-wide data from European individuals to investigate these relationships over the past 3,000 years, by looking for long stretches of genome that are shared between pairs of individuals through their inheritance from common genetic ancestors. We quantify this ubiquitous recent common ancestry, showing for instance that even pairs of individuals from opposite ends of Europe share hundreds of genetic common ancestors over this time period. Despite this degree of commonality, there are also striking regional differences. Southeastern Europeans, for example, share large numbers of common ancestors that date roughly to the era of the Slavic and Hunnic expansions around 1,500 years ago, while most common ancestors that Italians share with other populations lived longer than 2,500 years ago. The study of long stretches of shared genetic material promises to uncover rich information about many aspects of recent population history.
Collapse
Affiliation(s)
- Peter Ralph
- Department of Evolution and Ecology & Center for Population Biology, University of California, Davis, California, United States of America
- * E-mail: (PR); (GC)
| | - Graham Coop
- Department of Evolution and Ecology & Center for Population Biology, University of California, Davis, California, United States of America
- * E-mail: (PR); (GC)
| |
Collapse
|
191
|
Powell JE, Henders AK, McRae AF, Kim J, Hemani G, Martin NG, Dermitzakis ET, Gibson G, Montgomery GW, Visscher PM. Congruence of additive and non-additive effects on gene expression estimated from pedigree and SNP data. PLoS Genet 2013; 9:e1003502. [PMID: 23696747 PMCID: PMC3656157 DOI: 10.1371/journal.pgen.1003502] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Accepted: 03/22/2013] [Indexed: 01/13/2023] Open
Abstract
There is increasing evidence that heritable variation in gene expression underlies genetic variation in susceptibility to disease. Therefore, a comprehensive understanding of the similarity between relatives for transcript variation is warranted--in particular, dissection of phenotypic variation into additive and non-additive genetic factors and shared environmental effects. We conducted a gene expression study in blood samples of 862 individuals from 312 nuclear families containing MZ or DZ twin pairs using both pedigree and genotype information. From a pedigree analysis we show that the vast majority of genetic variation across 17,994 probes is additive, although non-additive genetic variation is identified for 960 transcripts. For 180 of the 960 transcripts with non-additive genetic variation, we identify expression quantitative trait loci (eQTL) with dominance effects in a sample of 339 unrelated individuals and replicate 31% of these associations in an independent sample of 139 unrelated individuals. Over-dominance was detected and replicated for a trans association between rs12313805 and ETV6, located 4MB apart on chromosome 12. Surprisingly, only 17 probes exhibit significant levels of common environmental effects, suggesting that environmental and lifestyle factors common to a family do not affect expression variation for most transcripts, at least those measured in blood. Consistent with the genetic architecture of common diseases, gene expression is predominantly additive, but a minority of transcripts display non-additive effects.
Collapse
Affiliation(s)
- Joseph E Powell
- University of Queensland Diamantina Institute, University of Queensland, Princess Alexandra Hospital, Brisbane, Queensland, Australia.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
192
|
Garcia-Cortes LA, Legarra A, Chevalet C, Toro MA. Variance and covariance of actual relationships between relatives at one locus. PLoS One 2013; 8:e57003. [PMID: 23451134 PMCID: PMC3579841 DOI: 10.1371/journal.pone.0057003] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2012] [Accepted: 01/17/2013] [Indexed: 11/18/2022] Open
Abstract
The relationship between pairs of individuals is an important topic in many areas of population and quantitative genetics. It is usually measured as the proportion of thegenome identical by descent shared by the pair and it can be inferred from pedigree information. But there is a variance in actual relationships as a consequence of Mendelian sampling, whose general formula has not been developed. The goal of this work is to develop this general formula for the one-locus situation,. We provide simple expressions for the variances and covariances of all actual relationships in an arbitrary complex pedigree. The proposed method relies on the use of the nine identity coefficients and the generalized relationship coefficients; formulas have been checked by computer simulation. Finally two examples for a short pedigree of dogs and a long pedigree of sheep are given.
Collapse
Affiliation(s)
| | | | | | - Miguel Angel Toro
- Departamento de Producción Animal, Universidad Politécnica de Madrid, Madrid, Spain
| |
Collapse
|
193
|
Pszczola M, Strabel T, van Arendonk JAM, Calus MPL. The impact of genotyping different groups of animals on accuracy when moving from traditional to genomic selection. J Dairy Sci 2013; 95:5412-5421. [PMID: 22916948 DOI: 10.3168/jds.2012-5550] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2012] [Accepted: 05/28/2012] [Indexed: 12/13/2022]
Abstract
Compared with traditional selection, the use of genomic information tends to increase the accuracy of estimated breeding values (EBV). The cause of this increase is, however, unknown. To explore this phenomenon, this study investigated whether the increase in accuracy when moving from traditional (AA) to genomic selection (GG) was mainly due to genotyping the reference population (GA) or the evaluated animals (AG). In it, a combined relationship matrix for simultaneous use of genotyped and ungenotyped animals was applied. A simulated data set reflected the dairy cattle population. Four differently designed (i.e., different average relationships within the reference population) small reference populations and 3 heritability levels were considered. The animals in the reference populations had high, moderate, low, and random (RND) relationships. The evaluated animals were juveniles. The small reference populations simulated difficult or expensive to measure traits (i.e., methane emission). The accuracy of selection was expressed as the reliability of (genomic) EBV and was predicted based on selection index theory using relationships. Connectedness between the reference populations and evaluated animals was calculated using the prediction error variance. Average (genomic) EBV reliabilities increased with heritability and with a decrease in the average relationship within the reference population. Reliabilities in AA and AG were lower than those in GG and were higher than those in GA (respectively, 0.039, 0.042, 0.052, and 0.048 for RND and a heritability of 0.01). Differences between AA and GA were small. Average connectedness with all animals in the reference population for all scenarios and reference populations ranged from 0.003 to 0.024; it was lowest when the animals were not genotyped (AA; e.g., 0.004 for RND) and highest when all the animals were genotyped (GG; e.g., 0.024 for RND). Differences present across designs of the reference populations were very small. Genomic relationships among animals in the reference population might be less important than those for the evaluated animals with no phenotypic observations. Thus, the main origin of the gain in accuracy when using genomic selection is due to genotyping the evaluated animals. However, genotyping only one group of animals will always yield less accurate estimates.
Collapse
Affiliation(s)
- M Pszczola
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, 8200 AB Lelystad, the Netherlands; Animal Breeding and Genomics Centre, Wageningen University, 6700 AH Wageningen, the Netherlands; Department of Genetics and Animal Breeding, Poznan University of Life Sciences, Wolynska 33, 60-637 Poznan, Poland.
| | - T Strabel
- Department of Genetics and Animal Breeding, Poznan University of Life Sciences, Wolynska 33, 60-637 Poznan, Poland
| | - J A M van Arendonk
- Animal Breeding and Genomics Centre, Wageningen University, 6700 AH Wageningen, the Netherlands
| | - M P L Calus
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, 8200 AB Lelystad, the Netherlands
| |
Collapse
|
194
|
The effect of linkage disequilibrium and family relationships on the reliability of genomic prediction. Genetics 2012; 193:621-31. [PMID: 23267052 DOI: 10.1534/genetics.112.146290] [Citation(s) in RCA: 128] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Although the concept of genomic selection relies on linkage disequilibrium (LD) between quantitative trait loci and markers, reliability of genomic predictions is strongly influenced by family relationships. In this study, we investigated the effects of LD and family relationships on reliability of genomic predictions and the potential of deterministic formulas to predict reliability using population parameters in populations with complex family structures. Five groups of selection candidates were simulated by taking different information sources from the reference population into account: (1) allele frequencies, (2) LD pattern, (3) haplotypes, (4) haploid chromosomes, and (5) individuals from the reference population, thereby having real family relationships with reference individuals. Reliabilities were predicted using genomic relationships among 529 reference individuals and their relationships with selection candidates and with a deterministic formula where the number of effective chromosome segments (M(e)) was estimated based on genomic and additive relationship matrices for each scenario. At a heritability of 0.6, reliabilities based on genomic relationships were 0.002 ± 0.0001 (allele frequencies), 0.022 ± 0.001 (LD pattern), 0.018 ± 0.001 (haplotypes), 0.100 ± 0.008 (haploid chromosomes), and 0.318 ± 0.077 (family relationships). At a heritability of 0.1, relative differences among groups were similar. For all scenarios, reliabilities were similar to predictions with a deterministic formula using estimated M(e). So, reliabilities can be predicted accurately using empirically estimated M(e) and level of relationship with reference individuals has a much higher effect on the reliability than linkage disequilibrium per se. Furthermore, accumulated length of shared haplotypes is more important in determining the reliability of genomic prediction than the individual shared haplotype length.
Collapse
|
195
|
Decker JE, Vasco DA, McKay SD, McClure MC, Rolf MM, Kim J, Northcutt SL, Bauck S, Woodward BW, Schnabel RD, Taylor JF. A novel analytical method, Birth Date Selection Mapping, detects response of the Angus (Bos taurus) genome to selection on complex traits. BMC Genomics 2012; 13:606. [PMID: 23140540 PMCID: PMC3532096 DOI: 10.1186/1471-2164-13-606] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2012] [Accepted: 10/31/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Several methods have recently been developed to identify regions of the genome that have been exposed to strong selection. However, recent theoretical and empirical work suggests that polygenic models are required to identify the genomic regions that are more moderately responding to ongoing selection on complex traits. We examine the effects of multi-trait selection on the genome of a population of US registered Angus beef cattle born over a 50-year period representing approximately 10 generations of selection. We present results from the application of a quantitative genetic model, called Birth Date Selection Mapping, to identify signatures of recent ongoing selection. RESULTS We show that US Angus cattle have been systematically selected to alter their mean additive genetic merit for most of the 16 production traits routinely recorded by breeders. Using Birth Date Selection Mapping, we estimate the time-dependency of allele frequency for 44,817 SNP loci using genomic best linear unbiased prediction, generalized least squares, and BayesCπ analyses. Finally, we reconstruct the primary phenotypes that have historically been exposed to selection from a genome-wide analysis of the 16 production traits and gene ontology enrichment analysis. CONCLUSIONS We demonstrate that Birth Date Selection Mapping utilizing mixed models corrects for time-dependent pedigree sampling effects that lead to spurious SNP associations and reveals genomic signatures of ongoing selection on complex traits. Because multiple traits have historically been selected in concert and most quantitative trait loci have small effects, selection has incrementally altered allele frequencies throughout the genome. Two quantitative trait loci of large effect were not the most strongly selected of the loci due to their antagonistic pleiotropic effects on strongly selected phenotypes. Birth Date Selection Mapping may readily be extended to temporally-stratified human or model organism populations.
Collapse
Affiliation(s)
- Jared E Decker
- Division of Animal Sciences, University of Missouri, Columbia, 65211, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
196
|
Erbe M, Hayes BJ, Matukumalli LK, Goswami S, Bowman PJ, Reich CM, Mason BA, Goddard ME. Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels. J Dairy Sci 2012; 95:4114-29. [PMID: 22720968 DOI: 10.3168/jds.2011-5019] [Citation(s) in RCA: 397] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2011] [Accepted: 02/27/2012] [Indexed: 11/19/2022]
Abstract
Achieving accurate genomic estimated breeding values for dairy cattle requires a very large reference population of genotyped and phenotyped individuals. Assembling such reference populations has been achieved for breeds such as Holstein, but is challenging for breeds with fewer individuals. An alternative is to use a multi-breed reference population, such that smaller breeds gain some advantage in accuracy of genomic estimated breeding values (GEBV) from information from larger breeds. However, this requires that marker-quantitative trait loci associations persist across breeds. Here, we assessed the gain in accuracy of GEBV in Jersey cattle as a result of using a combined Holstein and Jersey reference population, with either 39,745 or 624,213 single nucleotide polymorphism (SNP) markers. The surrogate used for accuracy was the correlation of GEBV with daughter trait deviations in a validation population. Two methods were used to predict breeding values, either a genomic BLUP (GBLUP_mod), or a new method, BayesR, which used a mixture of normal distributions as the prior for SNP effects, including one distribution that set SNP effects to zero. The GBLUP_mod method scaled both the genomic relationship matrix and the additive relationship matrix to a base at the time the breeds diverged, and regressed the genomic relationship matrix to account for sampling errors in estimating relationship coefficients due to a finite number of markers, before combining the 2 matrices. Although these modifications did result in less biased breeding values for Jerseys compared with an unmodified genomic relationship matrix, BayesR gave the highest accuracies of GEBV for the 3 traits investigated (milk yield, fat yield, and protein yield), with an average increase in accuracy compared with GBLUP_mod across the 3 traits of 0.05 for both Jerseys and Holsteins. The advantage was limited for either Jerseys or Holsteins in using 624,213 SNP rather than 39,745 SNP (0.01 for Holsteins and 0.03 for Jerseys, averaged across traits). Even this limited and nonsignificant advantage was only observed when BayesR was used. An alternative panel, which extracted the SNP in the transcribed part of the bovine genome from the 624,213 SNP panel (to give 58,532 SNP), performed better, with an increase in accuracy of 0.03 for Jerseys across traits. This panel captures much of the increased genomic content of the 624,213 SNP panel, with the advantage of a greatly reduced number of SNP effects to estimate. Taken together, using this panel, a combined breed reference and using BayesR rather than GBLUP_mod increased the accuracy of GEBV in Jerseys from 0.43 to 0.52, averaged across the 3 traits.
Collapse
Affiliation(s)
- M Erbe
- Department of Animal Sciences, Animal Breeding and Genetics Group, Georg-August-University Göttingen, 37075 Göttingen, Germany
| | | | | | | | | | | | | | | |
Collapse
|
197
|
Shrinkage estimation of the realized relationship matrix. G3-GENES GENOMES GENETICS 2012; 2:1405-13. [PMID: 23173092 PMCID: PMC3484671 DOI: 10.1534/g3.112.004259] [Citation(s) in RCA: 254] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/22/2012] [Accepted: 09/10/2012] [Indexed: 11/29/2022]
Abstract
The additive relationship matrix plays an important role in mixed model prediction of breeding values. For genotype matrix X (loci in columns), the product XX′ is widely used as a realized relationship matrix, but the scaling of this matrix is ambiguous. Our first objective was to derive a proper scaling such that the mean diagonal element equals 1+f, where f is the inbreeding coefficient of the current population. The result is a formula involving the covariance matrix for sampling genomic loci, which must be estimated with markers. Our second objective was to investigate whether shrinkage estimation of this covariance matrix can improve the accuracy of breeding value (GEBV) predictions with low-density markers. Using an analytical formula for shrinkage intensity that is optimal with respect to mean-squared error, simulations revealed that shrinkage can significantly increase GEBV accuracy in unstructured populations, but only for phenotyped lines; there was no benefit for unphenotyped lines. The accuracy gain from shrinkage increased with heritability, but at high heritability (> 0.6) this benefit was irrelevant because phenotypic accuracy was comparable. These trends were confirmed in a commercial pig population with progeny-test-estimated breeding values. For an anonymous trait where phenotypic accuracy was 0.58, shrinkage increased the average GEBV accuracy from 0.56 to 0.62 (SE < 0.00) when using random sets of 384 markers from a 60K array. We conclude that when moderate-accuracy phenotypes and low-density markers are available for the candidates of genomic selection, shrinkage estimation of the relationship matrix can improve genetic gain.
Collapse
|
198
|
Cussens J, Bartlett M, Jones EM, Sheehan NA. Maximum Likelihood Pedigree Reconstruction Using Integer Linear Programming. Genet Epidemiol 2012; 37:69-83. [DOI: 10.1002/gepi.21686] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2012] [Revised: 08/30/2012] [Accepted: 09/07/2012] [Indexed: 11/10/2022]
Affiliation(s)
- James Cussens
- Department of Computer Science; University of York; York; North Yorkshire; United Kingdom
| | - Mark Bartlett
- Department of Computer Science; University of York; York; North Yorkshire; United Kingdom
| | - Elinor M. Jones
- Department of Health Sciences; University of Leicester; Leicester; Leicestershire; United Kingdom
| | - Nuala A. Sheehan
- Department of Health Sciences; University of Leicester; Leicester; Leicestershire; United Kingdom
| |
Collapse
|
199
|
Maxa J, Neuditschko M, Russ I, Förster M, Medugorac I. Genome-wide association mapping of milk production traits in Braunvieh cattle. J Dairy Sci 2012; 95:5357-5364. [DOI: 10.3168/jds.2011-4673] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2011] [Accepted: 05/28/2012] [Indexed: 11/19/2022]
|
200
|
Zaitlen N, Kraft P. Heritability in the genome-wide association era. Hum Genet 2012; 131:1655-64. [PMID: 22821350 DOI: 10.1007/s00439-012-1199-6] [Citation(s) in RCA: 121] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2012] [Accepted: 06/29/2012] [Indexed: 02/02/2023]
Abstract
Heritability, the fraction of phenotypic variation explained by genetic variation, has been estimated for many phenotypes in a range of populations, organisms, and time points. The recent development of efficient genotyping and sequencing technology has led researchers to attempt to identify the genetic variants responsible for the genetic component of phenotype directly via GWAS. The gap between the phenotypic variance explained by GWAS results and those estimated from classical heritability methods has been termed the "missing heritability problem". In this work, we examine modern methods for estimating heritability, which use the genotype and sequence data directly. We discuss them in the context of classical heritability methods, the missing heritability problem, and describe their implications for understanding the genetic architecture of complex phenotypes.
Collapse
Affiliation(s)
- Noah Zaitlen
- Department of Epidemiology, Harvard School of Public Health, Boston, MA 02115, USA.
| | | |
Collapse
|