Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Masuda Y, Baba T, Suzuki M. Application of supernodal sparse factorization and inversion to the estimation of (co)variance components by residual maximum likelihood. J Anim Breed Genet 2013;131:227-36. [PMID: 24906028 DOI: 10.1111/jbg.12058] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2012] [Accepted: 09/12/2013] [Indexed: 11/29/2022]

For:	Masuda Y, Baba T, Suzuki M. Application of supernodal sparse factorization and inversion to the estimation of (co)variance components by residual maximum likelihood. J Anim Breed Genet 2013;131:227-36. [PMID: 24906028 DOI: 10.1111/jbg.12058] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2012] [Accepted: 09/12/2013] [Indexed: 11/29/2022]

Number

Cited by Other Article(s)

Baba T, Morota G, Kawakami J, Gotoh Y, Oka T, Masuda Y, Brito LF, Cockrum RR, Kawahara T. Longitudinal genome-wide association analysis using a single-step random regression model for height in Japanese Holstein cattle. JDS COMMUNICATIONS 2023;4:363-368. [PMID: 37727246 PMCID: PMC10505781 DOI: 10.3168/jdsc.2022-0347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Accepted: 03/22/2023] [Indexed: 09/21/2023]

Meyer K. Reducing computational demands of restricted maximum likelihood estimation with genomic relationship matrices. Genet Sel Evol 2023;55:7. [PMID: 36698054 PMCID: PMC9875494 DOI: 10.1186/s12711-023-00781-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 01/12/2023] [Indexed: 01/26/2023] Open

Xavier A, Habier D. A new approach fits multivariate genomic prediction models efficiently. Genet Sel Evol 2022;54:45. [PMID: 35715755 PMCID: PMC9204867 DOI: 10.1186/s12711-022-00730-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 05/13/2022] [Indexed: 12/03/2022] Open

Abstract

Background

Fast, memory-efficient, and reliable algorithms for estimating genomic estimated breeding values (GEBV) for multiple traits and environments are needed to make timely decisions in breeding. Multivariate genomic prediction exploits genetic correlations between traits and environments to increase accuracy of GEBV compared to univariate methods. These genetic correlations are estimated simultaneously with GEBV, because they are specific to year, environment, and management. However, estimating genetic parameters is computationally demanding with restricted maximum likelihood (REML) and Bayesian samplers, and canonical transformations or orthogonalizations cannot be used for unbalanced experimental designs.

Methods

We propose a multivariate randomized Gauss–Seidel algorithm for simultaneous estimation of model effects and genetic parameters. Two previously proposed methods for estimating genetic parameters were combined with a Gauss–Seidel (GS) solver, and were called Tilde-Hat-GS (THGS) and Pseudo-Expectation-GS (PEGS). Balanced and unbalanced experimental designs were simulated to compare runtime, bias and accuracy of GEBV, and bias and standard errors of estimates of heritabilities and genetic correlations of THGS, PEGS, and REML. Models with 10 to 400 response variables, 1279 to 42,034 genetic markers, and 5990 to 1.85 million observations were fitted.

Results

Runtime of PEGS and THGS was a fraction of REML. Accuracies of GEBV were slightly lower than those from REML, but higher than those from the univariate approach, hence THGS and PEGS exploited genetic correlations. For 500 to 600 observations per response variable, biases of estimates of genetic parameters of THGS and PEGS were small, but standard errors of estimates of genetic correlations were higher than for REML. Bias and standard errors decreased as sample size increased. For balanced designs, GEBV and estimates of genetic correlations from THGS were unbiased when only an intercept and eigenvectors of genotype scores were fitted.

Conclusions

THGS and PEGS are fast and memory-efficient algorithms for multivariate genomic prediction for balanced and unbalanced experimental designs. They are scalable for increasing numbers of environments and genetic markers. Accuracy of GEBV was comparable to REML. Estimates of genetic parameters had little bias, but their standard errors were larger than for REML. More studies are needed to evaluate the proposed methods for datasets that contain selection.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12711-022-00730-w.

Collapse

Junqueira VS, Lourenco D, Masuda Y, Cardoso FF, Lopes PS, Silva FFE, Misztal I. Is single-step genomic REML with the algorithm for proven and young more computationally efficient when less generations of data are present? J Anim Sci 2022;100:skac082. [PMID: 35289906 PMCID: PMC9118993 DOI: 10.1093/jas/skac082] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 03/10/2022] [Indexed: 12/04/2022] Open

Abstract

Efficient computing techniques allow the estimation of variance components for virtually any traditional dataset. When genomic information is available, variance components can be estimated using genomic REML (GREML). If only a portion of the animals have genotypes, single-step GREML (ssGREML) is the method of choice. The genomic relationship matrix (G) used in both cases is dense, limiting computations depending on the number of genotyped animals. The algorithm for proven and young (APY) can be used to create a sparse inverse of G (GAPY~-1) with close to linear memory and computing requirements. In ssGREML, the inverse of the realized relationship matrix (H-1) also includes the inverse of the pedigree relationship matrix, which can be dense with a long pedigree, but sparser with short. The main purpose of this study was to investigate whether costs of ssGREML can be reduced using APY with truncated pedigree and phenotypes. We also investigated the impact of truncation on variance components estimation when different numbers of core animals are used in APY. Simulations included 150K animals from 10 generations, with selection. Phenotypes (h2 = 0.3) were available for all animals in generations 1-9. A total of 30K animals in generations 8 and 9, and 15K validation animals in generation 10 were genotyped for 52,890 SNP. Average information REML and ssGREML with G-1 and GAPY~-1 using 1K, 5K, 9K, and 14K core animals were compared. Variance components are impacted when the core group in APY represents the number of eigenvalues explaining a small fraction of the total variation in G. The most time-consuming operation was the inversion of G, with more than 50% of the total time. Next, numerical factorization consumed nearly 30% of the total computing time. On average, a 7% decrease in the computing time for ordering was observed by removing each generation of data. APY can be successfully applied to create the inverse of the genomic relationship matrix used in ssGREML for estimating variance components. To ensure reliable variance component estimation, it is important to use a core size that corresponds to the number of largest eigenvalues explaining around 98% of total variation in G. When APY is used, pedigrees can be truncated to increase the sparsity of H and slightly reduce computing time for ordering and symbolic factorization, with no impact on the estimates.

Collapse

Misztal I, Aguilar I, Lourenco D, Ma L, Steibel JP, Toro M. Emerging issues in genomic selection. J Anim Sci 2021;99:skab092. [PMID: 33773494 PMCID: PMC8186541 DOI: 10.1093/jas/skab092] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Accepted: 03/26/2021] [Indexed: 12/22/2022] Open

Abstract

Genomic selection (GS) is now practiced successfully across many species. However, many questions remain, such as long-term effects, estimations of genomic parameters, robustness of genome-wide association study (GWAS) with small and large datasets, and stability of genomic predictions. This study summarizes presentations from the authors at the 2020 American Society of Animal Science (ASAS) symposium. The focus of many studies until now is on linkage disequilibrium between two loci. Ignoring higher-level equilibrium may lead to phantom dominance and epistasis. The Bulmer effect leads to a reduction of the additive variance; however, the selection for increased recombination rate can release anew genetic variance. With genomic information, estimates of genetic parameters may be biased by genomic preselection, but costs of estimation can increase drastically due to the dense form of the genomic information. To make the computation of estimates feasible, genotypes could be retained only for the most important animals, and methods of estimation should use algorithms that can recognize dense blocks in sparse matrices. GWASs using small genomic datasets frequently find many marker-trait associations, whereas studies using much bigger datasets find only a few. Most of the current tools use very simple models for GWAS, possibly causing artifacts. These models are adequate for large datasets where pseudo-phenotypes such as deregressed proofs indirectly account for important effects for traits of interest. Artifacts arising in GWAS with small datasets can be minimized by using data from all animals (whether genotyped or not), realistic models, and methods that account for population structure. Recent developments permit the computation of P-values from genomic best linear unbiased prediction (GBLUP), where models can be arbitrarily complex but restricted to genotyped animals only, and single-step GBLUP that also uses phenotypes from ungenotyped animals. Stability was an important part of nongenomic evaluations, where genetic predictions were stable in the absence of new data even with low prediction accuracies. Unfortunately, genomic evaluations for such animals change because all animals with genotypes are connected. A top-ranked animal can easily drop in the next evaluation, causing a crisis of confidence in genomic evaluations. While correlations between consecutive genomic evaluations are high, outliers can have differences as high as 1 SD. A solution to fluctuating genomic evaluations is to base selection decisions on groups of animals. Although many issues in GS have been solved, many new issues that require additional research continue to surface.

Collapse

Junqueira VS, Lopes PS, Lourenco D, Silva FFE, Cardoso FF. Applying the Metafounders Approach for Genomic Evaluation in a Multibreed Beef Cattle Population. Front Genet 2021;11:556399. [PMID: 33424914 PMCID: PMC7793833 DOI: 10.3389/fgene.2020.556399] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Accepted: 10/29/2020] [Indexed: 11/23/2022] Open

Abstract

Pedigree information is incomplete by nature and commonly not well-established because many of the genetic ties are not known a priori or can be wrong. The genomic era brought new opportunities to assess relationships between individuals. However, when pedigree and genomic information are used simultaneously, which is the case of single-step genomic BLUP (ssGBLUP), defining the genetic base is still a challenge. One alternative to overcome this challenge is to use metafounders, which are pseudo-individuals that describe the genetic relationship between the base population individuals. The purpose of this study was to evaluate the impact of metafounders on the estimation of breeding values for tick resistance under ssGBLUP for a multibreed population composed by Hereford, Braford, and Zebu animals. Three different scenarios were studied: pedigree-based model (BLUP), ssGBLUP, and ssGBLUP with metafounders (ssGBLUPm). In ssGBLUPm, a total of four different metafounders based on breed of origin (i.e., Hereford, Braford, Zebu, and unknown) were included for the animals with missing parents. The relationship coefficient between metafounders was in average 0.54 (ranging from 0.34 to 0.96) suggesting an overlap between ancestor populations. The estimates of metafounder relationships indicate that Hereford and Zebu breeds have a possible common ancestral relationship. Inbreeding coefficients calculated following the metafounder approach had less negative values, suggesting that ancestral populations were large enough and that gametes inherited from the historical population were not identical. Variance components were estimated based on ssGBLUPm, ssGBLUP, and BLUP, but the values from ssGBLUPm were scaled to provide a fair comparison with estimates from the other two models. In general, additive, residual, and phenotypic variance components in the Hereford population were smaller than in Braford across different models. The addition of genomic information increased heritability for Hereford, possibly because of improved genetic relationships. As expected, genomic models had greater predictive ability, with an additional gain for ssGBLUPm over ssGBLUP. The increase in predictive ability was greater for Herefords. Our results show the potential of using metafounders to increase accuracy of GEBV, and therefore, the rate of genetic gain in beef cattle populations with partial levels of missing pedigree information.

Collapse

Selle ML, Steinsland I, Powell O, Hickey JM, Gorjanc G. Spatial modelling improves genetic evaluation in smallholder breeding programs. Genet Sel Evol 2020;52:69. [PMID: 33198636 PMCID: PMC7670695 DOI: 10.1186/s12711-020-00588-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 11/03/2020] [Indexed: 01/13/2023] Open

Abstract

BACKGROUND

Breeders and geneticists use statistical models to separate genetic and environmental effects on phenotype. A common way to separate these effects is to model a descriptor of an environment, a contemporary group or herd, and account for genetic relationship between animals across environments. However, separating the genetic and environmental effects in smallholder systems is challenging due to small herd sizes and weak genetic connectedness across herds. We hypothesised that accounting for spatial relationships between nearby herds can improve genetic evaluation in smallholder systems. Furthermore, geographically referenced environmental covariates are increasingly available and could model underlying sources of spatial relationships. The objective of this study was therefore, to evaluate the potential of spatial modelling to improve genetic evaluation in dairy cattle smallholder systems.

METHODS

We performed simulations and real dairy cattle data analysis to test our hypothesis. We modelled environmental variation by estimating herd and spatial effects. Herd effects were considered independent, whereas spatial effects had distance-based covariance between herds. We compared these models using pedigree or genomic data.

RESULTS

The results show that in smallholder systems (i) standard models do not separate genetic and environmental effects accurately, (ii) spatial modelling increases the accuracy of genetic evaluation for phenotyped and non-phenotyped animals, (iii) environmental covariates do not substantially improve the accuracy of genetic evaluation beyond simple distance-based relationships between herds, (iv) the benefit of spatial modelling was largest when separating the genetic and environmental effects was challenging, and (v) spatial modelling was beneficial when using either pedigree or genomic data.

CONCLUSIONS

We have demonstrated the potential of spatial modelling to improve genetic evaluation in smallholder systems. This improvement is driven by establishing environmental connectedness between herds, which enhances separation of genetic and environmental effects. We suggest routine spatial modelling in genetic evaluations, particularly for smallholder systems. Spatial modelling could also have a major impact in studies of human and wild populations.

Collapse

Misztal I, Lourenco D, Legarra A. Current status of genomic evaluation. J Anim Sci 2020;98:skaa101. [PMID: 32267923 PMCID: PMC7183352 DOI: 10.1093/jas/skaa101] [Citation(s) in RCA: 72] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Accepted: 04/07/2020] [Indexed: 12/14/2022] Open

Abstract

Early application of genomic selection relied on SNP estimation with phenotypes or de-regressed proofs (DRP). Chips of 50k SNP seemed sufficient for an accurate estimation of SNP effects. Genomic estimated breeding values (GEBV) were composed of an index with parent average, direct genomic value, and deduction of a parental index to eliminate double counting. Use of SNP selection or weighting increased accuracy with small data sets but had minimal to no impact with large data sets. Efforts to include potentially causative SNP derived from sequence data or high-density chips showed limited or no gain in accuracy. After the implementation of genomic selection, EBV by BLUP became biased because of genomic preselection and DRP computed based on EBV required adjustments, and the creation of DRP for females is hard and subject to double counting. Genomic selection was greatly simplified by single-step genomic BLUP (ssGBLUP). This method based on combining genomic and pedigree relationships automatically creates an index with all sources of information, can use any combination of male and female genotypes, and accounts for preselection. To avoid biases, especially under strong selection, ssGBLUP requires that pedigree and genomic relationships are compatible. Because the inversion of the genomic relationship matrix (G) becomes costly with more than 100k genotyped animals, large data computations in ssGBLUP were solved by exploiting limited dimensionality of genomic data due to limited effective population size. With such dimensionality ranging from 4k in chickens to about 15k in cattle, the inverse of G can be created directly (e.g., by the algorithm for proven and young) at a linear cost. Due to its simplicity and accuracy, ssGBLUP is routinely used for genomic selection by the major chicken, pig, and beef industries. Single step can be used to derive SNP effects for indirect prediction and for genome-wide association studies, including computations of the P-values. Alternative single-step formulations exist that use SNP effects for genotyped or for all animals. Although genomics is the new standard in breeding and genetics, there are still some problems that need to be solved. This involves new validation procedures that are unaffected by selection, parameter estimation that accounts for all the genomic data used in selection, and strategies to address reduction in genetic variances after genomic selection was implemented.

Collapse

Aguilar I, Legarra A, Cardoso F, Masuda Y, Lourenco D, Misztal I. Frequentist p-values for large-scale-single step genome-wide association, with an application to birth weight in American Angus cattle. Genet Sel Evol 2019;51:28. [PMID: 31221101 PMCID: PMC6584984 DOI: 10.1186/s12711-019-0469-3] [Citation(s) in RCA: 82] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Accepted: 05/27/2019] [Indexed: 11/14/2022] Open

Colleau JJ, Palhière I, Rodríguez-Ramilo ST, Legarra A. A fast indirect method to compute functions of genomic relationships concerning genotyped and ungenotyped individuals, for diversity management. Genet Sel Evol 2017;49:87. [PMID: 29191178 PMCID: PMC5709854 DOI: 10.1186/s12711-017-0363-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Accepted: 11/24/2017] [Indexed: 12/01/2022] Open

Abstract

Background

Pedigree-based management of genetic diversity in populations, e.g., using optimal contributions, involves computation of the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{Ax}}$$\end{document}Ax type yielding elements (relationships) or functions (usually averages) of relationship matrices. For pedigree-based relationships \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{A}}$$\end{document}A, a very efficient method exists. When all the individuals of interest are genotyped, genomic management can be addressed using the genomic relationship matrix \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{G}}$$\end{document}G; however, to date, the computational problem of efficiently computing \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{Gx}}$$\end{document}Gx has not been well studied. When some individuals of interest are not genotyped, genomic management should consider the relationship matrix \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{H}}$$\end{document}H that combines genotyped and ungenotyped individuals; however, direct computation of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{Hx}}$$\end{document}Hx is computationally very demanding, because construction of a possibly huge matrix is required. Our work presents efficient ways of computing \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{Gx}}$$\end{document}Gx and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{Hx}}$$\end{document}Hx, with applications on real data from dairy sheep and dairy goat breeding schemes.

Results

For genomic relationships, an efficient indirect computation with quadratic instead of cubic cost is \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{x}} = {\mathbf{Z}}\left( {{\mathbf{Z^{\prime}x}}} \right)/k$$\end{document}x=ZZ′x/k, where Z is a matrix relating animals to genotypes. For the relationship matrix \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{H}}$$\end{document}H, we propose an indirect method based on the difference between vectors \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{Hx}} - {\mathbf{Ax}}$$\end{document}Hx-Ax, which involves computation of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{Ax}}$$\end{document}Ax and of products such as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{Gw}}$$\end{document}Gw and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{A}}_{22}^{ - 1} {\mathbf{w}}$$\end{document}A22-1w, where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{w}}$$\end{document}w is a working vector derived from \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{x}}$$\end{document}x. The latter computation is the most demanding but can be done using sparse Cholesky decompositions of matrix \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{A}}^{ - 1}$$\end{document}A-1, which allows handling very large genomic and pedigree data files. Studies based on simulations reported in the literature show that the trends of average relationships in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{H}}$$\end{document}H and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{A}}$$\end{document}A differ as genomic selection proceeds. When selection is based on genomic relationships but management is based on pedigree data, the true genetic diversity is overestimated. However, our tests on real data from sheep and goat obtained before genomic selection started do not show this.

Conclusions

We present efficient methods to compute elements and statistics of the genomic relationships \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{G}}$$\end{document}G and of matrix \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{H}}$$\end{document}H that combines ungenotyped and genotyped individuals. These methods should be useful to monitor and handle genomic diversity.

Collapse

Masuda Y, Misztal I, Legarra A, Tsuruta S, Lourenco DAL, Fragomeni BO, Aguilar I. Technical note: Avoiding the direct inversion of the numerator relationship matrix for genotyped animals in single-step genomic best linear unbiased prediction solved with the preconditioned conjugate gradient. J Anim Sci 2017;95:49-52. [PMID: 28177357 DOI: 10.2527/jas.2016.0699] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Masuda Y, Aguilar I, Tsuruta S, Misztal I. Technical note: Acceleration of sparse operations for average-information REML analyses with supernodal methods and sparse-storage refinements. J Anim Sci 2016;93:4670-4. [PMID: 26523559 DOI: 10.2527/jas.2015-9395] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Masuda Y, Misztal I, Tsuruta S, Legarra A, Aguilar I, Lourenco DAL, Fragomeni BO, Lawlor TJ. Implementation of genomic recursions in single-step genomic best linear unbiased predictor for US Holsteins with a large number of genotyped animals. J Dairy Sci 2016;99:1968-1974. [PMID: 26805987 DOI: 10.3168/jds.2015-10540] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2015] [Accepted: 12/01/2015] [Indexed: 11/19/2022]

Abstract

The objectives of this study were to develop and evaluate an efficient implementation in the computation of the inverse of genomic relationship matrix with the recursion algorithm, called the algorithm for proven and young (APY), in single-step genomic BLUP. We validated genomic predictions for young bulls with more than 500,000 genotyped animals in final score for US Holsteins. Phenotypic data included 11,626,576 final scores on 7,093,380 US Holstein cows, and genotypes were available for 569,404 animals. Daughter deviations for young bulls with no classified daughters in 2009, but at least 30 classified daughters in 2014 were computed using all the phenotypic data. Genomic predictions for the same bulls were calculated with single-step genomic BLUP using phenotypes up to 2009. We calculated the inverse of the genomic relationship matrix GAPY(-1) based on a direct inversion of genomic relationship matrix on a small subset of genotyped animals (core animals) and extended that information to noncore animals by recursion. We tested several sets of core animals including 9,406 bulls with at least 1 classified daughter, 9,406 bulls and 1,052 classified dams of bulls, 9,406 bulls and 7,422 classified cows, and random samples of 5,000 to 30,000 animals. Validation reliability was assessed by the coefficient of determination from regression of daughter deviation on genomic predictions for the predicted young bulls. The reliabilities were 0.39 with 5,000 randomly chosen core animals, 0.45 with the 9,406 bulls, and 7,422 cows as core animals, and 0.44 with the remaining sets. With phenotypes truncated in 2009 and the preconditioned conjugate gradient to solve mixed model equations, the number of rounds to convergence for core animals defined by bulls was 1,343; defined by bulls and cows, 2,066; and defined by 10,000 random animals, at most 1,629. With complete phenotype data, the number of rounds decreased to 858, 1,299, and at most 1,092, respectively. Setting up GAPY(-1) for 569,404 genotyped animals with 10,000 core animals took 1.3h and 57 GB of memory. The validation reliability with APY reaches a plateau when the number of core animals is at least 10,000. Predictions with APY have little differences in reliability among definitions of core animals. Single-step genomic BLUP with APY is applicable to millions of genotyped animals.

Collapse

Fragomeni BO, Lourenco DAL, Tsuruta S, Masuda Y, Aguilar I, Misztal I. Use of genomic recursions and algorithm for proven and young animals for single-step genomic BLUP analyses--a simulation study. J Anim Breed Genet 2015;132:340-5. [PMID: 25857518 DOI: 10.1111/jbg.12161] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2014] [Accepted: 03/13/2015] [Indexed: 11/29/2022]

DAIRRy-BLUP: a high-performance computing approach to genomic prediction. Genetics 2014;197:813-22. [PMID: 24736932 DOI: 10.1534/genetics.114.163683] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open