1
|
Leite NG, Bermann M, Tsuruta S, Misztal I, Lourenco D. Marker effect p-values for single-step GWAS with the algorithm for proven and young in large genotyped populations. Genet Sel Evol 2024; 56:59. [PMID: 39174924 PMCID: PMC11340074 DOI: 10.1186/s12711-024-00925-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 07/24/2024] [Indexed: 08/24/2024] Open
Abstract
BACKGROUND Single-nucleotide polymorphism (SNP) effects can be backsolved from ssGBLUP genomic estimated breeding values (GEBV) and used for genome-wide association studies (ssGWAS). However, obtaining p-values for those SNP effects relies on the inversion of dense matrices, which poses computational limitations in large genotyped populations. In this study, we present a method to approximate SNP p-values for ssGWAS with many genotyped animals. This method relies on the combination of a sparse approximation of the inverse of the genomic relationship matrix ( G A P Y - 1 ) built with the algorithm for proven and young ( APY ) and an approximation of the prediction error variance of SNP effects which does not require the inversion of the left-hand side (LHS) of the mixed model equations. To test the proposed p-value computing method, we used a reduced genotyped population of 50K genotyped animals and compared the approximated SNP p-values with benchmark p-values obtained with the direct inverse of LHS built with an exact genomic relationship matrix (G - 1 ) . Then, we applied the proposed approximation method to obtain SNP p-values for a larger genotyped population composed of 450K genotyped animals. RESULTS The same genomic regions on chromosomes 7 and 20 were identified across all p-value computing methods when using 50K genotyped animals. In terms of computational requirements, obtaining p-values with the proposed approximation reduced the wall-clock time by 38 times and the memory requirement by ten times compared to using the exact inversion of the LHS. When the approximation was applied to a population of 450K genotyped animals, two new significant regions on chromosomes 6 and 14 were uncovered, indicating an increase in GWAS detection power when including more genotypes in the analyses. The process of obtaining p-values with the approximation and 450K genotyped individuals took 24.5 wall-clock hours and 87.66GB of memory, which is expected to increase linearly with the addition of noncore genotyped individuals. CONCLUSIONS With the proposed method, obtaining p-values for SNP effects in ssGWAS is computationally feasible in large genotyped populations. The computational cost of obtaining p-values in ssGWAS may no longer be a limitation in extensive populations with many genotyped animals.
Collapse
Affiliation(s)
- Natália Galoro Leite
- 1Department of Animal and Dairy Science, University of Georgia, Athens, GA, 30602, USA.
| | - Matias Bermann
- 1Department of Animal and Dairy Science, University of Georgia, Athens, GA, 30602, USA
| | - Shogo Tsuruta
- 1Department of Animal and Dairy Science, University of Georgia, Athens, GA, 30602, USA
| | - Ignacy Misztal
- 1Department of Animal and Dairy Science, University of Georgia, Athens, GA, 30602, USA
| | - Daniela Lourenco
- 1Department of Animal and Dairy Science, University of Georgia, Athens, GA, 30602, USA
| |
Collapse
|
2
|
Chen CY, Knap PW, Bhatnagar AS, Tsuruta S, Lourenco D, Misztal I, Holl JW. Genetic parameters for pelvic organ prolapse in purebred and crossbred sows. Front Genet 2024; 15:1441303. [PMID: 39144723 PMCID: PMC11322066 DOI: 10.3389/fgene.2024.1441303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 07/22/2024] [Indexed: 08/16/2024] Open
Abstract
This study aimed to investigate genetic parameters for sow pelvic organ prolapse in purebred and crossbred herds. Pelvic organ prolapse was recorded as normal or prolapsed on the individual sow level across 32 purebred and 8 crossbred farms. In total, 75,162 purebred Landrace sows from a single maternal line were recorded between 2018 and 2023, while 18,988 commercial two-way crossbred (Landrace x Large White) sows were available between 2020 and 2023. There were 5,122,005 animals included in the pedigree. The prolapse in purebreds and crossbreds was considered two different traits in the model. Pedigrees of the crossbred sows were determined based on genotypes through parentage assignment. The average incidence rates were 1.81% and 3.93% for purebreds and crossbreds, respectively. The bivariate model incorporated fixed effects of parity group and region with random effects of contemporary group (farm and mating year and month at the first parity), additive genetic, and residual. Genetic parameter estimates were obtained using BLUPF90+ with the AIREML option. The estimated additive variance was larger in crossbreds than in purebreds. Estimates of heritability in the observed scale were 0.09 (0.006) for purebreds and 0.11 (0.014) for crossbreds, with a genetic correlation of 0.83 using a linear model. Results suggested that including data from crossbreds with higher incidence rate is beneficial and selection to reduce the prolapse incidence in purebred sow herds would also benefit commercial crossbred sow herds.
Collapse
Affiliation(s)
- Ching-Yi Chen
- The Pig Improvement Company, Genus plc, Hendersonville, TN, United States
| | - Pieter W. Knap
- The Pig Improvement Company, Genus plc, Isernhagen, Germany
| | - Adria S. Bhatnagar
- The Pig Improvement Company, Genus plc, Hendersonville, TN, United States
| | - Shogo Tsuruta
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, United States
| | - Daniela Lourenco
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, United States
| | - Ignacy Misztal
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, United States
| | - Justin W. Holl
- The Pig Improvement Company, Genus plc, Hendersonville, TN, United States
| |
Collapse
|
3
|
Zhuo Y, Du H, Diao C, Li W, Zhou L, Jiang L, Jiang J, Liu J. MAGE: metafounders-assisted genomic estimation of breeding value, a novel additive-dominance single-step model in crossbreeding systems. Bioinformatics 2024; 40:btae044. [PMID: 38268487 PMCID: PMC11212483 DOI: 10.1093/bioinformatics/btae044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 01/07/2024] [Accepted: 01/22/2024] [Indexed: 01/26/2024] Open
Abstract
MOTIVATION Utilizing both purebred and crossbred data in animal genetics is widely recognized as an optimal strategy for enhancing the predictive accuracy of breeding values. Practically, the different genetic background among several purebred populations and their crossbred offspring populations limits the application of traditional prediction methods. Several studies endeavor to predict the crossbred performance via the partial relationship, which divides the data into distinct sub-populations based on the common genetic background, such as one single purebred population and its corresponding crossbred descendant. However, this strategy makes prediction inaccurate due to ignoring half of the parental information of crossbreed animals. Furthermore, dominance effects, although playing a significant role in crossbreeding systems, cannot be modeled under such a prediction model. RESULTS To overcome this weakness, we developed a novel multi-breed single-step model using metafounders to assess ancestral relationships across diverse breeds under a unified framework. We proposed to use multi-breed dominance combined relationship matrices to model additive and dominance effects simultaneously. Our method provides a straightforward way to evaluate the heterosis of crossbreeds and the breeding values of purebred parents efficiently and accurately. We performed simulation and real data analyses to verify the potential of our proposed method. Our proposed model improved prediction accuracy under all scenarios considered compared to commonly used methods. AVAILABILITY AND IMPLEMENTATION The software for implementing our method is available at https://github.com/CAU-TeamLiuJF/MAGE.
Collapse
Affiliation(s)
- Yue Zhuo
- State Key Laboratory of Animal Biotech Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Heng Du
- State Key Laboratory of Animal Biotech Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - ChenGuang Diao
- State Key Laboratory of Animal Biotech Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - WeiNing Li
- State Key Laboratory of Animal Biotech Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Lei Zhou
- State Key Laboratory of Animal Biotech Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Li Jiang
- State Key Laboratory of Animal Biotech Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - JiCai Jiang
- Department of Animal Science, North Carolina State University, Raleigh, NC 27695, United States
| | - JianFeng Liu
- State Key Laboratory of Animal Biotech Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| |
Collapse
|
4
|
Jang S, Ros-Freixedes R, Hickey JM, Chen CY, Herring WO, Holl J, Misztal I, Lourenco D. Multi-line ssGBLUP evaluation using preselected markers from whole-genome sequence data in pigs. Front Genet 2023; 14:1163626. [PMID: 37252662 PMCID: PMC10213539 DOI: 10.3389/fgene.2023.1163626] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Accepted: 05/03/2023] [Indexed: 05/31/2023] Open
Abstract
Genomic evaluations in pigs could benefit from using multi-line data along with whole-genome sequencing (WGS) if the data are large enough to represent the variability across populations. The objective of this study was to investigate strategies to combine large-scale data from different terminal pig lines in a multi-line genomic evaluation (MLE) through single-step GBLUP (ssGBLUP) models while including variants preselected from whole-genome sequence (WGS) data. We investigated single-line and multi-line evaluations for five traits recorded in three terminal lines. The number of sequenced animals in each line ranged from 731 to 1,865, with 60k to 104k imputed to WGS. Unknown parent groups (UPG) and metafounders (MF) were explored to account for genetic differences among the lines and improve the compatibility between pedigree and genomic relationships in the MLE. Sequence variants were preselected based on multi-line genome-wide association studies (GWAS) or linkage disequilibrium (LD) pruning. These preselected variant sets were used for ssGBLUP predictions without and with weights from BayesR, and the performances were compared to that of a commercial porcine single-nucleotide polymorphisms (SNP) chip. Using UPG and MF in MLE showed small to no gain in prediction accuracy (up to 0.02), depending on the lines and traits, compared to the single-line genomic evaluation (SLE). Likewise, adding selected variants from the GWAS to the commercial SNP chip resulted in a maximum increase of 0.02 in the prediction accuracy, only for average daily feed intake in the most numerous lines. In addition, no benefits were observed when using preselected sequence variants in multi-line genomic predictions. Weights from BayesR did not help improve the performance of ssGBLUP. This study revealed limited benefits of using preselected whole-genome sequence variants for multi-line genomic predictions, even when tens of thousands of animals had imputed sequence data. Correctly accounting for line differences with UPG or MF in MLE is essential to obtain predictions similar to SLE; however, the only observed benefit of an MLE is to have comparable predictions across lines. Further investigation into the amount of data and novel methods to preselect whole-genome causative variants in combined populations would be of significant interest.
Collapse
Affiliation(s)
- Sungbong Jang
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, United States
| | - Roger Ros-Freixedes
- Departament de Ciència Animal, Universitat de Lleida-Agrotecnio-CERCA Center, Lleida, Spain
| | - John M Hickey
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Ching-Yi Chen
- The Pig Improvement Company, Genus plc, Hendersonville, TN, United States
| | - William O Herring
- The Pig Improvement Company, Genus plc, Hendersonville, TN, United States
| | - Justin Holl
- The Pig Improvement Company, Genus plc, Hendersonville, TN, United States
| | - Ignacy Misztal
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, United States
| | - Daniela Lourenco
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, United States
| |
Collapse
|
5
|
Pocrnic I, Lindgren F, Tolhurst D, Herring WO, Gorjanc G. Optimisation of the core subset for the APY approximation of genomic relationships. Genet Sel Evol 2022; 54:76. [PMID: 36418945 PMCID: PMC9682752 DOI: 10.1186/s12711-022-00767-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 10/31/2022] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND By entering the era of mega-scale genomics, we are facing many computational issues with standard genomic evaluation models due to their dense data structure and cubic computational complexity. Several scalable approaches have been proposed to address this challenge, such as the Algorithm for Proven and Young (APY). In APY, genotyped animals are partitioned into core and non-core subsets, which induces a sparser inverse of the genomic relationship matrix. This partitioning is often done at random. While APY is a good approximation of the full model, random partitioning can make results unstable, possibly affecting accuracy or even reranking animals. Here we present a stable optimisation of the core subset by choosing animals with the most informative genotype data. METHODS We derived a novel algorithm for optimising the core subset based on a conditional genomic relationship matrix or a conditional single nucleotide polymorphism (SNP) genotype matrix. We compared the accuracy of genomic predictions with different core subsets for simulated and real pig data sets. The core subsets were constructed (1) at random, (2) based on the diagonal of the genomic relationship matrix, (3) at random with weights from (2), or (4) based on the novel conditional algorithm. To understand the different core subset constructions, we visualise the population structure of the genotyped animals with linear Principal Component Analysis and non-linear Uniform Manifold Approximation and Projection. RESULTS All core subset constructions performed equally well when the number of core animals captured most of the variation in the genomic relationships, both in simulated and real data sets. When the number of core animals was not sufficiently large, there was substantial variability in the results with the random construction but no variability with the conditional construction. Visualisation of the population structure and chosen core animals showed that the conditional construction spreads core animals across the whole domain of genotyped animals in a repeatable manner. CONCLUSIONS Our results confirm that the size of the core subset in APY is critical. Furthermore, the results show that the core subset can be optimised with the conditional algorithm that achieves an optimal and repeatable spread of core animals across the domain of genotyped animals.
Collapse
Affiliation(s)
- Ivan Pocrnic
- grid.4305.20000 0004 1936 7988The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush Campus, Edinburgh, EH25 9RG UK
| | - Finn Lindgren
- grid.4305.20000 0004 1936 7988School of Mathematics, The University of Edinburgh, The King’s Buildings, Edinburgh, EH9 3FD UK
| | - Daniel Tolhurst
- grid.4305.20000 0004 1936 7988The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush Campus, Edinburgh, EH25 9RG UK
| | - William O. Herring
- Genus PIC, 100 Bluegrass Commons Blvd., Suite 2200, Hendersonville, TN 37075 USA
| | - Gregor Gorjanc
- grid.4305.20000 0004 1936 7988The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush Campus, Edinburgh, EH25 9RG UK
| |
Collapse
|
6
|
Leite NG, Chen CY, Herring WO, Holl J, Tsuruta S, Lourenco D. Leveraging low-density crossbred genotypes to offset crossbred phenotypes and their impact on purebred predictions. J Anim Sci 2022; 100:6780296. [PMID: 36309902 PMCID: PMC9733505 DOI: 10.1093/jas/skac359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 10/27/2022] [Indexed: 12/15/2022] Open
Abstract
The objectives of this study were to 1) investigate the predictability and bias of genomic breeding values (GEBV) of purebred (PB) sires for CB performance when CB genotypes imputed from a low-density panel are available, 2) assess if the availability of those CB genotypes can be used to partially offset CB phenotypic recording, and 3) investigate the impact of including imputed CB genotypes in genomic analyses when using the algorithm for proven and young (APY). Two pig populations with up to 207,375 PB and 32,893 CB phenotypic records per trait and 138,026 PB and 32,893 CB genotypes were evaluated. PB sires were genotyped for a 50K panel, whereas CB animals were genotyped for a low-density panel of 600 SNP and imputed to 50K. The predictability and bias of GEBV of PB sires for backfat thickness (BFX) and average daily gain recorded (ADGX) recorded on CB animals were assessed when CB genotypes were available or not in the analyses. In the first set of analyses, direct inverses of the genomic relationship matrix (G) were used with phenotypic datasets truncated at different time points. In the next step, we evaluated the APY algorithm with core compositions differing in the CB genotype contributions. After that, the performance of core compositions was compared with an analysis using a random PB core from a purely PB genomic set. The number of rounds to convergence was recorded for all APY analyses. With the direct inverse of G in the first set of analyses, adding CB genotypes imputed from a low-density panel (600 SNP) did not improve predictability or reduce the bias of PB sires' GEBV for CB performance, even for sires with fewer CB progeny phenotypes in the analysis. That indicates that the inclusion of CB genotypes primarily used for inferring pedigree in commercial farms is of no benefit to offset CB phenotyping. When CB genotypes were incorporated into APY, a random core composition or a core with no CB genotypes reduced bias and the number of rounds to convergence but did not affect predictability. Still, a PB random core composition from a genomic set with only PB genotypes resulted in the highest predictability and the smallest number of rounds to convergence, although bias increased. Genotyping CB individuals for low-density panels is a valuable identification tool for linking CB phenotypes to pedigree; however, the inclusion of those CB genotypes imputed from a low-density panel (600 SNP) might not benefit genomic predictions for PB individuals or offset CB phenotyping for the evaluated CB performance traits. Further studies will help understand the usefulness of those imputed CB genotypes for traits with lower PB-CB genetic correlations and traits not recorded in the PB environment, such as mortality and disease traits.
Collapse
Affiliation(s)
| | | | | | | | - Shogo Tsuruta
- Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602, USA
| | - Daniela Lourenco
- Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602, USA
| |
Collapse
|
7
|
Lozada-Soto EA, Lourenco D, Maltecca C, Fix J, Schwab C, Shull C, Tiezzi F. Genotyping and phenotyping strategies for genetic improvement of meat quality and carcass composition in swine. Genet Sel Evol 2022; 54:42. [PMID: 35672700 PMCID: PMC9171933 DOI: 10.1186/s12711-022-00736-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 05/25/2022] [Indexed: 12/04/2022] Open
Abstract
Background Meat quality and composition traits have become valuable in modern pork production; however, genetic improvement has been slow due to high phenotyping costs. Combining genomic information with multi-trait indirect selection based on cheaper indicator traits is an alternative for continued cost-effective genetic improvement. Methods Data from an ongoing breeding program were used in this study. Phenotypic and genomic information was collected on three-way crossbred and purebred Duroc animals belonging to 28 half-sib families. We applied different methods to assess the value of using purebred and crossbred information (both genomic and phenotypic) to predict expensive-to-record traits measured on crossbred individuals. Estimation of multi-trait variance components set the basis for comparing the different scenarios, together with a fourfold cross-validation approach to validate the phenotyping schemes under four genotyping strategies. Results The benefit of including genomic information for multi-trait prediction depended on the breeding goal trait, the indicator traits included, and the source of genomic information. While some traits benefitted significantly from genotyping crossbreds (e.g., loin intramuscular fat content, backfat depth, and belly weight), multi-trait prediction was advantageous for some traits even in the absence of genomic information (e.g., loin muscle weight, subjective color, and subjective firmness). Conclusions Our results show the value of using different sources of phenotypic and genomic information. For most of the traits studied, including crossbred genomic information was more beneficial than performing multi-trait prediction. Thus, we recommend including crossbred individuals in the reference population when these are phenotyped for the breeding objective.
Collapse
Affiliation(s)
| | - Daniela Lourenco
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, 30602, USA
| | - Christian Maltecca
- Department of Animal Science, North Carolina State University, Raleigh, NC, 27695, USA
| | - Justin Fix
- Acuity Ag Solutions, LLC, Carlyle, IL, 62230, USA
| | - Clint Schwab
- Acuity Ag Solutions, LLC, Carlyle, IL, 62230, USA.,The Maschhoffs, LLC, Carlyle, IL, 62230, USA
| | - Caleb Shull
- The Maschhoffs, LLC, Carlyle, IL, 62230, USA
| | - Francesco Tiezzi
- Department of Animal Science, North Carolina State University, Raleigh, NC, 27695, USA.,Department of Agriculture, Food, Environment and Forestry (DAGRI), University of Florence, 50144, Florence, Italy
| |
Collapse
|
8
|
Fathoni A, Boonkum W, Chankitisakul V, Duangjinda M. An Appropriate Genetic Approach for Improving Reproductive Traits in Crossbred Thai-Holstein Cattle under Heat Stress Conditions. Vet Sci 2022; 9:163. [PMID: 35448661 PMCID: PMC9031002 DOI: 10.3390/vetsci9040163] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 03/19/2022] [Accepted: 03/26/2022] [Indexed: 01/16/2023] Open
Abstract
Thailand is a tropical country affected by global climate change and has high temperatures and humidity that cause heat stress in livestock. A temperature−humidity index (THI) is required to assess and evaluate heat stress levels in livestock. One of the livestock types in Thailand experiencing heat stress due to extreme climate change is crossbred dairy cattle. Genetic evaluations of heat tolerance in dairy cattle have been carried out for reproductive traits. Heritability values for reproductive traits are generally low (<0.10) because environmental factors heavily influence them. Consequently, genetic improvement for these traits would be slow compared to production traits. Positive and negative genetic correlations were found between reproductive traits and reproductive traits and yield traits. Several selection methods for reproductive traits have been introduced, i.e., the traditional method, marker-assisted selection (MAS), and genomic selection (GS). GS is the most promising technique and provides accurate results with a high genetic gain. Single-step genomic BLUP (ssGBLUP) has higher accuracy than the multi-step equivalent for fertility traits or low-heritability traits.
Collapse
Affiliation(s)
- Akhmad Fathoni
- Department of Animal Science, Faculty of Agriculture, Khon Kaen University, Khon Kaen 40002, Thailand; (A.F.); (W.B.); (V.C.)
- Department of Animal Breeding and Reproduction, Faculty of Animal Science, Universitas Gadjah Mada, Yogyakarta 55281, Indonesia
| | - Wuttigrai Boonkum
- Department of Animal Science, Faculty of Agriculture, Khon Kaen University, Khon Kaen 40002, Thailand; (A.F.); (W.B.); (V.C.)
- Network Center for Animal Breeding and OMICS Research, Khon Kaen University, Khon Kaen 40002, Thailand
| | - Vibuntita Chankitisakul
- Department of Animal Science, Faculty of Agriculture, Khon Kaen University, Khon Kaen 40002, Thailand; (A.F.); (W.B.); (V.C.)
- Network Center for Animal Breeding and OMICS Research, Khon Kaen University, Khon Kaen 40002, Thailand
| | - Monchai Duangjinda
- Department of Animal Science, Faculty of Agriculture, Khon Kaen University, Khon Kaen 40002, Thailand; (A.F.); (W.B.); (V.C.)
- Network Center for Animal Breeding and OMICS Research, Khon Kaen University, Khon Kaen 40002, Thailand
| |
Collapse
|
9
|
Cesarani A, Lourenco D, Tsuruta S, Legarra A, Nicolazzi E, VanRaden P, Misztal I. Multibreed genomic evaluation for production traits of dairy cattle in the United States using single-step genomic best linear unbiased predictor. J Dairy Sci 2022; 105:5141-5152. [DOI: 10.3168/jds.2021-21505] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 01/27/2022] [Indexed: 01/01/2023]
|
10
|
Torres LG, de Oliveira EJ, Ogbonna AC, Bauchet GJ, Mueller LA, Azevedo CF, Fonseca e Silva F, Simiqueli GF, de Resende MDV. Can Cross-Country Genomic Predictions Be a Reasonable Strategy to Support Germplasm Exchange? - A Case Study With Hydrogen Cyanide in Cassava. FRONTIERS IN PLANT SCIENCE 2021; 12:742638. [PMID: 34956254 PMCID: PMC8692580 DOI: 10.3389/fpls.2021.742638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 11/08/2021] [Indexed: 06/14/2023]
Abstract
Genomic prediction (GP) offers great opportunities for accelerated genetic gains by optimizing the breeding pipeline. One of the key factors to be considered is how the training populations (TP) are composed in terms of genetic improvement, kinship/origin, and their impacts on GP. Hydrogen cyanide content (HCN) is a determinant trait to guide cassava's products usage and processing. This work aimed to achieve the following objectives: (i) evaluate the feasibility of using cross-country (CC) GP between germplasm's of Embrapa Mandioca e Fruticultura (Embrapa, Brazil) and The International Institute of Tropical Agriculture (IITA, Nigeria) for HCN; (ii) provide an assessment of population structure for the joint dataset; (iii) estimate the genetic parameters based on single nucleotide polymorphisms (SNPs) and a haplotype-approach. Datasets of HCN from Embrapa and IITA breeding programs were analyzed, separately and jointly, with 1,230, 590, and 1,820 clones, respectively. After quality control, ∼14K SNPs were used for GP. The genomic estimated breeding values (GEBVs) were predicted based on SNP effects from analyses with TP composed of the following: (i) Embrapa genotypic and phenotypic data, (ii) IITA genotypic and phenotypic data, and (iii) the joint datasets. Comparisons on GEBVs' estimation were made considering the hypothetical situation of not having the phenotypic characterization for a set of clones for a certain research institute/country and might need to use the markers' effects that were trained with data from other research institutes/country's germplasm to estimate their clones' GEBV. Fixation index (FST) among the genetic groups identified within the joint dataset ranged from 0.002 to 0.091. The joint dataset provided an improved accuracy (0.8-0.85) compared to the prediction accuracy of either germplasm's sources individually (0.51-0.67). CC GP proved to have potential use under the present study's scenario, the correlation between GEBVs predicted with TP from Embrapa and IITA was 0.55 for Embrapa's germplasm, whereas for IITA's it was 0.1. This seems to be among the first attempts to evaluate the CC GP in plants. As such, a lot of useful new information was provided on the subject, which can guide new research on this very important and emerging field.
Collapse
Affiliation(s)
- Lívia Gomes Torres
- Department of Plant Science, Universidade Federal de Viçosa, Viçosa, Brazil
| | | | - Alex C. Ogbonna
- Department of Plant Breeding and Genetics, Cornell University, Ithaca, NY, United States
- Boyce Thompson Institute, Ithaca, NY, United States
| | | | - Lukas A. Mueller
- Department of Plant Breeding and Genetics, Cornell University, Ithaca, NY, United States
- Boyce Thompson Institute, Ithaca, NY, United States
| | | | | | | | - Marcos Deon Vilela de Resende
- Department of Forestry Engineering, Universidade Federal de Viçosa, Viçosa, Brazil
- Embrapa Café, Universidade Federal de Viçosa, Viçosa, Brazil
| |
Collapse
|
11
|
Duenk P, Bijma P, Wientjes YCJ, Calus MPL. Review: optimizing genomic selection for crossbred performance by model improvement and data collection. J Anim Sci 2021; 99:skab205. [PMID: 34223907 PMCID: PMC8499581 DOI: 10.1093/jas/skab205] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 07/02/2021] [Indexed: 11/26/2022] Open
Abstract
Breeding programs aiming to improve the performance of crossbreds may benefit from genomic prediction of crossbred (CB) performance for purebred (PB) selection candidates. In this review, we compared genomic prediction strategies that differed in 1) the genomic prediction model used or 2) the data used in the reference population. We found 27 unique studies, two of which used deterministic simulation, 11 used stochastic simulation, and 14 real data. Differences in accuracy and response to selection between strategies depended on i) the value of the purebred crossbred genetic correlation (rpc), ii) the genetic distance between the parental lines, iii) the size of PB and CB reference populations, and iv) the relatedness of these reference populations to the selection candidates. In studies where a PB reference population was used, the use of a dominance model yielded accuracies that were equal to or higher than those of additive models. When rpc was lower than ~0.8, and was caused mainly by G × E, it was beneficial to create a reference population of PB animals that are tested in a CB environment. In general, the benefit of collecting CB information increased with decreasing rpc. For a given rpc, the benefit of collecting CB information increased with increasing size of the reference populations. Collecting CB information was not beneficial when rpc was higher than ~0.9, especially when the reference populations were small. Collecting only phenotypes of CB animals may slightly improve accuracy and response to selection, but requires that the pedigree is known. It is, therefore, advisable to genotype these CB animals as well. Finally, considering the breed-origin of alleles allows for modeling breed-specific effects in the CB, but this did not always lead to higher accuracies. Our review shows that the differences in accuracy and response to selection between strategies depend on several factors. One of the most important factors is rpc, and we, therefore, recommend to obtain accurate estimates of rpc of all breeding goal traits. Furthermore, knowledge about the importance of components of rpc (i.e., dominance, epistasis, and G × E) can help breeders to decide which model to use, and whether to collect data on animals in a CB environment. Future research should focus on the development of a tool that predicts accuracy and response to selection from scenario specific parameters.
Collapse
Affiliation(s)
- Pascal Duenk
- Animal Breeding and Genomics, Wageningen University and
Research, P.O. Box 338, 6700 AH Wageningen,
The Netherlands
| | - Piter Bijma
- Animal Breeding and Genomics, Wageningen University and
Research, P.O. Box 338, 6700 AH Wageningen,
The Netherlands
| | - Yvonne C J Wientjes
- Animal Breeding and Genomics, Wageningen University and
Research, P.O. Box 338, 6700 AH Wageningen,
The Netherlands
| | - Mario P L Calus
- Animal Breeding and Genomics, Wageningen University and
Research, P.O. Box 338, 6700 AH Wageningen,
The Netherlands
| |
Collapse
|
12
|
Steyn Y, Gonzalez-Pena D, Bernal Rubio YL, Vukasinovic N, DeNise SK, Lourenco DAL, Misztal I. Indirect genomic predictions for milk yield in crossbred Holstein-Jersey dairy cattle. J Dairy Sci 2021; 104:5728-5737. [PMID: 33685678 DOI: 10.3168/jds.2020-19451] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 01/05/2021] [Indexed: 11/19/2022]
Abstract
The objective of this study was to predict genomic breeding values for milk yield of crossbred dairy cattle under different scenarios using single-step genomic BLUP (ssGBLUP). The data set included 13,880,217 milk yield measurements on 6,830,415 cows. Genotypes of 89,558 Holstein, 40,769 Jersey, and 22,373 Holstein-Jersey crossbred animals were used, of which all Holstein, 9,313 Jersey, and 1,667 crossbred animals had phenotypic records. Genotypes were imputed to 45K SNP markers. The SNP effects were estimated from single-breed evaluations for Jersey (JE), Holstein (HO) and crossbreds (CROSS), and multibreed evaluations including all Jersey and Holstein (JE_HO) or approximately equal proportions of Jersey, Holstein, and crossbred animals (MIX). Indirect predictions (IP) of the validation animals (358 crossbred animals with phenotypes excluded from evaluations) were calculated using the resulting SNP effects. Additionally, breed proportions (BP) of crossbred animals were applied as a weight when IP were estimated based on each pure breed. The predictive ability of IP was calculated as the Pearson correlation between IP and phenotypes of the validation animals adjusted for fixed effects in the model. Regression of adjusted phenotypes on IP was used to assess the inflation of IP. The predictive ability of IP for CROSS, JE, HO, JE_HO, and MIX scenario was 0.50, 0.50, 0.47, 0.50, and 0.46, respectively. Using BP was the least successful, with a predictive ability of 0.32. The inflation of the IP for crossbred animals using CROSS, JE, HO, JE_HO, MIX, and BP scenarios were 1.17, 0.65, 0.55, 0.78, 1.00, and 0.85, respectively. The IP of crossbred animals can be predicted using single-step GBLUP under a scenario that includes purebred genotypes.
Collapse
Affiliation(s)
- Y Steyn
- Department of Animal and Dairy Science, University of Georgia, 425 River Road, Athens 30602.
| | | | | | | | - S K DeNise
- Zoetis, 333 Portage Street, Kalamazoo, MI 49007
| | - D A L Lourenco
- Department of Animal and Dairy Science, University of Georgia, 425 River Road, Athens 30602
| | - I Misztal
- Department of Animal and Dairy Science, University of Georgia, 425 River Road, Athens 30602
| |
Collapse
|
13
|
See GM, Mote BE, Spangler ML. Selective genotyping and phenotypic data inclusion strategies of crossbred progeny for combined crossbred and purebred selection in swine breeding. J Anim Sci 2021; 99:6131744. [PMID: 33560334 DOI: 10.1093/jas/skab041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Accepted: 02/04/2021] [Indexed: 11/14/2022] Open
Abstract
Inclusion of crossbred (CB) data into traditionally purebred (PB) genetic evaluations has been shown to increase the response in CB performance. Currently, it is unrealistic to collect data on all CB animals in swine production systems, thus, a subset of CB animals must be selected to contribute genomic/phenotypic information. The aim of this study was to evaluate selective genotyping strategies in a simulated 3-way swine crossbreeding scheme. The swine crossbreeding scheme was simulated and produced 3-way CB animals for 6 generations with 3 distinct PB breeds each with 25 and 175 mating males and females, respectively. F1 crosses (400 mating females) produced 4,000 terminal CB progeny which were subjected to selective genotyping. The genome consisted of 18 chromosomes with 1,800 QTL and 72k SNP markers. Selection was performed using estimated breeding values (EBV) for CB performance. It was assumed that both PB and CB performance was moderately heritable (h2=0.4). Several scenarios altering the genetic correlation between PB and CB performance (rpc=0.1, 0.3, 0.5, 0.7,or 0.9) were considered. CB animals were chosen based on phenotypes to select 200, 400, or 800 CB animals to genotype per generation. Selection strategies included: (1) Random: random selection, (2) Top: highest phenotype, (3) Bottom: lowest phenotype, (4) Extreme: half highest and half lowest phenotypes, and (5) Middle: average phenotype. Each selective genotyping strategy, except for Random, was considered by selecting animals in half-sib (HS) or full-sib (FS) families. The number of PB animals with genotypes and phenotypes each generation was fixed at 1,680. Each unique genotyping strategy and rpc scenario was replicated 10 times. Selection of CB animals based on the Extreme strategy resulted in the highest (P < 0.05) rates of genetic gain in CB performance (ΔG) when rpc<0.9. For highly correlated traits (rpc=0.9) selective genotyping did not impact (P > 0.05) ΔG. No differences (P > 0.05) were observed in ΔG between top, bottom, or middle when rpc>0.1. Higher correlations between true breeding values (TBV) and EBV were observed using Extreme when rpc<0.9. In general, family sampling method did not impact ΔG or the correlation between TBV and EBV. Overall, the Extreme genotyping strategy produced the greatest genetic gain and the highest correlations between TBV and EBV, suggesting that 2-tailed sampling of CB animals is the most informative when CB performance is the selection goal.
Collapse
Affiliation(s)
- Garrett M See
- Department of Animal Science, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Benny E Mote
- Department of Animal Science, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Matthew L Spangler
- Department of Animal Science, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| |
Collapse
|
14
|
Misztal I, Tsuruta S, Pocrnic I, Lourenco D. Core-dependent changes in genomic predictions using the Algorithm for Proven and Young in single-step genomic best linear unbiased prediction. J Anim Sci 2020; 98:skaa374. [PMID: 33211798 PMCID: PMC7739885 DOI: 10.1093/jas/skaa374] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Accepted: 11/11/2020] [Indexed: 11/28/2022] Open
Abstract
Single-step genomic best linear unbiased prediction with the Algorithm for Proven and Young (APY) is a popular method for large-scale genomic evaluations. With the APY algorithm, animals are designated as core or noncore, and the computing resources to create the inverse of the genomic relationship matrix (GRM) are reduced by inverting only a portion of that matrix for core animals. However, using different core sets of the same size causes fluctuations in genomic estimated breeding values (GEBVs) up to one additive standard deviation without affecting prediction accuracy. About 2% of the variation in the GRM is noise. In the recursion formula for APY, the error term modeling the noise is different for every set of core animals, creating changes in breeding values. While average changes are small, and correlations between breeding values estimated with different core animals are close to 1.0, based on the normal distribution theory, outliers can be several times bigger than the average. Tests included commercial datasets from beef and dairy cattle and from pigs. Beyond a certain number of core animals, the prediction accuracy did not improve, but fluctuations decreased with more animals. Fluctuations were much smaller than the possible changes based on prediction error variance. GEBVs change over time even for animals with no new data as genomic relationships ties all the genotyped animals, causing reranking of top animals. In contrast, changes in nongenomic models without new data are small. Also, GEBV can change due to details in the model, such as redefinition of contemporary groups or unknown parent groups. In particular, increasing the fraction of blending of the GRM with a pedigree relationship matrix from 5% to 20% caused changes in GEBV up to 0.45 SD, with a correlation of GEBV > 0.99. Fluctuations in genomic predictions are part of genomic evaluation models and are also present without the APY algorithm when genomic evaluations are computed with updated data. The best approach to reduce the impact of fluctuations in genomic evaluations is to make selection decisions not on individual animals with limited individual accuracy but on groups of animals with high average accuracy.
Collapse
Affiliation(s)
- Ignacy Misztal
- Department of Animal and Dairy Science, University of Georgia, Athens, GA
| | - Shogo Tsuruta
- Department of Animal and Dairy Science, University of Georgia, Athens, GA
| | - Ivan Pocrnic
- The Roslin Institute, The University of Edinburgh, Edinburgh, UK
| | - Daniela Lourenco
- Department of Animal and Dairy Science, University of Georgia, Athens, GA
| |
Collapse
|
15
|
Gualdrón Duarte JL, Gori AS, Hubin X, Lourenco D, Charlier C, Misztal I, Druet T. Performances of Adaptive MultiBLUP, Bayesian regressions, and weighted-GBLUP approaches for genomic predictions in Belgian Blue beef cattle. BMC Genomics 2020; 21:545. [PMID: 32762654 PMCID: PMC7430838 DOI: 10.1186/s12864-020-06921-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Accepted: 07/17/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Genomic selection has been successfully implemented in many livestock and crop species. The genomic best linear unbiased predictor (GBLUP) approach, assigning equal variance to all SNP effects, is one of the reference methods. When large-effect variants contribute to complex traits, it has been shown that genomic prediction methods that assign a higher variance to subsets of SNP effects can achieve higher prediction accuracy. We herein compared the efficiency of several such approaches, including the Adaptive MultiBLUP (AM-BLUP) that uses local genomic relationship matrices (GRM) to automatically identify and weight genomic regions with large effects, to predict genetic merit in Belgian Blue beef cattle. RESULTS We used a population of approximately 10,000 genotyped cows and their phenotypes for 14 traits, mostly related to muscular development and body dimensions. According to the trait, we found that 4 to 25% of the genetic variance could be associated with 2 to 12 genomic regions harbouring large-effect variants. Noteworthy, three previously identified recessive deleterious variants presented heterozygote advantage and were among the most significant SNPs for several traits. The AM-BLUP resulted in increased reliability of genomic predictions compared to GBLUP (+ 2%), but Bayesian methods proved more efficient (+ 3%). Overall, the reliability gains remained thus limited although higher gains were observed for skin thickness, a trait affected by two genomic regions having particularly large effects. Higher accuracies than those from the original AM-BLUP were achieved when applying the Bayesian Sparse Linear Mixed Model to pre-select groups of SNPs with large effects and subsequently use their estimated variance to build a weighted GRM. Finally, the single-step GBLUP performed best and could be further improved (+ 3% prediction accuracy) by using these weighted GRM. CONCLUSIONS The AM-BLUP is an attractive method to automatically identify and weight genomic regions with large effects on complex traits. However, the method was less accurate than Bayesian methods. Overall, weighted methods achieved modest accuracy gains compared to GBLUP. Nevertheless, the computational efficiency of the AM-BLUP might be valuable at higher marker density, including with whole-genome sequencing data. Furthermore, weighted GRM are particularly useful to account for large variance loci in the single-step GBLUP.
Collapse
Affiliation(s)
- José Luis Gualdrón Duarte
- Unit of Animal Genomics, GIGA-R, 11 Avenue de l'Hôpital (B34), University of Liège, 4000, Liège, Belgium.
| | - Ann-Stephan Gori
- Innovation Department, Elevéo asbl and Inovéo, Awé Group, 5590, Ciney, Belgium
| | - Xavier Hubin
- Innovation Department, Elevéo asbl and Inovéo, Awé Group, 5590, Ciney, Belgium
| | - Daniela Lourenco
- Department of Animal and Dairy Science, University of Georgia, 425 River Rd, Athens, GA, 30602, USA
| | - Carole Charlier
- Unit of Animal Genomics, GIGA-R, 11 Avenue de l'Hôpital (B34), University of Liège, 4000, Liège, Belgium
| | - Ignacy Misztal
- Department of Animal and Dairy Science, University of Georgia, 425 River Rd, Athens, GA, 30602, USA
| | - Tom Druet
- Unit of Animal Genomics, GIGA-R, 11 Avenue de l'Hôpital (B34), University of Liège, 4000, Liège, Belgium
| |
Collapse
|
16
|
Lourenco D, Legarra A, Tsuruta S, Masuda Y, Aguilar I, Misztal I. Single-Step Genomic Evaluations from Theory to Practice: Using SNP Chips and Sequence Data in BLUPF90. Genes (Basel) 2020; 11:E790. [PMID: 32674271 PMCID: PMC7397237 DOI: 10.3390/genes11070790] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 07/03/2020] [Accepted: 07/06/2020] [Indexed: 11/16/2022] Open
Abstract
Single-step genomic evaluation became a standard procedure in livestock breeding, and the main reason is the ability to combine all pedigree, phenotypes, and genotypes available into one single evaluation, without the need of post-analysis processing. Therefore, the incorporation of data on genotyped and non-genotyped animals in this method is straightforward. Since 2009, two main implementations of single-step were proposed. One is called single-step genomic best linear unbiased prediction (ssGBLUP) and uses single nucleotide polymorphism (SNP) to construct the genomic relationship matrix; the other is the single-step Bayesian regression (ssBR), which is a marker effect model. Under the same assumptions, both models are equivalent. In this review, we focus solely on ssGBLUP. The implementation of ssGBLUP into the BLUPF90 software suite was done in 2009, and since then, several changes were made to make ssGBLUP flexible to any model, number of traits, number of phenotypes, and number of genotyped animals. Single-step GBLUP from the BLUPF90 software suite has been used for genomic evaluations worldwide. In this review, we will show theoretical developments and numerical examples of ssGBLUP using SNP data from regular chips to sequence data.
Collapse
Affiliation(s)
- Daniela Lourenco
- Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602, USA; (S.T.); (Y.M.); (I.M.)
| | - Andres Legarra
- Institut National de la Recherche Agronomique, UMR1388 GenPhySE, 31326 Castanet Tolosan, France;
| | - Shogo Tsuruta
- Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602, USA; (S.T.); (Y.M.); (I.M.)
| | - Yutaka Masuda
- Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602, USA; (S.T.); (Y.M.); (I.M.)
| | - Ignacio Aguilar
- Instituto Nacional de Investigación Agropecuaria (INIA), 11500 Montevideo, Uruguay;
| | - Ignacy Misztal
- Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602, USA; (S.T.); (Y.M.); (I.M.)
| |
Collapse
|
17
|
Misztal I, Lourenco D, Legarra A. Current status of genomic evaluation. J Anim Sci 2020; 98:skaa101. [PMID: 32267923 PMCID: PMC7183352 DOI: 10.1093/jas/skaa101] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Accepted: 04/07/2020] [Indexed: 12/14/2022] Open
Abstract
Early application of genomic selection relied on SNP estimation with phenotypes or de-regressed proofs (DRP). Chips of 50k SNP seemed sufficient for an accurate estimation of SNP effects. Genomic estimated breeding values (GEBV) were composed of an index with parent average, direct genomic value, and deduction of a parental index to eliminate double counting. Use of SNP selection or weighting increased accuracy with small data sets but had minimal to no impact with large data sets. Efforts to include potentially causative SNP derived from sequence data or high-density chips showed limited or no gain in accuracy. After the implementation of genomic selection, EBV by BLUP became biased because of genomic preselection and DRP computed based on EBV required adjustments, and the creation of DRP for females is hard and subject to double counting. Genomic selection was greatly simplified by single-step genomic BLUP (ssGBLUP). This method based on combining genomic and pedigree relationships automatically creates an index with all sources of information, can use any combination of male and female genotypes, and accounts for preselection. To avoid biases, especially under strong selection, ssGBLUP requires that pedigree and genomic relationships are compatible. Because the inversion of the genomic relationship matrix (G) becomes costly with more than 100k genotyped animals, large data computations in ssGBLUP were solved by exploiting limited dimensionality of genomic data due to limited effective population size. With such dimensionality ranging from 4k in chickens to about 15k in cattle, the inverse of G can be created directly (e.g., by the algorithm for proven and young) at a linear cost. Due to its simplicity and accuracy, ssGBLUP is routinely used for genomic selection by the major chicken, pig, and beef industries. Single step can be used to derive SNP effects for indirect prediction and for genome-wide association studies, including computations of the P-values. Alternative single-step formulations exist that use SNP effects for genotyped or for all animals. Although genomics is the new standard in breeding and genetics, there are still some problems that need to be solved. This involves new validation procedures that are unaffected by selection, parameter estimation that accounts for all the genomic data used in selection, and strategies to address reduction in genetic variances after genomic selection was implemented.
Collapse
Affiliation(s)
- Ignacy Misztal
- Department of Animal and Dairy Science, University of Georgia, Athens, GA
| | - Daniela Lourenco
- Department of Animal and Dairy Science, University of Georgia, Athens, GA
| | - Andres Legarra
- Department of Animal Genetics, Institut National de la Recherche Agronomique, Castanet-Tolosan, France
| |
Collapse
|
18
|
Steyn Y, Lourenco DAL, Misztal I. Genomic predictions in purebreds with a multibreed genomic relationship matrix1. J Anim Sci 2020; 97:4418-4427. [PMID: 31539424 DOI: 10.1093/jas/skz296] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Accepted: 09/10/2019] [Indexed: 11/14/2022] Open
Abstract
Combining breeds in a multibreed evaluation can have a negative impact on prediction accuracy, especially if single nucleotide polymorphism (SNP) effects differ among breeds. The aim of this study was to evaluate the use of a multibreed genomic relationship matrix (G), where SNP effects are considered to be unique to each breed, that is, nonshared. This multibreed G was created by treating SNP of different breeds as if they were on nonoverlapping positions on the chromosome, although, in reality, they were not. This simple setup may avoid spurious Identity by state (IBS) relationships between breeds and automatically considers breed-specific allele frequencies. This scenario was contrasted to a regular multibreed evaluation where all SNPs were shared, that is, the same position, and to single-breed evaluations. Different SNP densities (9k and 45k) and different effective population sizes (Ne) were tested. Five breeds mimicking recent beef cattle populations that diverged from the same historical population were simulated using different selection criteria. It was assumed that quantitative trait locus (QTL) effects were the same over all breeds. For the recent population, generations 1-9 had approximately half of the animals genotyped, whereas all animals in generation 10 were genotyped. Generation 10 animals were set for validation; therefore, each breed had a validation group. Analyses were performed using single-step genomic best linear unbiased prediction. Prediction accuracy was calculated as the correlation between true (T) and genomic estimated breeding values (GEBV). Accuracies of GEBV were lower for the larger Ne and low SNP density. All three evaluation scenarios using 45k resulted in similar accuracies, suggesting that the marker density is high enough to account for relationships and linkage disequilibrium with QTL. A shared multibreed evaluation using 9k resulted in a decrease of accuracy of 0.08 for a smaller Ne and 0.12 for a larger Ne. This loss was mostly avoided when markers were treated as nonshared within the same G matrix. A G matrix with nonshared SNP enables multibreed evaluations without considerably changing accuracy, especially with limited information per breed.
Collapse
Affiliation(s)
- Yvette Steyn
- Department of Animal and Dairy Science, University of Georgia, Athens, GA
| | | | - Ignacy Misztal
- Department of Animal and Dairy Science, University of Georgia, Athens, GA
| |
Collapse
|
19
|
VanRaden PM, Tooker ME, Chud TCS, Norman HD, Megonigal JH, Haagen IW, Wiggans GR. Genomic predictions for crossbred dairy cattle. J Dairy Sci 2019; 103:1620-1631. [PMID: 31837783 DOI: 10.3168/jds.2019-16634] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Accepted: 10/14/2019] [Indexed: 01/14/2023]
Abstract
Genomic evaluations are useful for crossbred as well as purebred populations when selection is applied to commercial herds. Dairy farmers had already spent more than $1 million to genotype over 32,000 crossbred animals before US genomic evaluations became available for those animals. Thus, new tools were needed to provide accurate genomic predictions for crossbreds. Genotypes for crossbreds are imputed more accurately when the imputation reference population includes purebreds. Therefore, genotypes of 6,296 crossbred animals were imputed from lower-density chips by including either 3,119 ancestors or 834,367 genotyped animals in the reference population. Crossbreds in the imputation study included 733 Jersey × Holstein F1 animals, 55 Brown Swiss × Holstein F1 animals, 2,300 Holstein backcrosses, 2,026 Jersey backcrosses, 27 Brown Swiss backcrosses, and 502 other crossbreds of various breed combinations. Another 653 animals appeared to be purebreds that owners had miscoded as a different breed. Genomic breed composition was estimated from 60,671 markers using the known breed identities for purebred, progeny-tested Holstein, Jersey, Brown Swiss, Ayrshire, and Guernsey bulls as the 5 traits (breed fractions) to be predicted. Estimates of breed composition were adjusted so that no percentages were negative or exceeded 100%, and breed percentages summed to 100%. Another adjustment set percentages above 93.5% equal to 100%, and the resulting value was termed breed base representation (BBR). Larger percentages of missing alleles were imputed by using a crossbred reference population rather than only the closest purebred reference population. Crossbred predictions were averages of genomic predictions computed using marker effects for each pure breed, which were weighted by the animal's BBR. Marker and polygenic effects were estimated separately for each breed on the all-breed scale instead of within-breed scales. For crossbreds, genomic predictions weighted by BBR were more accurate than the average of parents' breeding values and slightly more accurate than predictions using only the predominant breed. For purebreds, single-trait predictions using only within-breed data were as accurate as multi-trait predictions with allele effects in different breeds treated as correlated effects. Crossbred genomic predicted transmitting abilities were implemented by the Council on Dairy Cattle Breeding in April 2019 and will aid producers in managing their breeding programs and selecting replacement heifers.
Collapse
Affiliation(s)
- P M VanRaden
- USDA, Agricultural Research Service, Animal Genomics and Improvement Laboratory, Beltsville, MD 20705-2350.
| | - M E Tooker
- USDA, Agricultural Research Service, Animal Genomics and Improvement Laboratory, Beltsville, MD 20705-2350
| | - T C S Chud
- Departamento de Ciências Exatas, Universidade Estadual Paulista (Unesp), Faculdade de Ciências Agrárias e Veterinárias, Jaboticabal, São Paulo CEP 14884-900, Brazil
| | - H D Norman
- Council on Dairy Cattle Breeding, Bowie, MD 20716
| | | | - I W Haagen
- Council on Dairy Cattle Breeding, Bowie, MD 20716
| | - G R Wiggans
- Council on Dairy Cattle Breeding, Bowie, MD 20716
| |
Collapse
|