1
|
Wang J, Chai J, Chen L, Zhang T, Long X, Diao S, Chen D, Guo Z, Tang G, Wu P. Enhancing Genomic Prediction Accuracy of Reproduction Traits in Rongchang Pigs Through Machine Learning. Animals (Basel) 2025; 15:525. [PMID: 40003007 PMCID: PMC11852217 DOI: 10.3390/ani15040525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2025] [Revised: 02/02/2025] [Accepted: 02/10/2025] [Indexed: 02/27/2025] Open
Abstract
The increasing volume of genome sequencing data presents challenges for traditional genome-wide prediction methods in handling large datasets. Machine learning (ML) techniques, which can process high-dimensional data, offer promising solutions. This study aimed to find a genome-wide prediction method for local pig breeds, using 10 datasets with varying SNP densities derived from imputed sequencing data of 515 Rongchang pigs and the Pig QTL database. Three reproduction traits-litter weight, total number of piglets born, and number of piglets born alive-were predicted using six traditional methods and five ML methods, including kernel ridge regression, random forest, Gradient Boosting Decision Tree (GBDT), Light Gradient Boosting Machine, and Adaboost. The methods' efficacy was evaluated using fivefold cross-validation and independent tests. The predictive performance of both traditional and ML methods initially increased with SNP density, peaking at 800-900 k SNPs. ML methods outperformed traditional ones, showing improvements of 0.4-4.1%. The integration of GWAS and the Pig QTL database enhanced ML robustness. ML models exhibited superior generalizability, with high correlation coefficients (0.935-0.998) between cross-validation and independent test results. GBDT and random forest showed high computational efficiency, making them promising methods for genomic prediction in livestock breeding.
Collapse
Affiliation(s)
- Junge Wang
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China; (J.W.); (D.C.)
| | - Jie Chai
- Chongqing Academy of Animal Sciences, Chongqing 402460, China; (J.C.); (L.C.); (T.Z.); (X.L.); (S.D.); (Z.G.)
- National Center of Technology Innovation for Pigs, Chongqing 402460, China
| | - Li Chen
- Chongqing Academy of Animal Sciences, Chongqing 402460, China; (J.C.); (L.C.); (T.Z.); (X.L.); (S.D.); (Z.G.)
- National Center of Technology Innovation for Pigs, Chongqing 402460, China
| | - Tinghuan Zhang
- Chongqing Academy of Animal Sciences, Chongqing 402460, China; (J.C.); (L.C.); (T.Z.); (X.L.); (S.D.); (Z.G.)
- National Center of Technology Innovation for Pigs, Chongqing 402460, China
| | - Xi Long
- Chongqing Academy of Animal Sciences, Chongqing 402460, China; (J.C.); (L.C.); (T.Z.); (X.L.); (S.D.); (Z.G.)
- National Center of Technology Innovation for Pigs, Chongqing 402460, China
| | - Shuqi Diao
- Chongqing Academy of Animal Sciences, Chongqing 402460, China; (J.C.); (L.C.); (T.Z.); (X.L.); (S.D.); (Z.G.)
- National Center of Technology Innovation for Pigs, Chongqing 402460, China
| | - Dong Chen
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China; (J.W.); (D.C.)
| | - Zongyi Guo
- Chongqing Academy of Animal Sciences, Chongqing 402460, China; (J.C.); (L.C.); (T.Z.); (X.L.); (S.D.); (Z.G.)
- National Center of Technology Innovation for Pigs, Chongqing 402460, China
| | - Guoqing Tang
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China; (J.W.); (D.C.)
| | - Pingxian Wu
- Chongqing Academy of Animal Sciences, Chongqing 402460, China; (J.C.); (L.C.); (T.Z.); (X.L.); (S.D.); (Z.G.)
- National Center of Technology Innovation for Pigs, Chongqing 402460, China
| |
Collapse
|
2
|
Ziadi C, Demyda-Peyrás S, Valera M, Perdomo-González D, Laseca N, Rodríguez-Sainz de los Terreros A, Encina A, Azor P, Molina A. Comparative Analysis of Genomic and Pedigree-Based Approaches for Genetic Evaluation of Morphological Traits in Pura Raza Española Horses. Genes (Basel) 2025; 16:131. [PMID: 40004460 PMCID: PMC11855142 DOI: 10.3390/genes16020131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2024] [Revised: 01/17/2025] [Accepted: 01/21/2025] [Indexed: 02/27/2025] Open
Abstract
BACKGROUND The single-step best linear unbiased predictor (ssGBLUP) has emerged as a reference method for genomic selection in recent years due to its advantages over traditional approaches. Although its application in horses remains limited, ssGBLUP has demonstrated the potential to improve the reliability of estimated breeding values in livestock species. This study aimed to assess the impact of incorporating genomic data using single-step restricted maximum likelihood (ssGREML) on reliability (R2) in the Pura Raza Española (PRE) horse breed, compared to traditional pedigree-based REML. METHODS The analysis involved 14 morphological traits from 7152 animals, including 2916 genotyped individuals. Genetic parameters were estimated using a multivariate model. RESULTS Results showed that heritability estimates were similar between the two approaches, ranging from 0.08 to 0.76. However, a significant increase in reliability (R2) was observed for ssGREML compared to REML across all morphological traits, with overall gains ranging from 1.56% to 13.30% depending on the trait evaluated. R2 ranged from 6.93% to 22.70% in genotyped animals, significantly lower in non-genotyped animals (0.82% to 12.37%). Interestingly, individuals with low R2 values in REML demonstrated the largest R2 gains in ssGREML. Additionally, this improvement was much greater (5.96% to 19.25%) when only considering stallions with less than 40 controlled foals. CONCLUSIONS Hereby, we demonstrated that the application of genomic selection can contribute to improving the reliability of mating decisions in a large horse breeding program such as the PRE breed.
Collapse
Affiliation(s)
- Chiraz Ziadi
- Departamento de Genética, Universidad de Córdoba, 14014 Córdoba, Spain; (C.Z.); (A.M.)
| | | | - Mercedes Valera
- Departamento de Agronomía, ETSIA, Universidad de Sevilla, 41013 Sevilla, Spain; (M.V.); (N.L.)
| | | | - Nora Laseca
- Departamento de Agronomía, ETSIA, Universidad de Sevilla, 41013 Sevilla, Spain; (M.V.); (N.L.)
- Real Asociación Nacional de Criadores de Caballos de Pura Raza Española (ANCCE), 41014 Sevilla, Spain; (A.R.-S.d.l.T.); (A.E.); (P.A.)
| | | | - Ana Encina
- Real Asociación Nacional de Criadores de Caballos de Pura Raza Española (ANCCE), 41014 Sevilla, Spain; (A.R.-S.d.l.T.); (A.E.); (P.A.)
| | - Pedro Azor
- Real Asociación Nacional de Criadores de Caballos de Pura Raza Española (ANCCE), 41014 Sevilla, Spain; (A.R.-S.d.l.T.); (A.E.); (P.A.)
| | - Antonio Molina
- Departamento de Genética, Universidad de Córdoba, 14014 Córdoba, Spain; (C.Z.); (A.M.)
| |
Collapse
|
3
|
Jin L, Xu L, Jin H, Zhao S, Jia Y, Li J, Hua J. Accuracy of Genomic Predictions Cross Populations with Different Linkage Disequilibrium Patterns. Genes (Basel) 2024; 15:1419. [PMID: 39596619 PMCID: PMC11594128 DOI: 10.3390/genes15111419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Accepted: 10/29/2024] [Indexed: 11/29/2024] Open
Abstract
BACKGROUND/OBJECTIVES There is a considerable global population of beef cattle, with numerous small-scale groups. Establishing separate reference groups for each breed in breeding practices is challenging, severely limiting the genome selection (GS) application. Combining data from multiple populations becomes particularly attractive and practical for small-scale populations, offering increased reference population size, operational ease, and data sharing. METHODS To evaluate potential for Chinese indigenous cattle, we evaluated the influence of combining multiple populations on genomic prediction reliability for 10 breeds using simulated data. RESULTS Within-breed evaluations consistently yielded the highest accuracies across various simulated genetic architectures. Genomic selection accuracy was lower in Group B populations referencing a Group A population (n = 400), but significantly higher in Group A populations with the addition of a small Group B (n = 200). However, accuracy remained low when using the Group A reference group (n = 400) to predict Group B. Incorporating a few Group B individuals (n = 200) into the reference group resulted in relatively high accuracy (~60% of Group A predictions). Accuracy increased with the growing number of individuals from Group B joining the reference group. CONCLUSIONS Our results suggested that multi-breed genomic selection was feasible for Chinese indigenous cattle populations with genetic relationships. This study's results also offer valuable insights into genome selection of multipopulations.
Collapse
Affiliation(s)
- Lei Jin
- College of Animal Science, Anhui Science and Technology University, Chuzhou 233100, China;
- Anhui Province Key Laboratory of Livestock and Poultry Product Safety Engineering, Institute of Animal Husbandry and Veterinary Medicine, Anhui Academy of Agricultural Sciences, Hefei 230031, China; (L.X.); (H.J.); (S.Z.); (Y.J.)
| | - Lei Xu
- Anhui Province Key Laboratory of Livestock and Poultry Product Safety Engineering, Institute of Animal Husbandry and Veterinary Medicine, Anhui Academy of Agricultural Sciences, Hefei 230031, China; (L.X.); (H.J.); (S.Z.); (Y.J.)
| | - Hai Jin
- Anhui Province Key Laboratory of Livestock and Poultry Product Safety Engineering, Institute of Animal Husbandry and Veterinary Medicine, Anhui Academy of Agricultural Sciences, Hefei 230031, China; (L.X.); (H.J.); (S.Z.); (Y.J.)
| | - Shuanping Zhao
- Anhui Province Key Laboratory of Livestock and Poultry Product Safety Engineering, Institute of Animal Husbandry and Veterinary Medicine, Anhui Academy of Agricultural Sciences, Hefei 230031, China; (L.X.); (H.J.); (S.Z.); (Y.J.)
| | - Yutang Jia
- Anhui Province Key Laboratory of Livestock and Poultry Product Safety Engineering, Institute of Animal Husbandry and Veterinary Medicine, Anhui Academy of Agricultural Sciences, Hefei 230031, China; (L.X.); (H.J.); (S.Z.); (Y.J.)
| | - Junya Li
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Jinling Hua
- College of Animal Science, Anhui Science and Technology University, Chuzhou 233100, China;
| |
Collapse
|
4
|
Lozada-Soto EA, Maltecca C, Jiang J, Cole JB, VanRaden PM, Tiezzi F. Effect of germplasm exchange strategies on genetic gain, homozygosity, and genetic diversity in dairy stud populations: A simulation study. J Dairy Sci 2024:S0022-0302(24)01085-3. [PMID: 39216524 DOI: 10.3168/jds.2024-24992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Accepted: 07/25/2024] [Indexed: 09/04/2024]
Abstract
While genomic selection has led to considerable improvements in genetic gain, it has also seemingly led to increased rates of inbreeding and homozygosity, which can negatively affect genetic diversity and the long-term sustainability of dairy populations. Using genotypes from US Holstein animals from 3 distinct stud populations, we performed a simulation study consisting of 10 rounds of selection, with each breeding population composed of 200 males and 2000 females. The investigated selection strategies consisted of selection using true breeding values (TBV), estimated breeding values (EBV), estimated breeding values penalized for the average future genomic inbreeding of progeny (PEN-EBV), or random selection (RAND). We also simulated several germplasm exchange strategies where the germplasm of males from other populations was used for breeding. These strategies included exchanging males based on EBV, PEN-EBV, low genomic future inbreeding of progeny (GFI), or randomly (RAND). Variations of several parameters, such as the correlation between the selection objectives of populations and the size of the exchange, were simulated. Penalizing genetic merit to minimize genomic inbreeding of progeny provided similar genetic gain and reduced the average homozygosity of populations compared with the EBV strategy. Germplasm exchange was found to generally provide long-term benefits to all stud populations. In both the short and long-term, germplasm exchange using the EBV or PEN-EBV strategies provided more cumulative genetic progress than the no-exchange strategy; the amount of long-term genetic progress achieved with germplasm exchange using these strategies was higher for scenarios with a higher genetic correlation between the traits selected by the studs and for a larger size of the exchange. Both the PEN-EBV and GFI exchange strategies allowed decreases in homozygosity and provided significant benefits to genetic diversity compared with other strategies, including larger average minor allele frequencies and smaller proportions of markers near fixation. Overall, this study showed the value of breeding strategies to balance genetic progress and genetic diversity and the benefits of cooperation between studs to ensure the sustainability of their respective breeding programs.
Collapse
Affiliation(s)
| | - Christian Maltecca
- Department of Animal Science, North Carolina State University, Raleigh, NC 27607, USA
| | - Jicai Jiang
- Department of Animal Science, North Carolina State University, Raleigh, NC 27607, USA
| | | | - Paul M VanRaden
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - Francesco Tiezzi
- Department of Agriculture, Food, Environment and Forestry (DAGRI), University of Florence, 50144 Florence, Italy.
| |
Collapse
|
5
|
Zhang M, Xu L, Lu H, Luo H, Zhou J, Wang D, Zhang X, Huang X, Wang Y. Genomic prediction based on a joint reference population for the Xinjiang Brown cattle. Front Genet 2024; 15:1394636. [PMID: 38737126 PMCID: PMC11082323 DOI: 10.3389/fgene.2024.1394636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Accepted: 04/10/2024] [Indexed: 05/14/2024] Open
Abstract
Introduction: Xinjiang Brown cattle constitute the largest breed of cattle in Xinjiang. Therefore, it is crucial to establish a genomic evaluation system, especially for those with low levels of breed improvement. Methods: This study aimed to establish a cross breed joint reference population by analyzing the genetic structure of 485 Xinjiang Brown cattle and 2,633 Chinese Holstein cattle (Illumina GeneSeek GGP bovine 150 K chip). The Bayes method single-step genome-wide best linear unbiased prediction was used to conduct a genomic evaluation of the joint reference population for the milk traits of Xinjiang Brown cattle. The reference population of Chinese Holstein cattle was randomly divided into groups to construct the joint reference population. By comparing the prediction accuracy, estimation bias, and inflation coefficient of the validation population, the optimal number of joint reference populations was determined. Results and Discussion: The results indicated a distinct genetic structure difference between the two breeds of adult cows, and both breeds should be considered when constructing multi-breed joint reference and validation populations. The reliability range of genome prediction of milk traits in the joint reference population was 0.142-0.465. Initially, it was determined that the inclusion of 600 and 900 Chinese Holstein cattle in the joint reference population positively impacted the genomic prediction of Xinjiang Brown cattle to certain extent. It was feasible to incorporate the Chinese Holstein into Xinjiang Brown cattle population to form a joint reference population for multi-breed genomic evaluation. However, for different Xinjiang Brown cattle populations, a fixed number of Chinese Holstein cattle cannot be directly added during multi-breed genomic selection. Pre-evaluation analysis based on the genetic structure, kinship, and other factors of the current population is required to ensure the authenticity and reliability of genomic predictions and improve estimation accuracy.
Collapse
Affiliation(s)
- Menghua Zhang
- College of Animal Science, Xinjiang Agricultural University, Urumqi, China
| | - Lei Xu
- College of Animal Science, Xinjiang Agricultural University, Urumqi, China
| | - Haibo Lu
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Hanpeng Luo
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Jinghang Zhou
- Shijiazhuang Molbreeding Biotechnology Co., Ltd., Shijiazhuang, China
| | - Dan Wang
- College of Animal Science, Xinjiang Agricultural University, Urumqi, China
| | - Xiaoxue Zhang
- College of Animal Science, Xinjiang Agricultural University, Urumqi, China
| | - Xixia Huang
- College of Animal Science, Xinjiang Agricultural University, Urumqi, China
| | - Yachun Wang
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
| |
Collapse
|
6
|
Yin C, Shi H, Zhou P, Wang Y, Tao X, Yin Z, Zhang X, Liu Y. Genomic Prediction of Growth Traits in Yorkshire Pigs of Different Reference Group Sizes Using Different Estimated Breeding Value Models. Animals (Basel) 2024; 14:1098. [PMID: 38612337 PMCID: PMC11010886 DOI: 10.3390/ani14071098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 03/31/2024] [Accepted: 04/02/2024] [Indexed: 04/14/2024] Open
Abstract
The need for sufficient reference population data poses a significant challenge in breeding programs aimed at improving pig farming on a small to medium scale. To overcome this hurdle, investigating the advantages of combing reference populations of varying sizes is crucial for enhancing the accuracy of the genomic estimated breeding value (GEBV). Genomic selection (GS) in populations with limited reference data can be optimized by combining populations of the same breed or related breeds. This study focused on understanding the effect of combing different reference group sizes on the accuracy of GS for determining the growth effectiveness and percentage of lean meat in Yorkshire pigs. Specifically, our study investigated two important traits: the age at 100 kg live weight (AGE100) and the backfat thickness at 100 kg live weight (BF100). This research assessed the efficiency of genomic prediction (GP) using different GEBV models across three Yorkshire populations with varying genetic backgrounds. The GeneSeek 50K GGP porcine high-density array was used for genotyping. A total of 2295 Yorkshire pigs were included, representing three Yorkshire pig populations with different genetic backgrounds-295 from Danish (small) lines from Huaibei City, Anhui Province, 500 from Canadian (medium) lines from Lixin County, Anhui Province, and 1500 from American (large) lines from Shanghai. To evaluate the impact of different population combination scenarios on the GS accuracy, three approaches were explored: (1) combining all three populations for prediction, (2) combining two populations to predict the third, and (3) predicting each population independently. Five GEBV models, including three Bayesian models (BayesA, BayesB, and BayesC), the genomic best linear unbiased prediction (GBLUP) model, and single-step GBLUP (ssGBLUP) were implemented through 20 repetitions of five-fold cross-validation (CV). The results indicate that predicting one target population using the other two populations yielded the highest accuracy, providing a novel approach for improving the genomic selection accuracy in Yorkshire pigs. In this study, it was found that using different populations of the same breed to predict small- and medium-sized herds might be effective in improving the GEBV. This investigation highlights the significance of incorporating population combinations in genetic models for predicting the breeding value, particularly for pig farmers confronted with resource limitations.
Collapse
Affiliation(s)
- Chang Yin
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, China; (C.Y.); (H.S.); (P.Z.); (Y.W.); (X.T.)
| | - Haoran Shi
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, China; (C.Y.); (H.S.); (P.Z.); (Y.W.); (X.T.)
| | - Peng Zhou
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, China; (C.Y.); (H.S.); (P.Z.); (Y.W.); (X.T.)
| | - Yuwei Wang
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, China; (C.Y.); (H.S.); (P.Z.); (Y.W.); (X.T.)
| | - Xuzhe Tao
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, China; (C.Y.); (H.S.); (P.Z.); (Y.W.); (X.T.)
| | - Zongjun Yin
- College of Animal Science and Technology, Anhui Agricultural University, Hefei 230036, China; (Z.Y.); (X.Z.)
| | - Xiaodong Zhang
- College of Animal Science and Technology, Anhui Agricultural University, Hefei 230036, China; (Z.Y.); (X.Z.)
| | - Yang Liu
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, China; (C.Y.); (H.S.); (P.Z.); (Y.W.); (X.T.)
| |
Collapse
|
7
|
Cuyabano BCD, Boichard D, Gondro C. Expected values for the accuracy of predicted breeding values accounting for genetic differences between reference and target populations. Genet Sel Evol 2024; 56:15. [PMID: 38424504 PMCID: PMC11234767 DOI: 10.1186/s12711-024-00876-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 01/08/2024] [Indexed: 03/02/2024] Open
Abstract
BACKGROUND Genetic merit, or breeding values as referred to in livestock and crop breeding programs, is one of the keys to the successful selection of animals in commercial farming systems. The developments in statistical methods during the twentieth century and single nucleotide polymorphism (SNP) chip technologies in the twenty-first century have revolutionized agricultural production, by allowing highly accurate predictions of breeding values for selection candidates at a very early age. Nonetheless, for many breeding populations, realized accuracies of predicted breeding values (PBV) remain below the theoretical maximum, even when the reference population is sufficiently large, and SNPs included in the model are in sufficient linkage disequilibrium (LD) with the quantitative trait locus (QTL). This is particularly noticeable over generations, as we observe the so-called erosion of the effects of SNPs due to recombinations, accompanied by the erosion of the accuracy of prediction. While accurately quantifying the erosion at the individual SNP level is a difficult and unresolved task, quantifying the erosion of the accuracy of prediction is a more tractable problem. In this paper, we describe a method that uses the relationship between reference and target populations to calculate expected values for the accuracies of predicted breeding values for non-phenotyped individuals accounting for erosion. The accuracy of the expected values was evaluated through simulations, and a further evaluation was performed on real data. RESULTS Using simulations, we empirically confirmed that our expected values for the accuracy of PBV accounting for erosion were able to correctly determine the prediction accuracy of breeding values for non-phenotyped individuals. When comparing the expected to the realized accuracies of PBV with real data, only one out of the four traits evaluated presented accuracies that were significantly higher than the expected, approachingh 2 . CONCLUSIONS We defined an index of genetic correlation between reference and target populations, which summarizes the expected overall erosion due to differences in allele frequencies and LD patterns between populations. We used this correlation along with a trait's heritability to derive expected values for the accuracy ( R ) of PBV accounting for the erosion, and demonstrated that our derived E R | erosion is a reliable metric.
Collapse
Affiliation(s)
- Beatriz C D Cuyabano
- INRAE, AgroParisTech, GABI, Université Paris Saclay, 78350, Jouy-en-Josas, France.
| | - Didier Boichard
- INRAE, AgroParisTech, GABI, Université Paris Saclay, 78350, Jouy-en-Josas, France
| | - Cedric Gondro
- Department of Animal Science, Michigan State University, 474 S Shaw Ln, East Lansing, MI, 48824, USA
| |
Collapse
|
8
|
Ma H, Li H, Ge F, Zhao H, Zhu B, Zhang L, Gao H, Xu L, Li J, Wang Z. Improving Genomic Predictions in Multi-Breed Cattle Populations: A Comparative Analysis of BayesR and GBLUP Models. Genes (Basel) 2024; 15:253. [PMID: 38397242 PMCID: PMC10887749 DOI: 10.3390/genes15020253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Revised: 02/09/2024] [Accepted: 02/16/2024] [Indexed: 02/25/2024] Open
Abstract
Numerous studies have shown that combining populations from similar or closely related genetic breeds improves the accuracy of genomic predictions (GP). Extensive experimentation with diverse Bayesian and genomic best linear unbiased prediction (GBLUP) models have been developed to explore multi-breed genomic selection (GS) in livestock, ultimately establishing them as successful approaches for predicting genomic estimated breeding value (GEBV). This study aimed to assess the effectiveness of using BayesR and GBLUP models with linkage disequilibrium (LD)-weighted genomic relationship matrices (GRMs) for genomic prediction in three different beef cattle breeds to identify the best approach for enhancing the accuracy of multi-breed genomic selection in beef cattle. Additionally, a comparison was conducted to evaluate the predictive precision of different marker densities and genetic correlations among the three breeds of beef cattle. The GRM between Yunling cattle (YL) and other breeds demonstrated modest affinity and highlighted a notable genetic concordance of 0.87 between Chinese Wagyu (WG) and Huaxi (HX) cattle. In the within-breed GS, BayesR demonstrated an advantage over GBLUP. The prediction accuracies for HX cattle using the BayesR model were 0.52 with BovineHD BeadChip data (HD) and 0.46 with whole-genome sequencing data (WGS). In comparison to the GBLUP model, the accuracy increased by 26.8% for HD data and 9.5% for WGS data. For WG and YL, BayesR doubled the within-breed prediction accuracy to 14.3% from 7.1%, outperforming GBLUP across both HD and WGS datasets. Moreover, analyzing multiple breeds using genomic selection showed that BayesR consistently outperformed GBLUP in terms of predictive accuracy, especially when using WGS. For instance, in a mixed reference population of HX and WG, BayesR achieved a significant accuracy of 0.53 using WGS for HX, which was a substantial enhancement over the accuracies obtained with GBLUP models. The research further highlights the benefit of including various breeds in the reference group, leading to enhanced accuracy in predictions and emphasizing the importance of comprehensive genomic selection methods. Our research findings indicate that BayesR exhibits superior performance compared to GBLUP in multi-breed genomic prediction accuracy, achieving a maximum improvement of 33.3%, especially in genetically diverse breeds. The improvement can be attributed to the effective utilization of higher single nucleotide polymorphism (SNP) marker density by BayesR, resulting in enhanced prediction accuracy. This evidence conclusively demonstrates the significant impact of BayesR on enhancing genomic predictions in diverse cattle populations, underscoring the crucial role of genetic relatedness in selection methodologies. In parallel, subsequent studies should focus on refining GRM and exploring alternative models for GP.
Collapse
Affiliation(s)
- Haoran Ma
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China; (H.M.); (H.L.); (L.Z.); (J.L.)
| | - Hongwei Li
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China; (H.M.); (H.L.); (L.Z.); (J.L.)
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB 510632, Canada
| | - Fei Ge
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China; (H.M.); (H.L.); (L.Z.); (J.L.)
| | - Huqiong Zhao
- College of Animal Science, Shanxi Agricultural University, Jinzhong 030801, China
| | - Bo Zhu
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China; (H.M.); (H.L.); (L.Z.); (J.L.)
| | - Lupei Zhang
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China; (H.M.); (H.L.); (L.Z.); (J.L.)
| | - Huijiang Gao
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China; (H.M.); (H.L.); (L.Z.); (J.L.)
| | - Lingyang Xu
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China; (H.M.); (H.L.); (L.Z.); (J.L.)
| | - Junya Li
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China; (H.M.); (H.L.); (L.Z.); (J.L.)
| | - Zezhao Wang
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China; (H.M.); (H.L.); (L.Z.); (J.L.)
| |
Collapse
|
9
|
Yin C, Zhou P, Wang Y, Yin Z, Liu Y. Using genomic selection to improve the accuracy of genomic prediction for multi-populations in pigs. Animal 2024; 18:101062. [PMID: 38211414 DOI: 10.1016/j.animal.2023.101062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 12/11/2023] [Accepted: 12/12/2023] [Indexed: 01/13/2024] Open
Abstract
The size of the reference group is among the most critical determinants of genomic estimated breeding values (GEBVs) accuracy. However, small- and medium-sized pig farms often need help accumulating adequate reference data, posing significant challenges to breeding programs. To solve this problem, exploring the potential benefits of combining reference groups of different sizes is necessary to improve GEBV accuracy. The primary objective of this investigation was to assess a more effective statistical model for combined multi-populations and its potential to enhance the accuracy of GEBVs for small and medium populations. Three populations were simulated using the QMSim software, each consisting of different sizes (300, 600, and 1 500, respectively). To assess the impact of heritability on the accuracy of GEBVs, four different levels of heritability (0.05, 0.15, 0.35, and 0.5) were simulated. Simultaneously, to investigate the impact of kinship on multi-populations, the study created four distinct scenarios for the three sizes of populations. These scenarios included: (1) the three groups are all independent, (2) the large group and the small group with a familial connection (n = 1 800), a middle group (n = 600) acting independently with no kinship, (3) the large group with a familial connection to the middle group (n = 2 100) but no connection to the small group (n = 300), and (4) the small group with a familial connection to the middle group (n = 900), while the large group (n = 1 500) acted independently with no kinship. This study evaluates and compares the accuracy of predicting breeding values using four different methods, including genomic best linear unbiased prediction (GBLUP), single-stepGBLUP (ssGBLUP), and two Bayesian models (Bayes A and Bayes B), with varying sizes of reference groups. In each scenario, three different prediction strategies were compared: (1) Merging all three different sizes of populations for predicting, (2) predicting each independent population separately, and (3) the other two populations predict the population. Our findings reveal that combining populations enhances the Bayesian models, with Bayes B yielding the highest accuracy. In independent populations, the best linear unbiased prediction (BLUP) models demonstrated the highest accuracy. However, in cases where populations were related and the heritability was high, the Bayes B model exhibited the highest overall accuracy (slightly higher than BLUP models) in the independent population. Our results underscore the importance of considering population combinations when using genetic models to predict breeding values, particularly for pig farmers with limited resources.
Collapse
Affiliation(s)
- Chang Yin
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, National Experimental Teaching Demonstration Centre of Animal Science, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Peng Zhou
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, National Experimental Teaching Demonstration Centre of Animal Science, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Yuwei Wang
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, National Experimental Teaching Demonstration Centre of Animal Science, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Zongjun Yin
- College of Animal Science and Technology, Anhui Agricultural University, Hefei 230036, PR China
| | - Yang Liu
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, National Experimental Teaching Demonstration Centre of Animal Science, Nanjing Agricultural University, Nanjing 210095, PR China.
| |
Collapse
|
10
|
Jones HE, Wilson PB. Progress and opportunities through use of genomics in animal production. Trends Genet 2022; 38:1228-1252. [PMID: 35945076 DOI: 10.1016/j.tig.2022.06.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 06/08/2022] [Accepted: 06/17/2022] [Indexed: 01/24/2023]
Abstract
The rearing of farmed animals is a vital component of global food production systems, but its impact on the environment, human health, animal welfare, and biodiversity is being increasingly challenged. Developments in genetic and genomic technologies have had a key role in improving the productivity of farmed animals for decades. Advances in genome sequencing, annotation, and editing offer a means not only to continue that trend, but also, when combined with advanced data collection, analytics, cloud computing, appropriate infrastructure, and regulation, to take precision livestock farming (PLF) and conservation to an advanced level. Such an approach could generate substantial additional benefits in terms of reducing use of resources, health treatments, and environmental impact, while also improving animal health and welfare.
Collapse
Affiliation(s)
- Huw E Jones
- UK Genetics for Livestock and Equines (UKGLE) Committee, Department for Environment, Food and Rural Affairs, Nobel House, 17 Smith Square, London, SW1P 3JR, UK; Nottingham Trent University, Brackenhurst Campus, Brackenhurst Lane, Southwell, NG25 0QF, UK.
| | - Philippe B Wilson
- UK Genetics for Livestock and Equines (UKGLE) Committee, Department for Environment, Food and Rural Affairs, Nobel House, 17 Smith Square, London, SW1P 3JR, UK; Nottingham Trent University, Brackenhurst Campus, Brackenhurst Lane, Southwell, NG25 0QF, UK
| |
Collapse
|
11
|
Bijma P, Dekkers JCM. Predictions of the accuracy of genomic prediction: connecting R2, selection index theory, and Fisher information. GENETICS SELECTION EVOLUTION 2022; 54:13. [PMID: 35164676 PMCID: PMC8842959 DOI: 10.1186/s12711-022-00700-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 01/18/2022] [Indexed: 11/10/2022]
Abstract
Abstract
Background
Deterministic predictions of the accuracy of genomic estimated breeding values (GEBV) when combining information sources have been developed based on selection index theory (SIT) and on Fisher information (FI). These two approaches have resulted in slightly different results when considering the combination of pedigree and genomic information. Here, we clarify this apparent contradiction, both for the combination of pedigree and genomic information and for the combination of subpopulations into a joint reference population.
Results
First, we show that existing expressions for the squared accuracy of GEBV can be understood as a proportion of the variance explained. Next, we show that the apparent discrepancy that has been observed between accuracies based on SIT vs. FI originated from two sources. First, the FI referred to the genetic component that is captured by the marker genotypes, rather than the full genetic component. Second, the common SIT-based derivations did not account for the increase in the accuracy of GEBV due to a reduction of the residual variance when combining information sources. The SIT and FI approaches are equivalent when these sources are accounted for.
Conclusions
The squared accuracy of GEBV can be understood as a proportion of the variance explained. The SIT and FI approaches for combining information for GEBV are equivalent and provide identical accuracies when the underlying assumptions are equivalent.
Collapse
|
12
|
Richardson C, Amer P, Quinton C, Crowley J, Hely F, van den Berg I, Pryce J. Reducing greenhouse gas emissions through genetic selection in the Australian dairy industry. J Dairy Sci 2022; 105:4272-4288. [DOI: 10.3168/jds.2021-21277] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Accepted: 12/22/2021] [Indexed: 11/19/2022]
|
13
|
Genetic and genomic characterization followed by single-step genomic evaluation of withers height in German Warmblood horses. J Appl Genet 2022; 63:369-378. [PMID: 35028913 PMCID: PMC8979901 DOI: 10.1007/s13353-021-00681-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 12/06/2021] [Accepted: 12/23/2021] [Indexed: 11/21/2022]
Abstract
Reliability of genomic predictions is influenced by the size and genetic composition of the reference population. For German Warmblood horses, compilation of a reference population has been enabled through the cooperation of five German breeding associations. In this study, preliminary data from this joint reference population were used to genetically and genomically characterize withers height and to apply single-step methodology for estimating genomic breeding values for withers height. Using data on 2113 mares and their genomic information considering about 62,000 single nucleotide polymorphisms (SNPs), analysis of the genomic relationship revealed substructures reflecting breed origin and different breeding goals of the contributing breeding associations. A genome-wide association study confirmed a known quantitative trait locus (QTL) for withers height on equine chromosome (ECA) 3 close to LCORL and identified a further significant peak on ECA 1. Using a single-step approach with a combined relationship matrix, the estimated heritability for withers height was 0.31 (SE = 0.08) and the corresponding genomic breeding values ranged from − 2.94 to 2.96 cm. A mean reliability of 0.38 was realized for these breeding values. The analyses of withers height showed that compiling a reference population across breeds is a suitable strategy for German Warmblood horses. The single-step method is an appealing approach for practical genomic prediction in horses, because not many genotypes are available yet and animals without genotypes can by this way directly contribute to the estimation system.
Collapse
|
14
|
Exploring the size of reference population for expected accuracy of genomic prediction using simulated and real data in Japanese Black cattle. BMC Genomics 2021; 22:799. [PMID: 34742249 PMCID: PMC8572443 DOI: 10.1186/s12864-021-08121-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Accepted: 10/21/2021] [Indexed: 11/19/2022] Open
Abstract
Background Size of reference population is a crucial factor affecting the accuracy of prediction of the genomic estimated breeding value (GEBV). There are few studies in beef cattle that have compared accuracies achieved using real data to that achieved with simulated data and deterministic predictions. Thus, extent to which traits of interest affect accuracy of genomic prediction in Japanese Black cattle remains obscure. This study aimed to explore the size of reference population for expected accuracy of genomic prediction for simulated and carcass traits in Japanese Black cattle using a large amount of samples. Results A simulation analysis showed that heritability and size of reference population substantially impacted the accuracy of GEBV, whereas the number of quantitative trait loci did not. The estimated numbers of independent chromosome segments (Me) and the related weighting factor (w) derived from simulation results and a maximum likelihood (ML) approach were 1900–3900 and 1, respectively. The expected accuracy for trait with heritability of 0.1–0.5 fitted well with empirical values when the reference population comprised > 5000 animals. The heritability for carcass traits was estimated to be 0.29–0.41 and the accuracy of GEBVs was relatively consistent with simulation results. When the reference population comprised 7000–11,000 animals, the accuracy of GEBV for carcass traits can range 0.73–0.79, which is comparable to estimated breeding value obtained in the progeny test. Conclusion Our simulation analysis demonstrated that the expected accuracy of GEBV for a polygenic trait with low-to-moderate heritability could be practical in Japanese Black cattle population. For carcass traits, a total of 7000–11,000 animals can be a sufficient size of reference population for genomic prediction. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-08121-z.
Collapse
|
15
|
McGaugh SE, Lorenz AJ, Flagel LE. The utility of genomic prediction models in evolutionary genetics. Proc Biol Sci 2021; 288:20210693. [PMID: 34344180 PMCID: PMC8334854 DOI: 10.1098/rspb.2021.0693] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 07/15/2021] [Indexed: 12/25/2022] Open
Abstract
Variation in complex traits is the result of contributions from many loci of small effect. Based on this principle, genomic prediction methods are used to make predictions of breeding value for an individual using genome-wide molecular markers. In breeding, genomic prediction models have been used in plant and animal breeding for almost two decades to increase rates of genetic improvement and reduce the length of artificial selection experiments. However, evolutionary genomics studies have been slow to incorporate this technique to select individuals for breeding in a conservation context or to learn more about the genetic architecture of traits, the genetic value of missing individuals or microevolution of breeding values. Here, we outline the utility of genomic prediction and provide an overview of the methodology. We highlight opportunities to apply genomic prediction in evolutionary genetics of wild populations and the best practices when using these methods on field-collected phenotypes.
Collapse
Affiliation(s)
- Suzanne E. McGaugh
- Ecology, Evolution, and Behavior, University of Minnesota, 140 Gortner Lab, 1479 Gortner Avenue, Saint Paul, MN 55108, USA
| | - Aaron J. Lorenz
- Agronomy and Plant Genetics, University of Minnesota, 411 Borlaug Hall, 1991 Upper Buford Circle, Saint Paul, MN 55108, USA
| | - Lex E. Flagel
- Plant and Microbial Biology, University of Minnesota, 140 Gortner Lab, 1479 Gortner Avenue, Saint Paul, MN 55108, USA
- Bayer Crop Science, 700 W Chesterfield Parkway, Chesterfield, MO 63017, USA
| |
Collapse
|
16
|
Mota LFM, Pegolo S, Baba T, Morota G, Peñagaricano F, Bittante G, Cecchinato A. Comparison of Single-Breed and Multi-Breed Training Populations for Infrared Predictions of Novel Phenotypes in Holstein Cows. Animals (Basel) 2021; 11:ani11071993. [PMID: 34359121 PMCID: PMC8300349 DOI: 10.3390/ani11071993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 06/30/2021] [Accepted: 07/01/2021] [Indexed: 11/16/2022] Open
Abstract
In general, Fourier-transform infrared (FTIR) predictions are developed using a single-breed population split into a training and a validation set. However, using populations formed of different breeds is an attractive way to design cross-validation scenarios aimed at increasing prediction for difficult-to-measure traits in the dairy industry. This study aimed to evaluate the potential of FTIR prediction using training set combining specialized and dual-purpose dairy breeds to predict different phenotypes divergent in terms of biological meaning, variability, and heritability, such as body condition score (BCS), serum β-hydroxybutyrate (BHB), and kappa casein (k-CN) in the major cattle breed, i.e., Holstein-Friesian. Data were obtained from specialized dairy breeds: Holstein (468 cows) and Brown Swiss (657 cows), and dual-purpose breeds: Simmental (157 cows), Alpine Grey (75 cows), and Rendena (104 cows), giving a total of 1461 cows from 41 multi-breed dairy herds. The FTIR prediction model was developed using a gradient boosting machine (GBM), and predictive ability for the target phenotype in Holstein cows was assessed using different cross-validation (CV) strategies: a within-breed scenario using 10-fold cross-validation, for which the Holstein population was randomly split into 10 folds, one for validation and the remaining nine for training (10-fold_HO); an across-breed scenario (BS_HO) where the Brown Swiss cows were used as the training set and the Holstein cows as the validation set; a specialized multi-breed scenario (BS+HO_10-fold), where the entire Brown Swiss and Holstein populations were combined then split into 10 folds, and a multi-breed scenario (Multi-breed), where the training set comprised specialized (Holstein and Brown Swiss) and dual-purpose (Simmental, Alpine Grey, and Rendena) dairy cows, combined with nine folds of the Holstein cows. Lastly a Multi-breed CV2 scenario was implemented, assuming the same number of records as the reference scenario and using the same proportions as the multi-breed. Within-Holstein, FTIR predictions had a predictive ability of 0.63 for BCS, 0.81 for BHB, and 0.80 for k-CN. Using a specific breed (Brown Swiss) as the training set for prediction in the Holstein population reduced the prediction accuracy by 10% for BCS, 7% for BHB, and 11% for k-CN. Notably, the combination of Holstein and Brown Swiss cows in the training set increased the predictive ability of the model by 6%, which was 0.66 for BCS, 0.85 for BHB, and 0.87 for k-CN. Using multiple specialized and dual-purpose animals in the training set outperforms the 10-fold_HO (standard) approach, with an increase in predictive ability of 8% for BCS, 7% for BHB, and 10% for k-CN. When the Multi-breed CV2 was implemented, no improvement was observed. Our findings suggest that FTIR prediction of different phenotypes in the Holstein breed can be improved by including different specialized and dual-purpose breeds in the training population. Our study also shows that predictive ability is enhanced when the size of the training population and the phenotypic variability are increased.
Collapse
Affiliation(s)
- Lucio Flavio Macedo Mota
- Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, 35020 Legnaro, Italy; (L.F.M.M.); (S.P.); (G.B.)
| | - Sara Pegolo
- Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, 35020 Legnaro, Italy; (L.F.M.M.); (S.P.); (G.B.)
| | - Toshimi Baba
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA; (T.B.); (G.M.)
| | - Gota Morota
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA; (T.B.); (G.M.)
- Center for Advanced Innovation in Agriculture, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Francisco Peñagaricano
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA;
| | - Giovanni Bittante
- Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, 35020 Legnaro, Italy; (L.F.M.M.); (S.P.); (G.B.)
| | - Alessio Cecchinato
- Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, 35020 Legnaro, Italy; (L.F.M.M.); (S.P.); (G.B.)
- Correspondence:
| |
Collapse
|
17
|
Dekkers JCM, Su H, Cheng J. Predicting the accuracy of genomic predictions. Genet Sel Evol 2021; 53:55. [PMID: 34187354 PMCID: PMC8244147 DOI: 10.1186/s12711-021-00647-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 06/11/2021] [Indexed: 11/22/2022] Open
Abstract
Background Mathematical models are needed for the design of breeding programs using genomic prediction. While deterministic models for selection on pedigree-based estimates of breeding values (PEBV) are available, these have not been fully developed for genomic selection, with a key missing component being the accuracy of genomic EBV (GEBV) of selection candidates. Here, a deterministic method was developed to predict this accuracy within a closed breeding population based on the accuracy of GEBV and PEBV in the reference population and the distance of selection candidates from their closest ancestors in the reference population. Methods The accuracy of GEBV was modeled as a combination of the accuracy of PEBV and of EBV based on genomic relationships deviated from pedigree (DEBV). Loss of the accuracy of DEBV from the reference to the target population was modeled based on the effective number of independent chromosome segments in the reference population (Me). Measures of Me derived from the inverse of the variance of relationships and from the accuracies of GEBV and PEBV in the reference population, derived using either a Fisher information or a selection index approach, were compared by simulation. Results Using simulation, both the Fisher and the selection index approach correctly predicted accuracy in the target population over time, both with and without selection. The index approach, however, resulted in estimates of Me that were less affected by heritability, reference size, and selection, and which are, therefore, more appropriate as a population parameter. The variance of relationships underpredicted Me and was greatly affected by selection. A leave-one-out cross-validation approach was proposed to estimate required accuracies of EBV in the reference population. Aspects of the methods were validated using real data. Conclusions A deterministic method was developed to predict the accuracy of GEBV in selection candidates in a closed breeding population. The population parameter Me that is required for these predictions can be derived from an available reference data set, and applied to other reference data sets and traits for that population. This method can be used to evaluate the benefit of genomic prediction and to optimize genomic selection breeding programs. Supplementary Information The online version contains supplementary material available at 10.1186/s12711-021-00647-w.
Collapse
Affiliation(s)
- Jack C M Dekkers
- Department of Animal Science, Iowa State University, Ames, Iowa, USA.
| | - Hailin Su
- Department of Animal Science, Iowa State University, Ames, Iowa, USA
| | - Jian Cheng
- Department of Animal Science, Iowa State University, Ames, Iowa, USA
| |
Collapse
|
18
|
Karaman E, Su G, Croue I, Lund MS. Genomic prediction using a reference population of multiple pure breeds and admixed individuals. Genet Sel Evol 2021; 53:46. [PMID: 34058971 PMCID: PMC8168010 DOI: 10.1186/s12711-021-00637-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Accepted: 05/11/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In dairy cattle populations in which crossbreeding has been used, animals show some level of diversity in their origins. In rotational crossbreeding, for instance, crossbred dams are mated with purebred sires from different pure breeds, and the genetic composition of crossbred animals is an admixture of the breeds included in the rotation. How to use the data of such individuals in genomic evaluations is still an open question. In this study, we aimed at providing methodologies for the use of data from crossbred individuals with an admixed genetic background together with data from multiple pure breeds, for the purpose of genomic evaluations for both purebred and crossbred animals. A three-breed rotational crossbreeding system was mimicked using simulations based on animals genotyped with the 50 K single nucleotide polymorphism (SNP) chip. RESULTS For purebred populations, within-breed genomic predictions generally led to higher accuracies than those from multi-breed predictions using combined data of pure breeds. Adding admixed population's (MIX) data to the combined pure breed data considering MIX as a different breed led to higher accuracies. When prediction models were able to account for breed origin of alleles, accuracies were generally higher than those from combining all available data, depending on the correlation of quantitative trait loci (QTL) effects between the breeds. Accuracies varied when using SNP effects from any of the pure breeds to predict the breeding values of MIX. Using those breed-specific SNP effects that were estimated separately in each pure breed, while accounting for breed origin of alleles for the selection candidates of MIX, generally improved the accuracies. Models that are able to accommodate MIX data with the breed origin of alleles approach generally led to higher accuracies than models without breed origin of alleles, depending on the correlation of QTL effects between the breeds. CONCLUSIONS Combining all available data, pure breeds' and admixed population's data, in a multi-breed reference population is beneficial for the estimation of breeding values for pure breeds with a small reference population. For MIX, such an approach can lead to higher accuracies than considering breed origin of alleles for the selection candidates, and using breed-specific SNP effects estimated separately in each pure breed. Including MIX data in the reference population of multiple breeds by considering the breed origin of alleles, accuracies can be further improved. Our findings are relevant for breeding programs in which crossbreeding is systematically applied, and also for populations that involve different subpopulations and between which exchange of genetic material is routine practice.
Collapse
Affiliation(s)
- Emre Karaman
- Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark.
| | - Guosheng Su
- Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | | | - Mogens S Lund
- Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| |
Collapse
|
19
|
Cao L, Mulder HA, Liu H, Nielsen HM, S Rensen AC. Competitive gene flow does not necessarily maximize the genetic gain of genomic breeding programs in the presence of genotype-by-environment interaction. J Dairy Sci 2021; 104:8122-8134. [PMID: 33934864 DOI: 10.3168/jds.2020-19823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Accepted: 03/15/2021] [Indexed: 11/19/2022]
Abstract
National and international across-population selection is often recommended and fairly common in the current breeding practice of dairy cattle, with the primary aims to increase genetic gain and genetic variability. The aim of this study was to test the hypothesis that the strategy of truncation selection of sires across populations [i.e., competitive gene flow strategy (CGF)] may not necessarily maximize genetic gain in the long term in the presence of genotype-by-environment interaction (G×E). Two alternative strategies used to be compared with CGF were forced gene flow (FGF) strategies, with 10 or 50% of domestic dams forced to be mated with foreign sires (FGF10%, FGF50%). Two equal-size populations (Ndams = 1,000) that were selected for the same breeding goal trait (h2 = 0.3) under G×E correlation (rg) of either 0.9 or 0.8 were simulated to test these 3 different strategies. Each population first experienced either 5 or 20 differentiation generations (Gd), then 15 migration generations. Discrete generations were simulated for simplicity. Each population performed a within-population conventional breeding program during differentiation generations and the 3 across-population sire selection strategies based on joint genomic prediction during migration generations. The 4 Gd_rg combinations defined 4 different levels of differentiation degree between the 2 populations at the start of migration. The true rate of inbreeding over the last 10 migration generations in each scenario was constrained at 0.01 to provide a fair basis for comparison of genetic gain across scenarios. Results showed that CGF maximized the genetic gain after 15 migration generations in 5_0.9 combination only, the case of the lowest differentiation degree, with a superiority of 0.4% (0.04 genetic SD units) over the suboptimal strategy. While in 5_0.8, 20_0.9, and 20_0.8 combinations, 2 FGF strategies had a superiority in genetic gain of 2.3 to 12.5% (0.21-1.07 genetic SD units) over CGF after 15 migration generations, especially FGF50%. The superiority of FGF strategies over CGF was that they alleviated inbreeding, introduced new genetic variance in the early migration period, and improved accuracy in the entire migration period. Therefore, we concluded that CGF does not necessarily maximize the genetic gain of across-population genomic breeding programs given moderate G×E. The across-population selection strategy remains to be optimized to maximize genetic gain.
Collapse
Affiliation(s)
- L Cao
- Center for Quantitative Genetics and Genomics, Aarhus University, Blichers Alle 20, 8830 Tjele, Denmark.
| | - H A Mulder
- Animal Breeding and Genomics Group, Wageningen University and Research, 6700 AH Wageningen, the Netherlands
| | - H Liu
- Center for Quantitative Genetics and Genomics, Aarhus University, Blichers Alle 20, 8830 Tjele, Denmark
| | - H M Nielsen
- Center for Quantitative Genetics and Genomics, Aarhus University, Blichers Alle 20, 8830 Tjele, Denmark
| | - A C S Rensen
- Center for Quantitative Genetics and Genomics, Aarhus University, Blichers Alle 20, 8830 Tjele, Denmark; Danish Pig Research Centre, SEGES, Axeltorv 3, 1609 Copenhagen V, Denmark
| |
Collapse
|
20
|
An Overview of Key Factors Affecting Genomic Selection for Wheat Quality Traits. PLANTS 2021; 10:plants10040745. [PMID: 33920359 PMCID: PMC8069980 DOI: 10.3390/plants10040745] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 04/06/2021] [Accepted: 04/08/2021] [Indexed: 11/17/2022]
Abstract
Selection for wheat (Triticum aestivum L.) grain quality is often costly and time-consuming since it requires extensive phenotyping in the last phases of development of new lines and cultivars. The development of high-throughput genotyping in the last decade enabled reliable and rapid predictions of breeding values based only on marker information. Genomic selection (GS) is a method that enables the prediction of breeding values of individuals by simultaneously incorporating all available marker information into a model. The success of GS depends on the obtained prediction accuracy, which is influenced by various molecular, genetic, and phenotypic factors, as well as the factors of the selected statistical model. The objectives of this article are to review research on GS for wheat quality done so far and to highlight the key factors affecting prediction accuracy, in order to suggest the most applicable approach in GS for wheat quality traits.
Collapse
|
21
|
Granado-Tajada I, Varona L, Ugarte E. Genotyping strategies for maximizing genomic information in evaluations of the Latxa dairy sheep breed. J Dairy Sci 2021; 104:6861-6872. [PMID: 33773777 DOI: 10.3168/jds.2020-19978] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 02/12/2021] [Indexed: 12/12/2022]
Abstract
Genomic selection has been implemented over the years in several livestock species, due to the achievable higher genetic progress. The use of genomic information in evaluations provides better prediction accuracy than do pedigree-based evaluations, and the makeup of the genotyped population is a decisive point. The aim of this work is to compare the effect of different genotyping strategies (number and type of animals) on the prediction accuracy for dairy sheep Latxa breeds. A simulation study was designed based on the real data structure of each population, and the phenotypic and genotypic data obtained were used in genetic (BLUP) and genomic (single-step genomic BLUP) evaluations of different genotyping strategies. The genotyping of males was beneficial when they were genetically connected individuals and if they had daughters with phenotypic records. Genotyping females with their own lactation records increased prediction accuracy, and the connection level has less relevance. The differences in genotyping females were independent of their estimated breeding value. The combined genotyping of males and females provided intermediate accuracy results regardless of the female selection strategy. Therefore, assuming that genotyping rams is interesting, the incorporation of genotyped females would be beneficial and worthwhile. The benefits of genotyping individuals from various generations were highlighted, although it was also possible to gain prediction accuracy when historic individuals were not considered. Greater genotyped population sizes resulted in more accuracy, even if the increase seems to reach a plateau.
Collapse
Affiliation(s)
- I Granado-Tajada
- Department of Animal Production, NEIKER-BRTA Basque Institute of Agricultural Research and Development, Agrifood Campus of Arkaute s/n, E-01080 Arkaute, Spain.
| | - L Varona
- Departamento de Anatomía Embriología y Genética Animal, Instituto Agroalimentario de Aragón (IA2), Universidad de Zaragoza, 50013 Zaragoza, Spain
| | - E Ugarte
- Department of Animal Production, NEIKER-BRTA Basque Institute of Agricultural Research and Development, Agrifood Campus of Arkaute s/n, E-01080 Arkaute, Spain
| |
Collapse
|
22
|
Marjanovic J, Calus MPL. Factors affecting accuracy of estimated effective number of chromosome segments for numerically small breeds. J Anim Breed Genet 2021; 138:151-160. [PMID: 33040409 PMCID: PMC7891385 DOI: 10.1111/jbg.12512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 08/25/2020] [Accepted: 09/12/2020] [Indexed: 11/28/2022]
Abstract
For numerically small breeds, obtaining a sufficiently large breed-specific reference population for genomic prediction is challenging or simply not possible, but may be overcome by adding individuals from another breed. To prioritize among available breeds, the effective number of chromosome segments (Me ) can be used as an indicator of relatedness between individuals from different breeds. The Me is also an important parameter in determining the accuracy of genomic prediction. The Me can be estimated both within a population and between two populations or breeds, as the reciprocal of the variance of genomic relationships. However, the threshold for number of individuals needed to accurately estimate within or between populations Me is currently unknown. It is also unknown if a discrepancy in number of genotyped individuals in two breeds affects the estimates of Me between populations. In this study, we conducted a simulation that mimics current domestic cattle populations in order to investigate how estimated Me is affected by number of genotyped individuals, single-nucleotide polymorphism (SNP) density and pedigree availability. Our results show that a small sample of 10 genotyped individuals may result in substantial over or underestimation of Me . While estimates of within population Me were hardly affected by SNP density, between population Me values were highly dependent on the number of available SNPs, with higher SNP densities being able to detect more independent chromosome segments. When subtracting pedigree from genomic relationships before computing Me , estimates of within population Me were three to four times higher than estimates with genotypes only; however, between Me estimates remained the same. For accurate estimation of within and between population Me , at least 50 individuals should be genotyped per population. Estimates of within Me were highly affected by whether pedigree was used or not. For within Me , even the smallest SNP density (~11k) resulted in accurate representation of family relationships in the population; however, for between Me , many more markers are needed to capture all independent segments.
Collapse
Affiliation(s)
- Jovana Marjanovic
- Animal Breeding and GenomicsWageningen University & ResearchWageningenThe Netherlands
| | - Mario P. L. Calus
- Animal Breeding and GenomicsWageningen University & ResearchWageningenThe Netherlands
| |
Collapse
|
23
|
Meuwissen T, van den Berg I, Goddard M. On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL. Genet Sel Evol 2021; 53:19. [PMID: 33637049 PMCID: PMC7908738 DOI: 10.1186/s12711-021-00607-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Accepted: 01/25/2021] [Indexed: 11/10/2022] Open
Abstract
Background Whole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities. Here, we present a computationally fast implementation of a variable selection genomic prediction method, that could handle WGS data on more than 35,000 individuals, test its accuracy for across-breed predictions and assess its quantitative trait locus (QTL) mapping precision. Methods The Monte Carlo Markov chain (MCMC) variable selection model (Bayes GC) fits simultaneously a genomic best linear unbiased prediction (GBLUP) term, i.e. a polygenic effect whose correlations are described by a genomic relationship matrix (G), and a Bayes C term, i.e. a set of single nucleotide polymorphisms (SNPs) with large effects selected by the model. Computational speed is improved by a Metropolis–Hastings sampling that directs computations to the SNPs, which are, a priori, most likely to be included into the model. Speed is also improved by running many relatively short MCMC chains. Memory requirements are reduced by storing the genotype matrix in binary form. The model was tested on a WGS dataset containing Holstein, Jersey and Australian Red cattle. The data contained 4,809,520 genotypes on 35,549 individuals together with their milk, fat and protein yields, and fat and protein percentage traits. Results The prediction accuracies of the Jersey individuals improved by 1.5% when using across-breed GBLUP compared to within-breed predictions. Using WGS instead of 600 k SNP-chip data yielded on average a 3% accuracy improvement for Australian Red cows. QTL were fine-mapped by locating the SNP with the highest posterior probability of being included in the model. Various QTL known from the literature were rediscovered, and a new SNP affecting milk production was discovered on chromosome 20 at 34.501126 Mb. Due to the high mapping precision, it was clear that many of the discovered QTL were the same across the five dairy traits. Conclusions Across-breed Bayes GC genomic prediction improved prediction accuracies compared to GBLUP. The combination of across-breed WGS data and Bayesian genomic prediction proved remarkably effective for the fine-mapping of QTL.
Collapse
Affiliation(s)
- Theo Meuwissen
- Norwegian University of Life Sciences, Box 5003, 1432, Ås, Norway.
| | | | - Mike Goddard
- Agriculture Victoria, Bundoora, Australia.,Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Parkville, Australia
| |
Collapse
|
24
|
Marjanovic J, Hulsegge B, Calus MPL. Relatedness between numerically small Dutch Red dairy cattle populations and possibilities for multibreed genomic prediction. J Dairy Sci 2021; 104:4498-4506. [PMID: 33551169 DOI: 10.3168/jds.2020-19573] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 11/05/2020] [Indexed: 11/19/2022]
Abstract
Red dairy breeds are a valuable cultural and historical asset, and often a source of unique genetic diversity. However, they have difficulties competing with other, more productive, dairy breeds. Improving competitiveness of Red dairy breeds, by accelerating their genetic improvement using genomic selection, may be a promising strategy to secure their long-term future. For many Red dairy breeds, establishing a sufficiently large breed-specific reference population for genomic prediction is often not possible, but may be overcome by adding individuals from another breed. Relatedness between breeds strongly decides the benefit of adding another breed to the reference population. To prioritize among available breeds, the effective number of chromosome segments (Me) can be used as an indicator of relatedness between individuals from different breeds. The Me is also an important parameter in determining the accuracy of genomic prediction. The Me can be estimated both within a population and between 2 populations or breeds, as the reciprocal of the variance of genomic relationships. We investigated relatedness between 6 Dutch Red cattle breeds, Groningen White Headed (GWH), Dutch Friesian (DF), Meuse-Rhine-Yssel (MRY), Dutch Belted (DB), Deep Red (DR), and Improved Red (IR), focusing primarily on the Me, to predict which of those breeds may benefit from including reference animals of the other breeds. All of these breeds, except MRY, are under high risk of extinction. Our results indicated high variability of Me, especially between Me ranging from ∼3,500 to ∼17,400, indicating different levels of relatedness between the breeds. Two clusters are especially important, one formed by MRY, DR, and IR, and the other comprising DF and DB. Although relatedness between breeds within each of these 2 clusters is high, across-breed genomic prediction is still limited by the current number of genotyped individuals, which for many breeds is low. However, adding MRY individuals would increase the reference population of DR substantially. We estimated that between 11 and 133 individuals from other breeds are needed to achieve accuracy of genomic prediction equivalent to using one additional individual from the same breed. Given the variation in size of the breeds in this study, the benefit of a multibreed reference population is expected to be lower for larger breeds than for the smaller ones.
Collapse
Affiliation(s)
- J Marjanovic
- Animal Breeding and Genomics, Wageningen University & Research, Droevendaalsesteeg 1, 6700AH Wageningen, the Netherlands
| | - B Hulsegge
- Animal Breeding and Genomics, Wageningen University & Research, Droevendaalsesteeg 1, 6700AH Wageningen, the Netherlands; Centre for Genetic Resources, Wageningen University & Research, Droevendaalsesteeg 1, 6700AH Wageningen, the Netherlands
| | - M P L Calus
- Animal Breeding and Genomics, Wageningen University & Research, Droevendaalsesteeg 1, 6700AH Wageningen, the Netherlands.
| |
Collapse
|
25
|
van den Berg I, Ho PN, Luke TDW, Haile-Mariam M, Bolormaa S, Pryce JE. The use of milk mid-infrared spectroscopy to improve genomic prediction accuracy of serum biomarkers. J Dairy Sci 2020; 104:2008-2017. [PMID: 33358169 DOI: 10.3168/jds.2020-19468] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 10/07/2020] [Indexed: 01/24/2023]
Abstract
Breeding objectives in the dairy industry have shifted from being solely focused on production to including fertility, animal health, and environmental impact. Increased serum concentrations of candidate biomarkers of health and fertility, such as β-hydroxybutyric acid (BHB), fatty acids, and urea are difficult and costly to measure, and thus limit the number of records. Accurate genomic prediction requires a large reference population. The inclusion of milk mid-infrared (MIR) spectroscopic predictions of biomarkers may increase genomic prediction accuracy of these traits. Our objectives were to (1) estimate the heritability of, and genetic correlations between, selected serum biomarkers and their respective MIR predictions, and (2) evaluate genomic prediction accuracies of either only measured serum traits, or serum traits plus MIR-predicted traits. The MIR-predicted traits were either fitted in a single trait model, assuming the measured trait and predicted trait were the same trait, or in a multitrait model, where measured and predicted trait were assumed to be correlated traits. We performed all analyses using relationship matrices constructed from pedigree (A matrix), genotypes (G matrix), or both pedigree and genotypes (H matrix). Our data set comprised up to 2,198 and 9,657 Holstein cows with records for serum biomarkers and MIR-predicted traits, respectively. Heritabilities of measured serum traits ranged from 0.04 to 0.07 for BHB, from 0.13 to 0.21 for fatty acids, and from 0.10 to 0.12 for urea. Heritabilities for MIR-predicted traits were not significantly different from those for the measured traits. Genetic correlations between measured traits and MIR-predicted traits were close to 1 for urea. For BHB and fatty acids, genetic correlations were lower and had large standard errors. The inclusion of MIR predicted urea substantially increased prediction accuracy for urea. For BHB, including MIR-predicted BHB reduced the genomic prediction accuracy, whereas for fatty acids, prediction accuracies were similar with either measured fatty acids, MIR-predicted fatty acids, or both. The high genetic correlation between urea and MIR-predicted urea, in combination with the increased prediction accuracy, demonstrated the potential of using MIR-predicted urea for genomic prediction of urea. For BHB and fatty acids, further studies with larger data sets are required to obtain more accurate estimates of genetic correlations.
Collapse
Affiliation(s)
- I van den Berg
- Agriculture Victoria Research, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria 3083, Australia.
| | - P N Ho
- Agriculture Victoria Research, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria 3083, Australia
| | - T D W Luke
- Agriculture Victoria Research, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria 3083, Australia; School of Applied Systems Biology, La Trobe University, Bundoora, Victoria 3083, Australia
| | - M Haile-Mariam
- Agriculture Victoria Research, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria 3083, Australia
| | - S Bolormaa
- Agriculture Victoria Research, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria 3083, Australia
| | - J E Pryce
- Agriculture Victoria Research, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria 3083, Australia; School of Applied Systems Biology, La Trobe University, Bundoora, Victoria 3083, Australia
| |
Collapse
|
26
|
Khansefid M, Goddard ME, Haile-Mariam M, Konstantinov KV, Schrooten C, de Jong G, Jewell EG, O’Connor E, Pryce JE, Daetwyler HD, MacLeod IM. Improving Genomic Prediction of Crossbred and Purebred Dairy Cattle. Front Genet 2020; 11:598580. [PMID: 33381150 PMCID: PMC7767986 DOI: 10.3389/fgene.2020.598580] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 11/19/2020] [Indexed: 11/17/2022] Open
Abstract
This study assessed the accuracy and bias of genomic prediction (GP) in purebred Holstein (H) and Jersey (J) as well as crossbred (H and J) validation cows using different reference sets and prediction strategies. The reference sets were made up of different combinations of 36,695 H and J purebreds and crossbreds. Additionally, the effect of using different sets of marker genotypes on GP was studied (conventional panel: 50k, custom panel enriched with, or close to, causal mutations: XT_50k, and conventional high-density with a limited custom set: pruned HDnGBS). We also compared the use of genomic best linear unbiased prediction (GBLUP) and Bayesian (emBayesR) models, and the traits tested were milk, fat, and protein yields. On average, by including crossbred cows in the reference population, the prediction accuracies increased by 0.01-0.08 and were less biased (regression coefficient closer to 1 by 0.02-0.16), and the benefit was greater for crossbreds compared to purebreds. The accuracy of prediction increased by 0.02 using XT_50k compared to 50k genotypes without affecting the bias. Although using pruned HDnGBS instead of 50k also increased the prediction accuracy by about 0.02, it increased the bias for purebred predictions in emBayesR models. Generally, emBayesR outperformed GBLUP for prediction accuracy when using 50k or pruned HDnGBS genotypes, but the benefits diminished with XT_50k genotypes. Crossbred predictions derived from a joint pure H and J reference were similar in accuracy to crossbred predictions derived from the two separate purebred reference sets and combined proportional to breed composition. However, the latter approach was less biased by 0.13. Most interestingly, using an equalized breed reference instead of an H-dominated reference, on average, reduced the bias of prediction by 0.16-0.19 and increased the accuracy by 0.04 for crossbred and J cows, with a little change in the H accuracy. In conclusion, we observed improved genomic predictions for both crossbreds and purebreds by equalizing breed contributions in a mixed breed reference that included crossbred cows. Furthermore, we demonstrate, that compared to the conventional 50k or high-density panels, our customized set of 50k sequence markers improved or matched the prediction accuracy and reduced bias with both GBLUP and Bayesian models.
Collapse
Affiliation(s)
- Majid Khansefid
- AgriBio Centre for AgriBioscience, Agriculture Victoria Services, Bundoora, VIC, Australia
| | - Michael E. Goddard
- AgriBio Centre for AgriBioscience, Agriculture Victoria Services, Bundoora, VIC, Australia
- Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Parkville, VIC, Australia
| | - Mekonnen Haile-Mariam
- AgriBio Centre for AgriBioscience, Agriculture Victoria Services, Bundoora, VIC, Australia
| | | | | | | | | | | | - Jennie E. Pryce
- AgriBio Centre for AgriBioscience, Agriculture Victoria Services, Bundoora, VIC, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, Australia
| | - Hans D. Daetwyler
- AgriBio Centre for AgriBioscience, Agriculture Victoria Services, Bundoora, VIC, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, Australia
| | - Iona M. MacLeod
- AgriBio Centre for AgriBioscience, Agriculture Victoria Services, Bundoora, VIC, Australia
| |
Collapse
|
27
|
Wientjes YCJ, Bijma P, Calus MPL. Optimizing genomic reference populations to improve crossbred performance. Genet Sel Evol 2020; 52:65. [PMID: 33158416 PMCID: PMC7648379 DOI: 10.1186/s12711-020-00573-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Accepted: 09/18/2020] [Indexed: 11/10/2022] Open
Abstract
Background In pig and poultry breeding, the objective is to improve the performance of crossbred production animals, while selection takes place in the purebred parent lines. One way to achieve this is to use genomic prediction with a crossbred reference population. A crossbred reference population benefits from expressing the breeding goal trait but suffers from a lower genetic relatedness with the purebred selection candidates than a purebred reference population. Our aim was to investigate the benefit of using a crossbred reference population for genomic prediction of crossbred performance for: (1) different levels of relatedness between the crossbred reference population and purebred selection candidates, (2) different levels of the purebred-crossbred correlation, and (3) different reference population sizes. We simulated a crossbred breeding program with 0, 1 or 2 multiplication steps to generate the crossbreds, and compared the accuracy of genomic prediction of crossbred performance in one generation using either a purebred or a crossbred reference population. For each scenario, we investigated the empirical accuracy based on simulation and the predicted accuracy based on the estimated effective number of independent chromosome segments between the reference animals and selection candidates. Results When the purebred-crossbred correlation was 0.75, the accuracy was highest for a two-way crossbred reference population but similar for purebred and four-way crossbred reference populations, for all reference population sizes. When the purebred-crossbred correlation was 0.5, a purebred reference population always resulted in the lowest accuracy. Among the different crossbred reference populations, the accuracy was slightly lower when more multiplication steps were used to create the crossbreds. In general, the benefit of crossbred reference populations increased when the size of the reference population increased. All predicted accuracies overestimated their corresponding empirical accuracies, but the different scenarios were ranked accurately when the reference population was large. Conclusions The benefit of a crossbred reference population becomes larger when the crossbred population is more related to the purebred selection candidates, when the purebred-crossbred correlation is lower, and when the reference population is larger. The purebred-crossbred correlation and reference population size interact with each other with respect to their impact on the accuracy of genomic estimated breeding values.
Collapse
Affiliation(s)
- Yvonne C J Wientjes
- Wageningen University & Research, Animal Breeding and Genomics, 6700 AH, Wageningen, The Netherlands.
| | - Piter Bijma
- Wageningen University & Research, Animal Breeding and Genomics, 6700 AH, Wageningen, The Netherlands
| | - Mario P L Calus
- Wageningen University & Research, Animal Breeding and Genomics, 6700 AH, Wageningen, The Netherlands
| |
Collapse
|
28
|
van den Berg I, MacLeod I, Reich C, Breen E, Pryce J. Optimizing genomic prediction for Australian Red dairy cattle. J Dairy Sci 2020; 103:6276-6298. [DOI: 10.3168/jds.2019-17914] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Accepted: 02/13/2020] [Indexed: 12/18/2022]
|
29
|
Raymond B, Wientjes YCJ, Bouwman AC, Schrooten C, Veerkamp RF. A deterministic equation to predict the accuracy of multi-population genomic prediction with multiple genomic relationship matrices. Genet Sel Evol 2020; 52:21. [PMID: 32345213 PMCID: PMC7189707 DOI: 10.1186/s12711-020-00540-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 04/14/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A multi-population genomic prediction (GP) model in which important pre-selected single nucleotide polymorphisms (SNPs) are differentially weighted (MPMG) has been shown to result in better prediction accuracy than a multi-population, single genomic relationship matrix ([Formula: see text]) GP model (MPSG) in which all SNPs are weighted equally. Our objective was to underpin theoretically the advantages and limits of the MPMG model over the MPSG model, by deriving and validating a deterministic prediction equation for its accuracy. METHODS Using selection index theory, we derived an equation to predict the accuracy of estimated total genomic values of selection candidates from population [Formula: see text] ([Formula: see text]), when individuals from two populations, [Formula: see text] and [Formula: see text], are combined in the training population and two [Formula: see text], made respectively from pre-selected and remaining SNPs, are fitted simultaneously in MPMG. We used simulations to validate the prediction equation in scenarios that differed in the level of genetic correlation between populations, heritability, and proportion of genetic variance explained by the pre-selected SNPs. Empirical accuracy of the MPMG model in each scenario was calculated and compared to the predicted accuracy from the equation. RESULTS In general, the derived prediction equation resulted in accurate predictions of [Formula: see text] for the scenarios evaluated. Using the prediction equation, we showed that an important advantage of the MPMG model over the MPSG model is its ability to benefit from the small number of independent chromosome segments ([Formula: see text]) due to the pre-selected SNPs, both within and across populations, whereas for the MPSG model, there is only a single value for [Formula: see text], calculated based on all SNPs, which is very large. However, this advantage is dependent on the pre-selected SNPs that explain some proportion of the total genetic variance for the trait. CONCLUSIONS We developed an equation that gives insight into why, and under which conditions the MPMG outperforms the MPSG model for GP. The equation can be used as a deterministic tool to assess the potential benefit of combining information from different populations, e.g., different breeds or lines for GP in livestock or plants, or different groups of people based on their ethnic background for prediction of disease risk scores.
Collapse
Affiliation(s)
- Biaty Raymond
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands. .,Biometris, Wageningen University and Research, 6700AA, Wageningen, The Netherlands.
| | - Yvonne C J Wientjes
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| | - Aniek C Bouwman
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| | | | - Roel F Veerkamp
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| |
Collapse
|
30
|
Theoretical Evaluation of Multi-Breed Genomic Prediction in Chinese Indigenous Cattle. Animals (Basel) 2019; 9:ani9100789. [PMID: 31614691 PMCID: PMC6827096 DOI: 10.3390/ani9100789] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 09/24/2019] [Accepted: 10/02/2019] [Indexed: 12/19/2022] Open
Abstract
Simple Summary In order to evaluate the potential application of genomic selection (GS) for Chinese indigenous cattle, we assessed the influence of combining multiple populations on the reliability of genomic predictions for 10 indigenous breeds of Chinese cattle using simulated data. We found the predictive accuracies to be low when the reference and validation populations were sampled from different breeds. When using multiple breeds for the reference population, the predictive accuracies were higher if the reference was comprised of breeds with close relationships. In addition, the accuracy increased in all scenarios when the heritability increased, and the genetic architecture of the QTL can affect genomic prediction. Our study suggested that the application of meta-populations can increase accuracy in scenarios with a reduced size of reference populations. Abstract Genomic selection (GS) has been widely considered as a valuable strategy for enhancing the rate of genetic gain in farm animals. However, the construction of a large reference population is a big challenge for small populations like indigenous cattle. In order to evaluate the potential application of GS for Chinese indigenous cattle, we assessed the influence of combining multiple populations on the reliability of genomic predictions for 10 indigenous breeds of Chinese cattle using simulated data. Also, we examined the effect of different genetic architecture on prediction accuracy. In this study, we simulated a set of genotype data by a resampling approach which can reflect the realistic linkage disequilibrium pattern for multiple populations. We found within-breed evaluations yielded the highest accuracies ranged from 0.64 to 0.68 for four different simulated genetic architectures. For scenarios using multiple breeds as reference, the predictive accuracies were higher when the reference was comprised of breeds with a close relationship, while the accuracies were low when prediction were carried out among breeds. In addition, the accuracy increased in all scenarios with the heritability increased. Our results suggested that using meta-population as reference can increase accuracy of genomic predictions for small populations. Moreover, multi-breed genomic selection was feasible for Chinese indigenous populations with genetic relationships.
Collapse
|