1
|
Lee J, Mun H, Koo Y, Park S, Kim J, Yu S, Shin J, Lee J, Son J, Park C, Lee S, Song H, Kim S, Dang C, Park J. Enhancing Genomic Prediction Accuracy for Body Conformation Traits in Korean Holstein Cattle. Animals (Basel) 2024; 14:1052. [PMID: 38612291 PMCID: PMC11011013 DOI: 10.3390/ani14071052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 03/18/2024] [Accepted: 03/28/2024] [Indexed: 04/14/2024] Open
Abstract
The Holstein breed is the mainstay of dairy production in Korea. In this study, we evaluated the genomic prediction accuracy for body conformation traits in Korean Holstein cattle, using a range of π levels (0.75, 0.90, 0.99, and 0.995) in Bayesian methods (BayesB and BayesC). Focusing on 24 traits, we analyzed the impact of different π levels on prediction accuracy. We observed a general increase in accuracy at higher levels for specific traits, with variations depending on the Bayesian method applied. Notably, the highest accuracy was achieved for rear teat angle when using deregressed estimated breeding values including parent average as a response variable. We further demonstrated that incorporating parent average into deregressed estimated breeding values enhances genomic prediction accuracy, showcasing the effectiveness of the model in integrating both offspring and parental genetic information. Additionally, we identified 18 significant window regions through genome-wide association studies, which are crucial for future fine mapping and discovery of causal mutations. These findings provide valuable insights into the efficiency of genomic selection for body conformation traits in Korean Holstein cattle and highlight the potential for advancements in the prediction accuracy using larger datasets and more sophisticated genomic models.
Collapse
Affiliation(s)
- Jungjae Lee
- Department of Animal Science and Technology, College of Biotechnology and Natural Resources, Chung-Ang University, Anseong 17546, Republic of Korea;
| | - Hyosik Mun
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Yangmo Koo
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Sangchul Park
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Junsoo Kim
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Seongpil Yu
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Jiseob Shin
- Dairy Cattle Improvement Center of NH-Agree Business Group, National Agricultural Cooperative Federation, Goyang 10292, Republic of Korea; (J.S.); (S.L.); (H.S.)
| | - Jaegu Lee
- Animal Breeding and Genetics Division, National Institute of Animal Science, Rural Development Administration, Cheonan 31000, Republic of Korea;
| | - Jihyun Son
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Chanhyuk Park
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Seokhyun Lee
- Dairy Cattle Improvement Center of NH-Agree Business Group, National Agricultural Cooperative Federation, Goyang 10292, Republic of Korea; (J.S.); (S.L.); (H.S.)
| | - Hyungjun Song
- Dairy Cattle Improvement Center of NH-Agree Business Group, National Agricultural Cooperative Federation, Goyang 10292, Republic of Korea; (J.S.); (S.L.); (H.S.)
| | - Sungjin Kim
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Changgwon Dang
- Animal Breeding and Genetics Division, National Institute of Animal Science, Rural Development Administration, Cheonan 31000, Republic of Korea;
| | - Jun Park
- Department of Animal Biotechnology, Jeonbuk National University, Jeonju 54896, Republic of Korea
| |
Collapse
|
2
|
Pedrosa VB, Chen SY, Gloria LS, Doucette JS, Boerman JP, Rosa GJM, Brito LF. Machine learning methods for genomic prediction of cow behavioral traits measured by automatic milking systems in North American Holstein cattle. J Dairy Sci 2024:S0022-0302(24)00497-1. [PMID: 38395400 DOI: 10.3168/jds.2023-24082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 01/18/2024] [Indexed: 02/25/2024]
Abstract
Identifying genome-enabled methods that provide more accurate genomic prediction is crucial when evaluating complex traits such as dairy cow behavior. In this study, we aimed to compare the predictive performance of traditional genomic prediction methods and deep learning algorithms for genomic prediction of milking refusals (MREF) and milking failures (MFAIL) in North American Holstein cows measured by automatic milking systems (milking robots). A total of 1,993,509 daily records from 4,511 genotyped Holstein cows were collected by 36 milking robot stations. After quality control, 57,600 single nucleotide polymorphisms (SNP) were available for the analyses. Four genomic prediction methods were considered: Bayesian Lasso (LASSO), Multiple Layer Perceptron (MLP), Convolutional Neural Network (CNN), and Genomic Best Linear Unbiased Prediction (GBLUP). We implemented the first 3 methods using the Keras and TensorFlow libraries in Python (v.3.9) while the GBLUP method was implemented using the BLUPF90+ family programs. The accuracy of genomic prediction (Mean Square Error) for MREF and MFAIL was 0.34 (0.08) and 0.27 (0.08) based on LASSO, 0.36 (0.09) and 0.32 (0.09) for MLP, 0.37 (0.08) and 0.30 (0.09) for CNN, and 0.35 (0.09) and 0.31(0.09) based on GBLUP, respectively. Additionally, we observed a lower re-ranking of top selected individuals based on the MLP versus CNN methods compared with the other approaches for both MREF and MFAIL. Although the deep learning methods showed slightly higher accuracies than GBLUP, the results may not be sufficient to justify their use over traditional methods due to their higher computational demand and the difficulty of performing genomic prediction for non-genotyped individuals using deep learning procedures. Overall, this study provides insights into the potential feasibility of using deep learning methods to enhance genomic prediction accuracy for behavioral traits in livestock. Further research is needed to determine their practical applicability to large dairy cattle breeding programs.
Collapse
Affiliation(s)
- Victor B Pedrosa
- Department of Animal Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Shi-Yi Chen
- Department of Animal Sciences, Purdue University, West Lafayette, IN, 47907, USA; Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, Sichuan, 611130, China
| | - Leonardo S Gloria
- Department of Animal Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Jarrod S Doucette
- Agriculture Information Technology (AgIT), Purdue University, West Lafayette, IN, 47907, USA
| | - Jacquelyn P Boerman
- Department of Animal Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Guilherme J M Rosa
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Luiz F Brito
- Department of Animal Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| |
Collapse
|
3
|
Afrazandeh M, Abdolahi-Arpanahi R, Abbasi MA, Kashan NEJ, Torshizi RV. Comparison of different response variables in genomic prediction using GBLUP and ssGBLUP methods in Iranian Holstein cattle. J DAIRY RES 2022; 89:1-7. [PMID: 35604025 DOI: 10.1017/s0022029922000395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
We compared the reliability and bias of genomic evaluation of Holstein bulls for milk, fat, and protein yield with two methods of genomic best linear unbiased prediction (GBLUP) and single-step GBLUP (ssGBLUP). Four response variables of estimated breeding value (EBV), daughter yield deviation (DYD), de-regressed proofs based on Garrick (DRPGR) and VanRaden (DRPVR) were used as dependent variables. The effects of three weighting methods for diagonal elements of the incidence matrix associated with residuals were also explored. The reliability and the absolute deviation from 1 of the regression coefficient of the response variable on genomic prediction (Dev) using GBLUP and ssGBLUP methods were estimated in the validation population. In the ssGBLUP method, the genomic prediction reliability and Dev from un-weighted DRPGR method for milk yield were 0.44 and 0.002, respectively. In the GBLUP method, the corresponding measurements from un-weighted EBV for fat were 0.52 and 0.008, respectively. Moreover, the un-weighted DRPGR performed well in ssGBLUP with fat yield values for reliability and Dev of 0.49 and 0.001, respectively, compared to equivalent protein yield values of 0.38 and 0.056, respectively. In general, the results from ssGBLUP of the un-weighted DRPGR for milk and fat yield and weighted DRPGR for protein yield outperformed other models. The average reliability of genomic predictions for three traits from ssGBLUP was 0.39 which was 0.98% higher than the average reliability from GBLUP. Likewise, the Dev of genomic predictions was lower in ssGBLUP than GBLUP. The average Dev of predictions for three traits from ssGBLUP and GBLUP were 0.110 and 0.144, respectively. In conclusion, genomic prediction using ssGBLUP outperformed GBLUP both in terms of reliability and bias.
Collapse
Affiliation(s)
- Mohamadreza Afrazandeh
- Department of Animal Science, Faculty of Agriculture Sciences and Food Industries, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Rostam Abdolahi-Arpanahi
- Department of Animal and Dairy Science, College of Agricultural and Environmental Sciences, University of Georgia, Athens, USA
| | - Mokhtar Ali Abbasi
- Animal Science Research Institute of Iran, Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran
| | - Nasser Emam Jomeh Kashan
- Department of Animal Science, Faculty of Agriculture Sciences and Food Industries, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Rasoul Vaez Torshizi
- Department of Animal Science, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| |
Collapse
|
4
|
Understanding the genomic architecture of clinical mastitis in Bos indicus. 3 Biotech 2021; 11:466. [PMID: 34745817 DOI: 10.1007/s13205-021-03012-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 10/01/2021] [Indexed: 12/26/2022] Open
Abstract
This study elucidated potential genetic variants and QTLs associated with clinical mastitis incidence traits in Bos indicus breed, Sahiwal. Estimated breeding values for the traits (calculated using Bayesian inference) were used as pseudo-phenotypes for association with genome-wide SNPs and further QTL regions underlying the traits were identified. In all, 25 SNPs were found to be associated with the traits at the genome-wide suggestive threshold (p ≤ 5 × 10-4) and these SNPs were used to define QTL boundaries based on the linkage disequilibrium structure. A total of 16 QTLs were associated with the trait EBVs including seven each for clinical mastitis incidence (CMI) in first and second lactations and two for CMI in third lactation. Nine out of sixteen QTLs overlapped with the already reported QTLs for mastitis traits, whereas seven were adjudged as novel ones. Important candidates for clinical mastitis in the identified QTL regions included DNAJB9, ELMO1, ARHGAP26, NR3C1, CACNB2, RAB4A, GRB2, NUP85, SUMO2, RBPJ, and RAB33B genes. These findings shed light on the genetic architecture of the disease in Bos indicus, and present potential regions for fine mapping and downstream analysis in future.
Collapse
|
5
|
Liu L, Zhou J, Chen CJ, Zhang J, Wen W, Tian J, Zhang Z, Gu Y. GWAS-Based Identification of New Loci for Milk Yield, Fat, and Protein in Holstein Cattle. Animals (Basel) 2020; 10:ani10112048. [PMID: 33167458 PMCID: PMC7694478 DOI: 10.3390/ani10112048] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 11/01/2020] [Accepted: 11/03/2020] [Indexed: 12/20/2022] Open
Abstract
Simple Summary Understanding the genetic architecture underlying milk production traits in cattle is beneficial so that genetic variants can be targeted toward the genetic improvement. In this study, we performed a genome-wide association study for milk production and quality traits in Holstein cattle. In the total of ten significant single-nucleotide polymorphisms (SNPs) associated with milk fat and protein, six are located in previously reported quantitative traits locus (QTL) regions. The study not only identified the effect of DGAT1 gene on milk fat and protein but also found several novel candidate genes. In addition, some pleiotropic SNPs and QTLs were identified that associated with more than two traits, these results could provide some basis for molecular breeding in dairy cattle. Abstract High-yield and high-quality of milk are the primary goals of dairy production. Understanding the genetic architecture underlying these milk-related traits is beneficial so that genetic variants can be targeted toward the genetic improvement. In this study, we measured five milk production and quality traits in Holstein cattle population from China. These traits included milk yield, fat, and protein. We used the estimated breeding values as dependent variables to conduct the genome-wide association studies (GWAS). Breeding values were estimated through pedigree relationships by using a linear mixed model. Genotyping was carried out on the individuals with phenotypes by using the Illumina BovineSNP150 BeadChip. The association analyses were conducted by using the fixed and random model Circulating Probability Unification (FarmCPU) method. A total of ten single-nucleotide polymorphisms (SNPs) were detected above the genome-wide significant threshold (p < 4.0 × 10−7), including six located in previously reported quantitative traits locus (QTL) regions. We found eight candidate genes within distances of 120 kb upstream or downstream to the associated SNPs. The study not only identified the effect of DGAT1 gene on milk fat and protein, but also discovered novel genetic loci and candidate genes related to milk traits. These novel genetic loci would be an important basis for molecular breeding in dairy cattle.
Collapse
Affiliation(s)
- Liyuan Liu
- School of Agriculture, Ningxia University, Yinchuan 750021, Ningxia, China; (L.L.); (J.Z.); (J.Z.)
- Department of Crop and Soil Sciences, Washington State University, Pullman, Washington, DC 99164, USA;
| | - Jinghang Zhou
- School of Agriculture, Ningxia University, Yinchuan 750021, Ningxia, China; (L.L.); (J.Z.); (J.Z.)
- Department of Crop and Soil Sciences, Washington State University, Pullman, Washington, DC 99164, USA;
| | - Chunpeng James Chen
- Department of Crop and Soil Sciences, Washington State University, Pullman, Washington, DC 99164, USA;
| | - Juan Zhang
- School of Agriculture, Ningxia University, Yinchuan 750021, Ningxia, China; (L.L.); (J.Z.); (J.Z.)
| | - Wan Wen
- Animal Husbandry Workstation, Yinchuan 750001, Ningxia, China; (W.W.); (J.T.)
| | - Jia Tian
- Animal Husbandry Workstation, Yinchuan 750001, Ningxia, China; (W.W.); (J.T.)
| | - Zhiwu Zhang
- Department of Crop and Soil Sciences, Washington State University, Pullman, Washington, DC 99164, USA;
- Correspondence: (Z.Z.); (Y.G.)
| | - Yaling Gu
- School of Agriculture, Ningxia University, Yinchuan 750021, Ningxia, China; (L.L.); (J.Z.); (J.Z.)
- Correspondence: (Z.Z.); (Y.G.)
| |
Collapse
|
6
|
Klápště J, Dungey HS, Telfer EJ, Suontama M, Graham NJ, Li Y, McKinley R. Marker Selection in Multivariate Genomic Prediction Improves Accuracy of Low Heritability Traits. Front Genet 2020; 11:499094. [PMID: 33193595 PMCID: PMC7662070 DOI: 10.3389/fgene.2020.499094] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Accepted: 09/18/2020] [Indexed: 11/13/2022] Open
Abstract
Multivariate analysis using mixed models allows for the exploration of genetic correlations between traits. Additionally, the transition to a genomic based approach is simplified by substituting classic pedigrees with a marker-based relationship matrix. It also enables the investigation of correlated responses to selection, trait integration and modularity in different kinds of populations. This study investigated a strategy for the construction of a marker-based relationship matrix that prioritized markers using Partial Least Squares. The efficiency of this strategy was found to depend on the correlation structure between investigated traits. In terms of accuracy, we found no benefit of this strategy compared with the all-marker-based multivariate model for the primary trait of diameter at breast height (DBH) in a radiata pine (Pinus radiata) population, possibly due to the presence of strong and well-estimated correlation with other highly heritable traits. Conversely, we did see benefit in a shining gum (Eucalyptus nitens) population, where the primary trait had low or only moderate genetic correlation with other low/moderately heritable traits. Marker selection in multivariate analysis can therefore be an efficient strategy to improve prediction accuracy for low heritability traits due to improved precision in poorly estimated low/moderate genetic correlations. Additionally, our study identified the genetic diversity as a factor contributing to the efficiency of marker selection in multivariate approaches due to higher precision of genetic correlation estimates.
Collapse
Affiliation(s)
- Jaroslav Klápště
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua, New Zealand
| | - Heidi S Dungey
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua, New Zealand
| | - Emily J Telfer
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua, New Zealand
| | - Mari Suontama
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua, New Zealand.,Skogforsk, Umeå, Sweden
| | - Natalie J Graham
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua, New Zealand
| | - Yongjun Li
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua, New Zealand.,Agriculture Victoria, AgriBio Center, Bundoora, VIC, Australia
| | - Russell McKinley
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua, New Zealand
| |
Collapse
|
7
|
Genomic Analysis Using Bayesian Methods under Different Genotyping Platforms in Korean Duroc Pigs. Animals (Basel) 2020; 10:ani10050752. [PMID: 32344859 PMCID: PMC7277155 DOI: 10.3390/ani10050752] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 04/16/2020] [Accepted: 04/22/2020] [Indexed: 12/03/2022] Open
Abstract
Simple Summary This study investigated the informative regions and the efficiency of genomic predictions for backfat thickness, days to 90 kg body weight, loin muscle area, and lean percentage in Korean Duroc pigs. The several regions of the genome were identified and a significant marker was found near the MC4R gene for growth and production-related traits. No differences in genomic accuracy were identified on the basis of the Bayesian approaches in these four growth and production-related traits. The genomic accuracy is improved by using deregressed estimated breeding values including parental information as a response variable in Korean Duroc pigs. Abstract Genomic evaluation has been widely applied to several species using commercial single nucleotide polymorphism (SNP) genotyping platforms. This study investigated the informative genomic regions and the efficiency of genomic prediction by using two Bayesian approaches (BayesB and BayesC) under two moderate-density SNP genotyping panels in Korean Duroc pigs. Growth and production records of 1026 individuals were genotyped using two medium-density, SNP genotyping platforms: Illumina60K and GeneSeek80K. These platforms consisted of 61,565 and 68,528 SNP markers, respectively. The deregressed estimated breeding values (DEBVs) derived from estimated breeding values (EBVs) and their reliabilities were taken as response variables. Two Bayesian approaches were implemented to perform the genome-wide association study (GWAS) and genomic prediction. Multiple significant regions for days to 90 kg (DAYS), lean muscle area (LMA), and lean percent (PCL) were detected. The most significant SNP marker, located near the MC4R gene, was detected using GeneSeek80K. Accuracy of genomic predictions was higher using the GeneSeek80K SNP panel for DAYS (Δ2%) and LMA (Δ2–3%) with two response variables, with no gains in accuracy by the Bayesian approaches in four growth and production-related traits. Genomic prediction is best derived from DEBVs including parental information as a response variable between two DEBVs regardless of the genotyping platform and the Bayesian method for genomic prediction accuracy in Korean Duroc pig breeding.
Collapse
|
8
|
Lee J, Lee S, Park JE, Moon SH, Choi SW, Go GW, Lim D, Kim JM. Genome-wide association study and genomic predictions for exterior traits in Yorkshire pigs1. J Anim Sci 2019; 97:2793-2802. [PMID: 31087081 PMCID: PMC6606491 DOI: 10.1093/jas/skz158] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 05/10/2019] [Indexed: 11/13/2022] Open
Abstract
The objectives of this study were to identify informative genomic regions that affect the exterior traits of purebred Korean Yorkshire pigs and to investigate and compare the accuracy of genomic prediction for response variables. Phenotypic data on body height (BH), body length (BL), and total teat number (TTN) from 2,432 Yorkshire pigs were used to obtain breeding values including as response variable the estimated breeding value (EBV) and 2 types of deregressed EBVs-one including the parent average (DEBVincPA) and the other excluding it (DEBVexcPA). A final genotype panel comprising 46,199 SNP markers was retained for analysis after quality control for common SNPs. The BayesB and BayesC methods-with various π and weighted response variables (EBV, DEBVincPA, or DEBVexcPA)-were used to estimate SNP effects, through the genome-wide association study. The significance of genomic windows (1 Mb) was obtained at 1.0% additive genetic variance and was subsequently used to identify informative genomic regions. Furthermore, SNPs with a high model frequency (≥0.90) were considered informative. The accuracy of genomic prediction was estimated using a 5-fold cross-validation with the K-means clustering method. Genomic accuracy was measured as the genomic correlation between the molecular breeding value and the individual weighted response variables (EBV, DEBVincPA, or DEBVexcPA). The number of identified informative windows (1 Mb) for BH, BL, and TTN was 4, 3, and 4, respectively. The number of significant SNPs for BH, BL, and TTN was 6, 4, and 5, respectively. Diversity π did not influence the accuracy of genomic prediction. The BayesB method showed slightly higher genomic accuracy for exterior traits than BayesC method in this study. In addition, the genomic accuracy using DEBVincPA as response variable was higher than that using other response variables. Therefore, the genomic accuracy using BayesB (π = 0.90) with DEBVinPA as a response variable was the most effective in this study. The genomic accuracy values for BH, BL, and TTN were calculated to be 0.52, 0.60, and 0.51, respectively.
Collapse
Affiliation(s)
- Jungjae Lee
- Jung P&C Institute, Inc., 1504 U-TOWER, Yongin-si, Gyeonggi-do, Republic of Korea
| | - SeokHyun Lee
- Division of Animal and Dairy Science, Chungnam National University, Daejeon, Korea
| | - Jong-Eun Park
- Animal Genomics and Bioinformatics Division, National Institute of Animal Science, RDA, Wanju, Republic of Korea
| | - Sung-Ho Moon
- National Agricultural Cooperative Federation Agribusiness Group, 92, Daeseong-ro, Daema-myeon, Yeonggwang-gun, Jeollanam-do, Republic of Korea
| | - Sung-Woon Choi
- National Agricultural Cooperative Federation Agribusiness Group, 92, Daeseong-ro, Daema-myeon, Yeonggwang-gun, Jeollanam-do, Republic of Korea
| | - Gwang-Woong Go
- Department of Food and Nutrition, Hanyang University, Seoul, Republic of Korea
| | - Dajeong Lim
- Animal Genomics and Bioinformatics Division, National Institute of Animal Science, RDA, Wanju, Republic of Korea
| | - Jun-Mo Kim
- Department of Animal Science and Technology, Chung-Ang University, Anseong-si, Gyeonggi-do, Republic of Korea
| |
Collapse
|
9
|
Abdalla E, Lopes F, Byrem T, Weigel K, Rosa G. Genomic prediction of bovine leukosis incidence in a US Holstein population. Livest Sci 2019. [DOI: 10.1016/j.livsci.2019.05.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
10
|
de Oliveira HR, Brito LF, Sargolzaei M, E Silva FF, Jamrozik J, Lourenco DAL, Schenkel FS. Impact of including information from bulls and their daughters in the training population of multiple-step genomic evaluations in dairy cattle: A simulation study. J Anim Breed Genet 2019; 136:441-452. [PMID: 31161635 DOI: 10.1111/jbg.12407] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 05/02/2019] [Accepted: 05/07/2019] [Indexed: 12/23/2022]
Abstract
The objective of this study was to investigate the impact of accounting for parent average (PA) and genotyped daughters' average (GDA) on the estimation of deregressed estimated breeding values (dEBVs) used as pseudo-phenotypes in multiple-step genomic evaluations. Genomic estimated breeding values (GEBVs) were predicted, in eight different simulated scenarios, using dEBVs calculated based on four methods. These methods included PA and GDA in the dEBV (VR) or only GDA (VRpa) and excluded both PA and GDA from the dEBV with either all information or only information from PA and GDA (JA and NEW, respectively). In general, VR and NEW showed the lowest and highest GEBV reliabilities across scenarios, respectively. Among all deregression methods, VRpa and NEW provided the most consistent bias estimates across the majority of scenarios, and they significantly yielded the least biased GEBVs. Our results indicate that removing PA and GDA information from dEBVs used in multiple-step genomic evaluations can increase the reliability of GEBVs, when both bulls and their daughters are included in the training population.
Collapse
Affiliation(s)
- Hinayah Rojas de Oliveira
- Department of Animal Science, Universidade Federal de Viçosa, Viçosa, Minas Gerais, Brazil.,Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada
| | - Luiz Fernando Brito
- Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada.,Department of Animal Sciences, Purdue University, West Lafayette, Indiana
| | - Mehdi Sargolzaei
- Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada.,HiggsGene Solutions Inc., Guelph, Ontario, Canada
| | | | - Janusz Jamrozik
- Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada.,Canadian Dairy Network, Guelph, Ontario, Canada
| | | | | |
Collapse
|
11
|
Baller JL, Howard JT, Kachman SD, Spangler ML. The impact of clustering methods for cross-validation, choice of phenotypes, and genotyping strategies on the accuracy of genomic predictions. J Anim Sci 2019; 97:1534-1549. [PMID: 30721970 DOI: 10.1093/jas/skz055] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Accepted: 02/04/2019] [Indexed: 01/22/2023] Open
Abstract
For genomic predictors to be of use in genetic evaluation, their predicted accuracy must be a reliable indicator of their utility, and thus unbiased. The objective of this paper was to evaluate the accuracy of prediction of genomic breeding values (GBV) using different clustering strategies and response variables. Red Angus genotypes (n = 9,763) were imputed to a reference 50K panel. The influence of clustering method [k-means, k-medoids, principal component (PC) analysis on the numerator relationship matrix (A) and the identical-by-state genomic relationship matrix (G) as both data and covariance matrices, and random] and response variables [deregressed estimated breeding values (DEBV) and adjusted phenotypes] were evaluated for cross-validation. The GBV were estimated using a Bayes C model for all traits. Traits for DEBV included birth weight (BWT), marbling (MARB), rib-eye area (REA), and yearling weight (YWT). Adjusted phenotypes included BWT, YWT, and ultrasonically measured intramuscular fat percentage and REA. Prediction accuracies were estimated using the genetic correlation between GBV and associated response variable using a bivariate animal model. A simulation mimicking a cattle population, replicated 5 times, was conducted to quantify differences between true and estimated accuracies. The simulation used the same clustering methods and response variables, with the addition of 2 genotyping strategies (random and top 25% of individuals), and forward validation. The prediction accuracies were estimated similarly, and true accuracies were estimated as the correlation between the residuals of a bivariate model including true breeding value (TBV) and GBV. Using the adjusted Rand index, random clusters were clearly different from relationship-based clustering methods. In both real and simulated data, random clustering consistently led to the largest estimates of accuracy, while no method was consistently associated with more or less bias than other methods. In simulation, random genotyping led to higher estimated accuracies than selection of the top 25% of individuals. Interestingly, random genotyping seemed to overpredict true accuracy while selective genotyping tended to underpredict accuracy. When forward in time validation was used, DEBV led to less biased estimates of GBV accuracy. Results suggest the highest, least biased GBV accuracies are associated with random genotyping and DEBV.
Collapse
Affiliation(s)
- Johnna L Baller
- Department of Animal Science, University of Nebraska, Lincoln, NE
| | - Jeremy T Howard
- Department of Animal Science, University of Nebraska, Lincoln, NE
| | | | | |
Collapse
|
12
|
Piccoli ML, Brito LF, Braccini J, Brito FV, Cardoso FF, Cobuci JA, Sargolzaei M, Schenkel FS. A comprehensive comparison between single- and two-step GBLUP methods in a simulated beef cattle population. CANADIAN JOURNAL OF ANIMAL SCIENCE 2018. [DOI: 10.1139/cjas-2017-0176] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The statistical methods used in the genetic evaluations are a key component of the process and can be best compared by using simulated data. The latter is especially true in grazing beef cattle production systems, where the number of proven bulls with highly reliable estimated breeding values is limited to allow for a trustworthy validation of genomic predictions. Therefore, we simulated data for 4980 beef cattle aiming to compare single-step genomic best linear unbiased prediction (ssGBLUP), which simultaneously incorporates pedigree, phenotypic, and genomic data into genomic evaluations, and two-step GBLUP (tsGBLUP) procedures and genomic estimated breeding values (GEBVs) blending methods. The greatest increases in GEBV accuracies compared with the parents’ average estimated breeding values (EBVPA) were 0.364 and 0.341 for ssGBLUP and tsGBLUP, respectively. Direct genomic value and GEBV accuracies when using ssGBLUP and tsGBLUP procedures were similar, except for the GEBV accuracies using Hayes’ blending method in tsGBLUP. There was no significant or slight bias in genomic predictions from ssGBLUP or tsGBLUP (using VanRaden’s blending method), indicating that these predictions are on the same scale compared with the true breeding values. Overall, genetic evaluations including genomic information resulted in gains in accuracy >100% compared with the EBVPA. In addition, there were no significant differences between the selected animals (10% males and 50% females) by using ssGBLUP or tsGBLUP.
Collapse
Affiliation(s)
- Mario L. Piccoli
- Departamento de Zootecnia, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS 91540-000, Brazil
- GenSys Consultores Associados S/S, Porto Alegre, RS 90460-060, Brazil
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Luiz F. Brito
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - José Braccini
- Departamento de Zootecnia, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS 91540-000, Brazil
- Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brasília, DF 71605-001, Brazil
| | - Fernanda V. Brito
- GenSys Consultores Associados S/S, Porto Alegre, RS 90460-060, Brazil
| | - Fernando F. Cardoso
- Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brasília, DF 71605-001, Brazil
- Embrapa Pecuária Sul, Bagé, RS 96401-970, Brazil
| | - Jaime A. Cobuci
- Departamento de Zootecnia, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS 91540-000, Brazil
| | - Mehdi Sargolzaei
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, ON N1G 2W1, Canada
- The Semex Alliance, Guelph, ON N1H 6J2, Canada
| | - Flávio S. Schenkel
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, ON N1G 2W1, Canada
| |
Collapse
|
13
|
de Oliveira H, Silva F, Brito L, Guarini A, Jamrozik J, Schenkel F. Comparing deregression methods for genomic prediction of test-day traits in dairy cattle. J Anim Breed Genet 2018; 135:97-106. [DOI: 10.1111/jbg.12317] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2017] [Accepted: 01/22/2018] [Indexed: 11/28/2022]
Affiliation(s)
- H.R. de Oliveira
- Centre for Genetic Improvement of Livestock; Department of Animal Biosciences; University of Guelph; Guelph ON Canada
- Department of Animal Science; Universidade Federal de Viçosa; Viçosa Minas Gerais Brazil
| | - F.F. Silva
- Department of Animal Science; Universidade Federal de Viçosa; Viçosa Minas Gerais Brazil
| | - L.F. Brito
- Centre for Genetic Improvement of Livestock; Department of Animal Biosciences; University of Guelph; Guelph ON Canada
| | - A.R. Guarini
- Centre for Genetic Improvement of Livestock; Department of Animal Biosciences; University of Guelph; Guelph ON Canada
| | - J. Jamrozik
- Centre for Genetic Improvement of Livestock; Department of Animal Biosciences; University of Guelph; Guelph ON Canada
- Canadian Dairy Network; Guelph ON Canada
| | - F.S. Schenkel
- Centre for Genetic Improvement of Livestock; Department of Animal Biosciences; University of Guelph; Guelph ON Canada
| |
Collapse
|
14
|
Abstract
Genomic selection has become increasingly important in the breeding of animals and plants. The response variable is an important factor, influencing the accuracy of genomic selection. The de-regressed proof (DRP) based on traditional estimated breeding value (EBV) is commonly used as response variable. In the current study, simulated data from 16th QTL-MAS Workshop and real data from Chinese Holstein cattle were used to compare accuracy and bias of genomic prediction with two methods of calculating DRP. Our results with simulated data showed that the correlation between genomic EBV and true breeding value achieved using the Jairath method (DRP_J) was superior to that achieved using the Garrick method (DRP_G) for simulated trait 1 but the reverse was true for simulated trait 3, and these two methods performed comparably for simulated trait 2. For all three simulated traits, DRP_J yielded larger bias of genomic prediction. However, DRP_J outperformed DRP_G in both accuracy and unbiasedness for four milk production traits in Chinese Holstein. In the estimation of genomic breeding value using genomic BLUP model, two methods for weighting diagonal elements of incidence matrix associated with residual error were also compared. With increasing the proportion of genetic variance unexplained by markers, the accuracy of genomic prediction was decreased and the bias was increased. Weighting by the reliability of DRP produced accuracy comparable to the evaluation where the proportion of genetic variance unexplained by markers was considered, but with smaller bias in general.
Collapse
|
15
|
Li H, Su G, Jiang L, Bao Z. An efficient unified model for genome-wide association studies and genomic selection. Genet Sel Evol 2017; 49:64. [PMID: 28836943 PMCID: PMC5569572 DOI: 10.1186/s12711-017-0338-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2017] [Accepted: 08/07/2017] [Indexed: 11/10/2022] Open
Abstract
Background A quantitative trait is controlled both by major variants with large genetic effects and by minor variants with small effects. Genome-wide association studies (GWAS) are an efficient approach to identify quantitative trait loci (QTL), and genomic selection (GS) with high-density single nucleotide polymorphisms (SNPs) can achieve higher accuracy of estimated breeding values than conventional best linear unbiased prediction (BLUP). GWAS and GS address different aspects of quantitative traits, but, as statistical models, they are quite similar in their description of the genetic mechanisms that underlie quantitative traits. Methods Here, we propose a stepwise linear regression mixed model (StepLMM) to unify GWAS and GS in a single statistical model. First, the variance components of the genomic-BLUP (GBLUP) model are estimated. Then, in the SNP selection step, the linear mixed model (LMM) for GWAS is equivalently transformed into a simple linear regression to improve computation speed, and the most significant SNP is selected and included into the evaluation model. In the SNP dropping step, the SNPs in the evaluation model are tested according to the standard errors of their estimated effects. If non-significant SNPs are present, the least significant one is dropped from the model and variance components are re-estimated. We used extended Bayesian information criteria (eBIC) to evaluate the model optimization, i.e. the model with the smallest eBIC is the final one and includes only significant SNPs. Results We simulated scenarios with different heritabilities with 100 QTL. StepLMM estimated heritability accurately and mapped QTL precisely. Genomic prediction accuracy was much higher with StepLMM than with GBLUP. The comparison of StepLMM with other GWAS and GS methods based on a dataset from the 16th QTLMAS Workshop showed that StepLMM had medium mapping power, the lowest rate of false positives for QTL mapping, and the highest accuracy for genomic prediction. Conclusions StepLMM is a combination of GWAS and GBLUP. GWAS and GBLUP are beneficial to each other in a single statistical model, GWAS improves genomic prediction accuracy, while GBLUP increases mapping precision and decreases the rate of false positives of GWAS. StepLMM has a high performance in both GWAS and GS and is feasible for agricultural breeding programs and human genetic studies. Electronic supplementary material The online version of this article (doi:10.1186/s12711-017-0338-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hengde Li
- Ministry of Agriculture Key Laboratory of Aquatic Genomics, CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Center for Applied Aquatic Genomics, Chinese Academy of Fishery Sciences, Beijing, 100141, China.
| | - Guosheng Su
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830, Tjele, Denmark
| | - Li Jiang
- Ministry of Agriculture Key Laboratory of Aquatic Genomics, CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Center for Applied Aquatic Genomics, Chinese Academy of Fishery Sciences, Beijing, 100141, China
| | - Zhenmin Bao
- College of Marine Life, Ocean University of China, Qingdao, 266003, China.
| |
Collapse
|
16
|
Lee J, Kachman SD, Spangler ML. The impact of training strategies on the accuracy of genomic predictors in United States Red Angus cattle1. J Anim Sci 2017; 95:3406-3414. [DOI: 10.2527/jas.2017.1604] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
|
17
|
Silva RMO, Fragomeni BO, Lourenco DAL, Magalhães AFB, Irano N, Carvalheiro R, Canesin RC, Mercadante MEZ, Boligon AA, Baldi FS, Misztal I, Albuquerque LG. Accuracies of genomic prediction of feed efficiency traits using different prediction and validation methods in an experimental Nelore cattle population. J Anim Sci 2017; 94:3613-3623. [PMID: 27898889 DOI: 10.2527/jas.2016-0401] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Animal feeding is the most important economic component of beef production systems. Selection for feed efficiency has not been effective mainly due to difficult and high costs to obtain the phenotypes. The application of genomic selection using SNP can decrease the cost of animal evaluation as well as the generation interval. The objective of this study was to compare methods for genomic evaluation of feed efficiency traits using different cross-validation layouts in an experimental beef cattle population genotyped for a high-density SNP panel (BovineHD BeadChip assay 700k, Illumina Inc., San Diego, CA). After quality control, a total of 437,197 SNP genotypes were available for 761 Nelore animals from the Institute of Animal Science, Sertãozinho, São Paulo, Brazil. The studied traits were residual feed intake, feed conversion ratio, ADG, and DMI. Methods of analysis were traditional BLUP, single-step genomic BLUP (ssGBLUP), genomic BLUP (GBLUP), and a Bayesian regression method (BayesCπ). Direct genomic values (DGV) from the last 2 methods were compared directly or in an index that combines DGV with parent average. Three cross-validation approaches were used to validate the models: 1) YOUNG, in which the partition into training and testing sets was based on year of birth and testing animals were born after 2010; 2) UNREL, in which the data set was split into 3 less related subsets and the validation was done in each subset a time; and 3) RANDOM, in which the data set was randomly divided into 4 subsets (considering the contemporary groups) and the validation was done in each subset at a time. On average, the RANDOM design provided the most accurate predictions. Average accuracies ranged from 0.10 to 0.58 using BLUP, from 0.09 to 0.48 using GBLUP, from 0.06 to 0.49 using BayesCπ, and from 0.22 to 0.49 using ssGBLUP. The most accurate and consistent predictions were obtained using ssGBLUP for all analyzed traits. The ssGBLUP seems to be more suitable to obtain genomic predictions for feed efficiency traits on an experimental population of genotyped animals.
Collapse
|
18
|
Incorporating the single-step strategy into a random regression model to enhance genomic prediction of longitudinal traits. Heredity (Edinb) 2016; 119:459-467. [PMID: 28029150 DOI: 10.1038/hdy.2016.91] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2016] [Revised: 08/01/2016] [Accepted: 08/15/2016] [Indexed: 01/05/2023] Open
Abstract
In prediction of genomic values, the single-step method has been demonstrated to outperform multi-step methods. In statistical analyses of longitudinal traits, the random regression test-day model (RR-TDM) has clear advantages over other models. Our goal in this study was to evaluate the performance of a model that integrates both single-step and RR-TDM prediction methods, called the single-step random regression test-day model (SS RR-TDM), in comparison with the pedigree-based RR-TDM and genomic best linear unbiased prediction (GBLUP) model. We performed extensive simulations to exploit the potential advantages of SS RR-TDM over the other two models under various scenarios with different levels of heritability, number of quantitative trait loci, as well as selection scheme. SS RR-TDM was found to achieve the highest accuracy and unbiasedness under all scenarios, exhibiting robust prediction ability in longitudinal trait analyses. Moreover, SS RR-TDM showed better persistency of accuracy over generations than the GBLUP model. In addition, we also found that the SS RR-TDM had advantages over RR-TDM and GBLUP in terms of its being a real data set of humans contributed by the Genetic Analysis Workshop 18. The findings of our study demonstrated the feasibility and advantages of SS RR-TDM, thus enhancing the strategies for genomic prediction of longitudinal traits in the future.
Collapse
|
19
|
Vandenplas J, Spehar M, Potocnik K, Gengler N, Gorjanc G. National single-step genomic method that integrates multi-national genomic information. J Dairy Sci 2016; 100:465-478. [PMID: 27865486 DOI: 10.3168/jds.2016-11733] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Accepted: 10/05/2016] [Indexed: 01/27/2023]
Abstract
The aim of this paper was to develop a national single-step genomic BLUP that integrates multi-national genomic estimated breeding values (EBV) and associated reliabilities without double counting dependent data contributions from the different evaluations. Simultaneous use of all data, including phenotypes, pedigree, and genotypes, is a condition to obtain unbiased EBV. However, this condition is not always fully met, mainly due to unavailability of foreign raw data for imported animals. In dairy cattle genetic evaluations, this issue is traditionally tackled through the multiple across-country evaluation (MACE) of sires, performed by Interbull Centre (Uppsala, Sweden). Multiple across-country evaluation regresses all the available national information onto a joint pedigree to obtain country-specific rankings of all sires without sharing the raw data. In the context of genomic selection, the issue is handled by exchanging sire genotypes and by using MACE information (i.e., MACE EBV and reliabilities), as a valuable source of "phenotypic" data. Although all the available data are considered, these "multi-national" genomic evaluations use multi-step methods assuming independence of various sources of information, which is not met in all situations. We developed a method that handles this by single-step genomic evaluation that jointly (1) uses national phenotypic, genomic, and pedigree data; (2) uses multi-national genomic information; and (3) avoids double counting dependent data contributions from an animal's own records and relatives' records. The method was demonstrated by integrating multi-national genomic EBV and reliabilities of Brown Swiss sires, included in the InterGenomics consortium at Interbull Centre, into the national evaluation in Slovenia. The results showed that the method could (1) increase reliability of a national (genomic) evaluation; (2) provide consistent ranking of all animals: bulls, cows, and young animals; and (3) increase the size of a genomic training population. These features provide more efficient and transparent selection throughout a breeding program.
Collapse
Affiliation(s)
- J Vandenplas
- Agriculture, Bio-engineering and Chemistry Department, Gembloux Agro-Bio Tech, University of Liege, 5030 Gembloux, Belgium; National Fund for Scientific Research, 1000 Brussels, Belgium.
| | - M Spehar
- Croatian Agricultural Agency, 10000 Zagreb, Croatia; Biotechnical Faculty, University of Ljubljana, 1000 Ljubljana, Slovenia
| | - K Potocnik
- Biotechnical Faculty, University of Ljubljana, 1000 Ljubljana, Slovenia
| | - N Gengler
- Agriculture, Bio-engineering and Chemistry Department, Gembloux Agro-Bio Tech, University of Liege, 5030 Gembloux, Belgium
| | - G Gorjanc
- Biotechnical Faculty, University of Ljubljana, 1000 Ljubljana, Slovenia; The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush Research Centre, Midlothian EH25 9RG, United Kingdom
| |
Collapse
|
20
|
Zhang X, Lourenco D, Aguilar I, Legarra A, Misztal I. Weighting Strategies for Single-Step Genomic BLUP: An Iterative Approach for Accurate Calculation of GEBV and GWAS. Front Genet 2016; 7:151. [PMID: 27594861 PMCID: PMC4990542 DOI: 10.3389/fgene.2016.00151] [Citation(s) in RCA: 85] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2016] [Accepted: 08/04/2016] [Indexed: 01/16/2023] Open
Abstract
Genomic Best Linear Unbiased Predictor (GBLUP) assumes equal variance for all single nucleotide polymorphisms (SNP). When traits are influenced by major SNP, Bayesian methods have the advantage of SNP selection. To overcome the limitation of GBLUP, unequal variance or weights for all SNP are applied in a method called weighted GBLUP (WGBLUP). If only a fraction of animals is genotyped, single-step WGBLUP (WssGBLUP) can be used. Default weights in WGBLUP or WssGBLUP are obtained iteratively based on single SNP effect squared (u2) and/or heterozygosity. When the weights are optimal, prediction accuracy, and ability to detect major SNP are maximized. The objective was to develop optimal weights for WGBLUP-based methods. We evaluated 5 new procedures that accounted for locus-specific or windows-specific variance to maximize accuracy of predicting genomic estimated breeding value (GEBV) and SNP effect. Simulated datasets consisted of phenotypes for 13,000 animals, including 1540 animals genotyped for 45,000 SNP. Scenarios with 5, 100, and 500 simulated quantitative trait loci (QTL) were considered. The 5 new procedures for SNP weighting were: (1) u2 plus a constant equal to the weight of the top SNP; (2) from a heavy-tailed distribution (similar to BayesA); (3) for every 20 SNP in a window along the whole genome, the largest effect (u2) among them; (4) the mean effect of every 20 SNP; and (5) the summation of every 20 SNP. Those methods were compared to the default WssGBLUP, GBLUP, BayesB, and BayesC. WssGBLUP methods were evaluated over 10 iterations. The accuracy of predicting GEBV was the correlation between true and estimated genomic breeding values for 300 genotyped animals from the last generation. The ability to detect the simulated QTL was also investigated. For most of the QTL scenarios, the accuracies obtained with all WssGBLUP procedures were higher compared to those from BayesB and BayesC, partly due to automatic inclusion of parent average in the former. Manhattan plots had higher resolution with 5 and 100 QTL. Using a common weight for a window of 20 SNP that sums or averages the SNP variance enhances accuracy of predicting GEBV and provides accurate estimation of marker effects.
Collapse
Affiliation(s)
- Xinyue Zhang
- Animal and Dairy Science, Animal Breeding and Genetics, University of Georgia Athens, GA, USA
| | - Daniela Lourenco
- Animal and Dairy Science, Animal Breeding and Genetics, University of Georgia Athens, GA, USA
| | - Ignacio Aguilar
- National Agricultural Research Institute Las Brujas, Uruguay
| | - Andres Legarra
- Institut National de la Recherche Agronomique, UMR1388 GenPhySE Castanet-Tolosan, France
| | - Ignacy Misztal
- Animal and Dairy Science, Animal Breeding and Genetics, University of Georgia Athens, GA, USA
| |
Collapse
|
21
|
Calus M, Vandenplas J, ten Napel J, Veerkamp R. Validation of simultaneous deregression of cow and bull breeding values and derivation of appropriate weights. J Dairy Sci 2016; 99:6403-6419. [DOI: 10.3168/jds.2016-11028] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2016] [Accepted: 04/06/2016] [Indexed: 01/17/2023]
|
22
|
Accuracy of Genomic Prediction in Switchgrass (Panicum virgatum L.) Improved by Accounting for Linkage Disequilibrium. G3-GENES GENOMES GENETICS 2016; 6:1049-62. [PMID: 26869619 PMCID: PMC4825640 DOI: 10.1534/g3.115.024950] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Switchgrass is a relatively high-yielding and environmentally sustainable biomass crop, but further genetic gains in biomass yield must be achieved to make it an economically viable bioenergy feedstock. Genomic selection (GS) is an attractive technology to generate rapid genetic gains in switchgrass, and meet the goals of a substantial displacement of petroleum use with biofuels in the near future. In this study, we empirically assessed prediction procedures for genomic selection in two different populations, consisting of 137 and 110 half-sib families of switchgrass, tested in two locations in the United States for three agronomic traits: dry matter yield, plant height, and heading date. Marker data were produced for the families’ parents by exome capture sequencing, generating up to 141,030 polymorphic markers with available genomic-location and annotation information. We evaluated prediction procedures that varied not only by learning schemes and prediction models, but also by the way the data were preprocessed to account for redundancy in marker information. More complex genomic prediction procedures were generally not significantly more accurate than the simplest procedure, likely due to limited population sizes. Nevertheless, a highly significant gain in prediction accuracy was achieved by transforming the marker data through a marker correlation matrix. Our results suggest that marker-data transformations and, more generally, the account of linkage disequilibrium among markers, offer valuable opportunities for improving prediction procedures in GS. Some of the achieved prediction accuracies should motivate implementation of GS in switchgrass breeding programs.
Collapse
|
23
|
Fernandes Júnior GA, Rosa GJM, Valente BD, Carvalheiro R, Baldi F, Garcia DA, Gordo DGM, Espigolan R, Takada L, Tonussi RL, de Andrade WBF, Magalhães AFB, Chardulo LAL, Tonhati H, de Albuquerque LG. Genomic prediction of breeding values for carcass traits in Nellore cattle. Genet Sel Evol 2016; 48:7. [PMID: 26830208 PMCID: PMC4734869 DOI: 10.1186/s12711-016-0188-y] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2015] [Accepted: 01/18/2016] [Indexed: 01/20/2023] Open
Abstract
Background The objective of this study was to evaluate the accuracy of genomic predictions for rib eye area (REA), backfat thickness (BFT), and hot carcass weight (HCW) in Nellore beef cattle from Brazilian commercial herds using different prediction models. Methods Phenotypic data from 1756 Nellore steers from ten commercial herds in Brazil were used. Animals were offspring of 294 sires and 1546 dams, reared on pasture, feedlot finished, and slaughtered at approximately 2 years of age. All animals were genotyped using a 777k Illumina Bovine HD SNP chip. Accuracy of genomic predictions of breeding values was evaluated by using a 5-fold cross-validation scheme and considering three models: Bayesian ridge regression (BRR), Bayes C (BC) and Bayesian Lasso (BL), and two types of response variables: traditional estimated breeding value (EBV), and phenotype adjusted for fixed effects (Y*). Results The prediction accuracies achieved with the BRR model were equal to 0.25 (BFT), 0.33 (HCW) and 0.36 (REA) when EBV was used as response variable, and 0.21 (BFT), 0.37 (HCW) and 0.46 (REA) when using Y*. Results obtained with the BC and BL models were similar. Accuracies increased for traits with a higher heritability, and using Y* instead of EBV as response variable resulted in higher accuracy when heritability was higher. Conclusions Our results indicate that the accuracy of genomic prediction of carcass traits in Nellore cattle is moderate to high. Prediction of genomic breeding values from adjusted phenotypes Y* was more accurate than from EBV, especially for highly heritable traits. The three models considered (BRR, BC and BL) led to similar predictive abilities and, thus, either one could be used to implement genomic prediction for carcass traits in Nellore cattle.
Collapse
Affiliation(s)
| | - Guilherme J M Rosa
- Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI, 53706, USA.
| | - Bruno D Valente
- Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI, 53706, USA.
| | - Roberto Carvalheiro
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Fernando Baldi
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Diogo A Garcia
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Daniel G M Gordo
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Rafael Espigolan
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Luciana Takada
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Rafael L Tonussi
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Willian B F de Andrade
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Ana F B Magalhães
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Luis A L Chardulo
- Faculdade de Medicina Veterinária e Zootecnia, UNESP, Botucatu, SP, 18618-970, Brazil.
| | - Humberto Tonhati
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Lucia G de Albuquerque
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| |
Collapse
|
24
|
A comparison of genomic selection models across time in interior spruce (Picea engelmannii × glauca) using unordered SNP imputation methods. Heredity (Edinb) 2015. [PMID: 26126540 DOI: 10.1038/hdy.2015.57.] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Genomic selection (GS) potentially offers an unparalleled advantage over traditional pedigree-based selection (TS) methods by reducing the time commitment required to carry out a single cycle of tree improvement. This quality is particularly appealing to tree breeders, where lengthy improvement cycles are the norm. We explored the prospect of implementing GS for interior spruce (Picea engelmannii × glauca) utilizing a genotyped population of 769 trees belonging to 25 open-pollinated families. A series of repeated tree height measurements through ages 3-40 years permitted the testing of GS methods temporally. The genotyping-by-sequencing (GBS) platform was used for single nucleotide polymorphism (SNP) discovery in conjunction with three unordered imputation methods applied to a data set with 60% missing information. Further, three diverse GS models were evaluated based on predictive accuracy (PA), and their marker effects. Moderate levels of PA (0.31-0.55) were observed and were of sufficient capacity to deliver improved selection response over TS. Additionally, PA varied substantially through time accordingly with spatial competition among trees. As expected, temporal PA was well correlated with age-age genetic correlation (r=0.99), and decreased substantially with increasing difference in age between the training and validation populations (0.04-0.47). Moreover, our imputation comparisons indicate that k-nearest neighbor and singular value decomposition yielded a greater number of SNPs and gave higher predictive accuracies than imputing with the mean. Furthermore, the ridge regression (rrBLUP) and BayesCπ (BCπ) models both yielded equal, and better PA than the generalized ridge regression heteroscedastic effect model for the traits evaluated.
Collapse
|
25
|
Ratcliffe B, El-Dien OG, Klápště J, Porth I, Chen C, Jaquish B, El-Kassaby YA. A comparison of genomic selection models across time in interior spruce (Picea engelmannii × glauca) using unordered SNP imputation methods. Heredity (Edinb) 2015; 115:547-55. [PMID: 26126540 DOI: 10.1038/hdy.2015.57] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Revised: 04/29/2015] [Accepted: 05/26/2015] [Indexed: 11/09/2022] Open
Abstract
Genomic selection (GS) potentially offers an unparalleled advantage over traditional pedigree-based selection (TS) methods by reducing the time commitment required to carry out a single cycle of tree improvement. This quality is particularly appealing to tree breeders, where lengthy improvement cycles are the norm. We explored the prospect of implementing GS for interior spruce (Picea engelmannii × glauca) utilizing a genotyped population of 769 trees belonging to 25 open-pollinated families. A series of repeated tree height measurements through ages 3-40 years permitted the testing of GS methods temporally. The genotyping-by-sequencing (GBS) platform was used for single nucleotide polymorphism (SNP) discovery in conjunction with three unordered imputation methods applied to a data set with 60% missing information. Further, three diverse GS models were evaluated based on predictive accuracy (PA), and their marker effects. Moderate levels of PA (0.31-0.55) were observed and were of sufficient capacity to deliver improved selection response over TS. Additionally, PA varied substantially through time accordingly with spatial competition among trees. As expected, temporal PA was well correlated with age-age genetic correlation (r=0.99), and decreased substantially with increasing difference in age between the training and validation populations (0.04-0.47). Moreover, our imputation comparisons indicate that k-nearest neighbor and singular value decomposition yielded a greater number of SNPs and gave higher predictive accuracies than imputing with the mean. Furthermore, the ridge regression (rrBLUP) and BayesCπ (BCπ) models both yielded equal, and better PA than the generalized ridge regression heteroscedastic effect model for the traits evaluated.
Collapse
Affiliation(s)
- B Ratcliffe
- Department of Forest and Conservation Sciences, Faculty of Forestry, The University of British Columbia, Vancouver, British Columbia, Canada
| | - O G El-Dien
- Department of Forest and Conservation Sciences, Faculty of Forestry, The University of British Columbia, Vancouver, British Columbia, Canada
| | - J Klápště
- Department of Forest and Conservation Sciences, Faculty of Forestry, The University of British Columbia, Vancouver, British Columbia, Canada.,Department of Genetics and Physiology of Forest Trees, Faculty of Forestry and Wood Sciences, Czech University of Life Sciences Prague, Praha 6, Czech Republic
| | - I Porth
- Department of Forest and Conservation Sciences, Faculty of Forestry, The University of British Columbia, Vancouver, British Columbia, Canada
| | - C Chen
- Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, OK, USA
| | - B Jaquish
- British Columbia Ministry of Forests, Lands and Natural Resource Operations, Tree Improvement Branch, Kalamalka Research Station and Seed Orchard, Vernon, British Columbia, Canada
| | - Y A El-Kassaby
- Department of Forest and Conservation Sciences, Faculty of Forestry, The University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
26
|
Parker Gaddis KL, Tiezzi F, Cole JB, Clay JS, Maltecca C. Genomic prediction of disease occurrence using producer-recorded health data: a comparison of methods. Genet Sel Evol 2015; 47:41. [PMID: 25951822 PMCID: PMC4423125 DOI: 10.1186/s12711-015-0093-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2014] [Accepted: 01/14/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Genetic selection has been successful in achieving increased production in dairy cattle; however, corresponding declines in fitness traits have been documented. Selection for fitness traits is more difficult, since they have low heritabilities and are influenced by various non-genetic factors. The objective of this paper was to investigate the predictive ability of two-stage and single-step genomic selection methods applied to health data collected from on-farm computer systems in the U.S. METHODS Implementation of single-trait and two-trait sire models was investigated using BayesA and single-step methods for mastitis and somatic cell score. Variance components were estimated. The complete dataset was divided into training and validation sets to perform model comparison. Estimated sire breeding values were used to estimate the number of daughters expected to develop mastitis. Predictive ability of each model was assessed by the sum of χ(2) values that compared predicted and observed numbers of daughters with mastitis and the proportion of wrong predictions. RESULTS According to the model applied, estimated heritabilities of liability to mastitis ranged from 0.05 (SD=0.02) to 0.11 (SD=0.03) and estimated heritabilities of somatic cell score ranged from 0.08 (SD=0.01) to 0.18 (SD=0.03). Posterior mean of genetic correlation between mastitis and somatic cell score was equal to 0.63 (SD=0.17). The single-step method had the best predictive ability. Conversely, the smallest number of wrong predictions was obtained with the univariate BayesA model. The best model fit was found for single-step and pedigree-based models. Bivariate single-step analysis had a better predictive ability than bivariate BayesA; however, the latter led to the smallest number of wrong predictions. CONCLUSIONS Genomic data improved our ability to predict animal breeding values. Performance of genomic selection methods depends on a multitude of factors. Heritability of traits and reliability of genotyped individuals has a large impact on the performance of genomic evaluation methods. Given the current characteristics of producer-recorded health data, single-step methods have several advantages compared to two-step methods.
Collapse
Affiliation(s)
| | | | - John B Cole
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, 20705-2350, MD, USA.
| | - John S Clay
- Dairy Records Management Systems, Raleigh, 27603, NC, USA.
| | | |
Collapse
|
27
|
Liu T, Qu H, Luo C, Shu D, Wang J, Lund MS, Su G. Accuracy of genomic prediction for growth and carcass traits in Chinese triple-yellow chickens. BMC Genet 2014; 15:110. [PMID: 25316160 PMCID: PMC4201679 DOI: 10.1186/s12863-014-0110-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2014] [Accepted: 10/01/2014] [Indexed: 11/10/2022] Open
Abstract
Background Growth and carcass traits are very important traits for broiler chickens. However, carcass traits can only be measured postmortem. Genomic selection may be a powerful tool for such traits because of its accurate prediction of breeding values of animals without own phenotypic information. This study investigated the efficiency of genomic prediction in Chinese triple-yellow chickens. As a new line, Chinese triple-yellow chicken was developed by cross-breeding and had a small effective population. Two growth traits and three carcass traits were analyzed: body weight at 6 weeks, body weight at 12 weeks, eviscerating percentage, breast muscle percentage and leg muscle percentage. Results Genomic prediction was assessed using a 4-fold cross-validation procedure for two validation scenarios. In the first scenario, each test data set comprised two half-sib families (family sample) and the rest represented the reference data. In the second scenario, the whole data were randomly divided into four subsets (random sample). In each fold of validation, one subset was used as the test data and the others as the reference data in each single validation. Genomic breeding values were predicted using a genomic best linear unbiased prediction model, a Bayesian least absolute shrinkage and selection operator model, and a Bayesian mixture model with four distributions. The accuracy of genomic estimated breeding value (GEBV) was measured as the correlation between GEBV and the corrected phenotypic value. Using the three models, the correlations ranged from 0.448 to 0.468 for the two growth traits and from 0.176 to 0.255 for the three carcass traits in the family sample scenario, and were between 0.487 and 0.536 for growth traits and between 0.312 and 0.430 for carcass traits in the random sample scenario. The differences in the prediction accuracies between the three models were very small; the Bayesian mixture model was slightly more accurate. According to the results from the random sample scenario, the accuracy of GEBV was 0.197 higher than the conventional pedigree index, averaged over the five traits. Conclusions The results indicated that genomic selection could greatly improve the accuracy of selection in chickens, compared with conventional selection. Genomic selection for growth and carcass traits in broiler chickens is promising. Electronic supplementary material The online version of this article (doi:10.1186/s12863-014-0110-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | - Dingming Shu
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China.
| | | | | | | |
Collapse
|
28
|
Su G, Christensen O, Janss L, Lund M. Comparison of genomic predictions using genomic relationship matrices built with different weighting factors to account for locus-specific variances. J Dairy Sci 2014; 97:6547-59. [DOI: 10.3168/jds.2014-8210] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2014] [Accepted: 07/07/2014] [Indexed: 12/24/2022]
|
29
|
Sahana G, Guldbrandtsen B, Thomsen B, Holm LE, Panitz F, Brøndum RF, Bendixen C, Lund MS. Genome-wide association study using high-density single nucleotide polymorphism arrays and whole-genome sequences for clinical mastitis traits in dairy cattle. J Dairy Sci 2014; 97:7258-75. [PMID: 25151887 DOI: 10.3168/jds.2014-8141] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2014] [Accepted: 07/14/2014] [Indexed: 12/21/2022]
Abstract
Mastitis is a mammary disease that frequently affects dairy cattle. Despite considerable research on the development of effective prevention and treatment strategies, mastitis continues to be a significant issue in bovine veterinary medicine. To identify major genes that affect mastitis in dairy cattle, 6 chromosomal regions on Bos taurus autosome (BTA) 6, 13, 16, 19, and 20 were selected from a genome scan for 9 mastitis phenotypes using imputed high-density single nucleotide polymorphism arrays. Association analyses using sequence-level variants for the 6 targeted regions were carried out to map causal variants using whole-genome sequence data from 3 breeds. The quantitative trait loci (QTL) discovery population comprised 4,992 progeny-tested Holstein bulls, and QTL were confirmed in 4,442 Nordic Red and 1,126 Jersey cattle. The targeted regions were imputed to the sequence level. The highest association signal for clinical mastitis was observed on BTA 6 at 88.97 Mb in Holstein cattle and was confirmed in Nordic Red cattle. The peak association region on BTA 6 contained 2 genes: vitamin D-binding protein precursor (GC) and neuropeptide FF receptor 2 (NPFFR2), which, based on known biological functions, are good candidates for affecting mastitis. However, strong linkage disequilibrium in this region prevented conclusive determination of the causal gene. A different QTL on BTA 6 located at 88.32 Mb in Holstein cattle affected mastitis. In addition, QTL on BTA 13 and 19 were confirmed to segregate in Nordic Red cattle and QTL on BTA 16 and 20 were confirmed in Jersey cattle. Although several candidate genes were identified in these targeted regions, it was not possible to identify a gene or polymorphism as the causal factor for any of these regions.
Collapse
Affiliation(s)
- G Sahana
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark.
| | - B Guldbrandtsen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - B Thomsen
- Molecular Genetics and Systems Biology, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - L-E Holm
- Molecular Genetics and Systems Biology, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - F Panitz
- Molecular Genetics and Systems Biology, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - R F Brøndum
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - C Bendixen
- Molecular Genetics and Systems Biology, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - M S Lund
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| |
Collapse
|
30
|
Gunia M, Saintilan R, Venot E, Hozé C, Fouilloux MN, Phocas F. Genomic prediction in French Charolais beef cattle using high-density single nucleotide polymorphism markers1. J Anim Sci 2014; 92:3258-69. [DOI: 10.2527/jas.2013-7478] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Affiliation(s)
- M. Gunia
- INRA, UMR 1313 Génétique Animale et Biologie Intégrative, 78350 Jouy-en-Josas, France
- AgroParisTech, UMR 1313 Génétique Animale et Biologie Intégrative, 75231 Paris, France
| | - R. Saintilan
- Union Nationale des Coopératives agricoles d'Elevage et d'Insémination Animale, 149 rue de Bercy, 75595 Paris Cedex 12, France
| | - E. Venot
- INRA, UMR 1313 Génétique Animale et Biologie Intégrative, 78350 Jouy-en-Josas, France
- AgroParisTech, UMR 1313 Génétique Animale et Biologie Intégrative, 75231 Paris, France
| | - C. Hozé
- INRA, UMR 1313 Génétique Animale et Biologie Intégrative, 78350 Jouy-en-Josas, France
- AgroParisTech, UMR 1313 Génétique Animale et Biologie Intégrative, 75231 Paris, France
- Union Nationale des Coopératives agricoles d'Elevage et d'Insémination Animale, 149 rue de Bercy, 75595 Paris Cedex 12, France
| | - M. N. Fouilloux
- Institut de l'Elevage, 149 rue de Bercy, 75595 Paris Cedex 12, France
| | - F. Phocas
- INRA, UMR 1313 Génétique Animale et Biologie Intégrative, 78350 Jouy-en-Josas, France
- AgroParisTech, UMR 1313 Génétique Animale et Biologie Intégrative, 75231 Paris, France
| |
Collapse
|
31
|
Morota G, Boddhireddy P, Vukasinovic N, Gianola D, Denise S. Kernel-based variance component estimation and whole-genome prediction of pre-corrected phenotypes and progeny tests for dairy cow health traits. Front Genet 2014; 5:56. [PMID: 24715901 PMCID: PMC3970026 DOI: 10.3389/fgene.2014.00056] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2013] [Accepted: 03/04/2014] [Indexed: 11/13/2022] Open
Abstract
Prediction of complex trait phenotypes in the presence of unknown gene action is an ongoing challenge in animals, plants, and humans. Development of flexible predictive models that perform well irrespective of genetic and environmental architectures is desirable. Methods that can address non-additive variation in a non-explicit manner are gaining attention for this purpose and, in particular, semi-parametric kernel-based methods have been applied to diverse datasets, mostly providing encouraging results. On the other hand, the gains obtained from these methods have been smaller when smoothed values such as estimated breeding value (EBV) have been used as response variables. However, less emphasis has been placed on the choice of phenotypes to be used in kernel-based whole-genome prediction. This study aimed to evaluate differences between semi-parametric and parametric approaches using two types of response variables and molecular markers as inputs. Pre-corrected phenotypes (PCP) and EBV obtained for dairy cow health traits were used for this comparison. We observed that non-additive genetic variances were major contributors to total genetic variances in PCP, whereas additivity was the largest contributor to variability of EBV, as expected. Within the kernels evaluated, non-parametric methods yielded slightly better predictive performance across traits relative to their additive counterparts regardless of the type of response variable used. This reinforces the view that non-parametric kernels aiming to capture non-linear relationships between a panel of SNPs and phenotypes are appealing for complex trait prediction. However, like past studies, the gain in predictive correlation was not large for either PCP or EBV. We conclude that capturing non-additive genetic variation, especially epistatic variation, in a cross-validation framework remains a significant challenge even when it is important, as seems to be the case for health traits in dairy cows.
Collapse
Affiliation(s)
- Gota Morota
- Department of Animal Sciences, University of Wisconsin-Madison Madison, WI, USA
| | | | | | - Daniel Gianola
- Department of Animal Sciences, University of Wisconsin-Madison Madison, WI, USA ; Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison Madison, WI, USA ; Department of Dairy Science, University of Wisconsin-Madison Madison, WI, USA
| | | |
Collapse
|
32
|
Guo G, Zhao F, Wang Y, Zhang Y, Du L, Su G. Comparison of single-trait and multiple-trait genomic prediction models. BMC Genet 2014; 15:30. [PMID: 24593261 PMCID: PMC3975852 DOI: 10.1186/1471-2156-15-30] [Citation(s) in RCA: 97] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2013] [Accepted: 02/26/2014] [Indexed: 11/30/2022] Open
Abstract
Background In this study, a single-trait genomic model (STGM) is compared with a multiple-trait genomic model (MTGM) for genomic prediction using conventional estimated breeding values (EBVs) calculated using a conventional single-trait and multiple-trait linear mixed models as the response variables. Three scenarios with and without missing data were simulated; no missing data, 90% missing data in a trait with high heritability, and 90% missing data in a trait with low heritability. The simulated genome had a length of 500 cM with 5000 equally spaced single nucleotide polymorphism markers and 300 randomly distributed quantitative trait loci (QTL). The true breeding values of each trait were determined using 200 of the QTLs, and the remaining 100 QTLs were assumed to affect both the high (trait I with heritability of 0.3) and the low (trait II with heritability of 0.05) heritability traits. The genetic correlation between traits I and II was 0.5, and the residual correlation was zero. Results The results showed that when there were no missing records, MTGM and STGM gave the same reliability for the genomic predictions for trait I while, for trait II, MTGM performed better that STGM. When there were missing records for one of the two traits, MTGM performed much better than STGM. In general, the difference in reliability of genomic EBVs predicted using the EBV response variables estimated from either the multiple-trait or single-trait models was relatively small for the trait without missing data. However, for the trait with missing data, the EBV response variable obtained from the multiple-trait model gave a more reliable genomic prediction than the EBV response variable from the single-trait model. Conclusions These results indicate that MTGM performed better than STGM for the trait with low heritability and for the trait with a limited number of records. Even when the EBV response variable was obtained using the multiple-trait model, the genomic prediction using MTGM was more reliable than the prediction using the STGM.
Collapse
Affiliation(s)
| | | | | | | | - Lixin Du
- National Center for Molecular Genetics and Breeding of Animal, Institute of Animal Sciences, Chinese academy of Agricultural Sciences, Beijing 100193, China.
| | | |
Collapse
|
33
|
Guo G, Zhao F, Wang Y, Zhang Y, Du L, Su G. Comparison of single-trait and multiple-trait genomic prediction models. BMC Genet 2014; 15:30. [PMID: 24593261 DOI: 10.1186/1471-2156-1115-1130] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2013] [Accepted: 02/26/2014] [Indexed: 05/28/2023] Open
Abstract
BACKGROUND In this study, a single-trait genomic model (STGM) is compared with a multiple-trait genomic model (MTGM) for genomic prediction using conventional estimated breeding values (EBVs) calculated using a conventional single-trait and multiple-trait linear mixed models as the response variables. Three scenarios with and without missing data were simulated; no missing data, 90% missing data in a trait with high heritability, and 90% missing data in a trait with low heritability. The simulated genome had a length of 500 cM with 5000 equally spaced single nucleotide polymorphism markers and 300 randomly distributed quantitative trait loci (QTL). The true breeding values of each trait were determined using 200 of the QTLs, and the remaining 100 QTLs were assumed to affect both the high (trait I with heritability of 0.3) and the low (trait II with heritability of 0.05) heritability traits. The genetic correlation between traits I and II was 0.5, and the residual correlation was zero. RESULTS The results showed that when there were no missing records, MTGM and STGM gave the same reliability for the genomic predictions for trait I while, for trait II, MTGM performed better that STGM. When there were missing records for one of the two traits, MTGM performed much better than STGM. In general, the difference in reliability of genomic EBVs predicted using the EBV response variables estimated from either the multiple-trait or single-trait models was relatively small for the trait without missing data. However, for the trait with missing data, the EBV response variable obtained from the multiple-trait model gave a more reliable genomic prediction than the EBV response variable from the single-trait model. CONCLUSIONS These results indicate that MTGM performed better than STGM for the trait with low heritability and for the trait with a limited number of records. Even when the EBV response variable was obtained using the multiple-trait model, the genomic prediction using MTGM was more reliable than the prediction using the STGM.
Collapse
Affiliation(s)
| | | | | | | | - Lixin Du
- National Center for Molecular Genetics and Breeding of Animal, Institute of Animal Sciences, Chinese academy of Agricultural Sciences, Beijing 100193, China.
| | | |
Collapse
|
34
|
Lourenco D, Misztal I, Tsuruta S, Aguilar I, Ezra E, Ron M, Shirak A, Weller J. Methods for genomic evaluation of a relatively small genotyped dairy population and effect of genotyped cow information in multiparity analyses. J Dairy Sci 2014; 97:1742-52. [DOI: 10.3168/jds.2013-6916] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2013] [Accepted: 12/06/2013] [Indexed: 01/06/2023]
|
35
|
Boddhireddy P, Kelly MJ, Northcutt S, Prayaga KC, Rumph J, DeNise S. Genomic predictions in Angus cattle: Comparisons of sample size, response variables, and clustering methods for cross-validation1. J Anim Sci 2014; 92:485-97. [DOI: 10.2527/jas.2013-6757] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Affiliation(s)
| | - M. J. Kelly
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane St. Lucia, QLD, 4072, Australia
| | - S. Northcutt
- American Angus Association, 3201 Frederick Ave, Saint Joseph, MO 64506
| | - K. C. Prayaga
- Zoetis Inc., 45 Poplar Road, Parkville, Victoria, 3052, Australia
| | - J. Rumph
- Zoetis Inc., Kalamazoo, MI 49007
| | | |
Collapse
|
36
|
Ding X, Zhang Z, Li X, Wang S, Wu X, Sun D, Yu Y, Liu J, Wang Y, Zhang Y, Zhang S, Zhang Y, Zhang Q. Accuracy of genomic prediction for milk production traits in the Chinese Holstein population using a reference population consisting of cows. J Dairy Sci 2013; 96:5315-23. [DOI: 10.3168/jds.2012-6194] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2012] [Accepted: 03/25/2013] [Indexed: 11/19/2022]
|
37
|
Sahana G, Guldbrandtsen B, Thomsen B, Lund MS. Confirmation and fine-mapping of clinical mastitis and somatic cell score QTL in Nordic Holstein cattle. Anim Genet 2013; 44:620-6. [DOI: 10.1111/age.12053] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/18/2013] [Indexed: 11/27/2022]
Affiliation(s)
- G. Sahana
- Department of Molecular Biology and Genetics; Faculty of Science and Technology; Aarhus University; DK-8830 Tjele Denmark
| | - B. Guldbrandtsen
- Department of Molecular Biology and Genetics; Faculty of Science and Technology; Aarhus University; DK-8830 Tjele Denmark
| | - B. Thomsen
- Department of Molecular Biology and Genetics; Faculty of Science and Technology; Aarhus University; DK-8830 Tjele Denmark
| | - M. S. Lund
- Department of Molecular Biology and Genetics; Faculty of Science and Technology; Aarhus University; DK-8830 Tjele Denmark
| |
Collapse
|
38
|
Gao H, Lund MS, Zhang Y, Su G. Accuracy of genomic prediction using different models and response variables in the Nordic Red cattle population. J Anim Breed Genet 2013; 130:333-40. [PMID: 24074170 DOI: 10.1111/jbg.12039] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2012] [Accepted: 03/13/2013] [Indexed: 12/01/2022]
Abstract
Breeding animals can be accurately evaluated using appropriate genomic prediction models, based on marker data and phenotype information. In this study, direct genomic values (DGV) were estimated for 16 traits of Nordic Total Merit (NTM) Index in Nordic Red cattle population using three models and two different response variables. The three models were as follows: a linear mixed model (GBLUP), a Bayesian variable selection model similar to BayesA (BayesA*) and a Bayesian least absolute shrinkage and selection operator model (Bayesian Lasso). The response variables were deregressed proofs (DRP) and conventional estimated breeding values (EBV). The reliability of genomic predictions was measured on bulls in the validation data set as the squared correlation between DGV and DRP divided by the reliability of DRP. Using DRP as response variable, the reliabilities of DGV among the 16 traits ranged from 0.151 to 0.569 (average 0.317) for GBLUP, from 0.152 to 0.576 (average 0.318) for BayesA* and from 0.150 to 0.570 (average 0.320) for Bayesian Lasso. Using EBV as response variable, the reliabilities ranged from 0.159 to 0.580 (average 0.322) for GBLUP, from 0.157 to 0.578 (average 0.319) for BayesA* and from 0.159 to 0.582 (average 0.325) for Bayesian Lasso. In summary, Bayesian Lasso performed slightly better than the other two models, and EBV performed slightly better than DRP as response variable, with regard to prediction reliability of DGV. However, these differences were not statistically significant. Moreover, using EBV as response variable would result in problems with the scale of the resulting DGV and potential problem due to double counting.
Collapse
Affiliation(s)
- H Gao
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Tjele, Denmark; College of Animal Science and Technology, China Agricultural University, Beijing, China
| | | | | | | |
Collapse
|
39
|
de Los Campos G, Hickey JM, Pong-Wong R, Daetwyler HD, Calus MPL. Whole-genome regression and prediction methods applied to plant and animal breeding. Genetics 2013; 193:327-45. [PMID: 22745228 PMCID: PMC3567727 DOI: 10.1534/genetics.112.143313] [Citation(s) in RCA: 471] [Impact Index Per Article: 42.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2012] [Accepted: 06/11/2012] [Indexed: 11/18/2022] Open
Abstract
Genomic-enabled prediction is becoming increasingly important in animal and plant breeding and is also receiving attention in human genetics. Deriving accurate predictions of complex traits requires implementing whole-genome regression (WGR) models where phenotypes are regressed on thousands of markers concurrently. Methods exist that allow implementing these large-p with small-n regressions, and genome-enabled selection (GS) is being implemented in several plant and animal breeding programs. The list of available methods is long, and the relationships between them have not been fully addressed. In this article we provide an overview of available methods for implementing parametric WGR models, discuss selected topics that emerge in applications, and present a general discussion of lessons learned from simulation and empirical data analysis in the last decade.
Collapse
Affiliation(s)
- Gustavo de Los Campos
- Department of Biostatistics, School of Public Health, University of Alabama, Birmingham, AL 35294, USA.
| | | | | | | | | |
Collapse
|
40
|
Koivula M, Strandén I, Su G, Mäntysaari EA. Different methods to calculate genomic predictions--comparisons of BLUP at the single nucleotide polymorphism level (SNP-BLUP), BLUP at the individual level (G-BLUP), and the one-step approach (H-BLUP). J Dairy Sci 2012; 95:4065-73. [PMID: 22720963 DOI: 10.3168/jds.2011-4874] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2011] [Accepted: 02/18/2012] [Indexed: 11/19/2022]
Abstract
Several strategies to use genomic data in predictions have been proposed. The aim of this study was to compare different genomic prediction methods. The response variables used in the genomic predictions were deregressed proofs, which were derived from 2 estimated breeding value (EBV) data sets. The full EBV data set from March 2010 included the EBV for production and mastitis traits for all Nordic red bulls. The reduced data set included the same animals as the full data set, but the EBV were predicted from a data set that excluded the last 5 yr of observations. Genomic predictions were obtained using different BLUP models: BLUP at the single nucleotide polymorphism level (SNP-BLUP), BLUP at the individual level (G-BLUP), and the one-step approach (H-BLUP). For the selection candidate bulls, the SNP-BLUP and G-BLUP models gave the same direct genomic breeding values (e.g., correlation of direct genomic breeding values between SNP-BLUP and G-BLUP for protein was 0.99), but slightly different from genomic EBV obtained from H-BLUP (correlations of SNP-BLUP or G-BLUP with H-BLUP were about 0.96). For all traits, SNP-BLUP and G-BLUP gave the same validation reliability, whereas H-BLUP led to slightly higher reliability. Therefore, the results support a slight advantage of using H-BLUP for genomic evaluation.
Collapse
Affiliation(s)
- M Koivula
- MTT Agrifood Research Finland, Biotechnology and Food Research, Biometrical Genetics FI-31600 Jokioinen, Finland.
| | | | | | | |
Collapse
|
41
|
Pintus MA, Gaspa G, Nicolazzi EL, Vicario D, Rossoni A, Ajmone-Marsan P, Nardone A, Dimauro C, Macciotta NPP. Prediction of genomic breeding values for dairy traits in Italian Brown and Simmental bulls using a principal component approach. J Dairy Sci 2012; 95:3390-400. [PMID: 22612973 DOI: 10.3168/jds.2011-4274] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2011] [Accepted: 02/13/2012] [Indexed: 01/18/2023]
Abstract
The large number of markers available compared with phenotypes represents one of the main issues in genomic selection. In this work, principal component analysis was used to reduce the number of predictors for calculating genomic breeding values (GEBV). Bulls of 2 cattle breeds farmed in Italy (634 Brown and 469 Simmental) were genotyped with the 54K Illumina beadchip (Illumina Inc., San Diego, CA). After data editing, 37,254 and 40,179 single nucleotide polymorphisms (SNP) were retained for Brown and Simmental, respectively. Principal component analysis carried out on the SNP genotype matrix extracted 2,257 and 3,596 new variables in the 2 breeds, respectively. Bulls were sorted by birth year to create reference and prediction populations. The effect of principal components on deregressed proofs in reference animals was estimated with a BLUP model. Results were compared with those obtained by using SNP genotypes as predictors with either the BLUP or Bayes_A method. Traits considered were milk, fat, and protein yields, fat and protein percentages, and somatic cell score. The GEBV were obtained for prediction population by blending direct genomic prediction and pedigree indexes. No substantial differences were observed in squared correlations between GEBV and EBV in prediction animals between the 3 methods in the 2 breeds. The principal component analysis method allowed for a reduction of about 90% in the number of independent variables when predicting direct genomic values, with a substantial decrease in calculation time and without loss of accuracy.
Collapse
Affiliation(s)
- M A Pintus
- Dipartimento di Scienze Zootecniche, Università di Sassari, Sassari 07100, Italy
| | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Su G, Christensen OF, Ostersen T, Henryon M, Lund MS. Estimating additive and non-additive genetic variances and predicting genetic merits using genome-wide dense single nucleotide polymorphism markers. PLoS One 2012; 7:e45293. [PMID: 23028912 PMCID: PMC3441703 DOI: 10.1371/journal.pone.0045293] [Citation(s) in RCA: 213] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2012] [Accepted: 08/14/2012] [Indexed: 11/29/2022] Open
Abstract
Non-additive genetic variation is usually ignored when genome-wide markers are used to study the genetic architecture and genomic prediction of complex traits in human, wild life, model organisms or farm animals. However, non-additive genetic effects may have an important contribution to total genetic variation of complex traits. This study presented a genomic BLUP model including additive and non-additive genetic effects, in which additive and non-additive genetic relation matrices were constructed from information of genome-wide dense single nucleotide polymorphism (SNP) markers. In addition, this study for the first time proposed a method to construct dominance relationship matrix using SNP markers and demonstrated it in detail. The proposed model was implemented to investigate the amounts of additive genetic, dominance and epistatic variations, and assessed the accuracy and unbiasedness of genomic predictions for daily gain in pigs. In the analysis of daily gain, four linear models were used: 1) a simple additive genetic model (MA), 2) a model including both additive and additive by additive epistatic genetic effects (MAE), 3) a model including both additive and dominance genetic effects (MAD), and 4) a full model including all three genetic components (MAED). Estimates of narrow-sense heritability were 0.397, 0.373, 0.379 and 0.357 for models MA, MAE, MAD and MAED, respectively. Estimated dominance variance and additive by additive epistatic variance accounted for 5.6% and 9.5% of the total phenotypic variance, respectively. Based on model MAED, the estimate of broad-sense heritability was 0.506. Reliabilities of genomic predicted breeding values for the animals without performance records were 28.5%, 28.8%, 29.2% and 29.5% for models MA, MAE, MAD and MAED, respectively. In addition, models including non-additive genetic effects improved unbiasedness of genomic predictions.
Collapse
Affiliation(s)
- Guosheng Su
- Department of Molecular Biology and Genetics, Aarhus University, AU-Foulum, Tjele, Denmark.
| | | | | | | | | |
Collapse
|
43
|
van Hulzen KJE, Schopen GCB, van Arendonk JAM, Nielen M, Koets AP, Schrooten C, Heuven HCM. Genome-wide association study to identify chromosomal regions associated with antibody response to Mycobacterium avium subspecies paratuberculosis in milk of Dutch Holstein-Friesians. J Dairy Sci 2012; 95:2740-8. [PMID: 22541504 DOI: 10.3168/jds.2011-5005] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2011] [Accepted: 01/13/2012] [Indexed: 11/19/2022]
Abstract
Heritability of susceptibility to Johne's disease in cattle has been shown to vary from 0.041 to 0.159. Although the presence of genetic variation involved in susceptibility to Johne's disease has been demonstrated, the understanding of genes contributing to the genetic variance is far from complete. The objective of this study was to contribute to further understanding of genetic variation involved in susceptibility to Johne's disease by identifying associated chromosomal regions using a genome-wide association approach. Log-transformed ELISA test results of 265,290 individual Holstein-Friesian cows from 3,927 herds from the Netherlands were analyzed to obtain sire estimated breeding values for Mycobacterium avium subspecies paratuberculosis (MAP)-specific antibody response in milk using a sire-maternal grandsire model with fixed effects for parity, year of birth, lactation stage, and herd; a covariate for milk yield on test day; and random effects for sire, maternal grandsire, and error. For 192 sires with estimated breeding values with a minimum reliability of 70%, single nucleotide polymorphism (SNP) typing was conducted by a multiple SNP analysis with a random polygenic effect fitting 37,869 SNP simultaneously. Five SNP associated with MAP-specific antibody response in milk were identified distributed over 4 chromosomal regions (chromosome 4, 15, 18, and 28). Thirteen putative SNP associated with MAP-specific antibody response in milk were identified distributed over 10 chromosomes (chromosome 4, 14, 16, 18, 19, 20, 21, 26, 27, and 29). This knowledge contributes to the current understanding of genetic variation involved in Johne's disease susceptibility and facilitates control of Johne's disease and improvement of health status by breeding.
Collapse
Affiliation(s)
- K J E van Hulzen
- Department of Farm Animal Health, Utrecht University, Utrecht, the Netherlands.
| | | | | | | | | | | | | |
Collapse
|
44
|
Kumar S, Chagné D, Bink MCAM, Volz RK, Whitworth C, Carlisle C. Genomic selection for fruit quality traits in apple (Malus×domestica Borkh.). PLoS One 2012; 7:e36674. [PMID: 22574211 PMCID: PMC3344927 DOI: 10.1371/journal.pone.0036674] [Citation(s) in RCA: 107] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2012] [Accepted: 04/05/2012] [Indexed: 11/18/2022] Open
Abstract
The genome sequence of apple (Malus×domestica Borkh.) was published more than a year ago, which helped develop an 8K SNP chip to assist in implementing genomic selection (GS). In apple breeding programmes, GS can be used to obtain genomic breeding values (GEBV) for choosing next-generation parents or selections for further testing as potential commercial cultivars at a very early stage. Thus GS has the potential to accelerate breeding efficiency significantly because of decreased generation interval or increased selection intensity. We evaluated the accuracy of GS in a population of 1120 seedlings generated from a factorial mating design of four females and two male parents. All seedlings were genotyped using an Illumina Infinium chip comprising 8,000 single nucleotide polymorphisms (SNPs), and were phenotyped for various fruit quality traits. Random-regression best liner unbiased prediction (RR-BLUP) and the Bayesian LASSO method were used to obtain GEBV, and compared using a cross-validation approach for their accuracy to predict unobserved BLUP-BV. Accuracies were very similar for both methods, varying from 0.70 to 0.90 for various fruit quality traits. The selection response per unit time using GS compared with the traditional BLUP-based selection were very high (>100%) especially for low-heritability traits. Genome-wide average estimated linkage disequilibrium (LD) between adjacent SNPs was 0.32, with a relatively slow decay of LD in the long range (r2 = 0.33 and 0.19 at 100 kb and 1,000 kb respectively), contributing to the higher accuracy of GS. Distribution of estimated SNP effects revealed involvement of large effect genes with likely pleiotropic effects. These results demonstrated that genomic selection is a credible alternative to conventional selection for fruit quality traits.
Collapse
Affiliation(s)
- Satish Kumar
- The New Zealand Institute for Plant & Food Research Limited, Havelock North, New Zealand.
| | | | | | | | | | | |
Collapse
|
45
|
Pérez-Cabal MA, Vazquez AI, Gianola D, Rosa GJM, Weigel KA. Accuracy of Genome-Enabled Prediction in a Dairy Cattle Population using Different Cross-Validation Layouts. Front Genet 2012; 3:27. [PMID: 22403583 PMCID: PMC3288819 DOI: 10.3389/fgene.2012.00027] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2011] [Accepted: 02/13/2012] [Indexed: 11/26/2022] Open
Abstract
The impact of extent of genetic relatedness on accuracy of genome-enabled predictions was assessed using a dairy cattle population and alternative cross-validation (CV) strategies were compared. The CV layouts consisted of training and testing sets obtained from either random allocation of individuals (RAN) or from a kernel-based clustering of individuals using the additive relationship matrix, to obtain two subsets that were as unrelated as possible (UNREL), as well as a layout based on stratification by generation (GEN). The UNREL layout decreased the average genetic relationships between training and testing animals but produced similar accuracies to the RAN design, which were about 15% higher than in the GEN setting. Results indicate that the CV structure can have an important effect on the accuracy of whole-genome predictions. However, the connection between average genetic relationships across training and testing sets and the estimated predictive ability is not straightforward, and may depend also on the kind of relatedness that exists between the two subsets and on the heritability of the trait. For high heritability traits, close relatives such as parents and full-sibs make the greatest contributions to accuracy, which can be compensated by half-sibs or grandsires in the case of lack of close relatives. However, for the low heritability traits the inclusion of close relatives is crucial and including more relatives of various types in the training set tends to lead to greater accuracy. In practice, CV designs should resemble the intended use of the predictive models, e.g., within or between family predictions, or within or across generation predictions, such that estimation of predictive ability is consistent with the actual application to be considered.
Collapse
|
46
|
Rius-Vilarrasa E, Brøndum RF, Strandén I, Guldbrandtsen B, Strandberg E, Lund MS, Fikse WF. Influence of model specifications on the reliabilities of genomic prediction in a Swedish-Finnish red breed cattle population. J Anim Breed Genet 2012; 129:369-79. [PMID: 22963358 DOI: 10.1111/j.1439-0388.2012.00989.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Using a combined multi-breed reference population, this study explored the influence of model specification and the effect of including a polygenic effect on the reliability of genomic breeding values (DGV and GEBV). The combined reference population consisted of 2986 Swedish Red Breed (SRB) and Finnish Ayrshire (FAY) dairy cattle. Bayesian methodology (common prior and mixture models with different prior distribution settings for the marker effects) as well as a best linear unbiased prediction with a genomic relationship matrix [genomic best linear unbiased predictor (GBLUP)] was used in the prediction of DGV. Mixture models including a polygenic effect were used to predict GEBV. In total, five traits with low, high and medium heritability were analysed. For the models using a mixture prior distribution, reliabilities of DGV tended to decrease with an increasing proportion of markers with small effects. The influence of the inclusion of a polygenic effect on the reliability of DGV varied across traits and model specifications. Average correlation between DGV with the Mendelian sampling term, across traits, was highest (R(2) = 0.25) for the GBLUP model and decreased with increasing proportion of markers with large effects. Reliabilities increased when DGV and parent average information were combined in an index. The GBLUP model with the largest gain across traits in the reliability of the index achieved the highest DGV mean reliability. However, the polygenic models showed to be less biased and more consistent in the estimation of DGV regardless of the model specifications compared with the mixture models without the polygenic effect.
Collapse
Affiliation(s)
- E Rius-Vilarrasa
- Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden.
| | | | | | | | | | | | | |
Collapse
|
47
|
Su G, Madsen P, Nielsen U, Mäntysaari E, Aamand G, Christensen O, Lund M. Genomic prediction for Nordic Red Cattle using one-step and selection index blending. J Dairy Sci 2012; 95:909-17. [DOI: 10.3168/jds.2011-4804] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2011] [Accepted: 10/13/2011] [Indexed: 11/19/2022]
|
48
|
|
49
|
Lund MS, Roos APWD, Vries AGD, Druet T, Ducrocq V, Fritz S, Guillaume F, Guldbrandtsen B, Liu Z, Reents R, Schrooten C, Seefried F, Su G. A common reference population from four European Holstein populations increases reliability of genomic predictions. Genet Sel Evol 2011; 43:43. [PMID: 22152008 PMCID: PMC3292506 DOI: 10.1186/1297-9686-43-43] [Citation(s) in RCA: 162] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2011] [Accepted: 12/12/2011] [Indexed: 11/17/2022] Open
Abstract
Background Size of the reference population and reliability of phenotypes are crucial factors influencing the reliability of genomic predictions. It is therefore useful to combine closely related populations. Increased accuracies of genomic predictions depend on the number of individuals added to the reference population, the reliability of their phenotypes, and the relatedness of the populations that are combined. Methods This paper assesses the increase in reliability achieved when combining four Holstein reference populations of 4000 bulls each, from European breeding organizations, i.e. UNCEIA (France), VikingGenetics (Denmark, Sweden, Finland), DHV-VIT (Germany) and CRV (The Netherlands, Flanders). Each partner validated its own bulls using their national reference data and the combined data, respectively. Results Combining the data significantly increased the reliability of genomic predictions for bulls in all four populations. Reliabilities increased by 10%, compared to reliabilities obtained with national reference populations alone, when they were averaged over countries and the traits evaluated. For different traits and countries, the increase in reliability ranged from 2% to 19%. Conclusions Genomic selection programs benefit greatly from combining data from several closely related populations into a single large reference population.
Collapse
Affiliation(s)
- Mogens S Lund
- Department of Molecular Biology and Genetics, Faculty of Science and Technology, Aarhus University, AU-Foulum, PO Box 50, DK-8830 Tjele, Denmark.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Ostersen T, Christensen OF, Henryon M, Nielsen B, Su G, Madsen P. Deregressed EBV as the response variable yield more reliable genomic predictions than traditional EBV in pure-bred pigs. Genet Sel Evol 2011; 43:38. [PMID: 22070746 PMCID: PMC3354418 DOI: 10.1186/1297-9686-43-38] [Citation(s) in RCA: 85] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2011] [Accepted: 11/09/2011] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Genomic selection can be implemented by a multi-step procedure, which requires a response variable and a statistical method. For pure-bred pigs, it was hypothesised that deregressed estimated breeding values (EBV) with the parent average removed as the response variable generate higher reliabilities of genomic breeding values than EBV, and that the normal, thick-tailed and mixture-distribution models yield similar reliabilities. METHODS Reliabilities of genomic breeding values were estimated with EBV and deregressed EBV as response variables and under the three statistical methods, genomic BLUP, Bayesian Lasso and MIXTURE. The methods were examined by splitting data into a reference data set of 1375 genotyped animals that were performance tested before October 2008, and 536 genotyped validation animals that were performance tested after October 2008. The traits examined were daily gain and feed conversion ratio. RESULTS Using deregressed EBV as the response variable yielded 18 to 39% higher reliabilities of the genomic breeding values than using EBV as the response variable. For daily gain, the increase in reliability due to deregression was significant and approximately 35%, whereas for feed conversion ratio it ranged between 18 and 39% and was significant only when MIXTURE was used. Genomic BLUP, Bayesian Lasso and MIXTURE had similar reliabilities. CONCLUSIONS Deregressed EBV is the preferred response variable, whereas the choice of statistical method is less critical for pure-bred pigs. The increase of 18 to 39% in reliability is worthwhile, since the reliabilities of the genomic breeding values directly affect the returns from genomic selection.
Collapse
Affiliation(s)
- Tage Ostersen
- The Danish Agricultural and Food Council, Pig Research Centre, Breeding and Genetics, Axeltorv 3, DK-1609, Denmark.
| | | | | | | | | | | |
Collapse
|