1
|
Villar-Hernández BDJ, Pérez-Rodríguez P, Vitale P, Gerard G, Montesinos-Lopez OA, Saint Pierre C, Crossa J, Dreisigacker S. Optimizing Genomic Parental Selection for Categorical and Continuous-Categorical Multi-Trait Mixtures. Genes (Basel) 2024; 15:995. [PMID: 39202356 PMCID: PMC11353433 DOI: 10.3390/genes15080995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2024] [Revised: 07/20/2024] [Accepted: 07/25/2024] [Indexed: 09/03/2024] Open
Abstract
This study presents a novel approach for the optimization of genomic parental selection in breeding programs involving categorical and continuous-categorical multi-trait mixtures (CMs and CCMMs). Utilizing the Bayesian decision theory (BDT) and latent trait models within a multivariate normal distribution framework, we address the complexities of selecting new parental lines across ordinal and continuous traits for breeding. Our methodology enhances precision and flexibility in genetic selection, validated through extensive simulations. This unified approach presents significant potential for the advancement of genetic improvements in diverse breeding contexts, underscoring the importance of integrating both categorical and continuous traits in genomic selection frameworks.
Collapse
Affiliation(s)
- Bartolo de Jesús Villar-Hernández
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco CP 52640, Estado de México, Mexico; (B.d.J.V.-H.); (P.V.); (G.G.); (C.S.P.)
| | | | - Paolo Vitale
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco CP 52640, Estado de México, Mexico; (B.d.J.V.-H.); (P.V.); (G.G.); (C.S.P.)
| | - Guillermo Gerard
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco CP 52640, Estado de México, Mexico; (B.d.J.V.-H.); (P.V.); (G.G.); (C.S.P.)
| | | | - Carolina Saint Pierre
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco CP 52640, Estado de México, Mexico; (B.d.J.V.-H.); (P.V.); (G.G.); (C.S.P.)
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco CP 52640, Estado de México, Mexico; (B.d.J.V.-H.); (P.V.); (G.G.); (C.S.P.)
- Colegio de Postgraduados, Montecillos CP 56230, Estado de México, Mexico;
- Louisiana State University, Baton Rouge, LA 70803, USA
- Distinguish Scientist Fellowship Program and Department of Statistics and Operations Research, King Saud University, Riyah 11459, Saudi Arabia
| | - Susanne Dreisigacker
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco CP 52640, Estado de México, Mexico; (B.d.J.V.-H.); (P.V.); (G.G.); (C.S.P.)
| |
Collapse
|
2
|
Zhang M, Xu L, Lu H, Luo H, Zhou J, Wang D, Zhang X, Huang X, Wang Y. Genomic prediction based on a joint reference population for the Xinjiang Brown cattle. Front Genet 2024; 15:1394636. [PMID: 38737126 PMCID: PMC11082323 DOI: 10.3389/fgene.2024.1394636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Accepted: 04/10/2024] [Indexed: 05/14/2024] Open
Abstract
Introduction: Xinjiang Brown cattle constitute the largest breed of cattle in Xinjiang. Therefore, it is crucial to establish a genomic evaluation system, especially for those with low levels of breed improvement. Methods: This study aimed to establish a cross breed joint reference population by analyzing the genetic structure of 485 Xinjiang Brown cattle and 2,633 Chinese Holstein cattle (Illumina GeneSeek GGP bovine 150 K chip). The Bayes method single-step genome-wide best linear unbiased prediction was used to conduct a genomic evaluation of the joint reference population for the milk traits of Xinjiang Brown cattle. The reference population of Chinese Holstein cattle was randomly divided into groups to construct the joint reference population. By comparing the prediction accuracy, estimation bias, and inflation coefficient of the validation population, the optimal number of joint reference populations was determined. Results and Discussion: The results indicated a distinct genetic structure difference between the two breeds of adult cows, and both breeds should be considered when constructing multi-breed joint reference and validation populations. The reliability range of genome prediction of milk traits in the joint reference population was 0.142-0.465. Initially, it was determined that the inclusion of 600 and 900 Chinese Holstein cattle in the joint reference population positively impacted the genomic prediction of Xinjiang Brown cattle to certain extent. It was feasible to incorporate the Chinese Holstein into Xinjiang Brown cattle population to form a joint reference population for multi-breed genomic evaluation. However, for different Xinjiang Brown cattle populations, a fixed number of Chinese Holstein cattle cannot be directly added during multi-breed genomic selection. Pre-evaluation analysis based on the genetic structure, kinship, and other factors of the current population is required to ensure the authenticity and reliability of genomic predictions and improve estimation accuracy.
Collapse
Affiliation(s)
- Menghua Zhang
- College of Animal Science, Xinjiang Agricultural University, Urumqi, China
| | - Lei Xu
- College of Animal Science, Xinjiang Agricultural University, Urumqi, China
| | - Haibo Lu
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Hanpeng Luo
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Jinghang Zhou
- Shijiazhuang Molbreeding Biotechnology Co., Ltd., Shijiazhuang, China
| | - Dan Wang
- College of Animal Science, Xinjiang Agricultural University, Urumqi, China
| | - Xiaoxue Zhang
- College of Animal Science, Xinjiang Agricultural University, Urumqi, China
| | - Xixia Huang
- College of Animal Science, Xinjiang Agricultural University, Urumqi, China
| | - Yachun Wang
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
| |
Collapse
|
3
|
Herry F, Hérault F, Lecerf F, Lagoutte L, Doublet M, Picard-Druet D, Bardou P, Varenne A, Burlot T, Le Roy P, Allais S. Restriction site-associated DNA sequencing technologies as an alternative to low-density SNP chips for genomic selection: a simulation study in layer chickens. BMC Genomics 2023; 24:271. [PMID: 37208589 DOI: 10.1186/s12864-023-09321-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 04/18/2023] [Indexed: 05/21/2023] Open
Abstract
BACKGROUND To reduce the cost of genomic selection, a low-density (LD) single nucleotide polymorphism (SNP) chip can be used in combination with imputation for genotyping selection candidates instead of using a high-density (HD) SNP chip. Next-generation sequencing (NGS) techniques have been increasingly used in livestock species but remain expensive for routine use for genomic selection. An alternative and cost-efficient solution is to use restriction site-associated DNA sequencing (RADseq) techniques to sequence only a fraction of the genome using restriction enzymes. From this perspective, use of RADseq techniques followed by an imputation step on HD chip as alternatives to LD chips for genomic selection was studied in a pure layer line. RESULTS Genome reduction and sequencing fragments were identified on reference genome using four restriction enzymes (EcoRI, TaqI, AvaII and PstI) and a double-digest RADseq (ddRADseq) method (TaqI-PstI). The SNPs contained in these fragments were detected from the 20X sequence data of the individuals in our population. Imputation accuracy on HD chip with these genotypes was assessed as the mean correlation between true and imputed genotypes. Several production traits were evaluated using single-step GBLUP methodology. The impact of imputation errors on the ranking of the selection candidates was assessed by comparing a genomic evaluation based on ancestry using true HD or imputed HD genotyping. The relative accuracy of genomic estimated breeding values (GEBVs) was investigated by considering the GEBVs estimated on offspring as a reference. With AvaII or PstI and ddRADseq with TaqI and PstI, more than 10 K SNPs were detected in common with the HD SNP chip, resulting in an imputation accuracy greater than 0.97. The impact of imputation errors on genomic evaluation of the breeders was reduced, with a Spearman correlation greater than 0.99. Finally, the relative accuracy of GEBVs was equivalent. CONCLUSIONS RADseq approaches can be interesting alternatives to low-density SNP chips for genomic selection. With more than 10 K SNPs in common with the SNPs of the HD SNP chip, good imputation and genomic evaluation results can be obtained. However, with real data, heterogeneity between individuals with missing data must be considered.
Collapse
Affiliation(s)
- Florian Herry
- NOVOGEN, 5 rue des Compagnons, Secteur du Vau Ballier, Plédran, 22960, France
- PEGASE, INRAE, Institut Agro, Saint-Gilles, 35590, France
| | | | | | | | | | | | - Philippe Bardou
- SIGENAE, GenPhySE, Université de Toulouse, INRA, ENVT, 24 chemin de Borde-Rouge - Auzeville Tolosane, Castanet Tolosan, 31326, France
| | - Amandine Varenne
- NOVOGEN, 5 rue des Compagnons, Secteur du Vau Ballier, Plédran, 22960, France
| | - Thierry Burlot
- NOVOGEN, 5 rue des Compagnons, Secteur du Vau Ballier, Plédran, 22960, France
| | - Pascale Le Roy
- PEGASE, INRAE, Institut Agro, Saint-Gilles, 35590, France
| | - Sophie Allais
- PEGASE, INRAE, Institut Agro, Saint-Gilles, 35590, France.
| |
Collapse
|
4
|
Xu P, Li D, Wu Z, Ni L, Liu J, Tang Y, Yu T, Ren J, Zhao X, Huang M. An imputation-based genome-wide association study for growth and fatness traits in Sujiang pigs. Animal 2022; 16:100591. [PMID: 35872387 DOI: 10.1016/j.animal.2022.100591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 06/15/2022] [Accepted: 06/16/2022] [Indexed: 11/01/2022] Open
Abstract
Sujiang pigs are a synthetic breed derived from Jiangquhai, Fengjing, and Duroc pigs. In this study, we sequenced the genome of 62 pigs with a coverage depth of 10× to 20×, including 27 Sujiang and 35 founder breed pigs, and we collected 360 global pigs' genome sequence data from public databases including 39 Duroc pigs. We obtained a high-quality variant dataset of 365 Sujiang pigs by imputing the porcine 80 K single nucleotide polymorphism (SNP) Beadchip to the whole-genome scale with a total of 422 pigs as a reference panel. A dataset of 365 imputated Sujiang pigs was used to perform single-trait genome-wide association study (GWAS) and meta-analyses for growth and fatness traits. Single-trait GWAS identified 1 907, 18, and 14 SNPs surpassing the suggestively significant threshold for backfat thickness, chest circumference, and chest width, respectively. Meta-analyses identified 2 400 genome-wide significant SNPs and 520 suggestively significant SNPs for backfat thickness and chest circumference, and 719 genome-wide significant SNPs and 1 225 suggestively significant SNPs for all seven traits. According to the meta-analysis of backfat thickness and chest circumference, a remarkable region of 2.69 Mb on Sus scrofa chromosome 4 containing FAM110B, IMPAD1, LYN, MOS, PENK, PLAG1, SDR16C5 and XKR4 was identified as a candidate region. The haplotype heat map of the 2.69 Mb region verified that Sujiang pigs were derived from Duroc and Chinese indigenous pigs, especially Jiangquhai pigs. The Kruskal-Wallis test showed that haplotypes of the 2.69 Mb region significantly affected backfat thickness and chest circumference traits. We then focused on PLAG1, an important growth-related gene, and identified two synonymous SNPs with obvious differences among different breeds in the PLAG1 gene. We then performed genotyping of 365 Sujiang, 150 Duroc, 95 Jiangquhai, and 100 Fengjing pigs to confirm the above result and verified that the two variants significantly affected phenotypes of growth and fatness traits. Our findings not only provide insights into the genetic architecture of porcine growth and fatness traits but also provide potential markers for selective breeding of these traits in Sujiang pigs.
Collapse
Affiliation(s)
- Pan Xu
- School of Animal Science and Technology, Jiangsu Agri-animal Husbandry Vocational College, Taizhou, PR China
| | - Desen Li
- College of Animal Science, South China Agricultural University, Guangzhou, PR China
| | - Zhongping Wu
- Zhongkai University of Agriculture and Engineering, Guangzhou, PR China
| | - Ligang Ni
- School of Animal Science and Technology, Jiangsu Agri-animal Husbandry Vocational College, Taizhou, PR China
| | - Jiaxing Liu
- School of Animal Science and Technology, Jiangsu Agri-animal Husbandry Vocational College, Taizhou, PR China
| | - Ying Tang
- School of Animal Science and Technology, Jiangsu Agri-animal Husbandry Vocational College, Taizhou, PR China
| | - Tongshun Yu
- School of Animal Science and Technology, Jiangsu Agri-animal Husbandry Vocational College, Taizhou, PR China
| | - Jun Ren
- College of Animal Science, South China Agricultural University, Guangzhou, PR China
| | - Xuting Zhao
- School of Animal Science and Technology, Jiangsu Agri-animal Husbandry Vocational College, Taizhou, PR China
| | - Min Huang
- College of Animal Science, South China Agricultural University, Guangzhou, PR China.
| |
Collapse
|
5
|
Genotyping, the Usefulness of Imputation to Increase SNP Density, and Imputation Methods and Tools. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2467:113-138. [PMID: 35451774 DOI: 10.1007/978-1-0716-2205-6_4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Imputation has become a standard practice in modern genetic research to increase genome coverage and improve accuracy of genomic selection and genome-wide association study as a large number of samples can be genotyped at lower density (and lower cost) and, imputed up to denser marker panels or to sequence level, using information from a limited reference population. Most genotype imputation algorithms use information from relatives and population linkage disequilibrium. A number of software for imputation have been developed originally for human genetics and, more recently, for animal and plant genetics considering pedigree information and very sparse SNP arrays or genotyping-by-sequencing data. In comparison to human populations, the population structures in farmed species and their limited effective sizes allow to accurately impute high-density genotypes or sequences from very low-density SNP panels and a limited set of reference individuals. Whatever the imputation method, the imputation accuracy, measured by the correct imputation rate or the correlation between true and imputed genotypes, increased with the increasing relatedness of the individual to be imputed with its denser genotyped ancestors and as its own genotype density increased. Increasing the imputation accuracy pushes up the genomic selection accuracy whatever the genomic evaluation method. Given the marker densities, the most important factors affecting imputation accuracy are clearly the size of the reference population and the relationship between individuals in the reference and target populations.
Collapse
|
6
|
Yan X, Zhang T, Liu L, Yu Y, Yang G, Han Y, Gong G, Wang F, Zhang L, Liu H, Li W, Yan X, Mao H, Li Y, Du C, Li J, Zhang Y, Wang R, Lv Q, Wang Z, Zhang J, Liu Z, Wang Z, Su R. Accuracy of Genomic Selection for Important Economic Traits of Cashmere and Meat Goats Assessed by Simulation Study. Front Vet Sci 2022; 9:770539. [PMID: 35372544 PMCID: PMC8966406 DOI: 10.3389/fvets.2022.770539] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Accepted: 01/24/2022] [Indexed: 11/13/2022] Open
Abstract
Genomic selection in plants and animals has become a standard tool for breeding because of the advantages of high accuracy and short generation intervals. Implementation of this technology is hindered by the high cost of genotyping and other factors. The aim of this study was to determine an optional marker density panel and reference population size for using genomic selection of goats, with speculation on the number of QTLs that affect the important economic traits of goats. In addition, the effect of buck population size in the reference population on the accuracy of genomic estimated breeding value (GEBV) was discussed. Based on the previous genetic evaluation results of Inner Mongolia White Cashmere Goats, live body weight (LBW, h2 = 0.11) and fiber diameter (FD, h2 = 0.34) were chosen to perform genomic selection in this study. Reasonable genome parameters and generation transmission processes were set, and phenotypic and genotype data of the two traits were simulated. Then, different sizes of the reference population and validation population were selected from progeny. The GEBVs were obtained by six methods, including GBLUP (Genomic Best Linear Unbiased Prediction), ssGBLUP (Single Step Genomic Best Linear Unbiased Prediction), BayesA, BayesB, Bayesian ridge regression, and Bayesian LASSO. The correlation coefficient between the predicted and realized phenotypes from simulation was calculated and used as a measure of the accuracy of GEBV in each trait. The results showed that the medium marker density Panel (45 K) could be used for genomic selection in goats, which can ensure the accuracy of the GEBV. The reference population size of 1,500 can achieve greater genetic progress in genomic selection for fiber diameter and live body weight in goats by comparing with the population size below this level. The accuracy of the GEBV for live body weight and fiber diameter was better when the number of QTLs was 100 and 50, respectively. Additionally, the accuracy of GEBV was discovered to be good when the buck population size was up to 200. Meanwhile, the accuracy of the GEBV for medium heritability traits (FDs) was found to be higher than the accuracy of the GEBV for low heritability traits (LBWs). These findings will provide theoretical guidance for genomic selection in goats by using real data.
Collapse
Affiliation(s)
- Xiaochun Yan
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Tao Zhang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
- Inner Mongolia Bigvet Co., Ltd., Hohhot, China
| | - Lichun Liu
- College of Veterinary Medicine, Inner Mongolia Agricultural University, Hohhot, China
| | - Yongsheng Yu
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Guang Yang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Yaqian Han
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Gao Gong
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Fenghong Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Lei Zhang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Hongfu Liu
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Wenze Li
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Xiaomin Yan
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Haoyu Mao
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Yaming Li
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Chen Du
- Department of Obstetrics and Gynaecology, Inner Mongolia Medical University, Hohhot, China
| | - Jinquan Li
- Key Laboratory of Mutton Sheep Genetics and Breeding, Ministry of Agriculture, Hohhot, China
- Key Laboratory of Animal Genetics, Breeding and Reproduction in Inner Mongolia Autonomous Region, Hohhot, China
- Engineering Research Centre for Goat Genetics and Breeding, Inner Mongolia Autonomous Region, Hohhot, China
| | - Yanjun Zhang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Ruijun Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Qi Lv
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Zhixin Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Jiaxin Zhang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Zhihong Liu
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Zhiying Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
- *Correspondence: Zhiying Wang
| | - Rui Su
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
- Rui Su
| |
Collapse
|
7
|
Lamb HJ, Hayes BJ, Nguyen LT, Ross EM. The Future of Livestock Management: A Review of Real-Time Portable Sequencing Applied to Livestock. Genes (Basel) 2020; 11:E1478. [PMID: 33317066 PMCID: PMC7763041 DOI: 10.3390/genes11121478] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 11/10/2020] [Accepted: 12/01/2020] [Indexed: 12/12/2022] Open
Abstract
Oxford Nanopore Technologies' MinION has proven to be a valuable tool within human and microbial genetics. Its capacity to produce long reads in real time has opened up unique applications for portable sequencing. Examples include tracking the recent African swine fever outbreak in China and providing a diagnostic tool for disease in the cassava plant in Eastern Africa. Here we review the current applications of Oxford Nanopore sequencing in livestock, then focus on proposed applications in livestock agriculture for rapid diagnostics, base modification detection, reference genome assembly and genomic prediction. In particular, we propose a future application: 'crush-side genotyping' for real-time on-farm genotyping for extensive industries such as northern Australian beef production. An initial in silico experiment to assess the feasibility of crush-side genotyping demonstrated promising results. SNPs were called from simulated Nanopore data, that included the relatively high base call error rate that is characteristic of the data, and calling parameters were varied to understand the feasibility of SNP calling at low coverages in a heterozygous population. With optimised genotype calling parameters, over 85% of the 10,000 simulated SNPs were able to be correctly called with coverages as low as 6×. These results provide preliminary evidence that Oxford Nanopore sequencing has potential to be used for real-time SNP genotyping in extensive livestock operations.
Collapse
Affiliation(s)
- Harrison J. Lamb
- Centre for Animal Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD 4067, Australia; (B.J.H.); (L.T.N.); (E.M.R.)
| | | | | | | |
Collapse
|
8
|
Liu G, Dong L, Gu L, Han Z, Zhang W, Fang M, Wang Z. Evaluation of Genomic Selection for Seven Economic Traits in Yellow Drum (Nibea albiflora). MARINE BIOTECHNOLOGY (NEW YORK, N.Y.) 2019; 21:806-812. [PMID: 31745748 PMCID: PMC6890617 DOI: 10.1007/s10126-019-09925-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 09/25/2019] [Indexed: 05/27/2023]
Abstract
Yellow drum (Nibea albiflora) is an important maricultural fish in China, and genetic improvement is necessary for this species. This research evaluated the application of genomic selection methods to predict the genetic values of seven economic traits for yellow drum. Using genome-wide single-nucleotide polymorphisms (SNPs), we estimated the genetic parameters for seven traits, including body length (BL), swimming bladder index (SBI), swimming bladder weight (SBW), body thickness (BT), body height (BH), body length/body height ratio (LHR), and gonad weight index (GWI). The heritability estimates ranged from 0.309 to 0.843. We evaluated the prediction performance of various statistical methods, and no one method provided the highest predictive ability for all traits. We then evaluated and compared the use of genome-wide association study (GWAS)-informative SNPs and random SNPs for prediction and found that GWAS-informative SNPs obviously increased. It only needed 5 and 100 informative SNPs for LHR and BT to achieve almost the same predictive abilities as using genome-wide SNPs, and for BL, SBI, SBW, BH, and GWI, about 1000 to 3000 informative SNPs were needed to achieve whole-genome level predictive abilities. It can be concluded from the test results that breeders can use fewer SNPs to save the breeding costs of genomic selection for some traits.
Collapse
Affiliation(s)
- Guijia Liu
- Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs, Jimei University, Xiamen, China
| | - Linsong Dong
- Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs, Jimei University, Xiamen, China
| | - Linlin Gu
- Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs, Jimei University, Xiamen, China
| | - Zhaofang Han
- Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs, Jimei University, Xiamen, China
| | - Wenjing Zhang
- Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs, Jimei University, Xiamen, China
| | - Ming Fang
- Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs, Jimei University, Xiamen, China.
| | - Zhiyong Wang
- Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs, Jimei University, Xiamen, China.
- Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China.
| |
Collapse
|
9
|
Herry F, Hérault F, Picard Druet D, Varenne A, Burlot T, Le Roy P, Allais S. Design of low density SNP chips for genotype imputation in layer chicken. BMC Genet 2018; 19:108. [PMID: 30514201 PMCID: PMC6278067 DOI: 10.1186/s12863-018-0695-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Accepted: 11/14/2018] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND The main goal of selection is to achieve genetic gain for a population by choosing the best breeders among a set of selection candidates. Since 2013, the use of a high density genotyping chip (600K Affymetrix® Axiom® HD genotyping array) for chicken has enabled the implementation of genomic selection in layer and broiler breeding, but the genotyping costs remain high for a routine use on a large number of selection candidates. It has thus been deemed interesting to develop a low density genotyping chip that would induce lower costs. In this perspective, various simulation studies have been conducted to find the best way to select a set of SNPs for low density genotyping of two laying hen lines. RESULTS To design low density SNP chips, two methodologies, based on equidistance (EQ) or on linkage disequilibrium (LD) were compared. Imputation accuracy was assessed as the mean correlation between true and imputed genotypes. The results showed correlations more sensitive to false imputation of SNPs having low Minor Allele Frequency (MAF) when the EQ methodology was used. An increase in imputation accuracy was obtained when SNP density was increased, either through an increase in the number of selected windows on a chromosome or through the rise of the LD threshold. Moreover, the results varied depending on the type of chromosome (macro or micro-chromosome). The LD methodology enabled to optimize the number of SNPs, by reducing the SNP density on macro-chromosomes and by increasing it on micro-chromosomes. Imputation accuracy also increased when the size of the reference population was increased. Conversely, imputation accuracy decreased when the degree of kinship between reference and candidate populations was reduced. Finally, adding selection candidates' dams in the reference population, in addition to their sire, enabled to get better imputation results. CONCLUSIONS Whichever the SNP chip, the methodology, and the scenario studied, highly accurate imputations were obtained, with mean correlations higher than 0.83. The key point to achieve good imputation results is to take into account chicken lines' LD when designing a low density SNP chip, and to include the candidates' direct parents in the reference population.
Collapse
Affiliation(s)
- Florian Herry
- NOVOGEN, 5 rue des Compagnons, Secteur du Vau Ballier, 22960, Plédran, France.,PEGASE, INRA, Agrocampus Ouest, 16 Le Clos, 35590, Saint-Gilles, France
| | - Frédéric Hérault
- PEGASE, INRA, Agrocampus Ouest, 16 Le Clos, 35590, Saint-Gilles, France
| | | | - Amandine Varenne
- NOVOGEN, 5 rue des Compagnons, Secteur du Vau Ballier, 22960, Plédran, France
| | - Thierry Burlot
- NOVOGEN, 5 rue des Compagnons, Secteur du Vau Ballier, 22960, Plédran, France
| | - Pascale Le Roy
- PEGASE, INRA, Agrocampus Ouest, 16 Le Clos, 35590, Saint-Gilles, France
| | - Sophie Allais
- PEGASE, INRA, Agrocampus Ouest, 16 Le Clos, 35590, Saint-Gilles, France.
| |
Collapse
|
10
|
de Oliveira AA, Pastina MM, de Souza VF, da Costa Parrella RA, Noda RW, Simeone MLF, Schaffert RE, de Magalhães JV, Damasceno CMB, Margarido GRA. Genomic prediction applied to high-biomass sorghum for bioenergy production. MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2018; 38:49. [PMID: 29670457 PMCID: PMC5893689 DOI: 10.1007/s11032-018-0802-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2017] [Accepted: 03/13/2018] [Indexed: 05/18/2023]
Abstract
The increasing cost of energy and finite oil and gas reserves have created a need to develop alternative fuels from renewable sources. Due to its abiotic stress tolerance and annual cultivation, high-biomass sorghum (Sorghum bicolor L. Moench) shows potential as a bioenergy crop. Genomic selection is a useful tool for accelerating genetic gains and could restructure plant breeding programs by enabling early selection and reducing breeding cycle duration. This work aimed at predicting breeding values via genomic selection models for 200 sorghum genotypes comprising landrace accessions and breeding lines from biomass and saccharine groups. These genotypes were divided into two sub-panels, according to breeding purpose. We evaluated the following phenotypic biomass traits: days to flowering, plant height, fresh and dry matter yield, and fiber, cellulose, hemicellulose, and lignin proportions. Genotyping by sequencing yielded more than 258,000 single-nucleotide polymorphism markers, which revealed population structure between subpanels. We then fitted and compared genomic selection models BayesA, BayesB, BayesCπ, BayesLasso, Bayes Ridge Regression and random regression best linear unbiased predictor. The resulting predictive abilities varied little between the different models, but substantially between traits. Different scenarios of prediction showed the potential of using genomic selection results between sub-panels and years, although the genotype by environment interaction negatively affected accuracies. Functional enrichment analyses performed with the marker-predicted effects suggested several interesting associations, with potential for revealing biological processes relevant to the studied quantitative traits. This work shows that genomic selection can be successfully applied in biomass sorghum breeding programs.
Collapse
Affiliation(s)
- Amanda Avelar de Oliveira
- Department of Genetics, Luiz de Queiroz College of Agriculture, University of São Paulo, Piracicaba, SP 13418-900 Brazil
| | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Song H, Li L, Ma P, Zhang S, Su G, Lund MS, Zhang Q, Ding X. Short communication: Improving the accuracy of genomic prediction of body conformation traits in Chinese Holsteins using markers derived from high-density marker panels. J Dairy Sci 2018; 101:5250-5254. [PMID: 29550139 DOI: 10.3168/jds.2017-13456] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2017] [Accepted: 11/25/2017] [Indexed: 01/02/2023]
Abstract
This study investigated the efficiency of genomic prediction with adding the markers identified by genome-wide association study (GWAS) using a data set of imputed high-density (HD) markers from 54K markers in Chinese Holsteins. Among 3,056 Chinese Holsteins with imputed HD data, 2,401 individuals born before October 1, 2009, were used for GWAS and a reference population for genomic prediction, and the 220 younger cows were used as a validation population. In total, 1,403, 1,536, and 1,383 significant single nucleotide polymorphisms (SNP; false discovery rate at 0.05) associated with conformation final score, mammary system, and feet and legs were identified, respectively. About 2 to 3% genetic variance of 3 traits was explained by these significant SNP. Only a very small proportion of significant SNP identified by GWAS was included in the 54K marker panel. Three new marker sets (54K+) were herein produced by adding significant SNP obtained by linear mixed model for each trait into the 54K marker panel. Genomic breeding values were predicted using a Bayesian variable selection (BVS) model. The accuracies of genomic breeding value by BVS based on the 54K+ data were 2.0 to 5.2% higher than those based on the 54K data. The imputed HD markers yielded 1.4% higher accuracy on average (BVS) than the 54K data. Both the 54K+ and HD data generated lower bias of genomic prediction, and the 54K+ data yielded the lowest bias in all situations. Our results show that the imputed HD data were not very useful for improving the accuracy of genomic prediction and that adding the significant markers derived from the imputed HD marker panel could improve the accuracy of genomic prediction and decrease the bias of genomic prediction.
Collapse
Affiliation(s)
- H Song
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, P.R. China
| | - L Li
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, P.R. China
| | - P Ma
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, P.R. China; Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark; Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, P.R. China
| | - S Zhang
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, P.R. China
| | - G Su
- Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark
| | - M S Lund
- Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark
| | - Q Zhang
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, P.R. China
| | - X Ding
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, P.R. China.
| |
Collapse
|
12
|
Abstract
Accurate genomic analyses are predicated on access to a large quantity of accurately genotyped and phenotyped animals. Because the cost of genotyping is often less than the cost of phenotyping, interest is increasing in generating genotypes for phenotyped animals. In some instances this may imply the requirement to genotype older animals with greater phenotypic information content. Biological material for these older informative animals may, however, no longer exist. The objective of the present study was to quantify the ability to impute 11 129 single nucleotide polymorphism (SNP) genotypes of non-genotyped animals (in this instance sires) from the genotypes of their progeny with or without including the genotypes of the progenys' dams (i.e. mates of the sire to be imputed). The impact on the accuracy of genotype imputation by including more progeny (and their dams') genotypes in the imputation reference population was also quantified. When genotypes of the dams were not available, genotypes of 41 sires with at least 15 genotyped progeny were used for the imputation; when genotypes of the dams were available, genotypes of 21 sires with at least 10 genotyped progeny were used for the imputation. Imputation was undertaken exploiting family and population level information. The mean and variability in the proportion of genotypes per individual that could not be imputed reduced as the number of progeny genotypes used per individual increased. Little improvement in the proportion of genotypes that could not be imputed was achieved once genotypes of seven progeny and their dams were used or genotypes of 11 progeny without their respective dam's genotypes were used. Mean imputation accuracy per individual (depicted by both concordance rates and correlation between true and imputed) increased with increasing progeny group size. Moreover, the range in mean imputation accuracy per individual reduced as more progeny genotypes were used in the imputation. If the genotype of the mate of the sire was also used, high accuracy of imputation (mean genotype concordance rate per individual of 0.988), with little additional benefit thereafter, was achieved with seven genotyped progeny. In the absence of genotypes on the dam, similar imputation accuracy could not be achieved even using genotypes on up to 15 progeny. Results therefore suggest, at least for the SNP density used in the present study, that it is possible to accurately impute the genotypes of a non-genotyped parent from the genotypes of its progeny and there is a benefit of also including the genotype of the sire's mate (i.e. dam of the progeny).
Collapse
|
13
|
Biffani S, Pausch H, Schwarzenbacher H, Biscarini F. The effect of mislabeled phenotypic status on the identification of mutation-carriers from SNP genotypes in dairy cattle. BMC Res Notes 2017. [PMID: 28651561 PMCID: PMC5485573 DOI: 10.1186/s13104-017-2540-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Background Statistical and machine learning applications are increasingly popular in animal breeding and genetics, especially to compute genomic predictions for phenotypes of interest. Noise (errors) in the data may have a negative impact on the accuracy of predictions. The effects of noisy data have been investigated in genome-wide association studies for case–control experiments, and in genomic predictions for binary traits in plants. No studies have been published yet on the impact of noisy data in animal genomics. In this work, the susceptibility to noise of five classification models (Lasso-penalised logistic regression—Lasso, K-nearest neighbours—KNN, random forest—RF, support vector machines with linear—SVML—or radial—SVMR—kernel) was tested. As illustration, the identification of carriers of a recessive mutation in cattle (Bos taurus) was used. A population of 3116 Fleckvieh animals with SNP genotypes on the same chromosome as the mutation locus (BTA 19) was available. The carrier status (0/1 phenotype) was randomly sampled to generate noise. Increasing proportions of noise—up to 20%— were introduced in the data. Results SVMR and Lasso were relatively more robust to noise in the data, with total accuracy still above 0.975 and TPR (true positive rate; accuracy in the minority class) in the range 0.5–0.80 also with 17.5–20% mislabeled observations. The performance of SVML and RF decreased monotonically with increasing noise in the data, while KNN constantly failed to identify mutation carriers (observations in the minority class). The computation time increased with noise in the data, especially for the two support vector machines classifiers. Conclusions This work was the first to assess the impact of phenotyping errors on the accuracy of genomic predictions in animal genetics. The choice of the classification method can influence results in terms of higher or lower susceptibility to noise. In the presented problem, SVM with radial kernel performed relatively well even when the proportion of errors in the data reached 12.5%. Lasso was the second best method, while SVML, RF and KNN were very sensitive to noise. Taking into account both accuracy and computation time, Lasso provided the best combination. Electronic supplementary material The online version of this article (doi:10.1186/s13104-017-2540-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Stefano Biffani
- IBBA-CNR, Via Einstein-Loc. Cascina Codazza, 26900, Lodi, Italy.,AIA: Associazione Italiana Allevatori, Via Giuseppe Tomassetti 9, 00161, Rome, Italy
| | - Hubert Pausch
- Technische Universität München, Liesel-Beckmann Straße 1, 85354, Freising-Weihenstephan, Germany
| | | | - Filippo Biscarini
- IBBA-CNR, Via Einstein-Loc. Cascina Codazza, 26900, Lodi, Italy. .,Division of Infection & Immunity, School of Medicine, Cardiff University, Heath Park, CF14 4XN, Cardiff, UK.
| |
Collapse
|
14
|
Status and future perspectives of single nucleotide polymorphisms (SNPs) markers in farmed fishes: Way ahead using next generation sequencing. GENE REPORTS 2017. [DOI: 10.1016/j.genrep.2016.12.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
15
|
Piccoli ML, Brito LF, Braccini J, Cardoso FF, Sargolzaei M, Schenkel FS. Genomic predictions for economically important traits in Brazilian Braford and Hereford beef cattle using true and imputed genotypes. BMC Genet 2017; 18:2. [PMID: 28100165 PMCID: PMC5241971 DOI: 10.1186/s12863-017-0475-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2016] [Accepted: 01/13/2017] [Indexed: 12/30/2022] Open
Abstract
Background Genomic selection (GS) has played an important role in cattle breeding programs. However, genotyping prices are still a challenge for implementation of GS in beef cattle and there is still a lack of information about the use of low-density Single Nucleotide Polymorphisms (SNP) chip panels for genomic predictions in breeds such as Brazilian Braford and Hereford. Therefore, this study investigated the effect of using imputed genotypes in the accuracy of genomic predictions for twenty economically important traits in Brazilian Braford and Hereford beef cattle. Various scenarios composed by different percentages of animals with imputed genotypes and different sizes of the training population were compared. De-regressed EBVs (estimated breeding values) were used as pseudo-phenotypes in a Genomic Best Linear Unbiased Prediction (GBLUP) model using two different mimicked panels derived from the 50 K (8 K and 15 K SNP panels), which were subsequently imputed to the 50 K panel. In addition, genomic prediction accuracies generated from a 777 K SNP (imputed from the 50 K SNP) were presented as another alternate scenario. Results The accuracy of genomic breeding values averaged over the twenty traits ranged from 0.38 to 0.40 across the different scenarios. The average losses in expected genomic estimated breeding values (GEBV) accuracy (accuracy obtained from the inverse of the mixed model equations) relative to the true 50 K genotypes ranged from −0.0007 to −0.0012 and from −0.0002 to −0.0005 when using the 50 K imputed from the 8 K or 15 K, respectively. When using the imputed 777 K panel the average losses in expected GEBV accuracy was −0.0021. The average gain in expected EBVs accuracy by including genomic information when compared to simple BLUP was between 0.02 and 0.03 across scenarios and traits. Conclusions The percentage of animals with imputed genotypes in the training population did not significantly influence the validation accuracy. However, the size of the training population played a major role in the accuracies of genomic predictions in this population. The losses in the expected accuracies of GEBV due to imputation of genotypes were lower when using the 50 K SNP chip panel imputed from the 15 K compared to the one imputed from the 8 K SNP chip panel. Electronic supplementary material The online version of this article (doi:10.1186/s12863-017-0475-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mario L Piccoli
- Departamento de Zootecnia, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil. .,GenSys Consultores Associados S/S, Porto Alegre, Brazil. .,Centre for Genetic Improvement of Livestock, University of Guelph, Guelph, Canada.
| | - Luiz F Brito
- Centre for Genetic Improvement of Livestock, University of Guelph, Guelph, Canada
| | - José Braccini
- Departamento de Zootecnia, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.,Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brasília, Brazil
| | - Fernando F Cardoso
- Embrapa Pecuária Sul, Bagé, Brazil.,Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brasília, Brazil
| | - Mehdi Sargolzaei
- Centre for Genetic Improvement of Livestock, University of Guelph, Guelph, Canada.,The Semex Alliance, Guelph, Canada
| | - Flávio S Schenkel
- Centre for Genetic Improvement of Livestock, University of Guelph, Guelph, Canada
| |
Collapse
|
16
|
Abstract
Reproductive inefficiency compromises the profitability of dairy herds and the health and longevity of individual cows. In the average dairy herd, the combination of estrus detection and ovulation synchronization protocols yields the best economic return. Genomic selection of animals is particularly profitable in situations in which little is known about their genetic potential. Biosensor systems in milking parlors may allow for the design of reproductive strategies tailored for cows according to their physiologic needs while optimizing economic return.
Collapse
|
17
|
Wu XL, Xu J, Feng G, Wiggans GR, Taylor JF, He J, Qian C, Qiu J, Simpson B, Walker J, Bauck S. Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications. PLoS One 2016; 11:e0161719. [PMID: 27583971 PMCID: PMC5008792 DOI: 10.1371/journal.pone.0161719] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Accepted: 08/10/2016] [Indexed: 11/19/2022] Open
Abstract
Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The utility of this MOLO algorithm was also demonstrated in a real application, in which a 6K SNP panel was optimized conditional on 5,260 obligatory SNP selected based on SNP-trait association in U.S. Holstein animals. With this MOLO algorithm, both imputation error rate and genomic prediction error rate were minimal.
Collapse
Affiliation(s)
- Xiao-Lin Wu
- Bioinformatics and Biostatistics, GeneSeek (a Neogen Company), Lincoln, Nebraska, United States of America
- * E-mail:
| | - Jiaqi Xu
- Bioinformatics and Biostatistics, GeneSeek (a Neogen Company), Lincoln, Nebraska, United States of America
- Department of Statistics, University of Nebraska, Lincoln, Nebraska, United States of America
| | - Guofei Feng
- Bioinformatics and Biostatistics, GeneSeek (a Neogen Company), Lincoln, Nebraska, United States of America
- Department of Statistics, University of Nebraska, Lincoln, Nebraska, United States of America
| | - George R. Wiggans
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland, United States of America
| | - Jeremy F. Taylor
- Division of Animal Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Jun He
- College of Animal Sciences and Technology, Hunan Agricultural University, Changsha, China
| | - Changsong Qian
- Marketing and Business Development, Neogen Bio-Scientific Technology (Shanghai) Company Ltd., Shanghai, China
| | - Jiansheng Qiu
- Bioinformatics and Biostatistics, GeneSeek (a Neogen Company), Lincoln, Nebraska, United States of America
| | - Barry Simpson
- Bioinformatics and Biostatistics, GeneSeek (a Neogen Company), Lincoln, Nebraska, United States of America
| | - Jeremy Walker
- Bioinformatics and Biostatistics, GeneSeek (a Neogen Company), Lincoln, Nebraska, United States of America
| | - Stewart Bauck
- Bioinformatics and Biostatistics, GeneSeek (a Neogen Company), Lincoln, Nebraska, United States of America
| |
Collapse
|
18
|
Samorè AB, Fontanesi L. Genomic selection in pigs: state of the art and perspectives. ITALIAN JOURNAL OF ANIMAL SCIENCE 2016. [DOI: 10.1080/1828051x.2016.1172034] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
19
|
Xavier A, Muir WM, Rainey KM. Impact of imputation methods on the amount of genetic variation captured by a single-nucleotide polymorphism panel in soybeans. BMC Bioinformatics 2016; 17:55. [PMID: 26830693 PMCID: PMC4736474 DOI: 10.1186/s12859-016-0899-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2015] [Accepted: 01/19/2016] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Success in genome-wide association studies and marker-assisted selection depends on good phenotypic and genotypic data. The more complete this data is, the more powerful will be the results of analysis. Nevertheless, there are next-generation technologies that seek to provide genotypic information in spite of great proportions of missing data. The procedures these technologies use to impute genetic data, therefore, greatly affect downstream analyses. This study aims to (1) compare the genetic variance in a single-nucleotide polymorphism panel of soybean with missing data imputed using various methods, (2) evaluate the imputation accuracy and post-imputation quality associated with these methods, and (3) evaluate the impact of imputation method on heritability and the accuracy of genome-wide prediction of soybean traits. The imputation methods we evaluated were as follows: multivariate mixed model, hidden Markov model, logical algorithm, k-nearest neighbor, single value decomposition, and random forest. We used raw genotypes from the SoyNAM project and the following phenotypes: plant height, days to maturity, grain yield, and seed protein composition. RESULTS We propose an imputation method based on multivariate mixed models using pedigree information. Our methods comparison indicate that heritability of traits can be affected by the imputation method. Genotypes with missing values imputed with methods that make use of genealogic information can favor genetic analysis of highly polygenic traits, but not genome-wide prediction accuracy. The genotypic matrix captured the highest amount of genetic variance when missing loci were imputed by the method proposed in this paper. CONCLUSIONS We concluded that hidden Markov models and random forest imputation are more suitable to studies that aim analyses of highly heritable traits while pedigree-based methods can be used to best analyze traits with low heritability. Despite the notable contribution to heritability, advantages in genomic prediction were not observed by changing the imputation method. We identified significant differences across imputation methods in a dataset missing 20 % of the genotypic values. It means that genotypic data from genotyping technologies that provide a high proportion of missing values, such as GBS, should be handled carefully because the imputation method will impact downstream analysis.
Collapse
Affiliation(s)
- A Xavier
- Department of Agronomy, Purdue University, Lilly Hall of Life Sciences, 915 W. State St., West Lafayette, Indiana, 47907, USA.
| | - William M Muir
- Department of Animal Science, Purdue University, Lilly Hall of Life Sciences, 915 W. State St., West Lafayette, Indiana, 47907, USA.
| | - Katy M Rainey
- Department of Agronomy, Purdue University, Lilly Hall of Life Sciences, 915 W. State St., West Lafayette, Indiana, 47907, USA.
| |
Collapse
|
20
|
Economic evaluation of genomic selection in small ruminants: a sheep meat breeding program. Animal 2016; 10:1033-41. [DOI: 10.1017/s1751731115002049] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
|
21
|
Ogawa S, Matsuda H, Taniguchi Y, Watanabe T, Sugimoto Y, Iwaisaki H. Estimation of variance and genomic prediction using genotypes imputed from low-density marker subsets for carcass traits in Japanese black cattle. Anim Sci J 2015; 87:1106-13. [DOI: 10.1111/asj.12570] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Revised: 09/15/2015] [Accepted: 10/07/2015] [Indexed: 12/31/2022]
Affiliation(s)
| | | | - Yukio Taniguchi
- Graduate School of Agriculture; Kyoto University; Kyoto Japan
| | | | - Yoshikazu Sugimoto
- Shirakawa Institute of Animal Genetics; Japan Livestock Technology Association; Nishigo Fukushima Japan
| | | |
Collapse
|
22
|
Gao H, Madsen P, Nielsen US, Aamand GP, Su G, Byskov K, Jensen J. Including different groups of genotyped females for genomic prediction in a Nordic Jersey population. J Dairy Sci 2015; 98:9051-9. [PMID: 26433419 DOI: 10.3168/jds.2015-9947] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2015] [Accepted: 08/17/2015] [Indexed: 12/24/2022]
Abstract
Including genotyped females in a reference population (RP) is an obvious way to increase the RP in genomic selection, especially for dairy breeds of limited population size. However, the incorporation of these females must be conducted cautiously because of the potential preferential treatment of the genotyped cows and lower reliabilities of phenotypes compared with the proven pseudo-phenotypes of bulls. Breeding organizations in Denmark, Finland, and Sweden have implemented a female-genotyping project with the possibility of genotyping entire herds using the low-density (LD) chip. In the present study, 5 scenarios for building an RP were investigated in the Nordic Jersey population: (1) bulls only, (2) bulls with females from the LD project, (3) bulls with females from the LD project plus non-LD project females genotyped before their first calving, (4) bulls with females from the LD project plus non-LD project females genotyped after their first calving, and (5) bulls with all genotyped females. The genomically enhanced breeding value (GEBV) was predicted for 8 traits in the Nordic total merit index through a genomic BLUP model using deregressed proof (DRP) as the response variable in all scenarios. In addition, (daughter) yield deviation and raw phenotypic data were studied as response variables for comparison with the DRP, using stature as a model trait. The validation population was formed using a cut-off birth year of 2005 based on the genotyped Nordic Jersey bulls with DRP. The average increment in reliability of the GEBV across the 8 traits investigated was 1.9 to 4.5 percentage points compared with using only bulls in the RP (scenario 1). The addition of all the genotyped females to the RP resulted in the highest gain in reliability (scenario 5), followed by scenario 3, scenario 2, and scenario 4. All scenarios led to inflated GEBV because the regression coefficients are less than 1. However, scenario 2 and scenario 3 led to less bias of genomic predictions than scenario 5, with regression coefficients showing less deviation from scenario 1. For the study on stature, the daughter yield deviation/daughter yield deviation performed slightly better than the DRP as the response variable in the genomic BLUP (GBLUP) model. Therefore, adding unselected females in the RP could significantly improve the reliabilities and tended to reduce the prediction bias compared with adding selectively genotyped females. Although the DRP has performed robustly so far, the use of raw data is recommended with a single-step model as an optimal solution for future genomic evaluations.
Collapse
Affiliation(s)
- H Gao
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark.
| | - P Madsen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark
| | | | - G P Aamand
- Nordic Cattle Genetic Evaluation, DK-8200 Aarhus N, Denmark
| | - G Su
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark
| | - K Byskov
- Seges, DK-8200 Aarhus N, Denmark
| | - J Jensen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark
| |
Collapse
|
23
|
van Marle-Kőster E, Visser C, Makgahlela M, Cloete SW. Genomic technologies for food security: A review of challenges and opportunities in Southern Africa. Food Res Int 2015. [DOI: 10.1016/j.foodres.2015.05.057] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
|
24
|
Chud TCS, Ventura RV, Schenkel FS, Carvalheiro R, Buzanskas ME, Rosa JO, Mudadu MDA, da Silva MVGB, Mokry FB, Marcondes CR, Regitano LCA, Munari DP. Strategies for genotype imputation in composite beef cattle. BMC Genet 2015; 16:99. [PMID: 26250698 PMCID: PMC4527250 DOI: 10.1186/s12863-015-0251-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2015] [Accepted: 07/09/2015] [Indexed: 11/23/2022] Open
Abstract
Background Genotype imputation has been used to increase genomic information, allow more animals in genome-wide analyses, and reduce genotyping costs. In Brazilian beef cattle production, many animals are resulting from crossbreeding and such an event may alter linkage disequilibrium patterns. Thus, the challenge is to obtain accurately imputed genotypes in crossbred animals. The objective of this study was to evaluate the best fitting and most accurate imputation strategy on the MA genetic group (the progeny of a Charolais sire mated with crossbred Canchim X Zebu cows) and Canchim cattle. The data set contained 400 animals (born between 1999 and 2005) genotyped with the Illumina BovineHD panel. Imputation accuracy of genotypes from the Illumina-Bovine3K (3K), Illumina-BovineLD (6K), GeneSeek-Genomic-Profiler (GGP) BeefLD (GGP9K), GGP-IndicusLD (GGP20Ki), Illumina-BovineSNP50 (50K), GGP-IndicusHD (GGP75Ki), and GGP-BeefHD (GGP80K) to Illumina-BovineHD (HD) SNP panels were investigated. Seven scenarios for reference and target populations were tested; the animals were grouped according with birth year (S1), genetic groups (S2 and S3), genetic groups and birth year (S4 and S5), gender (S6), and gender and birth year (S7). Analyses were performed using FImpute and BEAGLE software and computation run-time was recorded. Genotype imputation accuracy was measured by concordance rate (CR) and allelic R square (R2). Results The highest imputation accuracy scenario consisted of a reference population with males and females and a target population with young females. Among the SNP panels in the tested scenarios, from the 50K, GGP75Ki and GGP80K were the most adequate to impute to HD in Canchim cattle. FImpute reduced computation run-time to impute genotypes from 20 to 100 times when compared to BEAGLE. Conclusion The genotyping panels possessing at least 50 thousands markers are suitable for genotype imputation to HD with acceptable accuracy. The FImpute algorithm demonstrated a higher efficiency of imputed markers, especially in lower density panels. These considerations may assist to increase genotypic information, reduce genotyping costs, and aid in genomic selection evaluations in crossbred animals. Electronic supplementary material The online version of this article (doi:10.1186/s12863-015-0251-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tatiane C S Chud
- Departamento de Ciências Exatas, UNESP - Univ Estadual Paulista "Júlio de Mesquita Filho", Jaboticabal, SP, Brazil.
| | - Ricardo V Ventura
- Beef Improvement Opportunities, Guelph, ON, Canada. .,University of Guelph, Guelph, ON, Canada.
| | | | - Roberto Carvalheiro
- Departamento de Zootecnia, UNESP - Univ Estadual Paulista "Júlio de Mesquita Filho", Jaboticabal, SP, Brazil.
| | - Marcos E Buzanskas
- Departamento de Ciências Exatas, UNESP - Univ Estadual Paulista "Júlio de Mesquita Filho", Jaboticabal, SP, Brazil.
| | - Jaqueline O Rosa
- Departamento de Ciências Exatas, UNESP - Univ Estadual Paulista "Júlio de Mesquita Filho", Jaboticabal, SP, Brazil.
| | | | | | - Fabiana B Mokry
- Department of Genetics and Evolution, Federal University of São Carlos, São Carlos, SP, Brazil.
| | - Cintia R Marcondes
- Embrapa Southeast Livestock - Brazilian Corporation of Agricultural Research, São Carlos, SP, Brazil.
| | - Luciana C A Regitano
- Embrapa Southeast Livestock - Brazilian Corporation of Agricultural Research, São Carlos, SP, Brazil.
| | - Danísio P Munari
- Departamento de Ciências Exatas, UNESP - Univ Estadual Paulista "Júlio de Mesquita Filho", Jaboticabal, SP, Brazil.
| |
Collapse
|
25
|
Boison S, Santos D, Utsunomiya A, Carvalheiro R, Neves H, O’Brien A, Garcia J, Sölkner J, da Silva M. Strategies for single nucleotide polymorphism (SNP) genotyping to enhance genotype imputation in Gyr (Bos indicus) dairy cattle: Comparison of commercially available SNP chips. J Dairy Sci 2015; 98:4969-89. [DOI: 10.3168/jds.2014-9213] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2014] [Accepted: 03/22/2015] [Indexed: 01/15/2023]
|
26
|
Ogawa S, Matsuda H, Taniguchi Y, Watanabe T, Takasuga A, Sugimoto Y, Iwaisaki H. Accuracy of imputation of single nucleotide polymorphism marker genotypes from low-density panels in Japanese Black cattle. Anim Sci J 2015; 87:3-12. [PMID: 26032028 DOI: 10.1111/asj.12393] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2014] [Accepted: 11/18/2014] [Indexed: 12/25/2022]
Abstract
Using target and reference fattened steer populations, the performance of genotype imputation using lower-density marker panels in Japanese Black cattle was evaluated. Population imputation was performed using BEAGLE software. Genotype information for approximately 40,000 single nucleotide polymorphism (SNP) markers by Illumina BovineSNP50 BeadChip was available, and imputation accuracy was assessed based on the average concordance rates of the genotypes, varying equally spaced SNP densities, and the number of individuals in the reference population. Two additional statistics were also calculated as indicators of imputation performance. The concordance rates tended to be lower for SNPs with greater minor allele frequencies, or those located near the ends of the chromosomes. Longer autosomes yielded greater imputation accuracies than shorter ones. When SNPs were selected based on linkage disequilibrium information, relative imputation accuracy was slightly improved. When 3000 and 10,000 equally spaced SNPs were used, the imputation accuracies were greater than 90% and approximately 97%, respectively. These results indicate that combining genotyping using a lower-density SNP chip with genotype imputation based on a population of individuals genotyped using a higher-density SNP chip is a cost-effective and valid approach for genomic prediction.
Collapse
Affiliation(s)
| | | | - Yukio Taniguchi
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
| | | | | | | | | |
Collapse
|
27
|
Brøndum RF, Su G, Janss L, Sahana G, Guldbrandtsen B, Boichard D, Lund MS. Quantitative trait loci markers derived from whole genome sequence data increases the reliability of genomic prediction. J Dairy Sci 2015; 98:4107-16. [PMID: 25892697 DOI: 10.3168/jds.2014-9005] [Citation(s) in RCA: 114] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2014] [Accepted: 03/12/2015] [Indexed: 12/30/2022]
Abstract
This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected with the aim of augmenting the custom low-density Illumina BovineLD SNP chip (San Diego, CA) used in the Nordic countries. The single-marker analysis was done breed-wise on all 16 index traits included in the breeding goals for Nordic Holstein, Danish Jersey, and Nordic Red cattle plus the total merit index itself. Depending on the trait's economic weight, 15, 10, or 5 quantitative trait loci (QTL) were selected per trait per breed and 3 to 5 markers were selected to tag each QTL. After removing duplicate markers (same marker selected for more than one trait or breed) and filtering for high pairwise linkage disequilibrium and assaying performance on the array, a total of 1,623 QTL markers were selected for inclusion on the custom chip. Genomic prediction analyses were performed for Nordic and French Holstein and Nordic Red animals using either a genomic BLUP or a Bayesian variable selection model. When using the genomic BLUP model including the QTL markers in the analysis, reliability was increased by up to 4 percentage points for production traits in Nordic Holstein animals, up to 3 percentage points for Nordic Reds, and up to 5 percentage points for French Holstein. Smaller gains of up to 1 percentage point was observed for mastitis, but only a 0.5 percentage point increase was seen for fertility. When using a Bayesian model accuracies were generally higher with only 54k data compared with the genomic BLUP approach, but increases in reliability were relatively smaller when QTL markers were included. Results from this study indicate that the reliability of genomic prediction can be increased by including markers significant in genome-wide association studies on whole genome sequence data alongside the 54k SNP set.
Collapse
Affiliation(s)
- R F Brøndum
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Blichers Allé 20, Aarhus University, DK-8830 Tjele, Denmark.
| | - G Su
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Blichers Allé 20, Aarhus University, DK-8830 Tjele, Denmark
| | - L Janss
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Blichers Allé 20, Aarhus University, DK-8830 Tjele, Denmark
| | - G Sahana
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Blichers Allé 20, Aarhus University, DK-8830 Tjele, Denmark
| | - B Guldbrandtsen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Blichers Allé 20, Aarhus University, DK-8830 Tjele, Denmark
| | - D Boichard
- Institut National de la Recherche Agronomique (INRA), UMR 1313 Génétique Animale et Biologie Intégrative, 78350 Jouy-en-Josas, France
| | - M S Lund
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Blichers Allé 20, Aarhus University, DK-8830 Tjele, Denmark
| |
Collapse
|
28
|
Pimentel ECG, Edel C, Emmerling R, Götz KU. How imputation errors bias genomic predictions. J Dairy Sci 2015; 98:4131-8. [PMID: 25841966 DOI: 10.3168/jds.2014-9170] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2014] [Accepted: 02/20/2015] [Indexed: 12/19/2022]
Abstract
The objective of this study was to investigate in detail the biasing effects of imputation errors on genomic predictions. Direct genomic values (DGV) of 3,494 Brown Swiss selection candidates for 37 production and conformation traits were predicted using either their observed 50K genotypes or their 50K genotypes imputed from a mimicked 6K chip. Changes in DGV caused by imputation errors were shown to be systematic. The DGV of top animals were, on average, underestimated and that of bottom animals were, on average, overestimated when imputed genotypes were used instead of observed genotypes. This pattern might be explained by the fact that imputation algorithms will usually suggest the most frequent haplotype from the sample whenever a haplotype cannot be determined unambiguously. That was empirically shown to cause an advantage for the bottom animals and a disadvantage for the top animals.
Collapse
Affiliation(s)
- E C G Pimentel
- Institute of Animal Breeding, Bavarian State Research Center for Agriculture, Grub 85586, Germany.
| | - C Edel
- Institute of Animal Breeding, Bavarian State Research Center for Agriculture, Grub 85586, Germany
| | - R Emmerling
- Institute of Animal Breeding, Bavarian State Research Center for Agriculture, Grub 85586, Germany
| | - K-U Götz
- Institute of Animal Breeding, Bavarian State Research Center for Agriculture, Grub 85586, Germany
| |
Collapse
|
29
|
Bouquet A, Sørensen A, Juga J. Genomic selection strategies to optimize the use of multiple ovulation and embryo transfer schemes in dairy cattle breeding programs. Livest Sci 2015. [DOI: 10.1016/j.livsci.2015.01.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
30
|
Piccoli ML, Braccini J, Cardoso FF, Sargolzaei M, Larmer SG, Schenkel FS. Accuracy of genome-wide imputation in Braford and Hereford beef cattle. BMC Genet 2014; 15:157. [PMID: 25543517 PMCID: PMC4300607 DOI: 10.1186/s12863-014-0157-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2014] [Accepted: 12/18/2014] [Indexed: 12/31/2022] Open
Abstract
Background Strategies for imputing genotypes from the Illumina-Bovine3K, Illumina-BovineLD (6K), BeefLD-GGP (8K), a non-commercial-15K and IndicusLD-GGP (20K) to either Illumina-BovineSNP50 (50K) or to Illumina-BovineHD (777K) SNP panel, as well as for imputing from 50K, GGP-IndicusHD (90iK) and GGP-BeefHD (90tK) to 777K were investigated. Imputation of low density (<50K) genotypes to 777K was carried out in either one or two steps. Imputation of ungenotyped parents (n = 37 sires) with four or more offspring to the 50K panel was also assessed. There were 2,946 Braford, 664 Hereford and 88 Nellore animals, from which 71, 59 and 88 were genotyped with the 777K panel, while all others had 50K genotypes. The reference population was comprised of 2,735 animals and 175 bulls for 50K and 777K, respectively. The low density panels were simulated by masking genotypes in the 50K or 777K panel for animals born in 2011. Analyses were performed using both Beagle and FImpute software. Genotype imputation accuracy was measured by concordance rate and allelic R2 between true and imputed genotypes. Results The average concordance rate using FImpute was 0.943 and 0.921 averaged across all simulated low density panels to 50K or to 777K, respectively, in comparison with 0.927 and 0.895 using Beagle. The allelic R2 was 0.912 and 0.866 for imputation to 50K or to 777K using FImpute, respectively, and 0.890 and 0.826 using Beagle. One and two steps imputation to 777K produced averaged concordance rates of 0.806 and 0.892 and allelic R2 of 0.674 and 0.819, respectively. Imputation of low density panels to 50K, with the exception of 3K, had overall concordance rates greater than 0.940 and allelic R2 greater than 0.919. Ungenotyped animals were imputed to 50K panel with an average concordance rate of 0.950 by FImpute. Conclusion FImpute accuracy outperformed Beagle on both imputation to 50K and to 777K. Two-step outperformed one-step imputation for imputing to 777K. Ungenotyped animals that have four or more offspring can have their 50K genotypes accurately inferred using FImpute. All low density panels, except the 3K, can be used to impute to the 50K using FImpute or Beagle with high concordance rate and allelic R2.
Collapse
Affiliation(s)
- Mario L Piccoli
- Departamento de Zootecnia, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil. .,GenSys Consultores Associados S/S, Porto Alegre, Brazil. .,Centre for Genetic Improvement of Livestock, University of Guelph, Guelph, ON, Canada.
| | - José Braccini
- Departamento de Zootecnia, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil. .,National Council for Scientific and Technological Development, Brasília, Brazil.
| | - Fernando F Cardoso
- Embrapa Southern Region Animal Husbandry, Bagé, Brazil. .,National Council for Scientific and Technological Development, Brasília, Brazil.
| | - Medhi Sargolzaei
- Centre for Genetic Improvement of Livestock, University of Guelph, Guelph, ON, Canada. .,The Semex Alliance, Guelph, ON, Canada.
| | - Steven G Larmer
- Centre for Genetic Improvement of Livestock, University of Guelph, Guelph, ON, Canada.
| | - Flávio S Schenkel
- Centre for Genetic Improvement of Livestock, University of Guelph, Guelph, ON, Canada.
| |
Collapse
|
31
|
Felipe VPS, Okut H, Gianola D, Silva MA, Rosa GJM. Effect of genotype imputation on genome-enabled prediction of complex traits: an empirical study with mice data. BMC Genet 2014; 15:149. [PMID: 25544265 PMCID: PMC4333171 DOI: 10.1186/s12863-014-0149-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2014] [Accepted: 12/10/2014] [Indexed: 02/01/2023] Open
Abstract
Background Genotype imputation is an important tool for whole-genome prediction as it allows cost reduction of individual genotyping. However, benefits of genotype imputation have been evaluated mostly for linear additive genetic models. In this study we investigated the impact of employing imputed genotypes when using more elaborated models of phenotype prediction. Our hypothesis was that such models would be able to track genetic signals using the observed genotypes only, with no additional information to be gained from imputed genotypes. Results For the present study, an outbred mice population containing 1,904 individuals and genotypes for 1,809 pre-selected markers was used. The effect of imputation was evaluated for a linear model (the Bayesian LASSO - BL) and for semi and non-parametric models (Reproducing Kernel Hilbert spaces regressions – RKHS, and Bayesian Regularized Artificial Neural Networks – BRANN, respectively). The RKHS method had the best predictive accuracy. Genotype imputation had a similar impact on the effectiveness of BL and RKHS. BRANN predictions were, apparently, more sensitive to imputation errors. In scenarios where the masking rates were 75% and 50%, the genotype imputation was not beneficial. However, genotype imputation incorporated information about important markers and improved predictive ability, especially for body mass index (BMI), when genotype information was sparse (90% masking), and for body weight (BW) when the reference sample for imputation was weakly related to the target population. Conclusions In conclusion, genotype imputation is not always helpful for phenotype prediction, and so it should be considered in a case-by-case basis. In summary, factors that can affect the usefulness of genotype imputation for prediction of yet-to-be observed traits are: the imputation accuracy itself, the structure of the population, the genetic architecture of the target trait and also the model used for phenotype prediction.
Collapse
Affiliation(s)
- Vivian P S Felipe
- Department of Animal Sciences, University of Wisconsin, Madison, 53706, USA.
| | - Hayrettin Okut
- Department of Animal Sciences, Biometry and Genetics Branch, University of Yuzuncu Yil, Van, 65080, Turkey.
| | - Daniel Gianola
- Department of Animal Sciences, University of Wisconsin, Madison, 53706, USA.
| | - Martinho A Silva
- Department of Animal Sciences, Federal University of Jequitinhonha and Mucuri Valleys, Minas Gerais, Brazil.
| | - Guilherme J M Rosa
- Department of Animal Sciences, University of Wisconsin, Madison, 53706, USA.
| |
Collapse
|
32
|
Bouwman AC, Veerkamp RF. Consequences of splitting whole-genome sequencing effort over multiple breeds on imputation accuracy. BMC Genet 2014; 15:105. [PMID: 25277486 PMCID: PMC4189672 DOI: 10.1186/s12863-014-0105-8] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2014] [Accepted: 09/24/2014] [Indexed: 11/18/2022] Open
Abstract
Background The aim of this study was to determine the consequences of splitting sequencing effort over multiple breeds for imputation accuracy from a high-density SNP chip towards whole-genome sequence. Such information would assist for instance numerical smaller cattle breeds, but also pig and chicken breeders, who have to choose wisely how to spend their sequencing efforts over all the breeds or lines they evaluate. Sequence data from cattle breeds was used, because there are currently relatively many individuals from several breeds sequenced within the 1,000 Bull Genomes project. The advantage of whole-genome sequence data is that it carries the causal mutations, but the question is whether it is possible to impute the causal variants accurately. This study therefore focussed on imputation accuracy of variants with low minor allele frequency and breed specific variants. Results Imputation accuracy was assessed for chromosome 1 and 29 as the correlation between observed and imputed genotypes. For chromosome 1, the average imputation accuracy was 0.70 with a reference population of 20 Holstein, and increased to 0.83 when the reference population was increased by including 3 other dairy breeds with 20 animals each. When the same amount of animals from the Holstein breed were added the accuracy improved to 0.88, while adding the 3 other breeds to the reference population of 80 Holstein improved the average imputation accuracy marginally to 0.89. For chromosome 29, the average imputation accuracy was lower. Some variants benefitted from the inclusion of other breeds in the reference population, initially determined by the MAF of the variant in each breed, but even Holstein specific variants did gain imputation accuracy from the multi-breed reference population. Conclusions This study shows that splitting sequencing effort over multiple breeds and combining the reference populations is a good strategy for imputation from high-density SNP panels towards whole-genome sequence when reference populations are small and sequencing effort is limiting. When sequencing effort is limiting and interest lays in multiple breeds or lines this provides imputation of each breed.
Collapse
Affiliation(s)
- Aniek C Bouwman
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, P.O. Box 338, 6700, AH, Wageningen, Netherlands.
| | - Roel F Veerkamp
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, P.O. Box 338, 6700, AH, Wageningen, Netherlands.
| |
Collapse
|
33
|
Strategies for imputation to whole genome sequence using a single or multi-breed reference population in cattle. BMC Genomics 2014; 15:728. [PMID: 25164068 PMCID: PMC4152568 DOI: 10.1186/1471-2164-15-728] [Citation(s) in RCA: 84] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2014] [Accepted: 06/18/2014] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND The advent of low cost next generation sequencing has made it possible to sequence a large number of dairy and beef bulls which can be used as a reference for imputation of whole genome sequence data. The aim of this study was to investigate the accuracy and speed of imputation from a high density SNP marker panel to whole genome sequence level. Data contained 132 Holstein, 42 Jersey, 52 Nordic Red and 16 Brown Swiss bulls with whole genome sequence data; 16 Holstein, 27 Jersey and 29 Nordic Reds had previously been typed with the bovine high density SNP panel and were used for validation. We investigated the effect of enlarging the reference population by combining data across breeds on the accuracy of imputation, and the accuracy and speed of both IMPUTE2 and BEAGLE using either genotype probability reference data or pre-phased reference data. All analyses were done on Bovine autosome 29 using 387,436 bi-allelic variants and 13,612 SNP markers from the bovine HD panel. RESULTS A combined breed reference population led to higher imputation accuracies than did a single breed reference. The highest accuracy of imputation for all three test breeds was achieved when using BEAGLE with un-phased reference data (mean genotype correlations of 0.90, 0.89 and 0.87 for Holstein, Jersey and Nordic Red respectively) but IMPUTE2 with un-phased reference data gave similar accuracies for Holsteins and Nordic Red. Pre-phasing the reference data only lead to a minor decrease in the imputation accuracy, but gave a large improvement in computation time. Pre-phasing with BEAGLE was substantially faster than pre-phasing with SHAPEIT2 (2.5 hours vs. 52 hours for 242 individuals), and imputation with pre-phased data was faster in IMPUTE2 than in BEAGLE (5 minutes vs. 50 minutes per individual). CONCLUSION Combining reference populations across breeds is a good option to increase the size of the reference data and in turn the accuracy of imputation when only few animals are available. Pre-phasing the reference data only slightly decreases the accuracy but gives substantial improvements in speed. Using BEAGLE for pre-phasing and IMPUTE2 for imputation is a fast and accurate strategy.
Collapse
|
34
|
Ma P, Lund MS, Ding X, Zhang Q, Su G. Increasing imputation and prediction accuracy for Chinese Holsteins using joint Chinese-Nordic reference population. J Anim Breed Genet 2014; 131:462-72. [PMID: 25099946 DOI: 10.1111/jbg.12111] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2013] [Accepted: 06/30/2014] [Indexed: 11/29/2022]
Abstract
This study investigated the effect of including Nordic Holsteins in the reference population on the imputation accuracy and prediction accuracy for Chinese Holsteins. The data used in this study include 85 Chinese Holstein bulls genotyped with both 54K chip and 777K (HD) chip, 2862 Chinese cows genotyped with 54K chip, 510 Nordic Holstein bulls genotyped with HD chip, and 4398 Nordic Holstein bulls genotyped with 54K chip and with deregressed proofs for five milk production traits. Based on these data, the accuracy of imputation from 54K to HD marker data and the accuracy of genomic predictions in Chinese Holstein were assessed. The allele correct rate increased around 2.7 and 1.7% in imputation from the 54K to the HD marker data for Chinese Holstein bulls and cows, respectively, when the Nordic HD-genotyped bulls were included in the reference data for imputation. However, the prediction accuracy was improved slightly when using the marker data imputed based on the combined HD reference data, compared with using the marker data imputed based on the Chinese HD reference data only. On the other hand, when using the combined reference population including 4398 Nordic Holstein bulls, the accuracy of genomic predictions increased 6.5 percentage points together with a reduction of prediction bias. The HD markers did not outperform the 54K markers in genomic prediction based on the present data. The results indicate that for Chinese Holsteins, it is necessary to genotype more individuals with 54K chip to increase reference population rather than increasing marker density.
Collapse
Affiliation(s)
- P Ma
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Tjele, Denmark; Key Laboratory of Animal Genetics and Breeding of the Ministry of Agriculture, Department of Animal Genetics and Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | | | | | | | | |
Collapse
|
35
|
Silva MV, dos Santos DJ, Boison SA, Utsunomiya AT, Carmo AS, Sonstegard TS, Cole JB, Van Tassell CP. The development of genomics applied to dairy breeding. Livest Sci 2014. [DOI: 10.1016/j.livsci.2014.05.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
36
|
Su G, Guldbrandtsen B, Aamand GP, Strandén I, Lund MS. Genomic relationships based on X chromosome markers and accuracy of genomic predictions with and without X chromosome markers. Genet Sel Evol 2014; 46:47. [PMID: 25080199 PMCID: PMC4137273 DOI: 10.1186/1297-9686-46-47] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2013] [Accepted: 06/18/2014] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Although the X chromosome is the second largest bovine chromosome, markers on the X chromosome are not used for genomic prediction in some countries and populations. In this study, we presented a method for computing genomic relationships using X chromosome markers, investigated the accuracy of imputation from a low density (7K) to the 54K SNP (single nucleotide polymorphism) panel, and compared the accuracy of genomic prediction with and without using X chromosome markers. METHODS The impact of considering X chromosome markers on prediction accuracy was assessed using data from Nordic Holstein bulls and different sets of SNPs: (a) the 54K SNPs for reference and test animals, (b) SNPs imputed from the 7K to the 54K SNP panel for test animals, (c) SNPs imputed from the 7K to the 54K panel for half of the reference animals, and (d) the 7K SNP panel for all animals. Beagle and Findhap were used for imputation. GBLUP (genomic best linear unbiased prediction) models with or without X chromosome markers and with or without a residual polygenic effect were used to predict genomic breeding values for 15 traits. RESULTS Averaged over the two imputation datasets, correlation coefficients between imputed and true genotypes for autosomal markers, pseudo-autosomal markers, and X-specific markers were 0.971, 0.831 and 0.935 when using Findhap, and 0.983, 0.856 and 0.937 when using Beagle. Estimated reliabilities of genomic predictions based on the imputed datasets using Findhap or Beagle were very close to those using the real 54K data. Genomic prediction using all markers gave slightly higher reliabilities than predictions without X chromosome markers. Based on our data which included only bulls, using a G matrix that accounted for sex-linked relationships did not improve prediction, compared with a G matrix that did not account for sex-linked relationships. A model that included a polygenic effect did not recover the loss of prediction accuracy from exclusion of X chromosome markers. CONCLUSIONS The results from this study suggest that markers on the X chromosome contribute to accuracy of genomic predictions and should be used for routine genomic evaluation.
Collapse
Affiliation(s)
- Guosheng Su
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Tjele DK-8830, Denmark.
| | | | | | | | | |
Collapse
|
37
|
Accuracy of estimation of genomic breeding values in pigs using low-density genotypes and imputation. G3-GENES GENOMES GENETICS 2014; 4:623-31. [PMID: 24531728 PMCID: PMC4059235 DOI: 10.1534/g3.114.010504] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Genomic selection has the potential to increase genetic progress. Genotype imputation of high-density single-nucleotide polymorphism (SNP) genotypes can improve the cost efficiency of genomic breeding value (GEBV) prediction for pig breeding. Consequently, the objectives of this work were to: (1) estimate accuracy of genomic evaluation and GEBV for three traits in a Yorkshire population and (2) quantify the loss of accuracy of genomic evaluation and GEBV when genotypes were imputed under two scenarios: a high-cost, high-accuracy scenario in which only selection candidates were imputed from a low-density platform and a low-cost, low-accuracy scenario in which all animals were imputed using a small reference panel of haplotypes. Phenotypes and genotypes obtained with the PorcineSNP60 BeadChip were available for 983 Yorkshire boars. Genotypes of selection candidates were masked and imputed using tagSNP in the GeneSeek Genomic Profiler (10K). Imputation was performed with BEAGLE using 128 or 1800 haplotypes as reference panels. GEBV were obtained through an animal-centric ridge regression model using de-regressed breeding values as response variables. Accuracy of genomic evaluation was estimated as the correlation between estimated breeding values and GEBV in a 10-fold cross validation design. Accuracy of genomic evaluation using observed genotypes was high for all traits (0.65−0.68). Using genotypes imputed from a large reference panel (accuracy: R2 = 0.95) for genomic evaluation did not significantly decrease accuracy, whereas a scenario with genotypes imputed from a small reference panel (R2 = 0.88) did show a significant decrease in accuracy. Genomic evaluation based on imputed genotypes in selection candidates can be implemented at a fraction of the cost of a genomic evaluation using observed genotypes and still yield virtually the same accuracy. On the other side, using a very small reference panel of haplotypes to impute training animals and candidates for selection results in lower accuracy of genomic evaluation.
Collapse
|
38
|
Hozé C, Fritz S, Phocas F, Boichard D, Ducrocq V, Croiseau P. Efficiency of multi-breed genomic selection for dairy cattle breeds with different sizes of reference population. J Dairy Sci 2014; 97:3918-29. [PMID: 24704232 DOI: 10.3168/jds.2013-7761] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2013] [Accepted: 02/25/2014] [Indexed: 01/13/2023]
Abstract
Single-breed genomic selection (GS) based on medium single nucleotide polymorphism (SNP) density (~50,000; 50K) is now routinely implemented in several large cattle breeds. However, building large enough reference populations remains a challenge for many medium or small breeds. The high-density BovineHD BeadChip (HD chip; Illumina Inc., San Diego, CA) containing 777,609 SNP developed in 2010 is characterized by short-distance linkage disequilibrium expected to be maintained across breeds. Therefore, combining reference populations can be envisioned. A population of 1,869 influential ancestors from 3 dairy breeds (Holstein, Montbéliarde, and Normande) was genotyped with the HD chip. Using this sample, 50K genotypes were imputed within breed to high-density genotypes, leading to a large HD reference population. This population was used to develop a multi-breed genomic evaluation. The goal of this paper was to investigate the gain of multi-breed genomic evaluation for a small breed. The advantage of using a large breed (Normande in the present study) to mimic a small breed is the large potential validation population to compare alternative genomic selection approaches more reliably. In the Normande breed, 3 training sets were defined with 1,597, 404, and 198 bulls, and a unique validation set included the 394 youngest bulls. For each training set, estimated breeding values (EBV) were computed using pedigree-based BLUP, single-breed BayesC, or multi-breed BayesC for which the reference population was formed by any of the Normande training data sets and 4,989 Holstein and 1,788 Montbéliarde bulls. Phenotypes were standardized by within-breed genetic standard deviation, the proportion of polygenic variance was set to 30%, and the estimated number of SNP with a nonzero effect was about 7,000. The 2 genomic selection (GS) approaches were performed using either the 50K or HD genotypes. The correlations between EBV and observed daughter yield deviations (DYD) were computed for 6 traits and using the different prediction approaches. Compared with pedigree-based BLUP, the average gain in accuracy with GS in small populations was 0.057 for the single-breed and 0.086 for multi-breed approach. This gain was up to 0.193 and 0.209, respectively, with the large reference population. Improvement of EBV prediction due to the multi-breed evaluation was higher for animals not closely related to the reference population. In the case of a breed with a small reference population size, the increase in correlation due to multi-breed GS was 0.141 for bulls without their sire in reference population compared with 0.016 for bulls with their sire in reference population. These results demonstrate that multi-breed GS can contribute to increase genomic evaluation accuracy in small breeds.
Collapse
Affiliation(s)
- C Hozé
- Institut National de la Recherche Agronomique (INRA), UMR 1313, Génétique Animale et Biologie Intégrative (GABI), 78350 Jouy-en-Josas, France; Union Nationales des Coopératives d'Élevages et d'Insémination Animales (UNCEIA), 149 rue de Bercy, 75012 Paris, France.
| | - S Fritz
- Union Nationales des Coopératives d'Élevages et d'Insémination Animales (UNCEIA), 149 rue de Bercy, 75012 Paris, France
| | - F Phocas
- Institut National de la Recherche Agronomique (INRA), UMR 1313, Génétique Animale et Biologie Intégrative (GABI), 78350 Jouy-en-Josas, France
| | - D Boichard
- Institut National de la Recherche Agronomique (INRA), UMR 1313, Génétique Animale et Biologie Intégrative (GABI), 78350 Jouy-en-Josas, France
| | - V Ducrocq
- Institut National de la Recherche Agronomique (INRA), UMR 1313, Génétique Animale et Biologie Intégrative (GABI), 78350 Jouy-en-Josas, France
| | - P Croiseau
- Institut National de la Recherche Agronomique (INRA), UMR 1313, Génétique Animale et Biologie Intégrative (GABI), 78350 Jouy-en-Josas, France
| |
Collapse
|
39
|
Schrooten C, Dassonneville R, Ducrocq V, Brøndum RF, Lund MS, Chen J, Liu Z, González-Recio O, Pena J, Druet T. Error rate for imputation from the Illumina BovineSNP50 chip to the Illumina BovineHD chip. Genet Sel Evol 2014; 46:10. [PMID: 24495554 PMCID: PMC3929158 DOI: 10.1186/1297-9686-46-10] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2012] [Accepted: 12/02/2013] [Indexed: 11/20/2022] Open
Abstract
Background Imputation of genotypes from low-density to higher density chips is a cost-effective method to obtain high-density genotypes for many animals, based on genotypes of only a relatively small subset of animals (reference population) on the high-density chip. Several factors influence the accuracy of imputation and our objective was to investigate the effects of the size of the reference population used for imputation and of the imputation method used and its parameters. Imputation of genotypes was carried out from 50 000 (moderate-density) to 777 000 (high-density) SNPs (single nucleotide polymorphisms). Methods The effect of reference population size was studied in two datasets: one with 548 and one with 1289 Holstein animals, genotyped with the Illumina BovineHD chip (777 k SNPs). A third dataset included the 548 animals genotyped with the 777 k SNP chip and 2200 animals genotyped with the Illumina BovineSNP50 chip. In each dataset, 60 animals were chosen as validation animals, for which all high-density genotypes were masked, except for the Illumina BovineSNP50 markers. Imputation was studied in a subset of six chromosomes, using the imputation software programs Beagle and DAGPHASE. Results Imputation with DAGPHASE and Beagle resulted in 1.91% and 0.87% allelic imputation error rates in the dataset with 548 high-density genotypes, when scale and shift parameters were 2.0 and 0.1, and 1.0 and 0.0, respectively. When Beagle was used alone, the imputation error rate was 0.67%. If the information obtained by Beagle was subsequently used in DAGPHASE, imputation error rates were slightly higher (0.71%). When 2200 moderate-density genotypes were added and Beagle was used alone, imputation error rates were slightly lower (0.64%). The least imputation errors were obtained with Beagle in the reference set with 1289 high-density genotypes (0.41%). Conclusions For imputation of genotypes from the 50 k to the 777 k SNP chip, Beagle gave the lowest allelic imputation error rates. Imputation error rates decreased with increasing size of the reference population. For applications for which computing time is limiting, DAGPHASE using information from Beagle can be considered as an alternative, since it reduces computation time and increases imputation error rates only slightly.
Collapse
|
40
|
Van Eenennaam AL, Weigel KA, Young AE, Cleveland MA, Dekkers JCM. Applied animal genomics: results from the field. Annu Rev Anim Biosci 2013; 2:105-39. [PMID: 25384137 DOI: 10.1146/annurev-animal-022513-114119] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Genomic selection (GS) is the use of statistical methods to estimate the genetic merit of a genotyped animal based on prediction equations derived from large ancestral populations with both phenotypes and genotypes. It has revolutionized the dairy cattle breeding industry and has been implemented with varying degrees of success in other animal breeding programs, including swine, poultry, and beef cattle. The findings of empirical field studies applying GS to the breeding sectors of these main animal protein industries are reviewed. Several translational considerations must be addressed before implementing GS in genetic improvement programs. These include determining and obtaining economically relevant phenotypes and determining the optimal size of the training population, cost-effective genotyping strategies, the practicality of field implementation, and the relative costs versus the benefits of the realized rate of genetic gain. GS may additionally change the optimal breeding scheme design, and studies that address this consideration are also reviewed briefly.
Collapse
|
41
|
Ertl J, Edel C, Emmerling R, Pausch H, Fries R, Götz KU. On the limited increase in validation reliability using high-density genotypes in genomic best linear unbiased prediction: observations from Fleckvieh cattle. J Dairy Sci 2013; 97:487-96. [PMID: 24210491 DOI: 10.3168/jds.2013-6855] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2013] [Accepted: 09/16/2013] [Indexed: 01/11/2023]
Abstract
This study investigated reliability of genomic predictions using medium-density (40,089; 50K) or high-density (HD; 388,951) marker sets. We developed an approximate method to test differences in validation reliability for significance. Model-based reliability and the effect of HD genotypes on inflation of predictions were analyzed additionally. Genomic breeding values were predicted for at least 1,321 validation bulls based on phenotypes and genotypes of at least 5,324 calibration bulls by means of a linear model in milk, fat, and protein yield; somatic cell score; milkability; muscling; udder, feet, and legs score as well as stature. In total, 1,485 bulls were actually HD genotyped and HD genotypes of the other animals were imputed from 50K genotypes using FImpute software. Validation reliability was measured as the coefficient of determination of the weighted regression of daughter yield deviations on predicted breeding values divided by the reliability of daughter yield deviations and inflation was evaluated by the slope of this regression. Model-based reliability was calculated from the model. Distributions for validation reliability of 50K markers were derived by repeated sampling of 50,000-marker samples from HD to test differences in validation reliability statistically. Additionally, the benefit of HD genotypes in validation reliability was tested by repeated sampling of validation groups and calculation of the difference in validation reliability between HD and 50K genotypes for the sampled groups of bulls. The mean benefit in validation reliability of HD genotypes was 0.015 compared with real 50K genotypes and 0.028 compared with 50K samples from HD affected by imputation error and was significant for all traits. The model-based reliability was, on average, 0.036 lower and the regression coefficient was 0.036 closer to the expected value with HD genotypes. The observed gain in validation reliability with HD genotypes was similar to expectations based on the number of markers and the effective number of segregating chromosome segments. Sampling error in the marker-based relationship coefficients causing overestimation of the model-based reliability was smaller with HD genotypes. Inflation of the genomic predictions was reduced with HD genotypes, accordingly. Similar effects on model-based reliability and inflation, but not on the validation reliability, were obtained by shrinkage estimation of the realized relationship matrix from 50K genotypes.
Collapse
Affiliation(s)
- J Ertl
- Institute of Animal Breeding, Bavarian State Research Centre for Agriculture, 85586 Poing, Germany.
| | - C Edel
- Institute of Animal Breeding, Bavarian State Research Centre for Agriculture, 85586 Poing, Germany
| | - R Emmerling
- Institute of Animal Breeding, Bavarian State Research Centre for Agriculture, 85586 Poing, Germany
| | - H Pausch
- Chair of Animal Breeding, Technische Universität München, 85354 Freising, Germany
| | - R Fries
- Chair of Animal Breeding, Technische Universität München, 85354 Freising, Germany
| | - K-U Götz
- Institute of Animal Breeding, Bavarian State Research Centre for Agriculture, 85586 Poing, Germany
| |
Collapse
|
42
|
Smaragdov MG. Genomic selection of milk cattle. The practical application over five years. RUSS J GENET+ 2013. [DOI: 10.1134/s1022795413100104] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
43
|
Tribout T, Larzul C, Phocas F. Economic aspects of implementing genomic evaluations in a pig sire line breeding scheme. Genet Sel Evol 2013; 45:40. [PMID: 24127883 PMCID: PMC3840607 DOI: 10.1186/1297-9686-45-40] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2013] [Accepted: 09/05/2013] [Indexed: 01/31/2023] Open
Abstract
Background Replacing pedigree-based BLUP evaluations by genomic evaluations in pig breeding schemes can result in greater selection accuracy and genetic gains, especially for traits with limited phenotypes. However, this methodological change would generate additional costs. The objective of this study was to determine whether additional expenditures would be more profitably devoted to implementing genomic evaluations or to increasing phenotyping capacity while retaining traditional evaluations. Methods Stochastic simulation was used to simulate a population with 1050 breeding females and 50 boars that was selected for 10 years for a breeding goal with two uncorrelated traits with heritabilities of 0.4. The reference breeding scheme was based on phenotyping 13 770 candidates per year for trait 1 and 270 sibs of candidates per year for trait 2, with selection based on pedigree-based BLUP estimated breeding values. Increased expenditures were allocated to either increasing the phenotyping capacity for trait 2 while maintaining traditional evaluations, or to implementing genomic selection. The genomic scheme was based on two training populations: one for trait 2, consisting of phenotyped sibs of the candidates whose number increased from 1000 to 3430 over time, and one for trait 1, consisting of the selection candidates. Several genomic scenarios were tested, where the size of the training population for trait 1, and the number of genotyped candidates pre-selected based on their parental estimated breeding value, varied. Results Both approaches resulted in higher genetic trends for the population breeding goal and lower rates of inbreeding compared to the reference scheme. However, even a very marked increase in phenotyping capacity for trait 2 could not match improvements achieved with genomic selection when the number of genotyped candidates was large. Genotyping just a limited number of pre-selected candidates significantly reduced the extra costs, while preserving most of the benefits in terms of genetic trends and inbreeding. Implementing genomic evaluations was the most efficient approach when major expenditure was possible, whereas increasing phenotypes was preferable when limited resources were available. Conclusions Economic decisions on implementing genomic evaluations in a pig nucleus population must take account of population characteristics, phenotyping and genotyping costs, and available funds.
Collapse
Affiliation(s)
- Thierry Tribout
- INRA, UMR1313 Génétique Animale et Biologie Intégrative, F-78350, Jouy-en-Josas, France.
| | | | | |
Collapse
|
44
|
Hozé C, Fouilloux MN, Venot E, Guillaume F, Dassonneville R, Fritz S, Ducrocq V, Phocas F, Boichard D, Croiseau P. High-density marker imputation accuracy in sixteen French cattle breeds. Genet Sel Evol 2013; 45:33. [PMID: 24004563 PMCID: PMC3846489 DOI: 10.1186/1297-9686-45-33] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2013] [Accepted: 07/19/2013] [Indexed: 12/20/2022] Open
Abstract
Background Genotyping with the medium-density Bovine SNP50 BeadChip® (50K) is now standard in cattle. The high-density BovineHD BeadChip®, which contains 777 609 single nucleotide polymorphisms (SNPs), was developed in 2010. Increasing marker density increases the level of linkage disequilibrium between quantitative trait loci (QTL) and SNPs and the accuracy of QTL localization and genomic selection. However, re-genotyping all animals with the high-density chip is not economically feasible. An alternative strategy is to genotype part of the animals with the high-density chip and to impute high-density genotypes for animals already genotyped with the 50K chip. Thus, it is necessary to investigate the error rate when imputing from the 50K to the high-density chip. Methods Five thousand one hundred and fifty three animals from 16 breeds (89 to 788 per breed) were genotyped with the high-density chip. Imputation error rates from the 50K to the high-density chip were computed for each breed with a validation set that included the 20% youngest animals. Marker genotypes were masked for animals in the validation population in order to mimic 50K genotypes. Imputation was carried out using the Beagle 3.3.0 software. Results Mean allele imputation error rates ranged from 0.31% to 2.41% depending on the breed. In total, 1980 SNPs had high imputation error rates in several breeds, which is probably due to genome assembly errors, and we recommend to discard these in future studies. Differences in imputation accuracy between breeds were related to the high-density-genotyped sample size and to the genetic relationship between reference and validation populations, whereas differences in effective population size and level of linkage disequilibrium showed limited effects. Accordingly, imputation accuracy was higher in breeds with large populations and in dairy breeds than in beef breeds. More than 99% of the alleles were correctly imputed if more than 300 animals were genotyped at high-density. No improvement was observed when multi-breed imputation was performed. Conclusion In all breeds, imputation accuracy was higher than 97%, which indicates that imputation to the high-density chip was accurate. Imputation accuracy depends mainly on the size of the reference population and the relationship between reference and target populations.
Collapse
Affiliation(s)
- Chris Hozé
- INRA, UMR 1313 Génétique Animale et Biologie Intégrative, 78350 Jouy-en-Josas, France.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Sitzenstock F, Ytournel F, Sharifi AR, Cavero D, Täubert H, Preisinger R, Simianer H. Efficiency of genomic selection in an established commercial layer breeding program. Genet Sel Evol 2013; 45:29. [PMID: 23902427 PMCID: PMC3750290 DOI: 10.1186/1297-9686-45-29] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2012] [Accepted: 07/03/2013] [Indexed: 12/20/2022] Open
Abstract
Background In breeding programs for layers, selection of hens and cocks is based on recording phenotypic data from hens in different housing systems. Genomic information can provide additional information for selection and/or allow for a strong reduction in the generation interval. In this study, a typical conventional layer breeding program using a four-line cross was modeled and the expected genetic progress was derived deterministically with the software ZPLAN+. This non-genomic reference scenario was compared to two genomic breeding programs to determine the best strategy for implementing genomic information in layer breeding programs. Results In scenario I, genomic information was used in addition to all other information available in the conventional breeding program, so the generation interval was the same as in the reference scenario, i.e. 14.5 months. Here, we assumed that either only young cocks or young cocks and hens were genotyped as selection candidates. In scenario II, we assumed that breeders of both sexes were used at the biologically earliest possible age, so that at the time of selection only performance data of the parent generation and genomic information of the selection candidates were available. In this case, the generation interval was reduced to eight months. In both scenarios, the number of genotyped male selection candidates was varied between 800 and 4800 males and two sizes of the calibration set (500 or 2000 animals) were considered. All genomic scenarios increased the expected genetic gain and the economic profit of the breeding program. In scenario II, the increase was much more pronounced and even in the most conservative implementation led to a 60% improvement in genetic gain and economic profit. This increase was in all cases associated with higher breeding costs. Conclusions While genomic selection is shown to have the potential to improve genetic gain in layer breeding programs, its implementation remains a business decision of the breeding company; the possible extra profit for the breeding company depends on whether the customers of breeding stock are willing to pay more for improved genetic quality.
Collapse
Affiliation(s)
- Florian Sitzenstock
- Department of Animal Sciences, University of Göttingen, 37075 Göttingen, Germany.
| | | | | | | | | | | | | |
Collapse
|
46
|
Use of partial least squares regression to impute SNP genotypes in Italian cattle breeds. Genet Sel Evol 2013; 45:15. [PMID: 23738947 PMCID: PMC3716726 DOI: 10.1186/1297-9686-45-15] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2012] [Accepted: 05/31/2013] [Indexed: 01/28/2023] Open
Abstract
Background The objective of the present study was to test the ability of the partial least squares regression technique to impute genotypes from low density single nucleotide polymorphisms (SNP) panels i.e. 3K or 7K to a high density panel with 50K SNP. No pedigree information was used. Methods Data consisted of 2093 Holstein, 749 Brown Swiss and 479 Simmental bulls genotyped with the Illumina 50K Beadchip. First, a single-breed approach was applied by using only data from Holstein animals. Then, to enlarge the training population, data from the three breeds were combined and a multi-breed analysis was performed. Accuracies of genotypes imputed using the partial least squares regression method were compared with those obtained by using the Beagle software. The impact of genotype imputation on breeding value prediction was evaluated for milk yield, fat content and protein content. Results In the single-breed approach, the accuracy of imputation using partial least squares regression was around 90 and 94% for the 3K and 7K platforms, respectively; corresponding accuracies obtained with Beagle were around 85% and 90%. Moreover, computing time required by the partial least squares regression method was on average around 10 times lower than computing time required by Beagle. Using the partial least squares regression method in the multi-breed resulted in lower imputation accuracies than using single-breed data. The impact of the SNP-genotype imputation on the accuracy of direct genomic breeding values was small. The correlation between estimates of genetic merit obtained by using imputed versus actual genotypes was around 0.96 for the 7K chip. Conclusions Results of the present work suggested that the partial least squares regression imputation method could be useful to impute SNP genotypes when pedigree information is not available.
Collapse
|
47
|
Ma P, Brøndum RF, Zhang Q, Lund MS, Su G. Comparison of different methods for imputing genome-wide marker genotypes in Swedish and Finnish Red Cattle. J Dairy Sci 2013; 96:4666-77. [PMID: 23684022 DOI: 10.3168/jds.2012-6316] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2012] [Accepted: 03/28/2013] [Indexed: 12/25/2022]
Abstract
This study investigated the imputation accuracy of different methods, considering both the minor allele frequency and relatedness between individuals in the reference and test data sets. Two data sets from the combined population of Swedish and Finnish Red Cattle were used to test the influence of these factors on the accuracy of imputation. Data set 1 consisted of 2,931 reference bulls and 971 test bulls, and was used for validation of imputation from 3,000 markers (3K) to 54,000 markers (54K). Data set 2 contained 341 bulls in the reference set and 117 in the test set, and was used for validation of imputation from 54K to high density [777,000 markers (777K)]. Both test sets were divided into 4 groups according to their relationship to the reference population. Five imputation methods (Beagle, IMPUTE2, findhap, AlphaImpute, and FImpute) were used in this study. Imputation accuracy was measured as the allele correct rate and correlation between imputed and true genotypes. Results demonstrated that the accuracy was lower when imputing from 3K to 54K than from 54K to 777K. Using various imputation methods, the allele correct rates varied from 93.5 to 97.1% when imputing from 3K to 54K, and from 97.1 to 99.3% when imputing from 54K to 777K; IMPUTE2 and Beagle resulted in higher accuracies and were more robust under various conditions than the other 3 methods when imputing from 3K to 54K. The accuracy of imputation using FImpute was similar to those results from Beagle and IMPUTE2 when imputing from 54K to high density, and higher than the remaining 2 methods. The results also showed that a closer relationship between test set and reference set led to a higher accuracy for all the methods. In addition, the correct rate was higher when the minor allele frequency was lower, whereas the correlation coefficient was lower when the minor allele frequency was lower. The results indicate that Beagle and IMPUTE2 provide the most robust and accurate imputation accuracies, but considering computing time and memory usage, FImpute is another alternative method.
Collapse
Affiliation(s)
- P Ma
- Centre for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK 8830 Tjele, Denmark.
| | | | | | | | | |
Collapse
|
48
|
Enlarging a training set for genomic selection by imputation of un-genotyped animals in populations of varying genetic architecture. Genet Sel Evol 2013; 45:12. [PMID: 23621897 PMCID: PMC3652763 DOI: 10.1186/1297-9686-45-12] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2012] [Accepted: 03/24/2013] [Indexed: 02/02/2023] Open
Abstract
Background The most common application of imputation is to infer genotypes of a high-density panel of markers on animals that are genotyped for a low-density panel. However, the increase in accuracy of genomic predictions resulting from an increase in the number of markers tends to reach a plateau beyond a certain density. Another application of imputation is to increase the size of the training set with un-genotyped animals. This strategy can be particularly successful when a set of closely related individuals are genotyped. Methods Imputation on completely un-genotyped dams was performed using known genotypes from the sire of each dam, one offspring and the offspring’s sire. Two methods were applied based on either allele or haplotype frequencies to infer genotypes at ambiguous loci. Results of these methods and of two available software packages were compared. Quality of imputation under different population structures was assessed. The impact of using imputed dams to enlarge training sets on the accuracy of genomic predictions was evaluated for different populations, heritabilities and sizes of training sets. Results Imputation accuracy ranged from 0.52 to 0.93 depending on the population structure and the method used. The method that used allele frequencies performed better than the method based on haplotype frequencies. Accuracy of imputation was higher for populations with higher levels of linkage disequilibrium and with larger proportions of markers with more extreme allele frequencies. Inclusion of imputed dams in the training set increased the accuracy of genomic predictions. Gains in accuracy ranged from close to zero to 37.14%, depending on the simulated scenario. Generally, the larger the accuracy already obtained with the genotyped training set, the lower the increase in accuracy achieved by adding imputed dams. Conclusions Whenever a reference population resembling the family configuration considered here is available, imputation can be used to achieve an extra increase in accuracy of genomic predictions by enlarging the training set with completely un-genotyped dams. This strategy was shown to be particularly useful for populations with lower levels of linkage disequilibrium, for genomic selection on traits with low heritability, and for species or breeds for which the size of the reference population is limited.
Collapse
|
49
|
Imputation of unordered markers and the impact on genomic selection accuracy. G3-GENES GENOMES GENETICS 2013; 3:427-39. [PMID: 23449944 PMCID: PMC3583451 DOI: 10.1534/g3.112.005363] [Citation(s) in RCA: 100] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/04/2012] [Accepted: 12/28/2012] [Indexed: 12/24/2022]
Abstract
Genomic selection, a breeding method that promises to accelerate rates of genetic gain, requires dense, genome-wide marker data. Genotyping-by-sequencing can generate a large number of de novo markers. However, without a reference genome, these markers are unordered and typically have a large proportion of missing data. Because marker imputation algorithms were developed for species with a reference genome, algorithms suited for unordered markers have not been rigorously evaluated. Using four empirical datasets, we evaluate and characterize four such imputation methods, referred to as k-nearest neighbors, singular value decomposition, random forest regression, and expectation maximization imputation, in terms of their imputation accuracies and the factors affecting accuracy. The effect of imputation method on the genomic selection accuracy is assessed in comparison with mean imputation. The effect of excluding markers with a large proportion of missing data on the genomic selection accuracy is also examined. Our results show that imputation of unordered markers can be accurate, especially when linkage disequilibrium between markers is high and genotyped individuals are related. Of the methods evaluated, random forest regression imputation produced superior accuracy. In comparison with mean imputation, all four imputation methods we evaluated led to greater genomic selection accuracies when the level of missing data was high. Including rather than excluding markers with a large proportion of missing data nearly always led to greater GS accuracies. We conclude that high levels of missing data in dense marker sets is not a major obstacle for genomic selection, even when marker order is not known.
Collapse
|
50
|
Badke YM, Bates RO, Ernst CW, Schwab C, Fix J, Van Tassell CP, Steibel JP. Methods of tagSNP selection and other variables affecting imputation accuracy in swine. BMC Genet 2013; 14:8. [PMID: 23433396 PMCID: PMC3734000 DOI: 10.1186/1471-2156-14-8] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2012] [Accepted: 01/29/2013] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Genotype imputation is a cost efficient alternative to use of high density genotypes for implementing genomic selection. The objective of this study was to investigate variables affecting imputation accuracy from low density tagSNP (average distance between tagSNP from 100kb to 1Mb) sets in swine, selected using LD information, physical location, or accuracy for genotype imputation. We compared results of imputation accuracy based on several sets of low density tagSNP of varying densities and selected using three different methods. In addition, we assessed the effect of varying size and composition of the reference panel of haplotypes used for imputation. RESULTS TagSNP density of at least 1 tagSNP per 340kb (~7000 tagSNP) selected using pairwise LD information was necessary to achieve average imputation accuracy higher than 0.95. A commercial low density (9K) tagSNP set for swine was developed concurrent to this study and an average accuracy of imputation of 0.951 based on these tagSNP was estimated. Construction of a haplotype reference panel was most efficient when these haplotypes were obtained from randomly sampled individuals. Increasing the size of the original reference haplotype panel (128 haplotypes sampled from 32 sire/dam/offspring trios phased in a previous study) led to an overall increase in imputation accuracy (IA = 0.97 with 512 haplotypes), but was especially useful in increasing imputation accuracy of SNP with MAF below 0.1 and for SNP located in the chromosomal extremes (within 5% of chromosome end). CONCLUSION The new commercially available 9K tagSNP set can be used to obtain imputed genotypes with high accuracy, even when imputation is based on a comparably small panel of reference haplotypes (128 haplotypes). Average imputation accuracy can be further increased by adding haplotypes to the reference panel. In addition, our results show that randomly sampling individuals to genotype for the construction of a reference haplotype panel is more cost efficient than specifically sampling older animals or trios with no observed loss in imputation accuracy. We expect that the use of imputed genotypes in swine breeding will yield highly accurate predictions of GEBV, based on the observed accuracy and reported results in dairy cattle, where genomic evaluation of some individuals is based on genotypes imputed with the same accuracy as our Yorkshire population.
Collapse
Affiliation(s)
- Yvonne M Badke
- Department of Animal Science, Michigan State University, East Lansing, MI, USA
| | - Ronald O Bates
- Department of Animal Science, Michigan State University, East Lansing, MI, USA
| | - Catherine W Ernst
- Department of Animal Science, Michigan State University, East Lansing, MI, USA
| | | | - Justin Fix
- National Swine Registry, West Lafayette, IN, USA
| | - Curtis P Van Tassell
- Bovine Functional Genomics Laboratory, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD, USA
| | - Juan P Steibel
- Department of Animal Science, Michigan State University, East Lansing, MI, USA
- Department of Fisheries & Wildlife, Michigan State University, East Lansing, MI, USA
| |
Collapse
|