1
|
Martini JWR, Gao N, Crossa J. Incorporating Omics Data in Genomic Prediction. Methods Mol Biol 2022; 2467:341-357. [PMID: 35451782 DOI: 10.1007/978-1-0716-2205-6_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In this chapter, we discuss the motivation for integrating other types of omics data into genomic prediction methods. We give an overview of literature investigating the performance of omics-enhanced predictions, and highlight potential pitfalls when applying these methods in breeding. We emphasize that the statistical methods available for genomic data can be transferred to the general omics case. However, when using a framework of omic relationship matrices, the standardization of the variables may be more relevant than it is for a genomic relationship matrix based on single-nucleotide polymorphisms.
Collapse
Affiliation(s)
- Johannes W R Martini
- International Maize and Wheat Improvement Center (CIMMYT), Veracruz, CP, Mexico.
| | - Ning Gao
- School of Life Sciences, Sun Yat-Sen University, Guangzhou, China
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Veracruz, CP, Mexico
| |
Collapse
|
2
|
Adoption and Optimization of Genomic Selection To Sustain Breeding for Apricot Fruit Quality. G3-GENES GENOMES GENETICS 2020; 10:4513-4529. [PMID: 33067307 PMCID: PMC7718743 DOI: 10.1534/g3.120.401452] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Genomic selection (GS) is a breeding approach which exploits genome-wide information and whose unprecedented success has shaped several animal and plant breeding schemes through delivering their genetic progress. This is the first study assessing the potential of GS in apricot (Prunus armeniaca) to enhance postharvest fruit quality attributes. Genomic predictions were based on a F1 pseudo-testcross population, comprising 153 individuals with contrasting fruit quality traits. They were phenotyped for physical and biochemical fruit metrics in contrasting climatic conditions over two years. Prediction accuracy (PA) varied from 0.31 for glucose content with the Bayesian LASSO (BL) to 0.78 for ethylene production with RR-BLUP, which yielded the most accurate predictions in comparison to Bayesian models and only 10% out of 61,030 SNPs were sufficient to reach accurate predictions. Useful insights were provided on the genetic architecture of apricot fruit quality whose integration in prediction models improved their performance, notably for traits governed by major QTL. Furthermore, multivariate modeling yielded promising outcomes in terms of PA within training partitions partially phenotyped for target traits. This provides a useful framework for the implementation of indirect selection based on easy-to-measure traits. Thus, we highlighted the main levers to take into account for the implementation of GS for fruit quality in apricot, but also to improve the genetic gain in perennial species.
Collapse
|
3
|
Liu L, Zhou J, Chen CJ, Zhang J, Wen W, Tian J, Zhang Z, Gu Y. GWAS-Based Identification of New Loci for Milk Yield, Fat, and Protein in Holstein Cattle. Animals (Basel) 2020; 10:E2048. [PMID: 33167458 PMCID: PMC7694478 DOI: 10.3390/ani10112048] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 11/01/2020] [Accepted: 11/03/2020] [Indexed: 12/20/2022] Open
Abstract
High-yield and high-quality of milk are the primary goals of dairy production. Understanding the genetic architecture underlying these milk-related traits is beneficial so that genetic variants can be targeted toward the genetic improvement. In this study, we measured five milk production and quality traits in Holstein cattle population from China. These traits included milk yield, fat, and protein. We used the estimated breeding values as dependent variables to conduct the genome-wide association studies (GWAS). Breeding values were estimated through pedigree relationships by using a linear mixed model. Genotyping was carried out on the individuals with phenotypes by using the Illumina BovineSNP150 BeadChip. The association analyses were conducted by using the fixed and random model Circulating Probability Unification (FarmCPU) method. A total of ten single-nucleotide polymorphisms (SNPs) were detected above the genome-wide significant threshold (p < 4.0 × 10-7), including six located in previously reported quantitative traits locus (QTL) regions. We found eight candidate genes within distances of 120 kb upstream or downstream to the associated SNPs. The study not only identified the effect of DGAT1 gene on milk fat and protein, but also discovered novel genetic loci and candidate genes related to milk traits. These novel genetic loci would be an important basis for molecular breeding in dairy cattle.
Collapse
Affiliation(s)
- Liyuan Liu
- School of Agriculture, Ningxia University, Yinchuan 750021, Ningxia, China; (L.L.); (J.Z.); (J.Z.)
- Department of Crop and Soil Sciences, Washington State University, Pullman, Washington, DC 99164, USA;
| | - Jinghang Zhou
- School of Agriculture, Ningxia University, Yinchuan 750021, Ningxia, China; (L.L.); (J.Z.); (J.Z.)
- Department of Crop and Soil Sciences, Washington State University, Pullman, Washington, DC 99164, USA;
| | - Chunpeng James Chen
- Department of Crop and Soil Sciences, Washington State University, Pullman, Washington, DC 99164, USA;
| | - Juan Zhang
- School of Agriculture, Ningxia University, Yinchuan 750021, Ningxia, China; (L.L.); (J.Z.); (J.Z.)
| | - Wan Wen
- Animal Husbandry Workstation, Yinchuan 750001, Ningxia, China; (W.W.); (J.T.)
| | - Jia Tian
- Animal Husbandry Workstation, Yinchuan 750001, Ningxia, China; (W.W.); (J.T.)
| | - Zhiwu Zhang
- Department of Crop and Soil Sciences, Washington State University, Pullman, Washington, DC 99164, USA;
| | - Yaling Gu
- School of Agriculture, Ningxia University, Yinchuan 750021, Ningxia, China; (L.L.); (J.Z.); (J.Z.)
| |
Collapse
|
4
|
Yin L, Zhang H, Zhou X, Yuan X, Zhao S, Li X, Liu X. KAML: improving genomic prediction accuracy of complex traits using machine learning determined parameters. Genome Biol 2020; 21:146. [PMID: 32552725 PMCID: PMC7386246 DOI: 10.1186/s13059-020-02052-w] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 05/21/2020] [Indexed: 02/06/2023] Open
Abstract
Advances in high-throughput sequencing technologies have reduced the cost of genotyping dramatically and led to genomic prediction being widely used in animal and plant breeding, and increasingly in human genetics. Inspired by the efficient computing of linear mixed model and the accurate prediction of Bayesian methods, we propose a machine learning-based method incorporating cross-validation, multiple regression, grid search, and bisection algorithms named KAML that aims to combine the advantages of prediction accuracy with computing efficiency. KAML exhibits higher prediction accuracy than existing methods, and it is available at https://github.com/YinLiLin/KAML.
Collapse
Affiliation(s)
- Lilin Yin
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070, Hubei, People's Republic of China.,Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, Hubei, People's Republic of China
| | - Haohao Zhang
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, 430070, China
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Xiaohui Yuan
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, 430070, China
| | - Shuhong Zhao
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070, Hubei, People's Republic of China.,Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, Hubei, People's Republic of China
| | - Xinyun Li
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070, Hubei, People's Republic of China. .,Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, Hubei, People's Republic of China.
| | - Xiaolei Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070, Hubei, People's Republic of China. .,Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, Hubei, People's Republic of China.
| |
Collapse
|
5
|
Sehgal D, Rosyara U, Mondal S, Singh R, Poland J, Dreisigacker S. Incorporating Genome-Wide Association Mapping Results Into Genomic Prediction Models for Grain Yield and Yield Stability in CIMMYT Spring Bread Wheat. FRONTIERS IN PLANT SCIENCE 2020; 11:197. [PMID: 32194596 PMCID: PMC7064468 DOI: 10.3389/fpls.2020.00197] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2019] [Accepted: 02/11/2020] [Indexed: 05/21/2023]
Abstract
Untangling the genetic architecture of grain yield (GY) and yield stability is an important determining factor to optimize genomics-assisted selection strategies in wheat. We conducted in-depth investigation on the above using a large set of advanced bread wheat lines (4,302), which were genotyped with genotyping-by-sequencing markers and phenotyped under contrasting (irrigated and stress) environments. Haplotypes-based genome-wide-association study (GWAS) identified 58 associations with GY and 15 with superiority index Pi (measure of stability). Sixteen associations with GY were "environment-specific" with two on chromosomes 3B and 6B with the large effects and 8 associations were consistent across environments and trials. For Pi, 8 associations were from chromosomes 4B and 7B, indicating 'hot spot' regions for stability. Epistatic interactions contributed to an additional 5-9% variation on average. We further explored whether integrating consistent and robust associations identified in GWAS as fixed effects in prediction models improves prediction accuracy. For GY, the model accounting for the haplotype-based GWAS loci as fixed effects led to up to 9-10% increase in prediction accuracy, whereas for Pi this approach did not provide any advantage. This is the first report of integrating genetic architecture of GY and yield stability into prediction models in wheat.
Collapse
Affiliation(s)
- Deepmala Sehgal
- Global Wheat Program, International Maize and Wheat Improvement Center, Texcoco, Mexico
| | - Umesh Rosyara
- Global Wheat Program, International Maize and Wheat Improvement Center, Texcoco, Mexico
| | - Suchismita Mondal
- Global Wheat Program, International Maize and Wheat Improvement Center, Texcoco, Mexico
| | - Ravi Singh
- Global Wheat Program, International Maize and Wheat Improvement Center, Texcoco, Mexico
| | - Jesse Poland
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Susanne Dreisigacker
- Global Wheat Program, International Maize and Wheat Improvement Center, Texcoco, Mexico
| |
Collapse
|
6
|
Karaman E, Lund MS, Su G. Multi-trait single-step genomic prediction accounting for heterogeneous (co)variances over the genome. Heredity (Edinb) 2020; 124:274-287. [PMID: 31641237 PMCID: PMC6972913 DOI: 10.1038/s41437-019-0273-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 09/05/2019] [Accepted: 09/06/2019] [Indexed: 11/23/2022] Open
Abstract
Widely used genomic prediction models may not properly account for heterogeneous (co)variance structure across the genome. Models such as BayesA and BayesB assume locus-specific variance, which are highly influenced by the prior for (co)variance of single nucleotide polymorphism (SNP) effect, regardless of the size of data. Models such as BayesC or GBLUP assume a common (co)variance for a proportion (BayesC) or all (GBLUP) of the SNP effects. In this study, we propose a multi-trait Bayesian whole genome regression method (BayesN0), which is based on grouping a number of predefined SNPs to account for heterogeneous (co)variance structure across the genome. This model was also implemented in single-step Bayesian regression (ssBayesN0). For practical implementation, we considered multi-trait single-step SNPBLUP models, using (co)variance estimates from BayesN0 or ssBayesN0. Genotype data were simulated using haplotypes on first five chromosomes of 2200 Danish Holstein cattle, and phenotypes were simulated for two traits with heritabilities 0.1 or 0.4, assuming 200 quantitative trait loci (QTL). We compared prediction accuracy from different prediction models and different region sizes (one SNP, 100 SNPs, one chromosome or whole genome). In general, highest accuracies were obtained when 100 adjacent SNPs were grouped together. The ssBayesN0 improved accuracies over BayesN0, and using (co)variance estimates from ssBayesN0 generally yielded higher accuracies than using (co)variance estimates from BayesN0, for the 100 SNPs region size. Our results suggest that it could be a good strategy to estimate (co)variance components from ssBayesN0, and then to use those estimates in genomic prediction using multi-trait single-step SNPBLUP, in routine genomic evaluations.
Collapse
Affiliation(s)
- Emre Karaman
- Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark.
| | - Mogens S Lund
- Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Guosheng Su
- Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| |
Collapse
|
7
|
Arbelaez JD, Dwiyanti MS, Tandayu E, Llantada K, Jarana A, Ignacio JC, Platten JD, Cobb J, Rutkoski JE, Thomson MJ, Kretzschmar T. 1k-RiCA (1K-Rice Custom Amplicon) a novel genotyping amplicon-based SNP assay for genetics and breeding applications in rice. RICE (NEW YORK, N.Y.) 2019; 12:55. [PMID: 31350673 PMCID: PMC6660535 DOI: 10.1186/s12284-019-0311-0] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Accepted: 07/02/2019] [Indexed: 05/04/2023]
Abstract
BACKGROUND While a multitude of genotyping platforms have been developed for rice, the majority of them have not been optimized for breeding where cost, turnaround time, throughput and ease of use, relative to density and informativeness are critical parameters of their utility. With that in mind we report the development of the 1K-Rice Custom Amplicon, or 1k-RiCA, a robust custom sequencing-based amplicon panel of ~ 1000-SNPs that are uniformly distributed across the rice genome, designed to be highly informative within indica rice breeding pools, and tailored for genomic prediction in elite indica rice breeding programs. RESULTS Empirical validation tests performed on the 1k-RiCA showed average marker call rates of 95% with marker repeatability and concordance rates of 99%. These technical properties were not affected when two common DNA extraction protocols were used. The average distance between SNPs in the 1k-RiCA was 1.5 cM, similar to the theoretical distance which would be expected between 1,000 uniformly distributed markers across the rice genome. The average minor allele frequencies on a panel of indica lines was 0.36 and polymorphic SNPs estimated on pairwise comparisons between indica by indica accessions and indica by japonica accessions were on average 430 and 450 respectively. The specific design parameters of the 1k-RiCA allow for a detailed view of genetic relationships and unambiguous molecular IDs within indica accessions and good cost vs. marker-density balance for genomic prediction applications in elite indica germplasm. Predictive abilities of Genomic Selection models for flowering time, grain yield, and plant height were on average 0.71, 0.36, and 0.65 respectively based on cross-validation analysis. Furthermore the inclusion of important trait markers associated with 11 different genes and QTL adds value to parental selection in crossing schemes and marker-assisted selection in forward breeding applications. CONCLUSIONS This study validated the marker quality and robustness of the 1k-RiCA genotypic platform for genotyping populations derived from indica rice subpopulation for genetic and breeding purposes including MAS and genomic selection. The 1k-RiCA has proven to be an alternative cost-effective genotyping system for breeding applications.
Collapse
Affiliation(s)
- Juan David Arbelaez
- International Rice Research Institute, DAPO Box 7777, 1301 Los Baños, Metro Manila Philippines
| | | | - Erwin Tandayu
- International Rice Research Institute, DAPO Box 7777, 1301 Los Baños, Metro Manila Philippines
| | - Krizzel Llantada
- International Rice Research Institute, DAPO Box 7777, 1301 Los Baños, Metro Manila Philippines
| | - Annalhea Jarana
- International Rice Research Institute, DAPO Box 7777, 1301 Los Baños, Metro Manila Philippines
| | - John Carlos Ignacio
- International Rice Research Institute, DAPO Box 7777, 1301 Los Baños, Metro Manila Philippines
| | - John Damien Platten
- International Rice Research Institute, DAPO Box 7777, 1301 Los Baños, Metro Manila Philippines
| | - Joshua Cobb
- International Rice Research Institute, DAPO Box 7777, 1301 Los Baños, Metro Manila Philippines
| | - Jessica Elaine Rutkoski
- International Rice Research Institute, DAPO Box 7777, 1301 Los Baños, Metro Manila Philippines
| | - Michael J. Thomson
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Houston, TX 77843 USA
| | - Tobias Kretzschmar
- Southern Cross Plant Sciences, Southern Cross University, PO Box 157, Lismore, NSW 2480 Australia
| |
Collapse
|
8
|
van den Berg S, Vandenplas J, van Eeuwijk FA, Lopes MS, Veerkamp RF. Significance testing and genomic inflation factor using high-density genotypes or whole-genome sequence data. J Anim Breed Genet 2019; 136:418-429. [PMID: 31215703 PMCID: PMC6900143 DOI: 10.1111/jbg.12419] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Revised: 05/21/2019] [Accepted: 05/29/2019] [Indexed: 01/02/2023]
Abstract
Significance testing for genome‐wide association study (GWAS) with increasing SNP density up to whole‐genome sequence data (WGS) is not straightforward, because of strong LD between SNP and population stratification. Therefore, the objective of this study was to investigate genomic control and different significance testing procedures using data from a commercial pig breeding scheme. A GWAS was performed in GCTA with data of 4,964 Large White pigs using medium density, high density or imputed whole‐genome sequence data, fitting a genomic relationship matrix based on a leave‐one–chromosome‐out approach to account for population structure. Subsequently, genomic inflation factors were assessed on whole‐genome level and the chromosome level. To establish a significance threshold, permutation testing, Bonferroni corrections using either the total number of SNPs or the number of independent chromosome fragments, and false discovery rates (FDR) using either the Benjamini–Hochberg procedure or the Benjamini and Yekutieli procedure were evaluated. We found that genomic inflation factors did not differ between different density genotypes but do differ between chromosomes. Also, the leave‐one‐chromosome‐out approach for GWAS or using the pedigree relationships did not account appropriately for population stratification and gave strong genomic inflation. Regarding different procedures for significance testing, when the aim is to find QTL regions that are associated with a trait of interest, we recommend applying the FDR following the Benjamini and Yekutieli approach to establish a significance threshold that is adjusted for multiple testing. When the aim is to pinpoint a specific mutation, the more conservative Bonferroni correction based on the total number of SNPs is more appropriate, till an appropriate method is established to adjust for the number of independent tests.
Collapse
Affiliation(s)
- Sanne van den Berg
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands.,Biometris, Wageningen University and Research, Wageningen, The Netherlands
| | - Jérémie Vandenplas
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| | - Fred A van Eeuwijk
- Biometris, Wageningen University and Research, Wageningen, The Netherlands
| | - Marcos S Lopes
- Topigs Norsvin Research Center, Beuningen, the Netherlands
| | - Roel F Veerkamp
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| |
Collapse
|
9
|
Nani JP, Rezende FM, Peñagaricano F. Predicting male fertility in dairy cattle using markers with large effect and functional annotation data. BMC Genomics 2019; 20:258. [PMID: 30940077 PMCID: PMC6444482 DOI: 10.1186/s12864-019-5644-y] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2018] [Accepted: 03/25/2019] [Indexed: 11/22/2022] Open
Abstract
Background Fertility is among the most important economic traits in dairy cattle. Genomic prediction for cow fertility has received much attention in the last decade, while bull fertility has been largely overlooked. The goal of this study was to assess genomic prediction of dairy bull fertility using markers with large effect and functional annotation data. Sire conception rate (SCR) was used as a measure of service sire fertility. Dataset consisted of 11.5 k U.S. Holstein bulls with SCR records and about 300 k single nucleotide polymorphism (SNP) markers. The analyses included the use of both single-kernel and multi-kernel predictive models fitting either all SNPs, markers with large effect, or markers with presumed functional roles, such as non-synonymous, synonymous, or non-coding regulatory variants. Results The entire set of SNPs yielded predictive correlations of 0.340. Five markers located on chromosomes BTA8, BTA9, BTA13, BTA17, and BTA27 showed marked dominance effects. Interestingly, the inclusion of these five major markers as fixed effects in the predictive models increased predictive correlations to 0.403, representing an increase in accuracy of about 19% compared with the standard model. Single-kernel models fitting functional SNP classes outperformed their counterparts using random sets of SNPs, suggesting that the predictive power of these functional variants is driven in part by their biological roles. Multi-kernel models fitting all the functional SNP classes together with the five major markers exhibited predictive correlations around 0.405. Conclusions The inclusion of markers with large effect markedly improved the prediction of dairy sire fertility. Functional variants exhibited higher predictive ability than random variants, but did not outperform the standard whole-genome approach. This research is the foundation for the development of novel strategies that could help the dairy industry make accurate genome-guided selection decisions on service sire fertility.
Collapse
Affiliation(s)
- Juan Pablo Nani
- Department of Animal Sciences, University of Florida, 2250 Shealy Drive, Gainesville, FL, 32611, USA.,Estación Experimental Agropecuaria Rafaela, Instituto Nacional de Tecnología Agropecuaria, 22-2300, Rafaela, SF, Argentina
| | - Fernanda M Rezende
- Department of Animal Sciences, University of Florida, 2250 Shealy Drive, Gainesville, FL, 32611, USA.,Faculdade de Medicina Veterinária, Universidade Federal de Uberlândia, Uberlândia, MG, 38410-337, Brazil
| | - Francisco Peñagaricano
- Department of Animal Sciences, University of Florida, 2250 Shealy Drive, Gainesville, FL, 32611, USA. .,University of Florida Genetics Institute, University of Florida, Gainesville, FL, 32610, USA.
| |
Collapse
|
10
|
van Son M, Lopes MS, Martell HJ, Derks MFL, Gangsei LE, Kongsro J, Wass MN, Grindflek EH, Harlizius B. A QTL for Number of Teats Shows Breed Specific Effects on Number of Vertebrae in Pigs: Bridging the Gap Between Molecular and Quantitative Genetics. Front Genet 2019; 10:272. [PMID: 30972109 PMCID: PMC6445065 DOI: 10.3389/fgene.2019.00272] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Accepted: 03/12/2019] [Indexed: 12/31/2022] Open
Abstract
Modern breeding schemes for livestock species accumulate a large amount of genotype and phenotype data which can be used for genome-wide association studies (GWAS). Many chromosomal regions harboring effects on quantitative traits have been reported from these studies, but the underlying causative mutations remain mostly undetected. In this study, we combine large genotype and phenotype data available from a commercial pig breeding scheme for three different breeds (Duroc, Landrace, and Large White) to pinpoint functional variation for a region on porcine chromosome 7 affecting number of teats (NTE). Our results show that refining trait definition by counting number of vertebrae (NVE) and ribs (RIB) helps to reduce noise from other genetic variation and increases heritability from 0.28 up to 0.62 NVE and 0.78 RIB in Duroc. However, in Landrace, the effect of the same QTL on NTE mainly affects NVE and not RIB, which is reflected in reduced heritability for RIB (0.24) compared to NVE (0.59). Further, differences in allele frequencies and accuracy of rib counting influence genetic parameters. Correction for the top SNP does not detect any other QTL effect on NTE, NVE, or RIB in Landrace or Duroc. At the molecular level, haplotypes derived from 660K SNP data detects a core haplotype of seven SNPs in Duroc. Sequence analysis of 16 Duroc animals shows that two functional mutations of the Vertnin (VRTN) gene known to increase number of thoracic vertebrae (ribs) reside on this haplotype. In Landrace, the linkage disequilibrium (LD) extends over a region of more than 3 Mb also containing both VRTN mutations. Here, other modifying loci are expected to cause the breed-specific effect. Additional variants found on the wildtype haplotype surrounding the VRTN region in all sequenced Landrace animals point toward breed specific differences which are expected to be present also across the whole genome. This Landrace specific haplotype contains two missense mutations in the ABCD4 gene, one of which is expected to have a negative effect on the protein function. Together, the integration of largescale genotype, phenotype and sequence data shows exemplarily how population parameters are influenced by underlying variation at the molecular level.
Collapse
Affiliation(s)
| | - Marcos S Lopes
- Topigs Norsvin Research Center, Beuningen, Netherlands.,Topigs Norsvin, Curitiba, Brazil
| | - Henry J Martell
- School of Biosciences, University of Kent, Canterbury, United Kingdom
| | - Martijn F L Derks
- Department of Animal Sciences, Wageningen University and Research, Wageningen, Netherlands
| | - Lars Erik Gangsei
- Animalia AS, Oslo, Norway.,Faculty of Chemistry, Biotechnology and Food Sciences, Norwegian University of Life Sciences, Ås, Norway
| | | | - Mark N Wass
- School of Biosciences, University of Kent, Canterbury, United Kingdom
| | | | | |
Collapse
|
11
|
van den Berg S, Vandenplas J, van Eeuwijk FA, Bouwman AC, Lopes MS, Veerkamp RF. Imputation to whole-genome sequence using multiple pig populations and its use in genome-wide association studies. Genet Sel Evol 2019; 51:2. [PMID: 30678638 PMCID: PMC6346588 DOI: 10.1186/s12711-019-0445-y] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2018] [Accepted: 01/10/2019] [Indexed: 11/10/2022] Open
Abstract
Background Use of whole-genome sequence data (WGS) is expected to improve identification of quantitative trait loci (QTL). However, this requires imputation to WGS, often with a limited number of sequenced animals for the target population. The objective of this study was to investigate imputation to WGS in two pig lines using a multi-line reference population and, subsequently, to investigate the effect of using these imputed WGS (iWGS) for GWAS. Methods Phenotypes and genotypes were available on 12,184 Large White pigs (LW-line) and 4943 Dutch Landrace pigs (DL-line). Imputed 660 K and 80 K genotypes for the LW-line and DL-line, respectively, were imputed to iWGS using Beagle v.4.1. Since only 32 LW-line and 12 DL-line boars were sequenced, 142 animals from eight commercial lines were added. GWAS were performed for each line using the 80 K and 660 K SNPs, the genotype scores of iWGS SNPs that had an imputation accuracy (Beagle R2) higher than 0.6, and the dosage scores of all iWGS SNPs. Results For the DL-line (LW-line), imputation of 80 K genotypes to iWGS resulted in an average Beagle R2 of 0.39 (0.49). After quality control, 2.5 × 106 (3.5 × 106) SNPs had a Beagle R2 higher than 0.6, resulting in an average Beagle R2 of 0.83 (0.93). Compared to the 80 K and 660 K genotypes, using iWGS led to the identification of 48.9 and 64.4% more QTL regions, for the DL-line and LW-line, respectively, and the most significant SNPs in the QTL regions explained a higher proportion of phenotypic variance. Using dosage instead of genotype scores improved the identification of QTL, because the model accounted for uncertainty of imputation, and all SNPs were used in the analysis. Conclusions Imputation to WGS using the multi-line reference population resulted in relatively poor imputation, especially when imputing from 80 K (DL-line). In spite of the poor imputation accuracies, using iWGS instead of a lower density SNP chip increased the number of detected QTL and the estimated proportion of phenotypic variance explained by these QTL, especially when dosage scores were used instead of genotype scores. Thus, iWGS, even with poor imputation accuracy, can be used to identify possible interesting regions for fine mapping. Electronic supplementary material The online version of this article (10.1186/s12711-019-0445-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sanne van den Berg
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands.,Biometris, Wageningen University and Research, P.O. Box 16, 6700 AA, Wageningen, The Netherlands
| | - Jérémie Vandenplas
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| | - Fred A van Eeuwijk
- Biometris, Wageningen University and Research, P.O. Box 16, 6700 AA, Wageningen, The Netherlands
| | - Aniek C Bouwman
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| | - Marcos S Lopes
- Topigs Norsvin Research Center, 6640 AA, Beuningen, The Netherlands.,Topigs Norsvin, Curitiba, 80420-190, Brazil
| | - Roel F Veerkamp
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands.
| |
Collapse
|
12
|
Pszczola M, Strabel T, Mucha S, Sell-Kubiak E. Genome-wide association identifies methane production level relation to genetic control of digestive tract development in dairy cows. Sci Rep 2018; 8:15164. [PMID: 30310168 PMCID: PMC6181922 DOI: 10.1038/s41598-018-33327-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2018] [Accepted: 09/24/2018] [Indexed: 11/08/2022] Open
Abstract
The global temperatures are increasing. This increase is partly due to methane (CH4) production from ruminants, including dairy cattle. Recent studies on dairy cattle have revealed the existence of a heritable variation in CH4 production that enables mitigation strategies based on selective breeding. We have exploited the available heritable variation to study the genetic architecture of CH4 production and detected genomic regions affecting CH4 production. Although the detected regions explained only a small proportion of the heritable variance, we showed that potential QTL regions affecting CH4 production were located within QTLs related to feed efficiency, milk-related traits, body size and health status. Five candidate genes were found: CYP51A1 on BTA 4, PPP1R16B on BTA 13, and NTHL1, TSC2, and PKD1 on BTA 25. These candidate genes were involved in a number of metabolic processes that are possibly related to CH4 production. One of the most promising candidate genes (PKD1) was related to the development of the digestive tract. The results indicate that CH4 production is a highly polygenic trait.
Collapse
Affiliation(s)
- M Pszczola
- Department of Genetics and Animal Breeding, Poznan University of Life Sciences, Wolynska 33, Poznan, Poland.
| | - T Strabel
- Department of Genetics and Animal Breeding, Poznan University of Life Sciences, Wolynska 33, Poznan, Poland.
| | - S Mucha
- Department of Genetics and Animal Breeding, Poznan University of Life Sciences, Wolynska 33, Poznan, Poland
| | - E Sell-Kubiak
- Department of Genetics and Animal Breeding, Poznan University of Life Sciences, Wolynska 33, Poznan, Poland
| |
Collapse
|
13
|
Genomic prediction of the polled and horned phenotypes in Merino sheep. Genet Sel Evol 2018; 50:28. [PMID: 29788905 PMCID: PMC5964914 DOI: 10.1186/s12711-018-0398-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Accepted: 05/15/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In horned sheep breeds, breeding for polledness has been of interest for decades. The objective of this study was to improve prediction of the horned and polled phenotypes using horn scores classified as polled, scurs, knobs or horns. Derived phenotypes polled/non-polled (P/NP) and horned/non-horned (H/NH) were used to test four different strategies for prediction in 4001 purebred Merino sheep. These strategies include the use of single 'single nucleotide polymorphism' (SNP) genotypes, multiple-SNP haplotypes, genome-wide and chromosome-wide genomic best linear unbiased prediction and information from imputed sequence variants from the region including the RXFP2 gene. Low-density genotypes of these animals were imputed to the Illumina Ovine high-density (600k) chip and the 1.78-kb insertion polymorphism in RXFP2 was included in the imputation process to whole-genome sequence. We evaluated the mode of inheritance and validated models by a fivefold cross-validation and across- and between-family prediction. RESULTS The most significant SNPs for prediction of P/NP and H/NH were OAR10_29546872.1 and OAR10_29458450, respectively, located on chromosome 10 close to the 1.78-kb insertion at 29.5 Mb. The mode of inheritance included an additive effect and a sex-dependent effect for dominance for P/NP and a sex-dependent additive and dominance effect for H/NH. Models with the highest prediction accuracies for H/NH used either single SNPs or 3-SNP haplotypes and included a polygenic effect estimated based on traditional pedigree relationships. Prediction accuracies for H/NH were 0.323 for females and 0.725 for males. For predicting P/NP, the best models were the same as for H/NH but included a genomic relationship matrix with accuracies of 0.713 for females and 0.620 for males. CONCLUSIONS Our results show that prediction accuracy is high using a single SNP, but does not reach 1 since the causative mutation is not genotyped. Incomplete penetrance or allelic heterogeneity, which can influence expression of the phenotype, may explain why prediction accuracy did not approach 1 with any of the genetic models tested here. Nevertheless, a breeding program to eradicate horns from Merino sheep can be effective by selecting genotypes GG of SNP OAR10_29458450 or TT of SNP OAR10_29546872.1 since all sheep with these genotypes will be non-horned.
Collapse
|