1
|
Wu H, Gao B, Zhang R, Huang Z, Yin Z, Hu X, Yang CX, Du ZQ. Residual network improves the prediction accuracy of genomic selection. Anim Genet 2024; 55:599-611. [PMID: 38746973 DOI: 10.1111/age.13445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Revised: 04/21/2024] [Accepted: 04/29/2024] [Indexed: 07/04/2024]
Abstract
Genetic improvement of complex traits in animal and plant breeding depends on the efficient and accurate estimation of breeding values. Deep learning methods have been shown to be not superior over traditional genomic selection (GS) methods, partially due to the degradation problem (i.e. with the increase of the model depth, the performance of the deeper model deteriorates). Since the deep learning method residual network (ResNet) is designed to solve gradient degradation, we examined its performance and factors related to its prediction accuracy in GS. Here we compared the prediction accuracy of conventional genomic best linear unbiased prediction, Bayesian methods (BayesA, BayesB, BayesC, and Bayesian Lasso), and two deep learning methods, convolutional neural network and ResNet, on three datasets (wheat, simulated and real pig data). ResNet outperformed other methods in both Pearson's correlation coefficient (PCC) and mean squared error (MSE) on the wheat and simulated data. For the pig backfat depth trait, ResNet still had the lowest MSE, whereas Bayesian Lasso had the highest PCC. We further clustered the pig data into four groups and, on one separated group, ResNet had the highest prediction accuracy (both PCC and MSE). Transfer learning was adopted and capable of enhancing the performance of both convolutional neural network and ResNet. Taken together, our findings indicate that ResNet could improve GS prediction accuracy, affected potentially by factors such as the genetic architecture of complex traits, data volume, and heterogeneity.
Collapse
Affiliation(s)
- Huaxuan Wu
- College of Animal Science and Technology, Yangtze University, Jingzhou, Hubei, China
| | - Bingxi Gao
- College of Animal Science and Technology, Yangtze University, Jingzhou, Hubei, China
| | - Rong Zhang
- College of Animal Science and Technology, Yangtze University, Jingzhou, Hubei, China
| | - Zehang Huang
- College of Animal Science and Technology, Yangtze University, Jingzhou, Hubei, China
| | - Zongjun Yin
- College of Animal Science and Technology, Anhui Agricultural University, Hefei, Anhui, China
| | - Xiaoxiang Hu
- State Key Laboratory for Agrobiotechnology, China Agricultural University, Beijing, China
| | - Cai-Xia Yang
- College of Animal Science and Technology, Yangtze University, Jingzhou, Hubei, China
| | - Zhi-Qiang Du
- College of Animal Science and Technology, Yangtze University, Jingzhou, Hubei, China
| |
Collapse
|
2
|
Zayas GA, Rodriguez E, Hernandez A, Rezende FM, Mateescu RG. Breed of origin analysis in genome-wide association studies: enhancing SNP-based insights into production traits in a commercial Brangus population. BMC Genomics 2024; 25:654. [PMID: 38956457 PMCID: PMC11218112 DOI: 10.1186/s12864-024-10465-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2023] [Accepted: 05/29/2024] [Indexed: 07/04/2024] Open
Abstract
BACKGROUND Carcass weight (HCW) and marbling (MARB) are critical for meat quality and market value in beef cattle. In composite breeds like Brangus, which meld the genetics of Angus and Brahman, SNP-based analyses have illuminated some genetic influences on these traits, but they fall short in fully capturing the nuanced effects of breed of origin alleles (BOA) on these traits. Focus on the impacts of BOA on phenotypic features within Brangus populations can result in a more profound understanding of the specific influences of Angus and Brahman genetics. Moreover, the consideration of BOA becomes particularly significant when evaluating dominance effects contributing to heterosis in crossbred populations. BOA provides a more comprehensive measure of heterosis due to its ability to differentiate the distinct genetic contributions originating from each parent breed. This detailed understanding of genetic effects is essential for making informed breeding decisions to optimize the benefits of heterosis in composite breeds like Brangus. OBJECTIVE This study aims to identify quantitative trait loci (QTL) influencing HCW and MARB by utilizing SNP and BOA information, incorporating additive, dominance, and overdominance effects within a multi-generational Brangus commercial herd. METHODS We analyzed phenotypic data from 1,066 genotyped Brangus steers. BOA inference was performed using LAMP-LD software using Angus and Brahman reference sets. SNP-based and BOA-based GWAS were then conducted considering additive, dominance, and overdominance models. RESULTS The study identified numerous QTLs for HCW and MARB. A notable QTL for HCW was associated to the SGCB gene, pivotal for muscle growth, and was identified solely in the BOA GWAS. Several BOA GWAS QTLs exhibited a dominance effect underscoring their importance in estimating heterosis. CONCLUSIONS Our findings demonstrate that SNP-based methods may not detect all genetic variation affecting economically important traits in composite breeds. BOA inclusion in genomic evaluations is crucial for identifying genetic regions contributing to trait variation and for understanding the dominance value underpinning heterosis. By considering BOA, we gain a deeper understanding of genetic interactions and heterosis, which is integral to advancing breeding programs. The incorporation of BOA is recommended for comprehensive genomic evaluations to optimize trait improvements in crossbred cattle populations.
Collapse
Affiliation(s)
- Gabriel A Zayas
- Department of Animal Sciences, University of Florida, Gainesville, FL, USA.
| | - Eduardo Rodriguez
- Department of Animal Sciences, University of Florida, Gainesville, FL, USA
| | - Aakilah Hernandez
- Department of Animal Science, North Carolina State University, Raleigh, NC, USA
| | - Fernanda M Rezende
- Department of Animal Sciences, University of Florida, Gainesville, FL, USA
| | - Raluca G Mateescu
- Department of Animal Sciences, University of Florida, Gainesville, FL, USA
| |
Collapse
|
3
|
Pedrosa VB, Chen SY, Gloria LS, Doucette JS, Boerman JP, Rosa GJM, Brito LF. Machine learning methods for genomic prediction of cow behavioral traits measured by automatic milking systems in North American Holstein cattle. J Dairy Sci 2024; 107:4758-4771. [PMID: 38395400 DOI: 10.3168/jds.2023-24082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 01/18/2024] [Indexed: 02/25/2024]
Abstract
Identifying genome-enabled methods that provide more accurate genomic prediction is crucial when evaluating complex traits such as dairy cow behavior. In this study, we aimed to compare the predictive performance of traditional genomic prediction methods and deep learning algorithms for genomic prediction of milking refusals (MREF) and milking failures (MFAIL) in North American Holstein cows measured by automatic milking systems (milking robots). A total of 1,993,509 daily records from 4,511 genotyped Holstein cows were collected by 36 milking robot stations. After quality control, 57,600 SNPs were available for the analyses. Four genomic prediction methods were considered: Bayesian least absolute shrinkage and selection operator (LASSO), multiple layer perceptron (MLP), convolutional neural network (CNN), and GBLUP. We implemented the first 3 methods using the Keras and TensorFlow libraries in Python (v.3.9) but the GBLUP method was implemented using the BLUPF90+ family programs. The accuracy of genomic prediction (mean square error) for MREF and MFAIL was 0.34 (0.08) and 0.27 (0.08) based on LASSO, 0.36 (0.09) and 0.32 (0.09) for MLP, 0.37 (0.08) and 0.30 (0.09) for CNN, and 0.35 (0.09) and 0.31(0.09) based on GBLUP, respectively. Additionally, we observed a lower reranking of top selected individuals based on the MLP versus CNN methods compared with the other approaches for both MREF and MFAIL. Although the deep learning methods showed slightly higher accuracies than GBLUP, the results may not be sufficient to justify their use over traditional methods due to their higher computational demand and the difficulty of performing genomic prediction for nongenotyped individuals using deep learning procedures. Overall, this study provides insights into the potential feasibility of using deep learning methods to enhance genomic prediction accuracy for behavioral traits in livestock. Further research is needed to determine their practical applicability to large dairy cattle breeding programs.
Collapse
Affiliation(s)
- Victor B Pedrosa
- Department of Animal Sciences, Purdue University, West Lafayette, IN 47907
| | - Shi-Yi Chen
- Department of Animal Sciences, Purdue University, West Lafayette, IN 47907; Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, Sichuan, 611130, China
| | - Leonardo S Gloria
- Department of Animal Sciences, Purdue University, West Lafayette, IN 47907
| | - Jarrod S Doucette
- Agriculture Information Technology (AgIT), Purdue University, West Lafayette, IN 47907
| | | | - Guilherme J M Rosa
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI, 53706
| | - Luiz F Brito
- Department of Animal Sciences, Purdue University, West Lafayette, IN 47907.
| |
Collapse
|
4
|
Bhuiyan MSA, Kim YK, Lee DH, Chung Y, Lee DJ, Kang JM, Lee SH. Evaluation of non-additive genetic effects on carcass and meat quality traits in Korean Hanwoo cattle using genomic models. Animal 2024; 18:101152. [PMID: 38701710 DOI: 10.1016/j.animal.2024.101152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 03/26/2024] [Accepted: 03/29/2024] [Indexed: 05/05/2024] Open
Abstract
The traditional genetic evaluation methods generally consider additive genetic effects only and often ignore non-additive (dominance and epistasis) effects that may have contributed to genetic variation of complex traits of livestock species. The available dense single nucleotide polymorphisms (SNPs) panels offer to investigate the potential benefits of including non-additive genetic effects in the genomic evaluation models. Data from 16 971 genotyped (Illumina Bovine 50 K SNP chip) Korean Hanwoo cattle were used to estimate genetic variance components and prediction accuracy of genomic breeding values (GEBVs) for four carcass and meat quality traits: carcass weight (CWT), eye muscle area (EMA), back fat thickness (BFT) and marbling score (MS). Five different genetic models were evaluated through including additive, dominance and epistatic interactions (additive by additive, A × A; additive by dominance, A × D and dominance by dominance, D × D) successively in the models. The estimates of additive genetic variances and narrow sense heritabilities (ha2) were found similar across the evaluated models and traits except when additive interaction (A × A) was included. The dominance variance estimates relative to phenotypic variance ranged from 1.7-3.4% for CWT and MS traits, whereas, they were close to zero for EMA and BFT traits. The magnitude of A × A epistatic heritability (haa2) ranged between 14.8 and 27.7% in all traits. However, heritability estimates for A × D and D × D epistatic interactions (had2 and hdd2) were quite low compared to haa2 and were contributed only 0.0-9.7% of the total phenotypic variation. In general, broad sense heritability (hG2) estimates were almost twice (ranging between 0.54 and 0.68) the ha2 for all of the investigated traits. The inclusion of dominance effects did not improve the prediction accuracy of GEBV but improved 2.0-3.0% when epistatic effects were included in the model. More importantly, rank correlation revealed that partitioning of variance components considering dominance and epistatic effects in the model would enable to re-rank of top animals with better prediction of GEBV. The present result suggests that dominance and epistatic effects could be included in the genomic evaluation model for better estimates of variance components and more accurate prediction of GEBV for carcass and meat quality traits in Korean Hanwoo cattle.
Collapse
Affiliation(s)
- M S A Bhuiyan
- Division of Animal and Dairy Science, Chungnam National University, Daejeon 34134, Republic of Korea
| | - Y K Kim
- Division of Animal and Dairy Science, Chungnam National University, Daejeon 34134, Republic of Korea; Quantomic Research & Solution, Yuseong-gu, Daejeon 34134, Republic of Korea
| | - D H Lee
- Division of Animal and Dairy Science, Chungnam National University, Daejeon 34134, Republic of Korea; Quantomic Research & Solution, Yuseong-gu, Daejeon 34134, Republic of Korea
| | - Y Chung
- Division of Animal and Dairy Science, Chungnam National University, Daejeon 34134, Republic of Korea
| | - D J Lee
- Division of Animal and Dairy Science, Chungnam National University, Daejeon 34134, Republic of Korea
| | - J M Kang
- Division of Animal and Dairy Science, Chungnam National University, Daejeon 34134, Republic of Korea
| | - S H Lee
- Division of Animal and Dairy Science, Chungnam National University, Daejeon 34134, Republic of Korea.
| |
Collapse
|
5
|
Lorenzi A, Bauland C, Pin S, Madur D, Combes V, Palaffre C, Guillaume C, Touzy G, Mary-Huard T, Charcosset A, Moreau L. Portability of genomic predictions trained on sparse factorial designs across two maize silage breeding cycles. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:75. [PMID: 38453705 DOI: 10.1007/s00122-024-04566-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 01/30/2024] [Indexed: 03/09/2024]
Abstract
KEY MESSAGE We validated the efficiency of genomic predictions calibrated on sparse factorial training sets to predict the next generation of hybrids and tested different strategies for updating predictions along generations. Genomic selection offers new prospects for revisiting hybrid breeding schemes by replacing extensive phenotyping of individuals with genomic predictions. Finding the ideal design for training genomic prediction models is still an open question. Previous studies have shown promising predictive abilities using sparse factorial instead of tester-based training sets to predict single-cross hybrids from the same generation. This study aims to further investigate the use of factorials and their optimization to predict line general combining abilities (GCAs) and hybrid values across breeding cycles. It relies on two breeding cycles of a maize reciprocal genomic selection scheme involving multiparental connected reciprocal populations from flint and dent complementary heterotic groups selected for silage performances. Selection based on genomic predictions trained on a factorial design resulted in a significant genetic gain for dry matter yield in the new generation. Results confirmed the efficiency of sparse factorial training sets to predict candidate line GCAs and hybrid values across breeding cycles. Compared to a previous study based on the first generation, the advantage of factorial over tester training sets appeared lower across generations. Updating factorial training sets by adding single-cross hybrids between selected lines from the previous generation or a random subset of hybrids from the new generation both improved predictive abilities. The CDmean criterion helped determine the set of single-crosses to phenotype to update the training set efficiently. Our results validated the efficiency of sparse factorial designs for calibrating hybrid genomic prediction experimentally and showed the benefit of updating it along generations.
Collapse
Affiliation(s)
- Alizarine Lorenzi
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, Génétique Quantitative et Evolution (GQE) - Le Moulon, 91190, Gif-Sur-Yvette, France
- RAGT2n, Genetics and Analytics Unit, 12510, Druelle, France
| | - Cyril Bauland
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, Génétique Quantitative et Evolution (GQE) - Le Moulon, 91190, Gif-Sur-Yvette, France
| | - Sophie Pin
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, Génétique Quantitative et Evolution (GQE) - Le Moulon, 91190, Gif-Sur-Yvette, France
| | - Delphine Madur
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, Génétique Quantitative et Evolution (GQE) - Le Moulon, 91190, Gif-Sur-Yvette, France
| | - Valérie Combes
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, Génétique Quantitative et Evolution (GQE) - Le Moulon, 91190, Gif-Sur-Yvette, France
| | - Carine Palaffre
- UE 0394 SMH, INRAE, 2297 Route de l'INRA, 40390, Saint-Martin-de-Hinx, France
| | | | - Gaëtan Touzy
- RAGT2n, Genetics and Analytics Unit, 12510, Druelle, France
| | - Tristan Mary-Huard
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, Génétique Quantitative et Evolution (GQE) - Le Moulon, 91190, Gif-Sur-Yvette, France
- Université Paris-Saclay, AgroParisTech, INRAE, UMR MIA Paris-Saclay, 91120, Palaiseau, France
| | - Alain Charcosset
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, Génétique Quantitative et Evolution (GQE) - Le Moulon, 91190, Gif-Sur-Yvette, France
| | - Laurence Moreau
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, Génétique Quantitative et Evolution (GQE) - Le Moulon, 91190, Gif-Sur-Yvette, France.
| |
Collapse
|
6
|
Song H, Zhang Q, Hu H. polyGBLUP: a modified genomic best linear unbiased prediction improved the genomic prediction efficiency for autopolyploid species. Brief Bioinform 2024; 25:bbae106. [PMID: 38517695 PMCID: PMC10959164 DOI: 10.1093/bib/bbae106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 12/22/2023] [Accepted: 02/26/2024] [Indexed: 03/24/2024] Open
Abstract
Given the universality of autopolyploid species in nature, it is crucial to develop genomic selection methods that consider different allele dosages for autopolyploid breeding. However, no method has been developed to deal with autopolyploid data regardless of the ploidy level. In this study, we developed a modified genomic best linear unbiased prediction (GBLUP) model (polyGBLUP) through constructing additive and dominant genomic relationship matrices based on different allele dosages. polyGBLUP could carry out genomic prediction for autopolyploid species regardless of the ploidy level. Through comprehensive simulations and analysis of real data of autotetraploid blueberry and guinea grass and autohexaploid sweet potato, the results showed that polyGBLUP achieved higher prediction accuracy than GBLUP and its superiority was more obvious when the ploidy level of autopolyploids is high. Furthermore, when the dominant effect was added to polyGBLUP (polyGDBLUP), the greater the dominance degree, the more obvious the advantages of polyGDBLUP over the diploid models in terms of prediction accuracy, bias, mean squared error and mean absolute error. For real data, the superiority of polyGBLUP over GBLUP appeared in blueberry and sweet potato populations and a part of the traits in guinea grass population due to the high correlation coefficients between diploid and polyploidy genomic relationship matrices. In addition, polyGDBLUP did not produce higher prediction accuracy than polyGBLUP for most traits of real data as dominant genetic variance was not captured for these traits. Our study will be a significant promising method for genomic prediction of autopolyploid species.
Collapse
Affiliation(s)
- Hailiang Song
- Fisheries Science Institute, Beijing Academy of Agriculture and Forestry Sciences & Beijing Key Laboratory of Fisheries Biotechnology, Beijing 100068, China
- Key Laboratory of Sturgeon Genetics and Breeding, Ministry of Agriculture and Rural Affairs, Hangzhou, 311799, China
| | - Qin Zhang
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, Shandong Agricultural University, Taian 271001, China
| | - Hongxia Hu
- Fisheries Science Institute, Beijing Academy of Agriculture and Forestry Sciences & Beijing Key Laboratory of Fisheries Biotechnology, Beijing 100068, China
- Key Laboratory of Sturgeon Genetics and Breeding, Ministry of Agriculture and Rural Affairs, Hangzhou, 311799, China
| |
Collapse
|
7
|
Dong L, Xie Y, Zhang Y, Wang R, Sun X. Genomic dissection of additive and non-additive genetic effects and genomic prediction in an open-pollinated family test of Japanese larch. BMC Genomics 2024; 25:11. [PMID: 38166605 PMCID: PMC10759612 DOI: 10.1186/s12864-023-09891-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 12/11/2023] [Indexed: 01/05/2024] Open
Abstract
Genomic dissection of genetic effects on desirable traits and the subsequent use of genomic selection hold great promise for accelerating the rate of genetic improvement of forest tree species. In this study, a total of 661 offspring trees from 66 open-pollinated families of Japanese larch (Larix kaempferi (Lam.) Carrière) were sampled at a test site. The contributions of additive and non-additive effects (dominance, imprinting and epistasis) were evaluated for nine valuable traits related to growth, wood physical and chemical properties, and competitive ability using three pedigree-based and four Genomics-based Best Linear Unbiased Predictions (GBLUP) models and used to determine the genetic model. The predictive ability (PA) of two genomic prediction methods, GBLUP and Reproducing Kernel Hilbert Spaces (RKHS), was compared. The traits could be classified into two types based on different quantitative genetic architectures: for type I, including wood chemical properties and Pilodyn penetration, additive effect is the main source of variation (38.20-67.46%); for type II, including growth, competitive ability and acoustic velocity, epistasis plays a significant role (50.76-91.26%). Dominance and imprinting showed low to moderate contributions (< 36.26%). GBLUP was more suitable for traits of type I (PAs = 0.37-0.39 vs. 0.14-0.25), and RKHS was more suitable for traits of type II (PAs = 0.23-0.37 vs. 0.07-0.23). Non-additive effects make no meaningful contribution to the enhancement of PA of GBLUP method for all traits. These findings enhance our current understanding of the architecture of quantitative traits and lay the foundation for the development of genomic selection strategies in Japanese larch.
Collapse
Affiliation(s)
- Leiming Dong
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
- Key Laboratory of National Forestry and Grassland Administration on Plant Ex situ Conservation, Beijing Floriculture Engineering Technology Research Centre, Beijing Botanical Garden, Beijing, 100093, China
| | - Yunhui Xie
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Yalin Zhang
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Ruizhen Wang
- Key Laboratory of National Forestry and Grassland Administration on Plant Ex situ Conservation, Beijing Floriculture Engineering Technology Research Centre, Beijing Botanical Garden, Beijing, 100093, China
| | - Xiaomei Sun
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China.
| |
Collapse
|
8
|
Oyama H, Nishio M, Shibata E, Takemyo H, Ichinoseki K, Ishii K. Evaluation of genomic prediction considering non-additive genetic effects on fatty acid traits of Japanese Black cattle. Anim Sci J 2024; 95:e13978. [PMID: 38978175 DOI: 10.1111/asj.13978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 06/03/2024] [Accepted: 06/24/2024] [Indexed: 07/10/2024]
Abstract
Genomic prediction was conducted using 2494 Japanese Black cattle from Hiroshima Prefecture and both single-nucleotide polymorphism information and phenotype data on monounsaturated fatty acid (MUFA) and oleic acid (C18:1) analyzed with gas chromatography. We compared the prediction accuracy for four models (A, additive genetic effects; AD, as for A with dominance genetic effects; ADR, as for AD with the runs of homozygosity (ROH) effects calculated by ROH-based relationship matrix; and ADF, as for AD with the ROH-based inbreeding coefficient of the linear regression). Bayesian methods were used to estimate variance components. The narrow-sense heritability estimates for MUFA and C18:1 were 0.52-0.53 and 0.57, respectively; the corresponding proportions of dominance genetic variance were 0.04-0.07 and 0.04-0.05, and the proportion of ROH variance was 0.02. The deviance information criterion values showed slight differences among the models, and the models provided similar prediction accuracy.
Collapse
Affiliation(s)
- Hidemi Oyama
- Institute of Livestock and Grassland Science, NARO, Tsukuba, Japan
| | - Motohide Nishio
- Institute of Livestock and Grassland Science, NARO, Tsukuba, Japan
| | - Eri Shibata
- Hiroshima Prefectural Technology Research Institute Livestock Technology Research Center, Shobara, Japan
| | - Hinaka Takemyo
- Hiroshima Prefectural Technology Research Institute Livestock Technology Research Center, Shobara, Japan
| | | | - Kazuo Ishii
- Institute of Livestock and Grassland Science, NARO, Tsukuba, Japan
| |
Collapse
|
9
|
de Oliveira LF, Brito LF, Marques DBD, da Silva DA, Lopes PS, Dos Santos CG, Johnson JS, Veroneze R. Investigating the impact of non-additive genetic effects in the estimation of variance components and genomic predictions for heat tolerance and performance traits in crossbred and purebred pig populations. BMC Genom Data 2023; 24:76. [PMID: 38093199 PMCID: PMC10717470 DOI: 10.1186/s12863-023-01174-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 11/13/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Non-additive genetic effects are often ignored in livestock genetic evaluations. However, fitting them in the models could improve the accuracy of genomic breeding values. Furthermore, non-additive genetic effects contribute to heterosis, which could be optimized through mating designs. Traits related to fitness and adaptation, such as heat tolerance, tend to be more influenced by non-additive genetic effects. In this context, the primary objectives of this study were to estimate variance components and assess the predictive performance of genomic prediction of breeding values based on alternative models and two independent datasets, including performance records from a purebred pig population and heat tolerance indicators recorded in crossbred lactating sows. RESULTS Including non-additive genetic effects when modelling performance traits in purebred pigs had no effect on the residual variance estimates for most of the traits, but lower additive genetic variances were observed, especially when additive-by-additive epistasis was included in the models. Furthermore, including non-additive genetic effects did not improve the prediction accuracy of genomic breeding values, but there was animal re-ranking across the models. For the heat tolerance indicators recorded in a crossbred population, most traits had small non-additive genetic variance with large standard error estimates. Nevertheless, panting score and hair density presented substantial additive-by-additive epistatic variance. Panting score had an epistatic variance estimate of 0.1379, which accounted for 82.22% of the total genetic variance. For hair density, the epistatic variance estimates ranged from 0.1745 to 0.1845, which represent 64.95-69.59% of the total genetic variance. CONCLUSIONS Including non-additive genetic effects in the models did not improve the accuracy of genomic breeding values for performance traits in purebred pigs, but there was substantial re-ranking of selection candidates depending on the model fitted. Except for panting score and hair density, low non-additive genetic variance estimates were observed for heat tolerance indicators in crossbred pigs.
Collapse
Affiliation(s)
- Letícia Fernanda de Oliveira
- Department of Animal Science, Federal University of Viçosa, Viçosa, MG, Brazil.
- Department of Animal Sciences, Purdue University, West Lafayette, IN, USA.
| | - Luiz F Brito
- Department of Animal Sciences, Purdue University, West Lafayette, IN, USA
| | | | | | - Paulo Sávio Lopes
- Department of Animal Science, Federal University of Viçosa, Viçosa, MG, Brazil
| | | | - Jay S Johnson
- USDA-ARS Livestock Behavior Research Unit, West Lafayette, IN, USA
| | - Renata Veroneze
- Department of Animal Science, Federal University of Viçosa, Viçosa, MG, Brazil
| |
Collapse
|
10
|
Alipanah M, Roudbari Z, Momen M, Esmailizadeh A. Impact of inclusion non-additive effects on genome-wide association and variance's components in Scottish black sheep. Anim Biotechnol 2023; 34:3765-3773. [PMID: 37343283 DOI: 10.1080/10495398.2023.2224845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/23/2023]
Abstract
CONTEXT It's well-documented that most economic traits have a complex genetic structure that is controlled by additive and non-additive gene actions. Hence, knowledge of the underlying genetic architecture of such complex traits could aid in understanding how these traits respond to the selection in breeding and mating programs. Computing and having estimates of the non-additive effect for economic traits in sheep using genome-wide information can be important because; non-additive genes play an important role in the prediction accuracy of genomic breeding values and the genetic response to the selection. AIM This study aimed to assess the impact of non-additive effects (dominance and epistasis) on the estimation of genetic parameters for body weight traits in sheep. METHODS This study used phenotypic and genotypic belonging to 752 Scottish Blackface lambs. Three live weight traits considered in this study were included in body weight at 16, 20, and 24 weeks). Three genetic models including additive (AM), additive + dominance (ADM), and additive + dominance + epistasis (ADEM), were used. KEY RESULTS The narrow sense heritability for weight at 16 weeks of age (BW16) were 0.39, 0.35, and 0.23, for 20 weeks of age (BW20) were 0.55, 0.54, and 0.42, and finally for 24 weeks of age (BW24) were 0.16, 0.12, and 0.02, using the AM, ADM, and ADEM models, respectively. The additive genetic model significantly outperformed the non-additive genetic model (p < 0.01). The dominance variance of the BW16, BW20, and BW24 accounted for 38, 6, and 30% of the total phenotypic, respectively. Moreover, the epistatic variance accounted for 39, 0.39, and 47% of the total phenotypic variances of these traits, respectively. In addition, our results indicated that the most important SNPs for live weight traits are on chromosomes 3 (three SNPS including s12606.1, OAR3_221188082.1, and OAR3_4106875.1), 8 (OAR8_16468019.1, OAR8_18067475.1, and OAR8_18043643.1), and 19 (OAR19_18010247.1), according to the genome-wide association analysis using additive and non-additive genetic model. CONCLUSIONS The results emphasized that the non-additive genetic effects play an important role in controlling body weight variation at the age of 16-24 weeks in Scottish Blackface lambs. IMPLICATIONS It is expected that using a high-density SNP panel and the joint modeling of both additive and non-additive effects can lead to better estimation and prediction of genetic parameters.
Collapse
Affiliation(s)
- Masoud Alipanah
- Department of Plant Production, University of Torbat Heydarieh, Torbat-e Heydarieh, Iran
| | - Zahra Roudbari
- Department of Animal Science, University of Jiroft, Jiroft, Iran
| | - Mehdi Momen
- Department of Surgical Sciences, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, WI, USA
| | - Ali Esmailizadeh
- Department of Animal Science, Shahid Bahonar University of Kerman, Kerman, Iran
| |
Collapse
|
11
|
Chen C, Powell O, Dinglasan E, Ross EM, Yadav S, Wei X, Atkin F, Deomano E, Hayes BJ. Genomic prediction with machine learning in sugarcane, a complex highly polyploid clonally propagated crop with substantial non-additive variation for key traits. THE PLANT GENOME 2023; 16:e20390. [PMID: 37728221 DOI: 10.1002/tpg2.20390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Revised: 08/01/2023] [Accepted: 08/29/2023] [Indexed: 09/21/2023]
Abstract
Sugarcane has a complex, highly polyploid genome with multi-species ancestry. Additive models for genomic prediction of clonal performance might not capture interactions between genes and alleles from different ploidies and ancestral species. As such, genomic prediction in sugarcane presents an interesting case for machine learning (ML) methods, which are purportedly able to deal with high levels of complexity in prediction. Here, we investigated deep learning (DL) neural networks, including multilayer networks (MLP) and convolution neural networks (CNN), and an ensemble machine learning approach, random forest (RF), for genomic prediction in sugarcane. The data set used was 2912 sugarcane clones, scored for 26,086 genome wide single nucleotide polymorphism markers, with final assessment trial data for total cane harvested (TCH), commercial cane sugar (CCS), and fiber content (Fiber). The clones in the latest trial (2017) were used as a validation set. We compared prediction accuracy of these methods to genomic best linear unbiased prediction (GBLUP) extended to include dominance and epistatic effects. The prediction accuracies from GBLUP models were up to 0.37 for TCH, 0.43 for CCS, and 0.48 for Fiber, while the optimized ML models had prediction accuracies of 0.35 for TCH, 0.38 for CCS, and 0.48 for Fiber. Both RF and DL neural network models have comparable predictive ability with the additive GBLUP model but are less accurate than the extended GBLUP model.
Collapse
Affiliation(s)
- Chensong Chen
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Queensland, Australia
| | - Owen Powell
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Queensland, Australia
| | - Eric Dinglasan
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Queensland, Australia
| | - Elizabeth M Ross
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Queensland, Australia
| | - Seema Yadav
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Queensland, Australia
| | | | | | | | - Ben J Hayes
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Queensland, Australia
| |
Collapse
|
12
|
Bharati R, Sen MK, Severová L, Svoboda R, Fernández-Cusimamani E. Polyploidization and genomic selection integration for grapevine breeding: a perspective. FRONTIERS IN PLANT SCIENCE 2023; 14:1248978. [PMID: 38034577 PMCID: PMC10684766 DOI: 10.3389/fpls.2023.1248978] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 10/30/2023] [Indexed: 12/02/2023]
Abstract
Grapevines are economically important woody perennial crops widely cultivated for their fruits that are used for making wine, grape juice, raisins, and table grapes. However, grapevine production is constantly facing challenges due to climate change and the prevalence of pests and diseases, causing yield reduction, lower fruit quality, and financial losses. To ease the burden, continuous crop improvement to develop superior grape genotypes with desirable traits is imperative. Polyploidization has emerged as a promising tool to generate genotypes with novel genetic combinations that can confer desirable traits such as enhanced organ size, improved fruit quality, and increased resistance to both biotic and abiotic stresses. While previous studies have shown high polyploid induction rates in Vitis spp., rigorous screening of genotypes among the produced polyploids to identify those exhibiting desired traits remains a major bottleneck. In this perspective, we propose the integration of the genomic selection approach with omics data to predict genotypes with desirable traits among the vast unique individuals generated through polyploidization. This integrated approach can be a powerful tool for accelerating the breeding of grapevines to develop novel and improved grapevine varieties.
Collapse
Affiliation(s)
- Rohit Bharati
- Department of Crop Sciences and Agroforestry, The Faculty of Tropical AgriSciences, Czech University of Life Sciences Prague, Suchdol, Czechia
| | - Madhab Kumar Sen
- Department of Agroecology and Crop Production, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Suchdol, Czechia
| | - Lucie Severová
- Department of Economic Theories, Faculty of Economics and Management, Czech University of Life Sciences Prague, Prague, Czechia
| | - Roman Svoboda
- Department of Economic Theories, Faculty of Economics and Management, Czech University of Life Sciences Prague, Prague, Czechia
| | - Eloy Fernández-Cusimamani
- Department of Crop Sciences and Agroforestry, The Faculty of Tropical AgriSciences, Czech University of Life Sciences Prague, Suchdol, Czechia
| |
Collapse
|
13
|
Liu H, Yu S. A dimensionality-reduction genomic prediction method without direct inverse of the genomic relationship matrix for large genomic data. PLANT CELL REPORTS 2023; 42:1825-1832. [PMID: 37750948 DOI: 10.1007/s00299-023-03069-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 09/08/2023] [Indexed: 09/27/2023]
Abstract
KEY MESSAGE A new genomic prediction method (RHPP) was developed via combining randomized Haseman-Elston regression (RHE-reg), PCR based on genomic information of core population, and preconditioned conjugate gradient (PCG) algorithm. Computational efficiency is becoming a hot issue in the practical application of genomic prediction due to the large number of data generated by the high-throughput genotyping technology. In this study, we developed a fast genomic prediction method RHPP via combining randomized Haseman-Elston regression (RHE-reg), PCR based on genomic information of core population, and preconditioned conjugate gradient (PCG) algorithm. The simulation results demonstrated similar prediction accuracy between RHPP and GBLUP, and significantly higher computational efficiency of the former with the increase of individuals. The results of real datasets of both bread wheat and loblolly pine demonstrated that RHPP had a similar or better predictive accuracy in most cases compared with GBLUP. In the future, RHPP may be an attractive choice for analyzing large-scale and high-dimensional data.
Collapse
Affiliation(s)
- Hailan Liu
- Maize Research Institute, Sichuan Agricultural University, Chengdu, 611130, Sichuan, China.
| | - Shizhou Yu
- Molecular Genetics Key Laboratory of China Tobacco, Guizhou Academy of Tobacco Science, Guiyang, 550081, Guizhou, China.
| |
Collapse
|
14
|
Akutsu H, Na’iem M, Widiyatno, Indrioko S, Sawitri, Purnomo S, Uchiyama K, Tsumura Y, Tani N. Comparing modeling methods of genomic prediction for growth traits of a tropical timber species, Shorea macrophylla. FRONTIERS IN PLANT SCIENCE 2023; 14:1241908. [PMID: 38023878 PMCID: PMC10644202 DOI: 10.3389/fpls.2023.1241908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Accepted: 09/13/2023] [Indexed: 12/01/2023]
Abstract
Introduction Shorea macrophylla is a commercially important tropical tree species grown for timber and oil. It is amenable to plantation forestry due to its fast initial growth. Genomic selection (GS) has been used in tree breeding studies to shorten long breeding cycles but has not previously been applied to S. macrophylla. Methods To build genomic prediction models for GS, leaves and growth trait data were collected from a half-sib progeny population of S. macrophylla in Sari Bumi Kusuma forest concession, central Kalimantan, Indonesia. 18037 SNP markers were identified in two ddRAD-seq libraries. Genomic prediction models based on these SNPs were then generated for diameter at breast height and total height in the 7th year from planting (D7 and H7). Results and discussion These traits were chosen because of their relatively high narrow-sense genomic heritability and because seven years was considered long enough to assess initial growth. Genomic prediction models were built using 6 methods and their derivatives with the full set of identified SNPs and subsets of 48, 96, and 192 SNPs selected based on the results of a genome-wide association study (GWAS). The GBLUP and RKHS methods gave the highest predictive ability for D7 and H7 with the sets of selected SNPs and showed that D7 has an additive genetic architecture while H7 has an epistatic genetic architecture. LightGBM and CNN1D also achieved high predictive abilities for D7 with 48 and 96 selected SNPs, and for H7 with 96 and 192 selected SNPs, showing that gradient boosting decision trees and deep learning can be useful in genomic prediction. Predictive abilities were higher in H7 when smaller number of SNP subsets selected by GWAS p-value was used, However, D7 showed the contrary tendency, which might have originated from the difference in genetic architecture between primary and secondary growth of the species. This study suggests that GS with GWAS-based SNP selection can be used in breeding for non-cultivated tree species to improve initial growth and reduce genotyping costs for next-generation seedlings.
Collapse
Affiliation(s)
- Haruto Akutsu
- Graduate School of Science and Technology, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Mohammad Na’iem
- Faculty of Forestry, Gadjah Mada University, Yogyakarta, Indonesia
| | - Widiyatno
- Faculty of Forestry, Gadjah Mada University, Yogyakarta, Indonesia
| | - Sapto Indrioko
- Faculty of Forestry, Gadjah Mada University, Yogyakarta, Indonesia
| | - Sawitri
- Faculty of Forestry, Gadjah Mada University, Yogyakarta, Indonesia
| | - Susilo Purnomo
- PT. Sari Bumi Kusuma, Pontianak, West Kalimantan, Indonesia
| | - Kentaro Uchiyama
- Department of Forest Molecular Genetics and Biotechnology, Forestry and Forest Products Research Institute, Tsukuba, Ibaraki, Japan
| | - Yoshihiko Tsumura
- Faculty of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Naoki Tani
- Faculty of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
- Forestry Division, Japan International Research Center for Agricultural Sciences, Tsukuba, Ibaraki, Japan
| |
Collapse
|
15
|
Gu Z, Gong J, Zhu Z, Li Z, Feng Q, Wang C, Zhao Y, Zhan Q, Zhou C, Wang A, Huang T, Zhang L, Tian Q, Fan D, Lu Y, Zhao Q, Huang X, Yang S, Han B. Structure and function of rice hybrid genomes reveal genetic basis and optimal performance of heterosis. Nat Genet 2023; 55:1745-1756. [PMID: 37679493 PMCID: PMC10562254 DOI: 10.1038/s41588-023-01495-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 08/02/2023] [Indexed: 09/09/2023]
Abstract
Exploitation of crop heterosis is crucial for increasing global agriculture production. However, the quantitative genomic analysis of heterosis was lacking, and there is currently no effective prediction tool to optimize cross-combinations. Here 2,839 rice hybrid cultivars and 9,839 segregation individuals were resequenced and phenotyped. Our findings demonstrated that indica-indica hybrid-improving breeding was a process that broadened genetic resources, pyramided breeding-favorable alleles through combinatorial selection and collaboratively improved both parents by eliminating the inferior alleles at negative dominant loci. Furthermore, we revealed that widespread genetic complementarity contributed to indica-japonica intersubspecific heterosis in yield traits, with dominance effect loci making a greater contribution to phenotypic variance than overdominance effect loci. On the basis of the comprehensive dataset, a genomic model applicable to diverse rice varieties was developed and optimized to predict the performance of hybrid combinations. Our data offer a valuable resource for advancing the understanding and facilitating the utilization of heterosis in rice.
Collapse
Affiliation(s)
- Zhoulin Gu
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Junyi Gong
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Chinese Academy of Agricultural Sciences, Hangzhou, China
| | - Zhou Zhu
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhen Li
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
- College of Life Sciences, Anhui Normal University, Wuhu, China
| | - Qi Feng
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Changsheng Wang
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Yan Zhao
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Qilin Zhan
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Congcong Zhou
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Ahong Wang
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Tao Huang
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Lei Zhang
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Qilin Tian
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Danlin Fan
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Yiqi Lu
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Qiang Zhao
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Xuehui Huang
- College of Life Sciences, Shanghai Normal University, Shanghai, China
| | - Shihua Yang
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Chinese Academy of Agricultural Sciences, Hangzhou, China.
| | - Bin Han
- National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China.
| |
Collapse
|
16
|
Yuan W, Beitel F, Srikant T, Bezrukov I, Schäfer S, Kraft R, Weigel D. Pervasive under-dominance in gene expression underlying emergent growth trajectories in Arabidopsis thaliana hybrids. Genome Biol 2023; 24:200. [PMID: 37667232 PMCID: PMC10478501 DOI: 10.1186/s13059-023-03043-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 08/21/2023] [Indexed: 09/06/2023] Open
Abstract
BACKGROUND Complex traits, such as growth and fitness, are typically controlled by a very large number of variants, which can interact in both additive and non-additive fashion. In an attempt to gauge the relative importance of both types of genetic interactions, we turn to hybrids, which provide a facile means for creating many novel allele combinations. RESULTS We focus on the interaction between alleles of the same locus, i.e., dominance, and perform a transcriptomic study involving 141 random crosses between different accessions of the plant model species Arabidopsis thaliana. Additivity is rare, consistently observed for only about 300 genes enriched for roles in stress response and cell death. Regulatory rare-allele burden affects the expression level of these genes but does not correlate with F1 rosette size. Non-additive, dominant gene expression in F1 hybrids is much more common, with the vast majority of genes (over 90%) being expressed below the parental average. Unlike in the additive genes, regulatory rare-allele burden in the dominant gene set is strongly correlated with F1 rosette size, even though it only mildly covaries with the expression level of these genes. CONCLUSIONS Our study underscores under-dominance as the predominant gene action associated with emergence of rosette growth trajectories in the A. thaliana hybrid model. Our work lays the foundation for understanding molecular mechanisms and evolutionary forces that lead to dominance complementation of rare regulatory alleles.
Collapse
Affiliation(s)
- Wei Yuan
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076, Tübingen, Germany
| | - Fiona Beitel
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076, Tübingen, Germany
| | - Thanvi Srikant
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076, Tübingen, Germany
| | - Ilja Bezrukov
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076, Tübingen, Germany
| | - Sabine Schäfer
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076, Tübingen, Germany
| | - Robin Kraft
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076, Tübingen, Germany
| | - Detlef Weigel
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076, Tübingen, Germany.
| |
Collapse
|
17
|
Rio S, Charcosset A, Moreau L, Mary-Huard T. Detecting directional and non-directional epistasis in bi-parental populations using genomic data. Genetics 2023; 224:iyad089. [PMID: 37170627 DOI: 10.1093/genetics/iyad089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 01/16/2023] [Accepted: 05/01/2023] [Indexed: 05/13/2023] Open
Abstract
Epistasis, commonly defined as interaction effects between alleles of different loci, is an important genetic component of the variation of phenotypic traits in natural and breeding populations. In addition to its impact on variance, epistasis can also affect the expected performance of a population and is then referred to as directional epistasis. Before the advent of genomic data, the existence of epistasis (both directional and non-directional) was investigated based on complex and expensive mating schemes involving several generations evaluated for a trait of interest. In this study, we propose a methodology to detect the presence of epistasis based on simple inbred biparental populations, both genotyped and phenotyped, ideally along with their parents. Thanks to genomic data, parental proportions as well as shared parental proportions between inbred individuals can be estimated. They allow the evaluation of epistasis through a test of the expected performance for directional epistasis or the variance of genetic values. This methodology was applied to two large multiparental populations, i.e. the American maize and soybean nested association mapping populations, evaluated for different traits. Results showed significant epistasis, especially for the test of directional epistasis, e.g. the increase in anthesis to silking interval observed in most maize inbred progenies or the decrease in grain yield observed in several soybean inbred progenies. In general, the effects detected suggested that shuffling allelic associations of both elite parents had a detrimental effect on the performance of their progeny. This methodology is implemented in the EpiTest R-package and can be applied to any bi/multiparental inbred population evaluated for a trait of interest.
Collapse
Affiliation(s)
- Simon Rio
- CIRAD, UMR AGAP Institut, F-34398 Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, F-34398 Montpellier, France
| | - Alain Charcosset
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, UMR GQE-Le Moulon, 91190 Gif-sur-Yvette, France
| | - Laurence Moreau
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, UMR GQE-Le Moulon, 91190 Gif-sur-Yvette, France
| | - Tristan Mary-Huard
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, UMR GQE-Le Moulon, 91190 Gif-sur-Yvette, France
- Université Paris-Saclay, AgroParisTech, INRAE, UMR MIA-Paris, 91120 Palaiseau, France
| |
Collapse
|
18
|
Werner CR, Gaynor RC, Sargent DJ, Lillo A, Gorjanc G, Hickey JM. Genomic selection strategies for clonally propagated crops. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:74. [PMID: 36952013 PMCID: PMC10036424 DOI: 10.1007/s00122-023-04300-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 01/14/2023] [Indexed: 05/27/2023]
Abstract
For genomic selection in clonally propagated crops with diploid (-like) meiotic behavior to be effective, crossing parents should be selected based on genomic predicted cross-performance unless dominance is negligible. For genomic selection (GS) in clonal breeding programs to be effective, parents should be selected based on genomic predicted cross-performance unless dominance is negligible. Genomic prediction of cross-performance enables efficient exploitation of the additive and dominance value simultaneously. Here, we compared different GS strategies for clonally propagated crops with diploid (-like) meiotic behavior, using strawberry as an example. We used stochastic simulation to evaluate six combinations of three breeding programs and two parent selection methods. The three breeding programs included (1) a breeding program that introduced GS in the first clonal stage, and (2) two variations of a two-part breeding program with one and three crossing cycles per year, respectively. The two parent selection methods were (1) parent selection based on genomic estimated breeding values (GEBVs) and (2) parent selection based on genomic predicted cross-performance (GPCP). Selection of parents based on GPCP produced faster genetic gain than selection of parents based on GEBVs because it reduced inbreeding when the dominance degree increased. The two-part breeding programs with one and three crossing cycles per year using GPCP always produced the most genetic gain unless dominance was negligible. We conclude that (1) in clonal breeding programs with GS, parents should be selected based on GPCP, and (2) a two-part breeding program with parent selection based on GPCP to rapidly drive population improvement has great potential to improve breeding clonally propagated crops.
Collapse
Affiliation(s)
- Christian R Werner
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, Easter Bush Research Centre, University of Edinburgh, Midlothian, EH25 9RG, UK.
| | - R Chris Gaynor
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, Easter Bush Research Centre, University of Edinburgh, Midlothian, EH25 9RG, UK
| | - Daniel J Sargent
- NIAB EMR, New Road, East Malling, Kent, ME19 6BJ, UK
- East Malling Enterprise Centre, Driscoll's Genetics Ltd, New Road, East Malling, Kent, ME19 6BJ, UK
| | - Alessandra Lillo
- East Malling Enterprise Centre, Driscoll's Genetics Ltd, New Road, East Malling, Kent, ME19 6BJ, UK
| | - Gregor Gorjanc
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, Easter Bush Research Centre, University of Edinburgh, Midlothian, EH25 9RG, UK
| | - John M Hickey
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, Easter Bush Research Centre, University of Edinburgh, Midlothian, EH25 9RG, UK
| |
Collapse
|
19
|
Endelman JB. Fully efficient, two-stage analysis of multi-environment trials with directional dominance and multi-trait genomic selection. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:65. [PMID: 36949348 PMCID: PMC10033618 DOI: 10.1007/s00122-023-04298-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 01/02/2023] [Indexed: 06/18/2023]
Abstract
R/StageWise enables fully efficient, two-stage analysis of multi-environment, multi-trait datasets for genomic selection, including support for dominance heterosis and polyploidy. Plant breeders interested in genomic selection often face challenges to fully utilizing multi-trait, multi-environment datasets. R package StageWise was developed to go beyond the capabilities of most specialized software for genomic prediction, without requiring the programming skills needed for more general-purpose software for mixed models. As the name suggests, one of the core features is a fully efficient, two-stage analysis for multiple environments, in which the full variance-covariance matrix of the Stage 1 genotype means is used in Stage 2. Another feature is directional dominance, including for polyploids, to account for inbreeding depression in outbred crops. StageWise enables selection with multi-trait indices, including restricted indices with one or more traits constrained to have zero response. For a potato dataset with 943 genotypes evaluated over 6 years, including the Stage 1 errors in Stage 2 reduced the Akaike Information Criterion (AIC) by 29, 67, and 104 for maturity, yield, and fry color, respectively. The proportion of variation explained by heterosis was largest for yield but still only 0.03, likely because of limited variation for the genomic inbreeding coefficient. Due to the large additive genetic correlation (0.57) between yield and maturity, naïve selection on an index combining yield and fry color led to an undesirable response for later maturity. The restricted index coefficients to maximize genetic merit without delaying maturity were identified. The software and three vignettes are available at https://github.com/jendelman/StageWise .
Collapse
Affiliation(s)
- Jeffrey B Endelman
- Department of Horticulture, University of Wisconsin-Madison, Madison, WI, 53706, USA.
| |
Collapse
|
20
|
Nadeau S, Beaulieu J, Gezan SA, Perron M, Bousquet J, Lenz PRN. Increasing genomic prediction accuracy for unphenotyped full-sib families by modeling additive and dominance effects with large datasets in white spruce. FRONTIERS IN PLANT SCIENCE 2023; 14:1137834. [PMID: 37035077 PMCID: PMC10073444 DOI: 10.3389/fpls.2023.1137834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 02/14/2023] [Indexed: 06/19/2023]
Abstract
INTRODUCTION Genomic selection is becoming a standard technique in plant breeding and is now being introduced into forest tree breeding. Despite promising results to predict the genetic merit of superior material based on their additive breeding values, many studies and operational programs still neglect non-additive effects and their potential for enhancing genetic gains. METHODS Using two large comprehensive datasets totaling 4,066 trees from 146 full-sib families of white spruce (Picea glauca (Moench) Voss), we evaluated the effect of the inclusion of dominance on the precision of genetic parameter estimates and on the accuracy of conventional pedigree-based (ABLUP-AD) and genomic-based (GBLUP-AD) models. RESULTS While wood quality traits were mostly additively inherited, considerable non-additive effects and lower heritabilities were detected for growth traits. For growth, GBLUP-AD better partitioned the additive and dominance effects into roughly equal variances, while ABLUP-AD strongly overestimated dominance. The predictive abilities of breeding and total genetic value estimates were similar between ABLUP-AD and GBLUP-AD when predicting individuals from the same families as those included in the training dataset. However, GBLUP-AD outperformed ABLUP-AD when predicting for new unphenotyped families that were not represented in the training dataset, with, on average, 22% and 53% higher predictive ability of breeding and genetic values, respectively. Resampling simulations showed that GBLUP-AD required smaller sample sizes than ABLUP-AD to produce precise estimates of genetic variances and accurate predictions of genetic values. Still, regardless of the method used, large training datasets were needed to estimate additive and non-additive genetic variances precisely. DISCUSSION This study highlights the different quantitative genetic architectures between growth and wood traits. Furthermore, the usefulness of genomic additive-dominance models for predicting new families should allow practicing mating allocation to maximize the total genetic values for the propagation of elite material.
Collapse
Affiliation(s)
- Simon Nadeau
- Natural Resources Canada, Canadian Forest Service, Canadian Wood Fibre Centre, Québec, QC, Canada
| | - Jean Beaulieu
- Canada Research Chair in Forest Genomics, Institute for Systems and Integrative Biology, Université Laval, Québec, QC, Canada
| | | | - Martin Perron
- Canada Research Chair in Forest Genomics, Institute for Systems and Integrative Biology, Université Laval, Québec, QC, Canada
- Direction de la Recherche Forestière, Ministère des Ressources Naturelles et des Forêts, Québec, QC, Canada
| | - Jean Bousquet
- Canada Research Chair in Forest Genomics, Institute for Systems and Integrative Biology, Université Laval, Québec, QC, Canada
| | - Patrick R. N. Lenz
- Natural Resources Canada, Canadian Forest Service, Canadian Wood Fibre Centre, Québec, QC, Canada
- Canada Research Chair in Forest Genomics, Institute for Systems and Integrative Biology, Université Laval, Québec, QC, Canada
| |
Collapse
|
21
|
Schneider H, Heise J, Tetens J, Thaller G, Wellmann R, Bennewitz J. Genomic dominance variance analysis of health and milk production traits in German Holstein cattle. J Anim Breed Genet 2023. [PMID: 36872841 DOI: 10.1111/jbg.12765] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 02/12/2023] [Indexed: 03/07/2023]
Abstract
Genomic analyses commonly explore the additive genetic variance of traits. The non-additive variance, however, is usually small but often significant in dairy cattle. This study aimed at dissecting the genetic variance of eight health traits that recently entered the total merit index in Germany and the somatic cell score (SCS), as well as four milk production traits by analysing additive and dominance variance components. The heritabilities were low for all health traits (between 0.033 for mastitis and 0.099 for SCS), and moderate for the milk production traits (between 0.261 for milk energy yield and 0.351 for milk yield). For all traits, the contribution of dominance variance to the phenotypic variance was low, varying between 0.018 for ovarian cysts and 0.078 for milk yield. Inbreeding depression, inferred from the SNP-based observed homozygosity, was significant only for the milk production traits. The contribution of dominance variance to the genetic variance was larger for the health traits, ranging from 0.233 for ovarian cysts to 0.551 for mastitis, encouraging further studies that aim at discovering QTLs based on their additive and dominance effects.
Collapse
Affiliation(s)
- Helen Schneider
- Institute of Animal Science, University of Hohenheim, Stuttgart, Germany
| | - Johannes Heise
- Vereinigte Informationssysteme Tierhaltung w.V. (VIT), Verden, Germany
| | - Jens Tetens
- Department of Animal Sciences, University of Göttingen, Göttingen, Germany
| | - Georg Thaller
- Institute of Animal Breeding and Husbandry, Christian-Albrechts University of Kiel, Kiel, Germany
| | - Robin Wellmann
- Institute of Animal Science, University of Hohenheim, Stuttgart, Germany
| | - Jörn Bennewitz
- Institute of Animal Science, University of Hohenheim, Stuttgart, Germany
| |
Collapse
|
22
|
Improving Genomic Prediction Accuracy in the Chinese Holstein Population by Combining with the Nordic Holstein Reference Population. Animals (Basel) 2023; 13:ani13040636. [PMID: 36830423 PMCID: PMC9951650 DOI: 10.3390/ani13040636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/29/2023] [Accepted: 02/01/2023] [Indexed: 02/16/2023] Open
Abstract
The size of the reference population is critical in order to improve the accuracy of genomic prediction. Indeed, improving genomic prediction accuracy by combining multinational reference populations has proven to be effective. In this study, we investigated the improvement of genomic prediction accuracy in seven complex traits (i.e., milk yield; fat yield; protein yield; somatic cell count; body conformation; feet and legs; and mammary system conformation) by combining the Chinese and Nordic Holstein reference populations. The estimated genetic correlations between the Chinese and Nordic Holstein populations are high with respect to protein yield, fat yield, and milk yield-whereby these correlations range from 0.621 to 0.720-and are moderate with respect to somatic cell count (0.449), but low for the three conformation traits (which range from 0.144 to 0.236). When utilizing the joint reference data and a two-trait GBLUP model, the genomic prediction accuracy in the Chinese Holsteins improves considerably with respect to the traits with moderate-to-high genetic correlations, whereas the improvement in Nordic Holsteins is small. When compared with the single population analysis, using the joint reference population for genomic prediction in younger animals, results in a 2.3 to 8.1 percent improvement in accuracy. Meanwhile, 10 replications of five-fold cross-validation were also implemented in order to evaluate the performance of joint genomic prediction, thereby resulting in a 1.6 to 5.2 percent increase in accuracy. With respect to joint genomic prediction, the bias was found to be quite low. However, for traits with low genetic correlations, the joint reference data do not improve the prediction accuracy substantially for either population.
Collapse
|
23
|
Vu NT, Phuc TH, Nguyen NH, Van Sang N. Effects of common full-sib families on accuracy of genomic prediction for tagging weight in striped catfish Pangasianodon hypophthalmus. Front Genet 2023; 13:1081246. [PMID: 36685869 PMCID: PMC9845282 DOI: 10.3389/fgene.2022.1081246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 12/06/2022] [Indexed: 01/06/2023] Open
Abstract
Common full-sib families (c 2 ) make up a substantial proportion of total phenotypic variation in traits of commercial importance in aquaculture species and omission or inclusion of the c 2 resulted in possible changes in genetic parameter estimates and re-ranking of estimated breeding values. However, the impacts of common full-sib families on accuracy of genomic prediction for commercial traits of economic importance are not well known in many species, including aquatic animals. This research explored the impacts of common full-sib families on accuracy of genomic prediction for tagging weight in a population of striped catfish comprising 11,918 fish traced back to the base population (four generations), in which 560 individuals had genotype records of 14,154 SNPs. Our single step genomic best linear unbiased prediction (ssGLBUP) showed that the accuracy of genomic prediction for tagging weight was reduced by 96.5%-130.3% when the common full-sib families were included in statistical models. The reduction in the prediction accuracy was to a smaller extent in multivariate analysis than in univariate models. Imputation of missing genotypes somewhat reduced the upward biases in the prediction accuracy for tagging weight. It is therefore suggested that genomic evaluation models for traits recorded during the early phase of growth development should account for the common full-sib families to minimise possible biases in the accuracy of genomic prediction and hence, selection response.
Collapse
Affiliation(s)
- Nguyen Thanh Vu
- School of Science, Technology and Engineering, University of the Sunshine Coast, Sippy Downs, QLD, Australia,Center for Bio-Innovation, University of the Sunshine Coast, Maroochydore, QLD, Australia,Research Institute for Aquaculture No. 2, Ho Chi Minh City, Vietnam
| | - Tran Huu Phuc
- Research Institute for Aquaculture No. 2, Ho Chi Minh City, Vietnam
| | - Nguyen Hong Nguyen
- School of Science, Technology and Engineering, University of the Sunshine Coast, Sippy Downs, QLD, Australia,Center for Bio-Innovation, University of the Sunshine Coast, Maroochydore, QLD, Australia,*Correspondence: Nguyen Hong Nguyen, ; Nguyen Van Sang,
| | - Nguyen Van Sang
- Research Institute for Aquaculture No. 2, Ho Chi Minh City, Vietnam,*Correspondence: Nguyen Hong Nguyen, ; Nguyen Van Sang,
| |
Collapse
|
24
|
Andreou GM, Messer M, Tong H, Nikoloski Z, Laitinen RAE. Heritability of temperature-mediated flower size plasticity in Arabidopsis thaliana. QUANTITATIVE PLANT BIOLOGY 2023; 4:e4. [PMID: 37077703 PMCID: PMC10095859 DOI: 10.1017/qpb.2023.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 02/01/2023] [Accepted: 02/07/2023] [Indexed: 05/03/2023]
Abstract
Phenotypic plasticity is a heritable trait that provides sessile organisms a strategy to rapidly mitigate negative effects of environmental change. Yet, we have little understanding of the mode of inheritance and genetic architecture of plasticity in different focal traits relevant to agricultural applications. This study builds on our recent discovery of genes controlling temperature-mediated flower size plasticity in Arabidopsis thaliana and focuses on dissecting the mode of inheritance and combining ability of plasticity in the context of plant breeding. We created a full diallel cross using 12 A. thaliana accessions displaying different temperature-mediated flower size plasticities, scored as the fold change between two temperatures. Griffing's analysis of variance in flower size plasticity indicated that non-additive genetic action shapes this trait and pointed at challenges and opportunities when breeding for reduced plasticity. Our findings provide an outlook of flower size plasticity that is important for developing resilient crops for future climates.
Collapse
Affiliation(s)
- Gregory M. Andreou
- Organismal and Evolutionary Research Programme, Faculty of Biological and Environmental Sciences, Viikki Plant Science Centre, University of Helsinki, Helsinki, Finland
| | - Michaela Messer
- Molecular Mechanisms of Plant Adaptation Group, Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany
| | - Hao Tong
- Bioinformatics Department, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
- Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany
| | - Zoran Nikoloski
- Bioinformatics Department, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
- Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany
| | - Roosa A. E. Laitinen
- Organismal and Evolutionary Research Programme, Faculty of Biological and Environmental Sciences, Viikki Plant Science Centre, University of Helsinki, Helsinki, Finland
- Molecular Mechanisms of Plant Adaptation Group, Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany
- Author for correspondence: Roosa A. E. Laitinen, E-mail:
| |
Collapse
|
25
|
Cuevas J, Reslow F, Crossa J, Ortiz R. Modeling genotype × environment interaction for single and multitrait genomic prediction in potato (Solanum tuberosum L.). G3 (BETHESDA, MD.) 2022; 13:6883526. [PMID: 36477309 PMCID: PMC9911059 DOI: 10.1093/g3journal/jkac322] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 11/01/2022] [Accepted: 11/28/2022] [Indexed: 12/13/2022]
Abstract
In this study, we extend research on genomic prediction (GP) to polysomic polyploid plant species with the main objective to investigate single-trait (ST) and multitrait (MT) multienvironment (ME) models using field trial data from 3 locations in Sweden [Helgegården (HEL), Mosslunda (MOS), Umeå (UM)] over 2 years (2020, 2021) of 253 potato cultivars and breeding clones for 5 tuber weight traits and 2 tuber flesh quality characteristics. This research investigated the GP of 4 genome-based prediction models with genotype × environment interactions (GEs): (1) ST reaction norm model (M1), (2) ST model considering covariances between environments (M2), (3) ST M2 extended to include a random vector that utilizes the environmental covariances (M3), and (4) MT model with GE (M4). Several prediction problems were analyzed for each of the GP accuracy of the 4 models. Results of the prediction of traits in HEL, the high yield potential testing site in 2021, show that the best-predicted traits were tuber flesh starch (%), weight of tuber above 60 or below 40 mm in size, and the total tuber weight. In terms of GP, accuracy model M4 gave the best prediction accuracy in 3 traits, namely tuber weight of 40-50 or above 60 mm in size, and total tuber weight, and very similar in the starch trait. For MOS in 2021, the best predictive traits were starch, weight of tubers above 60, 50-60, or below 40 mm in size, and the total tuber weight. MT model M4 was the best GP model based on its accuracy when some cultivars are observed in some traits. For the GP accuracy of traits in UM in 2021, the best predictive traits were the weight of tubers above 60, 50-60, or below 40 mm in size, and the best model was MT M4, followed by models ST M3 and M2.
Collapse
Affiliation(s)
- Jaime Cuevas
- Departamento de Energía, Universidad Autónoma del Estado de Quintana Roo, Chetumal, Quintana Roo 77019, México
| | - Fredrik Reslow
- Department of Plant Breeding, Swedish University of Agricultural Sciences (SLU), P.O. Box 190, Lomma SE 23436, Sweden
| | - Jose Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz Km. 45, El Batán, Texcoco 56237, Edo. de Mexico, Mexico,Colegio de Postgraduados, Montecillos, Edo. de México 56230, México
| | - Rodomiro Ortiz
- Corresponding author: Sveriges Lantbruksuniversitet, Inst. för Växtförädling, Box 190, SE 23 422 Lomma, Sweden.
| |
Collapse
|
26
|
DoVale JC, Carvalho HF, Sabadin F, Fritsche-Neto R. Genotyping marker density and prediction models effects in long-term breeding schemes of cross-pollinated crops. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:4523-4539. [PMID: 36261658 DOI: 10.1007/s00122-022-04236-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Accepted: 10/09/2022] [Indexed: 06/16/2023]
Abstract
In genomic recurrent selection, the more markers, the better because they buffer the linkage disequilibrium losses caused by recombination over cycles, and consequently, provide higher responses to selection. Reductions of genotyping marker density have been extensively evaluated as potential strategies to reduce the genotyping costs of genomic selection (GS). Low-density marker panels are appealing in GS because they entail lower multicollinearity and computing time and allow more individuals to be genotyped for the same cost. However, statistical models used in GS are usually evaluated with empirical data, using "static" training sets and populations. This may be adequate for making predictions during a breeding program's initial cycles but not for the long-term. Moreover, studies that focus on long selective breeding cycles generally do not consider GS models with the effect of dominance, which is particularly important for breeding outcomes in cross-pollinated crops. Hence, dominance effects are important and unexplored in GS for long-term programs involving allogamous species. To address it, we employed two approaches: analysis of empirical maize datasets and simulations of long-term breeding applying phenotypic and genomic recurrent selection (intrapopulation and reciprocal schemes). In both schemes, we simulated twenty breeding cycles and assessed the effect of marker density reduction on the population mean, the best crosses, additive variance, selective accuracy, and response to selection with models [additive, additive-dominant, general (GCA), and this plus specific combining ability (GCA + SCA)]. Our results indicate that marker reduction based on linkage disequilibrium levels provides useful predictions only within a cycle, as accuracy significantly decreases over cycles. In the long-term, without training set updating, high-marker density provides the best responses to selection. The model to be used depends on the breeding scheme: additive for intrapopulation and additive-dominant or GCA + SCA for reciprocal.
Collapse
Affiliation(s)
- Júlio César DoVale
- Department of Crop Science, Federal University of Ceará, Fortaleza, CE, Brazil.
| | | | - Felipe Sabadin
- Virginia Tech: Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
| | | |
Collapse
|
27
|
Jones HE, Wilson PB. Progress and opportunities through use of genomics in animal production. Trends Genet 2022; 38:1228-1252. [PMID: 35945076 DOI: 10.1016/j.tig.2022.06.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 06/08/2022] [Accepted: 06/17/2022] [Indexed: 01/24/2023]
Abstract
The rearing of farmed animals is a vital component of global food production systems, but its impact on the environment, human health, animal welfare, and biodiversity is being increasingly challenged. Developments in genetic and genomic technologies have had a key role in improving the productivity of farmed animals for decades. Advances in genome sequencing, annotation, and editing offer a means not only to continue that trend, but also, when combined with advanced data collection, analytics, cloud computing, appropriate infrastructure, and regulation, to take precision livestock farming (PLF) and conservation to an advanced level. Such an approach could generate substantial additional benefits in terms of reducing use of resources, health treatments, and environmental impact, while also improving animal health and welfare.
Collapse
Affiliation(s)
- Huw E Jones
- UK Genetics for Livestock and Equines (UKGLE) Committee, Department for Environment, Food and Rural Affairs, Nobel House, 17 Smith Square, London, SW1P 3JR, UK; Nottingham Trent University, Brackenhurst Campus, Brackenhurst Lane, Southwell, NG25 0QF, UK.
| | - Philippe B Wilson
- UK Genetics for Livestock and Equines (UKGLE) Committee, Department for Environment, Food and Rural Affairs, Nobel House, 17 Smith Square, London, SW1P 3JR, UK; Nottingham Trent University, Brackenhurst Campus, Brackenhurst Lane, Southwell, NG25 0QF, UK
| |
Collapse
|
28
|
Lu T, Forgetta V, Richards JB, Greenwood CMT. Genetic determinants of polygenic prediction accuracy within a population. Genetics 2022; 222:6762086. [PMID: 36250789 PMCID: PMC9713421 DOI: 10.1093/genetics/iyac158] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 10/10/2022] [Indexed: 11/15/2022] Open
Abstract
Genomic risk prediction is on the emerging path toward personalized medicine. However, the accuracy of polygenic prediction varies strongly in different individuals. Based on up to 352,277 European ancestry participants in the UK Biobank, we constructed polygenic risk scores for 15 physiological and biochemical quantitative traits. We identified a total of 185 polygenic prediction variability quantitative trait loci for 11 traits by Levene's test among 254,376 unrelated individuals. We validated the effects of prediction variability quantitative trait loci using an independent test set of 58,927 individuals. For instance, a score aggregating 51 prediction variability quantitative trait locus variants for triglycerides had the strongest Spearman correlation of 0.185 (P-value <1.0 × 10-300) with the squared prediction errors. We found a strong enrichment of complex genetic effects conferred by prediction variability quantitative trait loci compared to risk loci identified in genome-wide association studies, including 89 prediction variability quantitative trait loci exhibiting dominance effects. Incorporation of dominance effects into polygenic risk scores significantly improved polygenic prediction for triglycerides, low-density lipoprotein cholesterol, vitamin D, and platelet. In conclusion, we have discovered and profiled genetic determinants of polygenic prediction variability for 11 quantitative biomarkers. These findings may assist interpretation of genomic risk prediction in various contexts and encourage novel approaches for constructing polygenic risk scores with complex genetic effects.
Collapse
Affiliation(s)
- Tianyuan Lu
- Centre for Clinical Epidemiology, Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, QC H3T 1E2, Canada.,Quantitative Life Sciences Program, McGill University, Montreal, QC H3A 0G4, Canada
| | - Vincenzo Forgetta
- Centre for Clinical Epidemiology, Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, QC H3T 1E2, Canada
| | - John Brent Richards
- Centre for Clinical Epidemiology, Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, QC H3T 1E2, Canada.,Department of Human Genetics, McGill University, Montreal, QC H3A 0G4, Canada.,Department of Twin Research and Genetic Epidemiology, King's College London, London WC2R 2LS, UK
| | - Celia M T Greenwood
- Centre for Clinical Epidemiology, Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, QC H3T 1E2, Canada.,Department of Human Genetics, McGill University, Montreal, QC H3A 0G4, Canada.,Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC H3A 0G4, Canada.,Gerald Bronfman Department of Oncology, McGill University, Montreal, QC H3A 0G4, Canada
| |
Collapse
|
29
|
Lorenzi A, Bauland C, Mary-Huard T, Pin S, Palaffre C, Guillaume C, Lehermeier C, Charcosset A, Moreau L. Genomic prediction of hybrid performance: comparison of the efficiency of factorial and tester designs used as training sets in a multiparental connected reciprocal design for maize silage. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:3143-3160. [PMID: 35918515 DOI: 10.1007/s00122-022-04176-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Accepted: 07/06/2022] [Indexed: 06/15/2023]
Abstract
Calibrating a genomic selection model on a sparse factorial design rather than on tester designs is advantageous for some traits, and equivalent for others. In maize breeding, the selection of the candidate inbred lines is based on topcross evaluations using a limited number of testers. Then, a subset of single-crosses between these selected lines is evaluated to identify the best hybrid combinations. Genomic selection enables the prediction of all possible single-crosses between candidate lines but raises the question of defining the best training set design. Previous simulation results have shown the potential of using a sparse factorial design instead of tester designs as the training set. To validate this result, a 363 hybrid factorial design was obtained by crossing 90 dent and flint inbred lines from six segregating families. Two tester designs were also obtained by crossing the same inbred lines to two testers of the opposite group. These designs were evaluated for silage in eight environments and used to predict independent performances of a 951 hybrid factorial design. At a same number of hybrids and lines, the factorial design was as efficient as the tester designs, and, for some traits, outperformed them. All available designs were used as both training and validation set to evaluate their efficiency. When the objective was to predict single-crosses between untested lines, we showed an advantage of increasing the number of lines involved in the training set, by (1) allocating each of them to a different tester for the tester design, or (2) reducing the number of hybrids per line for the factorial design. Our results confirm the potential of sparse factorial designs for genomic hybrid breeding.
Collapse
Affiliation(s)
- Alizarine Lorenzi
- Génétique Quantitative et Evolution - Le Moulon, INRAE, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Cyril Bauland
- Génétique Quantitative et Evolution - Le Moulon, INRAE, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Tristan Mary-Huard
- Génétique Quantitative et Evolution - Le Moulon, INRAE, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
- MIA, INRAE, AgroParisTech, Université Paris-Saclay, 75005, Paris, France
| | - Sophie Pin
- Génétique Quantitative et Evolution - Le Moulon, INRAE, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Carine Palaffre
- UE 0394 SMH, INRAE, 2297 Route de l'INRA, 40390, Saint-Martin-de-Hinx, France
| | | | | | - Alain Charcosset
- Génétique Quantitative et Evolution - Le Moulon, INRAE, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Laurence Moreau
- Génétique Quantitative et Evolution - Le Moulon, INRAE, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France.
| |
Collapse
|
30
|
Nagai R, Kinukawa M, Watanabe T, Ogino A, Kurogi K, Adachi K, Satoh M, Uemoto Y. Genomic dissection of repeatability considering additive and non-additive genetic effects for semen production traits in beef and dairy bulls. J Anim Sci 2022; 100:6647626. [PMID: 35860946 DOI: 10.1093/jas/skac241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 07/19/2022] [Indexed: 11/13/2022] Open
Abstract
The low heritability and moderate repeatability of semen production traits in beef and dairy bulls suggest that non-additive genetic effects, such as dominance and epistatic effects, play an important role in semen production and should therefore be considered in genetic improvement programs. In this study, the repeatability of semen production traits in Japanese Black bulls (JB) as beef bulls and Holstein bulls (HOL) as dairy bulls was evaluated by considering additive and non-additive genetic effects using the Illumina BovineSNP50 BeadChip. We also evaluated the advantage of using more complete models that include non-additive genetic effects by comparing the rank of genotyped animals and the phenotype prediction ability of each model. In total, 65,463 records for 615 genotyped JB and 48,653 records for 845 genotyped HOL were used to estimate additive and non-additive (dominance and epistatic) variance components for semen volume (VOL), sperm concentration (CON), sperm motility (MOT), MOT after freeze-thawing (aMOT), and sperm number (NUM). In the model including both additive and non-additive genetic effects, the broad-sense heritability (0.17-0.43) was more than twice as high as the narrow-sense heritability (0.04-0.11) for all traits and breeds, and the differences between the broad-sense heritability and repeatability were very small for VOL, NUM, and CON in both breeds. A large proportion of permanent environmental variance was explained by epistatic variance. The epistatic variance as a proportion of total phenotypic variance was 0.07-0.33 for all traits and breeds. In addition, heterozygosity showed significant positive relationships with NUM, MOT, and aMOT in JB and NUM in HOL, when the heterozygosity rate was included as a covariate. In a comparison of models, the inclusion of non-additive genetic effects resulted in a re-ranking of the top genotyped bulls for the additive effects. Adjusting for non-additive genetic effects could be expected to produce a more accurate breeding value, even if the models have similar fitting. However, including non-additive genetic effects did not improve the ability of any model to predict phenotypic values for any trait or breed compared with the predictive ability of a model that includes only additive effects. Consequently, although non-additive genetic effects, especially epistatic effects, play an important role in semen production traits, they do not improve prediction accuracy in beef and dairy bulls.
Collapse
Affiliation(s)
- Rintaro Nagai
- Graduate School of Agricultural Science, Tohoku University, Sendai, Miyagi 980-8572, Japan
| | - Masashi Kinukawa
- Maebashi Institute of Animal Science, Livestock Improvement Association of Japan, Inc., Maebashi 371-0121, Japan
| | - Toshio Watanabe
- Maebashi Institute of Animal Science, Livestock Improvement Association of Japan, Inc., Maebashi 371-0121, Japan
| | - Atsushi Ogino
- Maebashi Institute of Animal Science, Livestock Improvement Association of Japan, Inc., Maebashi 371-0121, Japan
| | - Kazuhito Kurogi
- Cattle Breeding Department, Livestock Improvement Association of Japan, Inc., Tokyo 135-0041, Japan
| | - Kazunori Adachi
- Cattle Breeding Department, Livestock Improvement Association of Japan, Inc., Tokyo 135-0041, Japan
| | - Masahiro Satoh
- Graduate School of Agricultural Science, Tohoku University, Sendai, Miyagi 980-8572, Japan
| | - Yoshinobu Uemoto
- Graduate School of Agricultural Science, Tohoku University, Sendai, Miyagi 980-8572, Japan
| |
Collapse
|
31
|
Kenny D, Sleator RD, Murphy CP, Evans RD, Berry DP. Detection of Genomic Imprinting for Carcass Traits in Cattle Using Imputed High-Density Genotype Data. Front Genet 2022; 13:951087. [PMID: 35910233 PMCID: PMC9334527 DOI: 10.3389/fgene.2022.951087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 06/16/2022] [Indexed: 12/03/2022] Open
Abstract
Genomic imprinting is an epigenetic phenomenon defined as the silencing of an allele, at least partially, at a given locus based on the sex of the transmitting parent. The objective of the present study was to detect the presence of SNP-phenotype imprinting associations for carcass weight (CW), carcass conformation (CC) and carcass fat (CF) in cattle. The data used comprised carcass data, along with imputed, high-density genotype data on 618,837 single nucleotide polymorphisms (SNPs) from 23,687 cattle; all animal genotypes were phased with respect to parent of origin. Based on the phased genotypes and a series of single-locus linear models, 24, 339, and 316 SNPs demonstrated imprinting associations with CW, CC, and CF, respectively. Regardless of the trait in question, no known imprinted gene was located within 0.5 Mb of the SNPs demonstrating imprinting associations in the present study. Since all imprinting associations detected herein were at novel loci, further investigation of these regions may be warranted. Nonetheless, knowledge of these associations might be useful for improving the accuracy of genomic evaluations for these traits, as well as mate allocations systems to exploit the effects of genomic imprinting.
Collapse
Affiliation(s)
- David Kenny
- Animal and Grassland Research and Innovation Centre, Teagasc, Moorepark, Co. Cork, Ireland
- Department of Biological Sciences, Munster Technological University, Bishopstown Campus, Co. Cork, Ireland
| | - Roy D. Sleator
- Department of Biological Sciences, Munster Technological University, Bishopstown Campus, Co. Cork, Ireland
| | - Craig P. Murphy
- Department of Biological Sciences, Munster Technological University, Bishopstown Campus, Co. Cork, Ireland
| | - Ross D. Evans
- Irish Cattle Breeding Federation, Highfield House, Bandon, Co. Cork, Ireland
| | - Donagh P. Berry
- Animal and Grassland Research and Innovation Centre, Teagasc, Moorepark, Co. Cork, Ireland
- *Correspondence: Donagh P. Berry,
| |
Collapse
|
32
|
Wade AR, Duruflé H, Sanchez L, Segura V. eQTLs are key players in the integration of genomic and transcriptomic data for phenotype prediction. BMC Genomics 2022; 23:476. [PMID: 35764918 PMCID: PMC9238188 DOI: 10.1186/s12864-022-08690-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 06/11/2022] [Indexed: 11/10/2022] Open
Abstract
Background Multi-omics represent a promising link between phenotypes and genome variation. Few studies yet address their integration to understand genetic architecture and improve predictability. Results Our study used 241 poplar genotypes, phenotyped in two common gardens, with xylem and cambium RNA sequenced at one site, yielding large phenotypic, genomic (SNP), and transcriptomic datasets. Prediction models for each trait were built separately for SNPs and transcripts, and compared to a third model integrated by concatenation of both omics. The advantage of integration varied across traits and, to understand such differences, an eQTL analysis was performed to characterize the interplay between the genome and transcriptome and classify the predicting features into cis or trans relationships. A strong, significant negative correlation was found between the change in predictability and the change in predictor ranking for trans eQTLs for traits evaluated in the site of transcriptomic sampling. Conclusions Consequently, beneficial integration happens when the redundancy of predictors is decreased, likely leaving the stage to other less prominent but complementary predictors. An additional gene ontology (GO) enrichment analysis appeared to corroborate such statistical output. To our knowledge, this is a novel finding delineating a promising method to explore data integration. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08690-7.
Collapse
|
33
|
Mohd Saad NS, Neik TX, Thomas WJW, Amas JC, Cantila AY, Craig RJ, Edwards D, Batley J. Advancing designer crops for climate resilience through an integrated genomics approach. CURRENT OPINION IN PLANT BIOLOGY 2022; 67:102220. [PMID: 35489163 DOI: 10.1016/j.pbi.2022.102220] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 03/15/2022] [Accepted: 03/25/2022] [Indexed: 06/14/2023]
Abstract
Climate change and exponential population growth are exposing an immediate need for developing future crops that are highly resilient and adaptable to changing environments to maintain global food security in the next decade. Rigorous selection from long domestication history has rendered cultivated crops genetically disadvantaged, raising concerns in their ability to adapt to these new challenges and limiting their usefulness in breeding programmes. As a result, future crop improvement efforts must rely on integrating various genomic strategies ranging from high-throughput sequencing to machine learning, in order to exploit germplasm diversity and overcome bottlenecks created by domestication, expansive multi-dimensional phenotypes, arduous breeding processes, complex traits and big data.
Collapse
Affiliation(s)
- Nur Shuhadah Mohd Saad
- UWA School of Biological Sciences and the UWA Institute of Agriculture, University of Western Australia, Crawley, WA, Australia
| | - Ting Xiang Neik
- Sunway College Kuala Lumpur, Bandar Sunway, 47500, Selangor, Malaysia
| | - William J W Thomas
- UWA School of Biological Sciences and the UWA Institute of Agriculture, University of Western Australia, Crawley, WA, Australia
| | - Junrey C Amas
- UWA School of Biological Sciences and the UWA Institute of Agriculture, University of Western Australia, Crawley, WA, Australia
| | - Aldrin Y Cantila
- UWA School of Biological Sciences and the UWA Institute of Agriculture, University of Western Australia, Crawley, WA, Australia
| | - Ryan J Craig
- UWA School of Biological Sciences and the UWA Institute of Agriculture, University of Western Australia, Crawley, WA, Australia
| | - David Edwards
- UWA School of Biological Sciences and the UWA Institute of Agriculture, University of Western Australia, Crawley, WA, Australia
| | - Jacqueline Batley
- UWA School of Biological Sciences and the UWA Institute of Agriculture, University of Western Australia, Crawley, WA, Australia.
| |
Collapse
|
34
|
Wang X, Shi S, Wang G, Luo W, Wei X, Qiu A, Luo F, Ding X. Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs. J Anim Sci Biotechnol 2022; 13:60. [PMID: 35578371 PMCID: PMC9112588 DOI: 10.1186/s40104-022-00708-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 03/13/2022] [Indexed: 12/02/2022] Open
Abstract
Background Recently, machine learning (ML) has become attractive in genomic prediction, but its superiority in genomic prediction over conventional (ss) GBLUP methods and the choice of optimal ML methods need to be investigated. Results In this study, 2566 Chinese Yorkshire pigs with reproduction trait records were genotyped with the GenoBaits Porcine SNP 50 K and PorcineSNP50 panels. Four ML methods, including support vector regression (SVR), kernel ridge regression (KRR), random forest (RF) and Adaboost.R2 were implemented. Through 20 replicates of fivefold cross-validation (CV) and one prediction for younger individuals, the utility of ML methods in genomic prediction was explored. In CV, compared with genomic BLUP (GBLUP), single-step GBLUP (ssGBLUP) and the Bayesian method BayesHE, ML methods significantly outperformed these conventional methods. ML methods improved the genomic prediction accuracy of GBLUP, ssGBLUP, and BayesHE by 19.3%, 15.0% and 20.8%, respectively. In addition, ML methods yielded smaller mean squared error (MSE) and mean absolute error (MAE) in all scenarios. ssGBLUP yielded an improvement of 3.8% on average in accuracy compared to that of GBLUP, and the accuracy of BayesHE was close to that of GBLUP. In genomic prediction of younger individuals, RF and Adaboost.R2_KRR performed better than GBLUP and BayesHE, while ssGBLUP performed comparably with RF, and ssGBLUP yielded slightly higher accuracy and lower MSE than Adaboost.R2_KRR in the prediction of total number of piglets born, while for number of piglets born alive, Adaboost.R2_KRR performed significantly better than ssGBLUP. Among ML methods, Adaboost.R2_KRR consistently performed well in our study. Our findings also demonstrated that optimal hyperparameters are useful for ML methods. After tuning hyperparameters in CV and in predicting genomic outcomes of younger individuals, the average improvement was 14.3% and 21.8% over those using default hyperparameters, respectively. Conclusion Our findings demonstrated that ML methods had better overall prediction performance than conventional genomic selection methods, and could be new options for genomic prediction. Among ML methods, Adaboost.R2_KRR consistently performed well in our study, and tuning hyperparameters is necessary for ML methods. The optimal hyperparameters depend on the character of traits, datasets etc. Supplementary Information The online version contains supplementary material available at 10.1186/s40104-022-00708-0.
Collapse
Affiliation(s)
- Xue Wang
- Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture and Rural Affairs, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Shaolei Shi
- Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture and Rural Affairs, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Guijiang Wang
- Hebei Province Animal Husbandry and Improved Breeds Work Station, Shijiazhuang, Hebei, China
| | - Wenxue Luo
- Hebei Province Animal Husbandry and Improved Breeds Work Station, Shijiazhuang, Hebei, China
| | - Xia Wei
- Zhangjiakou Dahao Heshan New Agricultural Development Co., Ltd, Zhangjiakou, Hebei, China
| | - Ao Qiu
- Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture and Rural Affairs, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Fei Luo
- Hebei Province Animal Husbandry and Improved Breeds Work Station, Shijiazhuang, Hebei, China
| | - Xiangdong Ding
- Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture and Rural Affairs, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China.
| |
Collapse
|
35
|
Mathew B, Hauptmann A, Léon J, Sillanpää MJ. NeuralLasso: Neural Networks Meet Lasso in Genomic Prediction. FRONTIERS IN PLANT SCIENCE 2022; 13:800161. [PMID: 35574107 PMCID: PMC9100816 DOI: 10.3389/fpls.2022.800161] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 03/18/2022] [Indexed: 06/15/2023]
Abstract
Prediction of complex traits based on genome-wide marker information is of central importance for both animal and plant breeding. Numerous models have been proposed for the prediction of complex traits and still considerable effort has been given to improve the prediction accuracy of these models, because various genetics factors like additive, dominance and epistasis effects can influence of the prediction accuracy of such models. Recently machine learning (ML) methods have been widely applied for prediction in both animal and plant breeding programs. In this study, we propose a new algorithm for genomic prediction which is based on neural networks, but incorporates classical elements of LASSO. Our new method is able to account for the local epistasis (higher order interaction between the neighboring markers) in the prediction. We compare the prediction accuracy of our new method with the most commonly used prediction methods, such as BayesA, BayesB, Bayesian Lasso (BL), genomic BLUP and Elastic Net (EN) using the heterogenous stock mouse and rice field data sets.
Collapse
Affiliation(s)
- Boby Mathew
- Bayer CropScience, Monheim am Rhein, Germany
- Institute of Crop Science and Resource Conservation, University of Bonn, Bonn, Germany
| | - Andreas Hauptmann
- Research Unit of Mathematical Sciences, University of Oulu, Oulu, Finland
- Department of Computer Science, University College London, London, United Kingdom
| | - Jens Léon
- Institute of Crop Science and Resource Conservation, University of Bonn, Bonn, Germany
| | - Mikko J. Sillanpää
- Research Unit of Mathematical Sciences, University of Oulu, Oulu, Finland
| |
Collapse
|
36
|
Building a Calibration Set for Genomic Prediction, Characteristics to Be Considered, and Optimization Approaches. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2467:77-112. [PMID: 35451773 DOI: 10.1007/978-1-0716-2205-6_3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
The efficiency of genomic selection strongly depends on the prediction accuracy of the genetic merit of candidates. Numerous papers have shown that the composition of the calibration set is a key contributor to prediction accuracy. A poorly defined calibration set can result in low accuracies, whereas an optimized one can considerably increase accuracy compared to random sampling, for a same size. Alternatively, optimizing the calibration set can be a way of decreasing the costs of phenotyping by enabling similar levels of accuracy compared to random sampling but with fewer phenotypic units. We present here the different factors that have to be considered when designing a calibration set, and review the different criteria proposed in the literature. We classified these criteria into two groups: model-free criteria based on relatedness, and criteria derived from the linear mixed model. We introduce criteria targeting specific prediction objectives including the prediction of highly diverse panels, biparental families, or hybrids. We also review different ways of updating the calibration set, and different procedures for optimizing phenotyping experimental designs.
Collapse
|
37
|
Gill M, Anderson R, Hu H, Bennamoun M, Petereit J, Valliyodan B, Nguyen HT, Batley J, Bayer PE, Edwards D. Machine learning models outperform deep learning models, provide interpretation and facilitate feature selection for soybean trait prediction. BMC PLANT BIOLOGY 2022; 22:180. [PMID: 35395721 PMCID: PMC8991976 DOI: 10.1186/s12870-022-03559-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 03/21/2022] [Indexed: 05/26/2023]
Abstract
Recent growth in crop genomic and trait data have opened opportunities for the application of novel approaches to accelerate crop improvement. Machine learning and deep learning are at the forefront of prediction-based data analysis. However, few approaches for genotype to phenotype prediction compare machine learning with deep learning and further interpret the models that support the predictions. This study uses genome wide molecular markers and traits across 1110 soybean individuals to develop accurate prediction models. For 13/14 sets of predictions, XGBoost or random forest outperformed deep learning models in prediction performance. Top ranked SNPs by F-score were identified from XGBoost, and with further investigation found overlap with significantly associated loci identified from GWAS and previous literature. Feature importance rankings were used to reduce marker input by up to 90%, and subsequent models maintained or improved their prediction performance. These findings support interpretable machine learning as an approach for genomic based prediction of traits in soybean and other crops.
Collapse
Affiliation(s)
- Mitchell Gill
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Robyn Anderson
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Haifei Hu
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Mohammed Bennamoun
- Department of Computer Science and Software Engineering, The University of Western Australia, Perth, WA, Australia
| | - Jakob Petereit
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Babu Valliyodan
- Division of Plant Sciences and National Center for Soybean Biotechnology, University of Missouri, Columbia, MO, 65211, USA
- Department of Agriculture and Environmental Sciences, Lincoln University, Jefferson City, MO, 65101, USA
| | - Henry T Nguyen
- Division of Plant Sciences and National Center for Soybean Biotechnology, University of Missouri, Columbia, MO, 65211, USA
| | - Jacqueline Batley
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Philipp E Bayer
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia.
| |
Collapse
|
38
|
Bermann M, Cesarani A, Misztal I, Lourenco D. Past, present, and future developments in single-step genomic models. ITALIAN JOURNAL OF ANIMAL SCIENCE 2022. [DOI: 10.1080/1828051x.2022.2053366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Matias Bermann
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, USA
| | - Alberto Cesarani
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, USA
- Dipartimento di Agraria, Università degli Studi di Sassari, Sassari, Italy
| | - Ignacy Misztal
- Dipartimento di Agraria, Università degli Studi di Sassari, Sassari, Italy
| | - Daniela Lourenco
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, USA
| |
Collapse
|
39
|
Puglisi D, Visioni A, Ozkan H, Kara İ, Lo Piero AR, Rachdad FE, Tondelli A, Valè G, Cattivelli L, Fricano A. High accuracy of genome-enabled prediction of belowground and physiological traits in barley seedlings. G3 GENES|GENOMES|GENETICS 2022; 12:6517783. [PMID: 35099521 PMCID: PMC8895982 DOI: 10.1093/g3journal/jkac022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 01/21/2022] [Indexed: 11/24/2022]
Abstract
In plants, the study of belowground traits is gaining momentum due to their importance on yield formation and the uptake of water and nutrients. In several cereal crops, seminal root number and seminal root angle are proxy traits of the root system architecture at the mature stages, which in turn contributes to modulating the uptake of water and nutrients. Along with seminal root number and seminal root angle, experimental evidence indicates that the transpiration rate response to evaporative demand or vapor pressure deficit is a key physiological trait that might be targeted to cope with drought tolerance as the reduction of the water flux to leaves for limiting transpiration rate at high levels of vapor pressure deficit allows to better manage soil moisture. In the present study, we examined the phenotypic diversity of seminal root number, seminal root angle, and transpiration rate at the seedling stage in a panel of 8-way Multiparent Advanced Generation Inter-Crosses lines of winter barley and correlated these traits with grain yield measured in different site-by-season combinations. Second, phenotypic and genotypic data of the Multiparent Advanced Generation Inter-Crosses population were combined to fit and cross-validate different genomic prediction models for these belowground and physiological traits. Genomic prediction models for seminal root number were fitted using threshold and log-normal models, considering these data as ordinal discrete variable and as count data, respectively, while for seminal root angle and transpiration rate, genomic prediction was implemented using models based on extended genomic best linear unbiased predictors. The results presented in this study show that genome-enabled prediction models of seminal root number, seminal root angle, and transpiration rate data have high predictive ability and that the best models investigated in the present study include first-order additive × additive epistatic interaction effects. Our analyses indicate that beyond grain yield, genomic prediction models might be used to predict belowground and physiological traits and pave the way to practical applications for barley improvement.
Collapse
Affiliation(s)
- Damiano Puglisi
- Dipartimento di Agricoltura, Alimentazione e Ambiente (Di3A), Università di Catania , 95123 Catania, Italy
| | - Andrea Visioni
- Biodiversity and Crop Improvement Program, International Center for Agricultural Research in the Dry Areas , 6299 Rabat, Morocco
| | - Hakan Ozkan
- Faculty of Agriculture, Department of Field Crops, University of Cukurova , 01330 Adana, Turkey
| | - İbrahim Kara
- Bahri Dagdas International Agricultural Research Institute , Km Karatay/Konya 42020, Turkey
| | - Angela Roberta Lo Piero
- Dipartimento di Agricoltura, Alimentazione e Ambiente (Di3A), Università di Catania , 95123 Catania, Italy
| | - Fatima Ezzahra Rachdad
- Biodiversity and Crop Improvement Program, International Center for Agricultural Research in the Dry Areas , 6299 Rabat, Morocco
- Faculty of Sciences Ben M’sik, Department of Biology, Environment and Ecology Laboratory, Hassan II University of Casablanca , 7955 Casablanca, Morocco
| | - Alessandro Tondelli
- Council for Agricultural Research and Economics—Research Centre for Genomics and Bioinformatics , 29017 Fiorenzuola d’Arda (PC), Italy
| | - Giampiero Valè
- DiSIT, Dipartimento di Scienze e Innovazione Tecnologica, Università del Piemonte Orientale , 13100 Vercelli, Italy
| | - Luigi Cattivelli
- Council for Agricultural Research and Economics—Research Centre for Genomics and Bioinformatics , 29017 Fiorenzuola d’Arda (PC), Italy
| | - Agostino Fricano
- Council for Agricultural Research and Economics—Research Centre for Genomics and Bioinformatics , 29017 Fiorenzuola d’Arda (PC), Italy
| |
Collapse
|
40
|
Ogawa S, Kimata M, Tomiyama M, Satoh M. Heritability and genetic correlation estimates of semen production traits with litter traits and pork production traits in purebred Duroc pigs. J Anim Sci 2022; 100:6535633. [PMID: 35201314 PMCID: PMC9030147 DOI: 10.1093/jas/skac055] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 02/23/2022] [Indexed: 11/30/2022] Open
Abstract
We estimated heritabilities of semen production traits and their genetic correlations with litter traits and pork production traits in purebred Duroc pigs. Semen production traits were semen volume, sperm concentration, proportion of morphologically normal sperms, total number of sperm, and total number of morphologically normal sperm. Litter traits at farrowing were total number born, number born alive, number stillborn, total litter weight at birth, mean litter weight at birth, and piglet survival rate at birth. Litter traits at weaning were litter size at weaning, total litter weight at weaning, mean litter weight at weaning, and piglet survival rate from birth to weaning. Pork production traits were average daily gain, backfat thickness, and loin muscle area. We analyzed 45,913 semen collection records of 896 boars, 6,950 farrowing performance records of 1,400 sows, 2,237 weaning performance records of 586 sows, and individual growth performance records of 9,550 animals measured at approximately 5 mo of age. Heritabilities were estimated using a single-trait animal model. Genetic correlations were estimated using a 2-trait animal model. Estimated heritabilities of semen production traits ranged from 0.20 for sperm concentration to 0.29 for semen volume and were equal to or higher than those of litter traits, ranging from 0.06 for number stillborn and piglet survival rate at birth to 0.25 for mean litter weight at birth, but lower than those of pork production traits, ranging from 0.50 for average daily gain to 0.63 for backfat thickness. In many cases, the absolute values of estimated genetic correlations between semen production traits and other traits were smaller than 0.3. These estimated genetic parameters provide useful information for establishing a comprehensive pig breeding scheme. Genetic parameters of 5 semen production traits, 10 litter traits, and 3 pork production traits in purebred Duroc pigs was estimated. Heritabilities of semen production traits ranged from 0.20 for sperm concentration to 0.29 for semen volume and were equal to or higher than those of litter traits, ranging from 0.06 for number stillborn and piglet survival rate at birth to 0.25 for mean litter weight at birth, but lower than those of pork production traits, ranging from 0.50 for average daily gain to 0.63 for backfat thickness. In many cases, the absolute values of genetic correlations between semen production traits and other traits were smaller than 0.3. These estimated genetic parameters provide useful information for establishing a comprehensive pig breeding scheme.
Collapse
Affiliation(s)
- S Ogawa
- Graduate School of Agricultural Science, Tohoku University, Sendai, Miyagi 980-8572, Japan
| | - M Kimata
- CIMCO Corporation, Koto-ku, Tokyo 136-0071, Japan
| | - M Tomiyama
- CIMCO Corporation, Koto-ku, Tokyo 136-0071, Japan
| | - M Satoh
- Graduate School of Agricultural Science, Tohoku University, Sendai, Miyagi 980-8572, Japan
| |
Collapse
|
41
|
Roth M, Beugnot A, Mary-Huard T, Moreau L, Charcosset A, Fiévet JB. Improving genomic predictions with inbreeding and nonadditive effects in two admixed maize hybrid populations in single and multienvironment contexts. Genetics 2022; 220:6527635. [PMID: 35150258 PMCID: PMC8982028 DOI: 10.1093/genetics/iyac018] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Accepted: 01/28/2022] [Indexed: 11/12/2022] Open
Abstract
Genetic admixture, resulting from the recombination between structural groups, is frequently encountered in breeding populations. In hybrid breeding, crossing admixed lines can generate substantial nonadditive genetic variance and contrasted levels of inbreeding which can impact trait variation. This study aimed at testing recent methodological developments for the modeling of inbreeding and nonadditive effects in order to increase prediction accuracy in admixed populations. Using two maize (Zea mays L.) populations of hybrids admixed between dent and flint heterotic groups, we compared a suite of five genomic prediction models incorporating (or not) parameters accounting for inbreeding and nonadditive effects with the natural and orthogonal interaction approach in single and multienvironment contexts. In both populations, variance decompositions showed the strong impact of inbreeding on plant yield, height, and flowering time which was supported by the superiority of prediction models incorporating this effect (+0.038 in predictive ability for mean yield). In most cases dominance variance was reduced when inbreeding was accounted for. The model including additivity, dominance, epistasis, and inbreeding effects appeared to be the most robust for prediction across traits and populations (+0.054 in predictive ability for mean yield). In a multienvironment context, we found that the inclusion of nonadditive and inbreeding effects was advantageous when predicting hybrids not yet observed in any environment. Overall, comparing variance decompositions was helpful to guide model selection for genomic prediction. Finally, we recommend the use of models including inbreeding and nonadditive parameters following the natural and orthogonal interaction approach to increase prediction accuracy in admixed populations.
Collapse
Affiliation(s)
- Morgane Roth
- Plant Breeding Research Division, Agroscope, Wädenswil, 8820 Zurich, Switzerland,Corresponding author: INRAE GAFL, 67 Allée des Chênes 84140 Montfavet, France.
| | - Aurélien Beugnot
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, UMR GQE-Le Moulon, 91190 Gif-sur-Yvette, France
| | - Tristan Mary-Huard
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, UMR GQE-Le Moulon, 91190 Gif-sur-Yvette, France,Université Paris-Saclay, INRAE, AgroParisTech, UMR MIA-Paris Paris, 75005 Paris, France
| | - Laurence Moreau
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, UMR GQE-Le Moulon, 91190 Gif-sur-Yvette, France
| | - Alain Charcosset
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, UMR GQE-Le Moulon, 91190 Gif-sur-Yvette, France
| | - Julie B Fiévet
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, UMR GQE-Le Moulon, 91190 Gif-sur-Yvette, France
| |
Collapse
|
42
|
Raben TG, Lello L, Widen E, Hsu SDH. From Genotype to Phenotype: Polygenic Prediction of Complex Human Traits. Methods Mol Biol 2022; 2467:421-446. [PMID: 35451785 DOI: 10.1007/978-1-0716-2205-6_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Decoding the genome confers the capability to predict characteristics of the organism (phenotype) from DNA (genotype). We describe the present status and future prospects of genomic prediction of complex traits in humans. Some highly heritable complex phenotypes such as height and other quantitative traits can already be predicted with reasonable accuracy from DNA alone. For many diseases, including important common conditions such as coronary artery disease, breast cancer, type I and II diabetes, individuals with outlier polygenic scores (e.g., top few percent) have been shown to have 5 or even 10 times higher risk than average. Several psychiatric conditions such as schizophrenia and autism also fall into this category. We discuss related topics such as the genetic architecture of complex traits, sibling validation of polygenic scores, and applications to adult health, in vitro fertilization (embryo selection), and genetic engineering.
Collapse
Affiliation(s)
| | - Louis Lello
- Michigan State University, East Lansing, MI, USA
- Genomic Prediction, North Brunswick, NJ, USA
| | - Erik Widen
- Michigan State University, East Lansing, MI, USA
| | - Stephen D H Hsu
- Michigan State University, East Lansing, MI, USA.
- Genomic Prediction, North Brunswick, NJ, USA.
| |
Collapse
|
43
|
Martini JWR, Gao N, Crossa J. Incorporating Omics Data in Genomic Prediction. Methods Mol Biol 2022; 2467:341-357. [PMID: 35451782 DOI: 10.1007/978-1-0716-2205-6_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In this chapter, we discuss the motivation for integrating other types of omics data into genomic prediction methods. We give an overview of literature investigating the performance of omics-enhanced predictions, and highlight potential pitfalls when applying these methods in breeding. We emphasize that the statistical methods available for genomic data can be transferred to the general omics case. However, when using a framework of omic relationship matrices, the standardization of the variables may be more relevant than it is for a genomic relationship matrix based on single-nucleotide polymorphisms.
Collapse
Affiliation(s)
- Johannes W R Martini
- International Maize and Wheat Improvement Center (CIMMYT), Veracruz, CP, Mexico.
| | - Ning Gao
- School of Life Sciences, Sun Yat-Sen University, Guangzhou, China
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Veracruz, CP, Mexico
| |
Collapse
|
44
|
Lara LADC, Pocrnic I, Oliveira TDP, Gaynor RC, Gorjanc G. Temporal and genomic analysis of additive genetic variance in breeding programmes. Heredity (Edinb) 2022; 128:21-32. [PMID: 34912044 PMCID: PMC8733024 DOI: 10.1038/s41437-021-00485-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 10/24/2021] [Accepted: 11/01/2021] [Indexed: 11/13/2022] Open
Abstract
Genetic variance is a central parameter in quantitative genetics and breeding. Assessing changes in genetic variance over time as well as the genome is therefore of high interest. Here, we extend a previously proposed framework for temporal analysis of genetic variance using the pedigree-based model, to a new framework for temporal and genomic analysis of genetic variance using marker-based models. To this end, we describe the theory of partitioning genetic variance into genic variance and within-chromosome and between-chromosome linkage-disequilibrium, and how to estimate these variance components from a marker-based model fitted to observed phenotype and marker data. The new framework involves three steps: (i) fitting a marker-based model to data, (ii) sampling realisations of marker effects from the fitted model and for each sample calculating realisations of genetic values and (iii) calculating the variance of sampled genetic values by time and genome partitions. Analysing time partitions indicates breeding programme sustainability, while analysing genome partitions indicates contributions from chromosomes and chromosome pairs and linkage-disequilibrium. We demonstrate the framework with a simulated breeding programme involving a complex trait. Results show good concordance between simulated and estimated variances, provided that the fitted model is capturing genetic complexity of a trait. We observe a reduction of genetic variance due to selection and drift changing allele frequencies, and due to selection inducing negative linkage-disequilibrium.
Collapse
Affiliation(s)
- Letícia A de C Lara
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Edinburgh, UK.
| | - Ivan Pocrnic
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Edinburgh, UK
| | - Thiago de P Oliveira
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Edinburgh, UK
| | - R Chris Gaynor
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Edinburgh, UK
| | - Gregor Gorjanc
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Edinburgh, UK
| |
Collapse
|
45
|
Wolfe MD, Chan AW, Kulakow P, Rabbi I, Jannink JL. Genomic mating in outbred species: predicting cross usefulness with additive and total genetic covariance matrices. Genetics 2021; 219:iyab122. [PMID: 34740244 PMCID: PMC8570794 DOI: 10.1093/genetics/iyab122] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 07/13/2021] [Indexed: 11/14/2022] Open
Abstract
Diverse crops are both outbred and clonally propagated. Breeders typically use truncation selection of parents and invest significant time, land, and money evaluating the progeny of crosses to find exceptional genotypes. We developed and tested genomic mate selection criteria suitable for organisms of arbitrary homozygosity level where the full-sibling progeny are of direct interest as future parents and/or cultivars. We extended cross variance and covariance variance prediction to include dominance effects and predicted the multivariate selection index genetic variance of crosses based on haplotypes of proposed parents, marker effects, and recombination frequencies. We combined the predicted mean and variance into usefulness criteria for parent and variety development. We present an empirical study of cassava (Manihot esculenta), a staple tropical root crop. We assessed the potential to predict the multivariate genetic distribution (means, variances, and trait covariances) of 462 cassava families in terms of additive and total value using cross-validation. Most variance (89%) and covariance (70%) prediction accuracy estimates were greater than zero. The usefulness of crosses was accurately predicted with good correspondence between the predicted and the actual mean performance of family members breeders selected for advancement as new parents and candidate varieties. We also used a directional dominance model to quantify significant inbreeding depression for most traits. We predicted 47,083 possible crosses of 306 parents and contrasted them to those previously tested to show how mate selection can reveal the new potential within the germplasm. We enable breeders to consider the potential of crosses to produce future parents (progeny with top breeding values) and varieties (progeny with top own performance).
Collapse
Affiliation(s)
- Marnin D Wolfe
- Section on Plant Breeding and Genetics, School of Integrative Plant Sciences,
Cornell University, Ithaca, NY 14850, USA
| | - Ariel W Chan
- Section on Plant Breeding and Genetics, School of Integrative Plant Sciences,
Cornell University, Ithaca, NY 14850, USA
| | - Peter Kulakow
- International Institute of Tropical Agriculture (IITA), Ibadan,
Nigeria
| | - Ismail Rabbi
- International Institute of Tropical Agriculture (IITA), Ibadan,
Nigeria
| | - Jean-Luc Jannink
- Section on Plant Breeding and Genetics, School of Integrative Plant Sciences,
Cornell University, Ithaca, NY 14850, USA
- USDA-ARS, Ithaca, NY 14850, USA
| |
Collapse
|
46
|
Merrick LF, Carter AH. Comparison of genomic selection models for exploring predictive ability of complex traits in breeding programs. THE PLANT GENOME 2021; 14:e20158. [PMID: 34719886 DOI: 10.1002/tpg2.20158] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 08/23/2021] [Indexed: 06/13/2023]
Abstract
Traits with a complex unknown genetic architecture are common in breeding programs. However, they pose a challenge for selection due to a combination of complex environmental and pleiotropic effects that impede the ability to create mapping populations to characterize the trait's genetic basis. One such trait, seedling emergence of wheat (Triticum aestivum L.) from deep planting, presents a unique opportunity to explore the best method to use and implement genetic selection (GS) models to predict a complex trait. Seventeen GS models were compared using two training populations, consisting of 473 genotypes from a diverse association mapping panel phenotyped from 2015 to 2019 and the other training population consisting of 643 breeding lines phenotyped in 2015 and 2020 in Lind, WA, with 40,368 markers. There were only a few significant differences between GS models, with support vector machines reaching the highest accuracy of 0.56 in a single breeding line trial using cross-validations. However, the consistent moderate accuracy of the parametric models indicates little advantage of using nonparametric models within individual years, but the nonparametric models show a slight increase in accuracy when combing years for complex traits. There was an increase in accuracy using cross-validations from 0.40 to 0.41 using diversity panels lines to breeding lines. Overall, our study showed that breeders can accurately predict and implement GS for a complex trait by using nonparametric machine learning models within their own breeding programs with increased accuracy as they combine training populations over the years.
Collapse
Affiliation(s)
- Lance F Merrick
- Dep. of Crop and Soil Sciences, Washington State Univ., Pullman, WA, 99164, USA
| | - Arron H Carter
- Dep. of Crop and Soil Sciences, Washington State Univ., Pullman, WA, 99164, USA
| |
Collapse
|
47
|
Merrick LF, Carter AH. Comparison of genomic selection models for exploring predictive ability of complex traits in breeding programs. THE PLANT GENOME 2021; 14:e20158. [PMID: 34719886 DOI: 10.1101/2021.04.15.440015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 08/23/2021] [Indexed: 05/23/2023]
Abstract
Traits with a complex unknown genetic architecture are common in breeding programs. However, they pose a challenge for selection due to a combination of complex environmental and pleiotropic effects that impede the ability to create mapping populations to characterize the trait's genetic basis. One such trait, seedling emergence of wheat (Triticum aestivum L.) from deep planting, presents a unique opportunity to explore the best method to use and implement genetic selection (GS) models to predict a complex trait. Seventeen GS models were compared using two training populations, consisting of 473 genotypes from a diverse association mapping panel phenotyped from 2015 to 2019 and the other training population consisting of 643 breeding lines phenotyped in 2015 and 2020 in Lind, WA, with 40,368 markers. There were only a few significant differences between GS models, with support vector machines reaching the highest accuracy of 0.56 in a single breeding line trial using cross-validations. However, the consistent moderate accuracy of the parametric models indicates little advantage of using nonparametric models within individual years, but the nonparametric models show a slight increase in accuracy when combing years for complex traits. There was an increase in accuracy using cross-validations from 0.40 to 0.41 using diversity panels lines to breeding lines. Overall, our study showed that breeders can accurately predict and implement GS for a complex trait by using nonparametric machine learning models within their own breeding programs with increased accuracy as they combine training populations over the years.
Collapse
Affiliation(s)
- Lance F Merrick
- Dep. of Crop and Soil Sciences, Washington State Univ., Pullman, WA, 99164, USA
| | - Arron H Carter
- Dep. of Crop and Soil Sciences, Washington State Univ., Pullman, WA, 99164, USA
| |
Collapse
|
48
|
Bayer PE, Petereit J, Danilevicz MF, Anderson R, Batley J, Edwards D. The application of pangenomics and machine learning in genomic selection in plants. THE PLANT GENOME 2021; 14:e20112. [PMID: 34288550 DOI: 10.1002/tpg2.20112] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 05/01/2021] [Indexed: 05/10/2023]
Abstract
Genomic selection approaches have increased the speed of plant breeding, leading to growing crop yields over the last decade. However, climate change is impacting current and future yields, resulting in the need to further accelerate breeding efforts to cope with these changing conditions. Here we present approaches to accelerate plant breeding and incorporate nonadditive effects in genomic selection by applying state-of-the-art machine learning approaches. These approaches are made more powerful by the inclusion of pangenomes, which represent the entire genome content of a species. Understanding the strengths and limitations of machine learning methods, compared with more traditional genomic selection efforts, is paramount to the successful application of these methods in crop breeding. We describe examples of genomic selection and pangenome-based approaches in crop breeding, discuss machine learning-specific challenges, and highlight the potential for the application of machine learning in genomic selection. We believe that careful implementation of machine learning approaches will support crop improvement to help counter the adverse outcomes of climate change on crop production.
Collapse
Affiliation(s)
- Philipp E Bayer
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Jakob Petereit
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Monica Furaste Danilevicz
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Robyn Anderson
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Jacqueline Batley
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| |
Collapse
|
49
|
Mahadevaiah C, Appunu C, Aitken K, Suresha GS, Vignesh P, Mahadeva Swamy HK, Valarmathi R, Hemaprabha G, Alagarasan G, Ram B. Genomic Selection in Sugarcane: Current Status and Future Prospects. FRONTIERS IN PLANT SCIENCE 2021; 12:708233. [PMID: 34646284 PMCID: PMC8502939 DOI: 10.3389/fpls.2021.708233] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 08/24/2021] [Indexed: 05/18/2023]
Abstract
Sugarcane is a C4 and agro-industry-based crop with a high potential for biomass production. It serves as raw material for the production of sugar, ethanol, and electricity. Modern sugarcane varieties are derived from the interspecific and intergeneric hybridization between Saccharum officinarum, Saccharum spontaneum, and other wild relatives. Sugarcane breeding programmes are broadly categorized into germplasm collection and characterization, pre-breeding and genetic base-broadening, and varietal development programmes. The varietal identification through the classic breeding programme requires a minimum of 12-14 years. The precise phenotyping in sugarcane is extremely tedious due to the high propensity of lodging and suckering owing to the influence of environmental factors and crop management practices. This kind of phenotyping requires data from both plant crop and ratoon experiments conducted over locations and seasons. In this review, we explored the feasibility of genomic selection schemes for various breeding programmes in sugarcane. The genetic diversity analysis using genome-wide markers helps in the formation of core set germplasm representing the total genomic diversity present in the Saccharum gene bank. The genome-wide association studies and genomic prediction in the Saccharum gene bank are helpful to identify the complete genomic resources for cane yield, commercial cane sugar, tolerances to biotic and abiotic stresses, and other agronomic traits. The implementation of genomic selection in pre-breeding, genetic base-broadening programmes assist in precise introgression of specific genes and recurrent selection schemes enhance the higher frequency of favorable alleles in the population with a considerable reduction in breeding cycles and population size. The integration of environmental covariates and genomic prediction in multi-environment trials assists in the prediction of varietal performance for different agro-climatic zones. This review also directed its focus on enhancing the genetic gain over time, cost, and resource allocation at various stages of breeding programmes.
Collapse
Affiliation(s)
| | - Chinnaswamy Appunu
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| | - Karen Aitken
- CSIRO (Commonwealth Scientific and Industrial Research Organization), St. Lucia, QLD, Australia
| | | | - Palanisamy Vignesh
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| | | | | | - Govind Hemaprabha
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| | - Ganesh Alagarasan
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| | - Bakshi Ram
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| |
Collapse
|
50
|
Gozalo-Marcilla M, Buntjer J, Johnsson M, Batista L, Diez F, Werner CR, Chen CY, Gorjanc G, Mellanby RJ, Hickey JM, Ros-Freixedes R. Genetic architecture and major genes for backfat thickness in pig lines of diverse genetic backgrounds. Genet Sel Evol 2021; 53:76. [PMID: 34551713 PMCID: PMC8459476 DOI: 10.1186/s12711-021-00671-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Accepted: 09/07/2021] [Indexed: 01/23/2023] Open
Abstract
Background Backfat thickness is an important carcass composition trait for pork production and is commonly included in swine breeding programmes. In this paper, we report the results of a large genome-wide association study for backfat thickness using data from eight lines of diverse genetic backgrounds. Methods Data comprised 275,590 pigs from eight lines with diverse genetic backgrounds (breeds included Large White, Landrace, Pietrain, Hampshire, Duroc, and synthetic lines) genotyped and imputed for 71,324 single-nucleotide polymorphisms (SNPs). For each line, we estimated SNP associations using a univariate linear mixed model that accounted for genomic relationships. SNPs with significant associations were identified using a threshold of p < 10–6 and used to define genomic regions of interest. The proportion of genetic variance explained by a genomic region was estimated using a ridge regression model. Results We found significant associations with backfat thickness for 264 SNPs across 27 genomic regions. Six genomic regions were detected in three or more lines. The average estimate of the SNP-based heritability was 0.48, with estimates by line ranging from 0.30 to 0.58. The genomic regions jointly explained from 3.2 to 19.5% of the additive genetic variance of backfat thickness within a line. Individual genomic regions explained up to 8.0% of the additive genetic variance of backfat thickness within a line. Some of these 27 genomic regions also explained up to 1.6% of the additive genetic variance in lines for which the genomic region was not statistically significant. We identified 64 candidate genes with annotated functions that can be related to fat metabolism, including well-studied genes such as MC4R, IGF2, and LEPR, and more novel candidate genes such as DHCR7, FGF23, MEDAG, DGKI, and PTN. Conclusions Our results confirm the polygenic architecture of backfat thickness and the role of genes involved in energy homeostasis, adipogenesis, fatty acid metabolism, and insulin signalling pathways for fat deposition in pigs. The results also suggest that several less well-understood metabolic pathways contribute to backfat development, such as those of phosphate, calcium, and vitamin D homeostasis. Supplementary Information The online version contains supplementary material available at 10.1186/s12711-021-00671-w.
Collapse
Affiliation(s)
- Miguel Gozalo-Marcilla
- The Roslin Institute, The University of Edinburgh, Midlothian, UK.,The Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, UK
| | - Jaap Buntjer
- The Roslin Institute, The University of Edinburgh, Midlothian, UK
| | - Martin Johnsson
- The Roslin Institute, The University of Edinburgh, Midlothian, UK.,Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Lorena Batista
- The Roslin Institute, The University of Edinburgh, Midlothian, UK
| | - Federico Diez
- The Roslin Institute, The University of Edinburgh, Midlothian, UK.,The Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, UK
| | | | - Ching-Yi Chen
- The Pig Improvement Company, Genus plc, Hendersonville, TN, USA
| | - Gregor Gorjanc
- The Roslin Institute, The University of Edinburgh, Midlothian, UK
| | - Richard J Mellanby
- The Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, UK
| | - John M Hickey
- The Roslin Institute, The University of Edinburgh, Midlothian, UK
| | - Roger Ros-Freixedes
- The Roslin Institute, The University of Edinburgh, Midlothian, UK. .,Departament de Ciència Animal, Universitat de Lleida - Agrotecnio-CERCA Center, Lleida, Spain.
| |
Collapse
|