1
|
Ghavi Hossein-Zadeh N. An overview of recent technological developments in bovine genomics. Vet Anim Sci 2024; 25:100382. [PMID: 39166173 PMCID: PMC11334705 DOI: 10.1016/j.vas.2024.100382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/22/2024] Open
Abstract
Cattle are regarded as highly valuable animals because of their milk, beef, dung, fur, and ability to draft. The scientific community has tried a number of strategies to improve the genetic makeup of bovine germplasm. To ensure higher returns for the dairy and beef industries, researchers face their greatest challenge in improving commercially important traits. One of the biggest developments in the last few decades in the creation of instruments for cattle genetic improvement is the discovery of the genome. Breeding livestock is being revolutionized by genomic selection made possible by the availability of medium- and high-density single nucleotide polymorphism (SNP) arrays coupled with sophisticated statistical techniques. It is becoming easier to access high-dimensional genomic data in cattle. Continuously declining genotyping costs and an increase in services that use genomic data to increase return on investment have both made a significant contribution to this. The field of genomics has come a long way thanks to groundbreaking discoveries such as radiation-hybrid mapping, in situ hybridization, synteny analysis, somatic cell genetics, cytogenetic maps, molecular markers, association studies for quantitative trait loci, high-throughput SNP genotyping, whole-genome shotgun sequencing to whole-genome mapping, and genome editing. These advancements have had a significant positive impact on the field of cattle genomics. This manuscript aimed to review recent advances in genomic technologies for cattle breeding and future prospects in this field.
Collapse
Affiliation(s)
- Navid Ghavi Hossein-Zadeh
- Department of Animal Science, Faculty of Agricultural Sciences, University of Guilan, Rasht, 41635-1314, Iran
| |
Collapse
|
2
|
Haque MA, Iqbal A, Alam MZ, Lee YM, Ha JJ, Kim JJ. Estimation of genetic correlations and genomic prediction accuracy for reproductive and carcass traits in Hanwoo cows. JOURNAL OF ANIMAL SCIENCE AND TECHNOLOGY 2024; 66:682-701. [PMID: 39165742 PMCID: PMC11331368 DOI: 10.5187/jast.2024.e75] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 07/04/2023] [Accepted: 07/18/2023] [Indexed: 08/22/2024]
Abstract
This study estimated the heritabilities (h2) and genetic and phenotypic correlations between reproductive traits, including calving interval (CI), age at first calving (AFC), gestation length (GL), number of artificial inseminations per conception (NAIPC), and carcass traits, including carcass weight (CWT), eye muscle area (EMA), backfat thickness (BF), and marbling score (MS) in Korean Hanwoo cows. In addition, the accuracy of genomic predictions of breeding values was evaluated by applying the genomic best linear unbiased prediction (GBLUP) and the weighted GBLUP (WGBLUP) method. The phenotypic data for reproductive and carcass traits were collected from 1,544 Hanwoo cows, and all animals were genotyped using Illumina Bovine 50K single nucleotide polymorphism (SNP) chip. The genetic parameters were estimated using a multi-trait animal model using the MTG2 program. The estimated h2 for CI, AFC, GL, NAIPC, CWT, EMA, BF, and MS were 0.10, 0.13, 0.17, 0.11, 0.37, 0.35, 0.27, and 0.45, respectively, according to the GBLUP model. The GBLUP accuracy estimates ranged from 0.51 to 0.74, while the WGBLUP accuracy estimates for the traits under study ranged from 0.51 to 0.79. Strong and favorable genetic correlations were observed between GL and NAIPC (0.61), CWT and EMA (0.60), NAIPC and CWT (0.49), AFC and CWT (0.48), CI and GL (0.36), BF and MS (0.35), NAIPC and EMA (0.35), CI and BF (0.30), EMA and MS (0.28), CI and AFC (0.26), AFC and EMA (0.24), and AFC and BF (0.21). The present study identified low to moderate positive genetic correlations between reproductive and CWT traits, suggesting that a heavier body weight may lead to a longer CI, AFC, GL, and NAIPC. The moderately positive genetic correlation between CWT and AFC, and NAIPC, with a phenotypic correlation of nearly zero, suggesting that the genotype-environment interactions are more likely to be responsible for the phenotypic manifestation of these traits. As a result, the inclusion of these traits by breeders as selection criteria may present a good opportunity for developing a selection index to increase the response to the selection and identification of candidate animals, which can result in significantly increased profitability of production systems.
Collapse
Affiliation(s)
- Md Azizul Haque
- Department of Biotechnology, Yeungnam
University, Gyeongsan 38541, Korea
| | - Asif Iqbal
- Department of Biotechnology, Yeungnam
University, Gyeongsan 38541, Korea
| | | | - Yun-Mi Lee
- Department of Biotechnology, Yeungnam
University, Gyeongsan 38541, Korea
| | - Jae-Jung Ha
- Gyeongbuk Livestock Research
Institute, Yeongju 36052, Korea
| | - Jong-Joo Kim
- Department of Biotechnology, Yeungnam
University, Gyeongsan 38541, Korea
| |
Collapse
|
3
|
Mora M, González P, Quevedo JR, Montañés E, Tusell L, Bergsma R, Piles M. Impact of multi-output and stacking methods on feed efficiency prediction from genotype using machine learning algorithms. J Anim Breed Genet 2023; 140:638-652. [PMID: 37403756 DOI: 10.1111/jbg.12815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 05/23/2023] [Accepted: 06/23/2023] [Indexed: 07/06/2023]
Abstract
Feeding represents the largest economic cost in meat production; therefore, selection to improve traits related to feed efficiency is a goal in most livestock breeding programs. Residual feed intake (RFI), that is, the difference between the actual and the expected feed intake based on animal's requirements, has been used as the selection criteria to improve feed efficiency since it was proposed by Kotch in 1963. In growing pigs, it is computed as the residual of the multiple regression model of daily feed intake (DFI), on average daily gain (ADG), backfat thickness (BFT), and metabolic body weight (MW). Recently, prediction using single-output machine learning algorithms and information from SNPs as predictor variables have been proposed for genomic selection in growing pigs, but like in other species, the prediction quality achieved for RFI has been generally poor. However, it has been suggested that it could be improved through multi-output or stacking methods. For this purpose, four strategies were implemented to predict RFI. Two of them correspond to the computation of RFI in an indirect way using the predicted values of its components obtained from (i) individual (multiple single-output strategy) or (ii) simultaneous predictions (multi-output strategy). The other two correspond to the direct prediction of RFI using (iii) the individual predictions of its components as predictor variables jointly with the genotype (stacking strategy), or (iv) using only the genotypes as predictors of RFI (single-output strategy). The single-output strategy was considered the benchmark. This research aimed to test the former three hypotheses using data recorded from 5828 growing pigs and 45,610 SNPs. For all the strategies two different learning methods were fitted: random forest (RF) and support vector regression (SVR). A nested cross-validation (CV) with an outer 10-folds CV and an inner threefold CV for hyperparameter tuning was implemented to test all strategies. This scheme was repeated using as predictor variables different subsets with an increasing number (from 200 to 3000) of the most informative SNPs identified with RF. Results showed that the highest prediction performance was achieved with 1000 SNPs, although the stability of feature selection was poor (0.13 points out of 1). For all SNP subsets, the benchmark showed the best prediction performance. Using the RF as a learner and the 1000 most informative SNPs as predictors, the mean (SD) of the 10 values obtained in the test sets were: 0.23 (0.04) for the Spearman correlation, 0.83 (0.04) for the zero-one loss, and 0.33 (0.03) for the rank distance loss. We conclude that the information on predicted components of RFI (DFI, ADG, MW, and BFT) does not contribute to improve the quality of the prediction of this trait in relation to the one obtained with the single-output strategy.
Collapse
Affiliation(s)
- Mónica Mora
- Departamento de Ciencia Animal, Universidad Politècnica de València, Valencia, Spain
- Animal Breeding and Genetics, Institute of Agrifood Research and Technology (IRTA), Barcelona, Spain
| | - Pablo González
- Artificial Intelligence Centre, University of Oviedo, Gijón, Spain
| | | | - Elena Montañés
- Artificial Intelligence Centre, University of Oviedo, Gijón, Spain
| | - Llibertat Tusell
- Animal Breeding and Genetics, Institute of Agrifood Research and Technology (IRTA), Barcelona, Spain
| | - Rob Bergsma
- Topigs Norsvin Research Center, Beuningen, Netherlands
| | - Miriam Piles
- Animal Breeding and Genetics, Institute of Agrifood Research and Technology (IRTA), Barcelona, Spain
| |
Collapse
|
4
|
Chafai N, Hayah I, Houaga I, Badaoui B. A review of machine learning models applied to genomic prediction in animal breeding. Front Genet 2023; 14:1150596. [PMID: 37745853 PMCID: PMC10516561 DOI: 10.3389/fgene.2023.1150596] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 08/22/2023] [Indexed: 09/26/2023] Open
Abstract
The advent of modern genotyping technologies has revolutionized genomic selection in animal breeding. Large marker datasets have shown several drawbacks for traditional genomic prediction methods in terms of flexibility, accuracy, and computational power. Recently, the application of machine learning models in animal breeding has gained a lot of interest due to their tremendous flexibility and their ability to capture patterns in large noisy datasets. Here, we present a general overview of a handful of machine learning algorithms and their application in genomic prediction to provide a meta-picture of their performance in genomic estimated breeding values estimation, genotype imputation, and feature selection. Finally, we discuss a potential adoption of machine learning models in genomic prediction in developing countries. The results of the reviewed studies showed that machine learning models have indeed performed well in fitting large noisy data sets and modeling minor nonadditive effects in some of the studies. However, sometimes conventional methods outperformed machine learning models, which confirms that there's no universal method for genomic prediction. In summary, machine learning models have great potential for extracting patterns from single nucleotide polymorphism datasets. Nonetheless, the level of their adoption in animal breeding is still low due to data limitations, complex genetic interactions, a lack of standardization and reproducibility, and the lack of interpretability of machine learning models when trained with biological data. Consequently, there is no remarkable outperformance of machine learning methods compared to traditional methods in genomic prediction. Therefore, more research should be conducted to discover new insights that could enhance livestock breeding programs.
Collapse
Affiliation(s)
- Narjice Chafai
- Laboratory of Biodiversity, Ecology, and Genome, Department of Biology, Faculty of Sciences, Mohammed V University in Rabat, Rabat, Morocco
| | - Ichrak Hayah
- Laboratory of Biodiversity, Ecology, and Genome, Department of Biology, Faculty of Sciences, Mohammed V University in Rabat, Rabat, Morocco
| | - Isidore Houaga
- Centre for Tropical Livestock Genetics and Health, The Roslin Institute, Royal (Dick) School of Veterinary Medicine, The University of Edinburgh, Edinburgh, United Kingdom
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, United Kingdom
| | - Bouabid Badaoui
- Laboratory of Biodiversity, Ecology, and Genome, Department of Biology, Faculty of Sciences, Mohammed V University in Rabat, Rabat, Morocco
- African Sustainable Agriculture Research Institute (ASARI), Mohammed VI Polytechnic University (UM6P), Laayoune, Morocco
| |
Collapse
|
5
|
Neshat M, Lee S, Momin MM, Truong B, van der Werf JHJ, Lee SH. An effective hyper-parameter can increase the prediction accuracy in a single-step genetic evaluation. Front Genet 2023; 14:1104906. [PMID: 37359380 PMCID: PMC10285379 DOI: 10.3389/fgene.2023.1104906] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 05/23/2023] [Indexed: 06/28/2023] Open
Abstract
The H-matrix best linear unbiased prediction (HBLUP) method has been widely used in livestock breeding programs. It can integrate all information, including pedigree, genotypes, and phenotypes on both genotyped and non-genotyped individuals into one single evaluation that can provide reliable predictions of breeding values. The existing HBLUP method requires hyper-parameters that should be adequately optimised as otherwise the genomic prediction accuracy may decrease. In this study, we assess the performance of HBLUP using various hyper-parameters such as blending, tuning, and scale factor in simulated and real data on Hanwoo cattle. In both simulated and cattle data, we show that blending is not necessary, indicating that the prediction accuracy decreases when using a blending hyper-parameter <1. The tuning process (adjusting genomic relationships accounting for base allele frequencies) improves prediction accuracy in the simulated data, confirming previous studies, although the improvement is not statistically significant in the Hanwoo cattle data. We also demonstrate that a scale factor, α, which determines the relationship between allele frequency and per-allele effect size, can improve the HBLUP accuracy in both simulated and real data. Our findings suggest that an optimal scale factor should be considered to increase prediction accuracy, in addition to blending and tuning processes, when using HBLUP.
Collapse
Affiliation(s)
- Mehdi Neshat
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, Australia
- South Australian Health and Medical Research Institute (SAHMRI), Adelaide, SA, Australia
| | - Soohyun Lee
- Division of Animal Breeding and Genetics, National Institute of Animal Science (NIAS), Cheonan, Republic of Korea
| | - Md. Moksedul Momin
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, Australia
- South Australian Health and Medical Research Institute (SAHMRI), Adelaide, SA, Australia
- Department of Genetics and Animal Breeding, Faculty of Veterinary Medicine, Chattogram Veterinary and Animal Sciences University (CVASU), Chattogram, Bangladesh
| | - Buu Truong
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, Australia
- Cardiovascular Research Centre, Massachusetts General Hospital, Boston, MA, United States
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad, Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA, United States
| | | | - S. Hong Lee
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, Australia
- South Australian Health and Medical Research Institute (SAHMRI), Adelaide, SA, Australia
| |
Collapse
|
6
|
Perez BC, Bink MCAM, Svenson KL, Churchill GA, Calus MPL. Prediction performance of linear models and gradient boosting machine on complex phenotypes in outbred mice. G3 (BETHESDA, MD.) 2022; 12:6528848. [PMID: 35166767 PMCID: PMC8982369 DOI: 10.1093/g3journal/jkac039] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 01/29/2022] [Indexed: 12/14/2022]
Abstract
We compared the performance of linear (GBLUP, BayesB, and elastic net) methods to a nonparametric tree-based ensemble (gradient boosting machine) method for genomic prediction of complex traits in mice. The dataset used contained genotypes for 50,112 SNP markers and phenotypes for 835 animals from 6 generations. Traits analyzed were bone mineral density, body weight at 10, 15, and 20 weeks, fat percentage, circulating cholesterol, glucose, insulin, triglycerides, and urine creatinine. The youngest generation was used as a validation subset, and predictions were based on all older generations. Model performance was evaluated by comparing predictions for animals in the validation subset against their adjusted phenotypes. Linear models outperformed gradient boosting machine for 7 out of 10 traits. For bone mineral density, cholesterol, and glucose, the gradient boosting machine model showed better prediction accuracy and lower relative root mean squared error than the linear models. Interestingly, for these 3 traits, there is evidence of a relevant portion of phenotypic variance being explained by epistatic effects. Using a subset of top markers selected from a gradient boosting machine model helped for some of the traits to improve the accuracy of prediction when these were fitted into linear and gradient boosting machine models. Our results indicate that gradient boosting machine is more strongly affected by data size and decreased connectedness between reference and validation sets than the linear models. Although the linear models outperformed gradient boosting machine for the polygenic traits, our results suggest that gradient boosting machine is a competitive method to predict complex traits with assumed epistatic effects.
Collapse
Affiliation(s)
- Bruno C Perez
- Hendrix Genetics B.V., Research and Technology Center (RTC), 5830 AC Boxmeer, The Netherlands
| | - Marco C A M Bink
- Hendrix Genetics B.V., Research and Technology Center (RTC), 5830 AC Boxmeer, The Netherlands
| | | | | | - Mario P L Calus
- Wageningen University & Research, Animal Breeding and Genomics, 6700 AH Wageningen, The Netherlands
| |
Collapse
|