1
|
Schreiber M, Jayakodi M, Stein N, Mascher M. Plant pangenomes for crop improvement, biodiversity and evolution. Nat Rev Genet 2024; 25:563-577. [PMID: 38378816 DOI: 10.1038/s41576-024-00691-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/14/2023] [Indexed: 02/22/2024]
Abstract
Plant genome sequences catalogue genes and the genetic elements that regulate their expression. Such inventories further research aims as diverse as mapping the molecular basis of trait diversity in domesticated plants or inquiries into the origin of evolutionary innovations in flowering plants millions of years ago. The transformative technological progress of DNA sequencing in the past two decades has enabled researchers to sequence ever more genomes with greater ease. Pangenomes - complete sequences of multiple individuals of a species or higher taxonomic unit - have now entered the geneticists' toolkit. The genomes of crop plants and their wild relatives are being studied with translational applications in breeding in mind. But pangenomes are applicable also in ecological and evolutionary studies, as they help classify and monitor biodiversity across the tree of life, deepen our understanding of how plant species diverged and show how plants adapt to changing environments or new selection pressures exerted by human beings.
Collapse
Affiliation(s)
- Mona Schreiber
- Department of Biology, University of Marburg, Marburg, Germany
| | - Murukarthick Jayakodi
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
- Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany.
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany.
| |
Collapse
|
2
|
Meyenberg C, Braun V, Longin CFH, Thorwarth P. Feature engineering and parameter tuning: improving phenomic prediction ability in multi-environmental durum wheat breeding trials. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:188. [PMID: 39037501 PMCID: PMC11263437 DOI: 10.1007/s00122-024-04695-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 07/10/2024] [Indexed: 07/23/2024]
Abstract
KEY MESSAGE Optimized phenomic selection in durum wheat uses near-infrared spectra, feature engineering and parameter tuning. Our study reports improvements in predictive ability and emphasizes customized preprocessing for different traits and models. The success of plant breeding programs depends on efficient selection decisions. Phenomic selection has been proposed as a tool to predict phenotype performance based on near-infrared spectra (NIRS) to support selection decisions. In this study, we test the performance of phenomic selection in multi-environmental trials from our durum wheat breeding program for three breeding scenarios and use feature engineering as well as parameter tuning to improve the phenomic prediction ability. In addition, we investigate the influence of genotype and environment on the phenomic prediction ability for agronomic and quality traits. Preprocessing, based on a grid search over the Savitzky-Golay filter parameters based on 756,000 genotype best linear unbiased estimate (BLUE) computations, improved the phenomic prediction ability by up to 1500% (0.02-0.3). Furthermore, we show that preprocessing should be optimized depending on the dataset, trait, and model used for prediction. The phenomic prediction scenarios in our durum breeding program resulted in low-to-moderate prediction abilities with the highest and most stable prediction results when predicting new genotypes in the same environment as used for model training. This is consistent with the finding that NIRS capture both the genotype and genotype-by-environment ( G × E ) interaction variance.
Collapse
Affiliation(s)
- Carina Meyenberg
- State Plant Breeding Institute, University of Hohenheim, Fruwirthstr. 21, 70599, Stuttgart, Germany
| | - Vincent Braun
- State Plant Breeding Institute, University of Hohenheim, Fruwirthstr. 21, 70599, Stuttgart, Germany
| | | | - Patrick Thorwarth
- State Plant Breeding Institute, University of Hohenheim, Fruwirthstr. 21, 70599, Stuttgart, Germany.
| |
Collapse
|
3
|
Hemstrom W, Grummer JA, Luikart G, Christie MR. Next-generation data filtering in the genomics era. Nat Rev Genet 2024:10.1038/s41576-024-00738-6. [PMID: 38877133 DOI: 10.1038/s41576-024-00738-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2024] [Indexed: 06/16/2024]
Abstract
Genomic data are ubiquitous across disciplines, from agriculture to biodiversity, ecology, evolution and human health. However, these datasets often contain noise or errors and are missing information that can affect the accuracy and reliability of subsequent computational analyses and conclusions. A key step in genomic data analysis is filtering - removing sequencing bases, reads, genetic variants and/or individuals from a dataset - to improve data quality for downstream analyses. Researchers are confronted with a multitude of choices when filtering genomic data; they must choose which filters to apply and select appropriate thresholds. To help usher in the next generation of genomic data filtering, we review and suggest best practices to improve the implementation, reproducibility and reporting standards for filter types and thresholds commonly applied to genomic datasets. We focus mainly on filters for minor allele frequency, missing data per individual or per locus, linkage disequilibrium and Hardy-Weinberg deviations. Using simulated and empirical datasets, we illustrate the large effects of different filtering thresholds on common population genetics statistics, such as Tajima's D value, population differentiation (FST), nucleotide diversity (π) and effective population size (Ne).
Collapse
Affiliation(s)
- William Hemstrom
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| | - Jared A Grummer
- Flathead Lake Biological Station, Wildlife Biology Program and Division of Biological Sciences, University of Montana, Missoula, MT, USA
| | - Gordon Luikart
- Flathead Lake Biological Station, Wildlife Biology Program and Division of Biological Sciences, University of Montana, Missoula, MT, USA
| | - Mark R Christie
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
- Department of Forestry and Natural Resources, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
4
|
Aguirre NC, Villalba PV, García MN, Filippi CV, Rivas JG, Martínez MC, Acuña CV, López AJ, López JA, Pathauer P, Palazzini D, Harrand L, Oberschelp J, Marcó MA, Cisneros EF, Carreras R, Martins Alves AM, Rodrigues JC, Hopp HE, Grattapaglia D, Cappa EP, Paniego NB, Marcucci Poltri SN. Comparison of ddRADseq and EUChip60K SNP genotyping systems for population genetics and genomic selection in Eucalyptus dunnii (Maiden). Front Genet 2024; 15:1361418. [PMID: 38606359 PMCID: PMC11008695 DOI: 10.3389/fgene.2024.1361418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Accepted: 02/19/2024] [Indexed: 04/13/2024] Open
Abstract
Eucalyptus dunnii is one of the most important Eucalyptus species for short-fiber pulp production in regions where other species of the genus are affected by poor soil and climatic conditions. In this context, E. dunnii holds promise as a resource to address and adapt to the challenges of climate change. Despite its rapid growth and favorable wood properties for solid wood products, the advancement of its improvement remains in its early stages. In this work, we evaluated the performance of two single nucleotide polymorphism, (SNP), genotyping methods for population genetics analysis and Genomic Selection in E. dunnii. Double digest restriction-site associated DNA sequencing (ddRADseq) was compared with the EUChip60K array in 308 individuals from a provenance-progeny trial. The compared SNP set included 8,011 and 19,008 informative SNPs distributed along the 11 chromosomes, respectively. Although the two datasets differed in the percentage of missing data, genome coverage, minor allele frequency and estimated genetic diversity parameters, they revealed a similar genetic structure, showing two subpopulations with little differentiation between them, and low linkage disequilibrium. GS analyses were performed for eleven traits using Genomic Best Linear Unbiased Prediction (GBLUP) and a conventional pedigree-based model (ABLUP). Regardless of the SNP dataset, the predictive ability (PA) of GBLUP was better than that of ABLUP for six traits (Cellulose content, Total and Ethanolic extractives, Total and Klason lignin content and Syringyl and Guaiacyl lignin monomer ratio). When contrasting the SNP datasets used to estimate PAs, the GBLUP-EUChip60K model gave higher and significant PA values for six traits, meanwhile, the values estimated using ddRADseq gave higher values for three other traits. The PAs correlated positively with narrow sense heritabilities, with the highest correlations shown by the ABLUP and GBLUP-EUChip60K. The two genotyping methods, ddRADseq and EUChip60K, are generally comparable for population genetics and genomic prediction, demonstrating the utility of the former when subjected to rigorous SNP filtering. The results of this study provide a basis for future whole-genome studies using ddRADseq in non-model forest species for which SNP arrays have not yet been developed.
Collapse
Affiliation(s)
| | | | - Martín Nahuel García
- Instituto de Agrobiotecnología y Biología Molecular, UEDD INTA-CONICET, Hurlingham, Argentina
| | - Carla Valeria Filippi
- Instituto de Agrobiotecnología y Biología Molecular, UEDD INTA-CONICET, Hurlingham, Argentina
- Laboratorio de Bioquímica, Departamento de Biología Vegetal, Facultad de Agronomía, Universidad de la República, Montevideo, Uruguay
| | - Juan Gabriel Rivas
- Instituto de Agrobiotecnología y Biología Molecular, UEDD INTA-CONICET, Hurlingham, Argentina
| | - María Carolina Martínez
- Instituto de Agrobiotecnología y Biología Molecular, UEDD INTA-CONICET, Hurlingham, Argentina
| | - Cintia Vanesa Acuña
- Instituto de Agrobiotecnología y Biología Molecular, UEDD INTA-CONICET, Hurlingham, Argentina
| | - Augusto J. López
- Estación Experimental Agropecuaria de Bella Vista, Instituto Nacional de Tecnología Agropecuaria, Bella Vista, Argentina
| | - Juan Adolfo López
- Estación Experimental Agropecuaria de Bella Vista, Instituto Nacional de Tecnología Agropecuaria, Bella Vista, Argentina
| | - Pablo Pathauer
- Instituto de Recursos Biológicos, Instituto Nacional de Tecnología Agropecuaria, Hurlingham, Argentina
| | - Dino Palazzini
- Instituto de Recursos Biológicos, Instituto Nacional de Tecnología Agropecuaria, Hurlingham, Argentina
| | - Leonel Harrand
- Estación Experimental Agropecuaria de Concordia, Instituto Nacional de Tecnología Agropecuaria, Concordia, Argentina
| | - Javier Oberschelp
- Estación Experimental Agropecuaria de Concordia, Instituto Nacional de Tecnología Agropecuaria, Concordia, Argentina
| | - Martín Alberto Marcó
- Estación Experimental Agropecuaria de Concordia, Instituto Nacional de Tecnología Agropecuaria, Concordia, Argentina
| | - Esteban Felipe Cisneros
- Facultad de Ciencias Forestales, Universidad Nacional de Santiago del Estero (UNSE), Santiago del Estero, Argentina
| | - Rocío Carreras
- Facultad de Ciencias Forestales, Universidad Nacional de Santiago del Estero (UNSE), Santiago del Estero, Argentina
| | - Ana Maria Martins Alves
- Centro de Estudos Florestais e Laboratório Associado TERRA, Instituto Superior de Agronomia, Universidade de Lisboa, Tapada da Ajuda, Lisboa, Portugal
| | - José Carlos Rodrigues
- Centro de Estudos Florestais e Laboratório Associado TERRA, Instituto Superior de Agronomia, Universidade de Lisboa, Tapada da Ajuda, Lisboa, Portugal
| | - H. Esteban Hopp
- Instituto de Agrobiotecnología y Biología Molecular, UEDD INTA-CONICET, Hurlingham, Argentina
| | - Dario Grattapaglia
- Empresa Brasileira de Pesquisa Agropecuária (EMBRAPA), Recursos Genéticos e Biotecnologia, Brasilia, Brazil
| | - Eduardo Pablo Cappa
- Instituto de Recursos Biológicos, Instituto Nacional de Tecnología Agropecuaria, Hurlingham, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas, Buenos Aires, Argentina
| | - Norma Beatriz Paniego
- Instituto de Agrobiotecnología y Biología Molecular, UEDD INTA-CONICET, Hurlingham, Argentina
| | | |
Collapse
|
5
|
Freitas LA, Savegnago RP, Alves AAC, Stafuzza NB, Pedrosa VB, Rocha RA, Rosa GJM, Paz CCP. Genome-enabled prediction of indicator traits of resistance to gastrointestinal nematodes in sheep using parametric models and artificial neural networks. Res Vet Sci 2024; 166:105099. [PMID: 38091815 DOI: 10.1016/j.rvsc.2023.105099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 11/15/2023] [Accepted: 11/19/2023] [Indexed: 01/01/2024]
Abstract
This study aimed to assess the predictive ability of parametric models and artificial neural network method for genomic prediction of the following indicator traits of resistance to gastrointestinal nematodes in Santa Inês sheep: packed cell volume (PCV), fecal egg count (FEC), and Famacha© method (FAM). After quality control, the number of genotyped animals was 551 (PCV), 548 (FEC), and 565 (FAM), and 41,676 SNP. The average prediction accuracy (ACC) calculated by Pearson correlation between observed and predicted values and mean squared errors (MSE) were obtained using genomic best unbiased linear predictor (GBLUP), BayesA, BayesB, Bayesian least absolute shrinkage and selection operator (BLASSO), and Bayesian regularized artificial neural network (three and four hidden neurons, BRANN_3 and BRANN_4, respectively) in a 5-fold cross-validation technique. The average ACC varied from moderate to high according to the trait and models, ranging between 0.418 and 0.546 (PCV), between 0.646 and 0.793 (FEC), and between 0.414 and 0.519 (FAM). Parametric models presented nearly the same ACC and MSE for the studied traits and provided better accuracies than BRANN. The GBLUP, BayesA, BayesB and BLASSO models provided better accuracies than the BRANN_3 method, increasing by around 23% for PCV, and 18.5% for FEC. In conclusion, parametric models are suitable for genome-enabled prediction of indicator traits of resistance to gastrointestinal nematodes in sheep. Due to the small differences in accuracy found between them, the use of the GBLUP model is recommended due to its lower computational costs.
Collapse
Affiliation(s)
- L A Freitas
- University of Sao Paulo, Department of Genetics, Ribeirão Preto, São Paulo 14049-900, Brazil; University of Wisconsin, Department of Animal and Dairy Sciences, Madison 53706, USA.
| | - R P Savegnago
- Michigan State University, Department of Animal Science, MI 48864, USA.
| | - A A C Alves
- University of Wisconsin, Department of Animal and Dairy Sciences, Madison 53706, USA.
| | - N B Stafuzza
- Sustainable Livestock Research Center, Animal Science Institute, São José do Rio Preto, São Paulo 15130-000, Brazil
| | - V B Pedrosa
- State University of Ponta Grossa, Ponta Grossa, Paraná 84030-900, Brazil.
| | - R A Rocha
- State University of Ponta Grossa, Ponta Grossa, Paraná 84030-900, Brazil.
| | - G J M Rosa
- University of Wisconsin, Department of Animal and Dairy Sciences, Madison 53706, USA.
| | - C C P Paz
- University of Sao Paulo, Department of Genetics, Ribeirão Preto, São Paulo 14049-900, Brazil; Sustainable Livestock Research Center, Animal Science Institute, São José do Rio Preto, São Paulo 15130-000, Brazil.
| |
Collapse
|
6
|
Lozada DN, Sandhu KS, Bhatta M. Ridge regression and deep learning models for genome-wide selection of complex traits in New Mexican Chile peppers. BMC Genom Data 2023; 24:80. [PMID: 38110866 PMCID: PMC10726521 DOI: 10.1186/s12863-023-01179-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 12/05/2023] [Indexed: 12/20/2023] Open
Abstract
BACKGROUND Genomewide prediction estimates the genomic breeding values of selection candidates which can be utilized for population improvement and cultivar development. Ridge regression and deep learning-based selection models were implemented for yield and agronomic traits of 204 chile pepper genotypes evaluated in multi-environment trials in New Mexico, USA. RESULTS Accuracy of prediction differed across different models under ten-fold cross-validations, where high prediction accuracy was observed for highly heritable traits such as plant height and plant width. No model was superior across traits using 14,922 SNP markers for genomewide selection. Bayesian ridge regression had the highest average accuracy for first pod date (0.77) and total yield per plant (0.33). Multilayer perceptron (MLP) was the most superior for flowering time (0.76) and plant height (0.73), whereas the genomic BLUP model had the highest accuracy for plant width (0.62). Using a subset of 7,690 SNP loci resulting from grouping markers based on linkage disequilibrium coefficients resulted in improved accuracy for first pod date, ten pod weight, and total yield per plant, even under a relatively small training population size for MLP and random forest models. Genomic and ridge regression BLUP models were sufficient for optimal prediction accuracies for small training population size. Combining phenotypic selection and genomewide selection resulted in improved selection response for yield-related traits, indicating that integrated approaches can result in improved gains achieved through selection. CONCLUSIONS Accuracy values for ridge regression and deep learning prediction models demonstrate the potential of implementing genomewide selection for genetic improvement in chile pepper breeding programs. Ultimately, a large training data is relevant for improved genomic selection accuracy for the deep learning models.
Collapse
Affiliation(s)
- Dennis N Lozada
- Department of Plant and Environmental Sciences, New Mexico State University, Las Cruces, NM, 88003, USA.
- Chile Pepper Institute, New Mexico State University, Las Cruces, NM, 88003, USA.
| | | | | |
Collapse
|
7
|
Nisa FU, Kaul H, Asif M, Amin I, Mrode R, Mansoor S, Mukhtar Z. Genetic insights into crossbred dairy cattle of Pakistan: exploring allele frequency, linkage disequilibrium, and effective population size at a genome-wide scale. Mamm Genome 2023; 34:602-614. [PMID: 37804434 DOI: 10.1007/s00335-023-10019-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 09/13/2023] [Indexed: 10/09/2023]
Abstract
Linkage disequilibrium (LD) affects genomic studies accuracy. High-density genotyping platforms identify SNPs across animal genomes, increasing LD evaluation resolution for accurate analysis. This study aimed to evaluate the decay and magnitude of LD in a cohort of 81 crossbred dairy cattle using the GGP_HDv3_C Bead Chip. After quality control, 116,710 Single Nucleotide Polymorphisms (SNPs) across 2520.241 Mb of autosomes were retained. LD extent was assessed between autosomal SNPs within a 10 Mb range using the r2 statistics. LD value declined as inter-marker distance increased. The average r2 value was 0.24 for SNP pairs < 10 kb apart, decreasing to 0.13 for 50-100 kb distances. Minor allele frequency (MAF) and sample size significantly impact LD. Lower MAF thresholds result in smaller r2 values, while higher thresholds show increased r2 values. Additionally, smaller sample sizes exhibit higher average r2 values, especially for larger physical distance intervals (> 50 kb) between SNP pairs. Effective population size and inbreeding coefficient were 150 and 0.028 for the present generation, indicating a decrease in genetic diversity over time. These findings imply that the utilization of high-density SNP panels and customized/breed-specific SNP panels represent a highly favorable approach for conducting genome-wide association studies (GWAS) and implementing genomic selection (GS) in the Bos indicus cattle breeds, whose genomes are still largely unexplored. Furthermore, it is imperative to devise a meticulous breeding strategy tailored to each herd, aiming to enhance desired traits while simultaneously preserving genetic diversity.
Collapse
Affiliation(s)
- Fakhar Un Nisa
- Agricultural Biotechnology Division, National Institute for Biotechnology and Genetic Engineering College (NIBGE-C), Faisalabad, Pakistan
- Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
- Department of Animal Breeding and Genetics, University of Veterinary and Animal Sciences, Lahore, Pakistan
| | - Haiba Kaul
- Department of Animal Breeding and Genetics, University of Veterinary and Animal Sciences, Lahore, Pakistan
| | - Muhammad Asif
- Agricultural Biotechnology Division, National Institute for Biotechnology and Genetic Engineering College (NIBGE-C), Faisalabad, Pakistan
- Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
| | - Imran Amin
- Agricultural Biotechnology Division, National Institute for Biotechnology and Genetic Engineering College (NIBGE-C), Faisalabad, Pakistan
- Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
| | - Raphael Mrode
- Animal Biosciences, International Livestock Research Institute, Nairobi, Kenya
- Animal and Veterinary Sciences, Scotland's Rural College, Edinburgh, UK
| | - Shahid Mansoor
- Agricultural Biotechnology Division, National Institute for Biotechnology and Genetic Engineering College (NIBGE-C), Faisalabad, Pakistan
- Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
- International Centre for Chemical and Biological Sciences, University of Karachi, Karachi, Pakistan
| | - Zahid Mukhtar
- Agricultural Biotechnology Division, National Institute for Biotechnology and Genetic Engineering College (NIBGE-C), Faisalabad, Pakistan.
- Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan.
| |
Collapse
|
8
|
Jurado-Ruiz F, Rousseau D, Botía JA, Aranzana MJ. GenoDrawing: An Autoencoder Framework for Image Prediction from SNP Markers. PLANT PHENOMICS (WASHINGTON, D.C.) 2023; 5:0113. [PMID: 38239740 PMCID: PMC10795539 DOI: 10.34133/plantphenomics.0113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 10/23/2023] [Indexed: 01/22/2024]
Abstract
Advancements in genome sequencing have facilitated whole-genome characterization of numerous plant species, providing an abundance of genotypic data for genomic analysis. Genomic selection and neural networks (NNs), particularly deep learning, have been developed to predict complex traits from dense genotypic data. Autoencoders, an NN model to extract features from images in an unsupervised manner, has proven to be useful for plant phenotyping. This study introduces an autoencoder framework, GenoDrawing, for predicting and retrieving apple images from a low-depth single-nucleotide polymorphism (SNP) array, potentially useful in predicting traits that are difficult to define. GenoDrawing demonstrates proficiency in its task using a small dataset of shape-related SNPs. Results indicate that the use of SNPs associated with visual traits has substantial impact on the generated images, consistent with biological interpretation. While using substantial SNPs is crucial, incorporating additional, unrelated SNPs results in performance degradation for simple NN architectures that cannot easily identify the most important inputs. The proposed GenoDrawing method is a practical framework for exploring genomic prediction in fruit tree phenotyping, particularly beneficial for small to medium breeding companies to predict economically substantial heritable traits. Although GenoDrawing has limitations, it sets the groundwork for future research in image prediction from genomic markers. Future studies should focus on using stronger models for image reproduction, SNP information extraction, and dataset balance in terms of phenotypes for more precise outcomes.
Collapse
Affiliation(s)
- Federico Jurado-Ruiz
- Center for Research in Agricultural Genomics (CRAG), 08193 Barcelona, Cerdanyola, Spain
| | - David Rousseau
- Université d’Angers, LARIS, INRAe UMR IRHS, 49000 Angers, France
| | - Juan A. Botía
- Department of Information and Communication Engineering,
University of Murcia, 30071 Murcia, Spain
| | - Maria José Aranzana
- Center for Research in Agricultural Genomics (CRAG), 08193 Barcelona, Cerdanyola, Spain
- IRTA (Institut de Recerca i Tecnologia Agroalimentàries), Barcelona, Spain
| |
Collapse
|
9
|
Weber SE, Chawla HS, Ehrig L, Hickey LT, Frisch M, Snowdon RJ. Accurate prediction of quantitative traits with failed SNP calls in canola and maize. FRONTIERS IN PLANT SCIENCE 2023; 14:1221750. [PMID: 37936929 PMCID: PMC10627008 DOI: 10.3389/fpls.2023.1221750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 10/05/2023] [Indexed: 11/09/2023]
Abstract
In modern plant breeding, genomic selection is becoming the gold standard to select superior genotypes in large breeding populations that are only partially phenotyped. Many breeding programs commonly rely on single-nucleotide polymorphism (SNP) markers to capture genome-wide data for selection candidates. For this purpose, SNP arrays with moderate to high marker density represent a robust and cost-effective tool to generate reproducible, easy-to-handle, high-throughput genotype data from large-scale breeding populations. However, SNP arrays are prone to technical errors that lead to failed allele calls. To overcome this problem, failed calls are often imputed, based on the assumption that failed SNP calls are purely technical. However, this ignores the biological causes for failed calls-for example: deletions-and there is increasing evidence that gene presence-absence and other kinds of genome structural variants can play a role in phenotypic expression. Because deletions are frequently not in linkage disequilibrium with their flanking SNPs, permutation of missing SNP calls can potentially obscure valuable marker-trait associations. In this study, we analyze published datasets for canola and maize using four parametric and two machine learning models and demonstrate that failed allele calls in genomic prediction are highly predictive for important agronomic traits. We present two statistical pipelines, based on population structure and linkage disequilibrium, that enable the filtering of failed SNP calls that are likely caused by biological reasons. For the population and trait examined, prediction accuracy based on these filtered failed allele calls was competitive to standard SNP-based prediction, underlying the potential value of missing data in genomic prediction approaches. The combination of SNPs with all failed allele calls or the filtered allele calls did not outperform predictions with only SNP-based prediction due to redundancy in genomic relationship estimates.
Collapse
Affiliation(s)
- Sven E. Weber
- Department of Plant Breeding, Justus Liebig University, Giessen, Germany
| | | | - Lennard Ehrig
- Department of Plant Breeding, Justus Liebig University, Giessen, Germany
| | - Lee T. Hickey
- Centre for Crop Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St Lucia, QLD, Australia
| | - Matthias Frisch
- Department of Biometry and Population Genetics, Justus Liebig University, Giessen, Germany
| | - Rod J. Snowdon
- Department of Plant Breeding, Justus Liebig University, Giessen, Germany
| |
Collapse
|
10
|
Sadeqi MB, Ballvora A, Dadshani S, Léon J. Genetic Parameter and Hyper-Parameter Estimation Underlie Nitrogen Use Efficiency in Bread Wheat. Int J Mol Sci 2023; 24:14275. [PMID: 37762585 PMCID: PMC10531695 DOI: 10.3390/ijms241814275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 09/07/2023] [Accepted: 09/14/2023] [Indexed: 09/29/2023] Open
Abstract
Estimation and prediction play a key role in breeding programs. Currently, phenotyping of complex traits such as nitrogen use efficiency (NUE) in wheat is still expensive, requires high-throughput technologies and is very time consuming compared to genotyping. Therefore, researchers are trying to predict phenotypes based on marker information. Genetic parameters such as population structure, genomic relationship matrix, marker density and sample size are major factors that increase the performance and accuracy of a model. However, they play an important role in adjusting the statistically significant false discovery rate (FDR) threshold in estimation. In parallel, there are many genetic hyper-parameters that are hidden and not represented in the given genomic selection (GS) model but have significant effects on the results, such as panel size, number of markers, minor allele frequency, number of call rates for each marker, number of cross validations and batch size in the training set of the genomic file. The main challenge is to ensure the reliability and accuracy of predicted breeding values (BVs) as results. Our study has confirmed the results of bias-variance tradeoff and adaptive prediction error for the ensemble-learning-based model STACK, which has the highest performance when estimating genetic parameters and hyper-parameters in a given GS model compared to other models.
Collapse
Affiliation(s)
- Mohammad Bahman Sadeqi
- INRES-Plant Breeding, Rheinische Friedrich-Wilhelms-Universität Bonn, 53113 Bonn, Germany; (M.B.S.); (J.L.)
| | - Agim Ballvora
- INRES-Plant Breeding, Rheinische Friedrich-Wilhelms-Universität Bonn, 53113 Bonn, Germany; (M.B.S.); (J.L.)
| | - Said Dadshani
- INRES-Plant Nutrition, Rheinische Friedrich-Wilhelms-Universität Bonn, 53113 Bonn, Germany;
| | - Jens Léon
- INRES-Plant Breeding, Rheinische Friedrich-Wilhelms-Universität Bonn, 53113 Bonn, Germany; (M.B.S.); (J.L.)
| |
Collapse
|
11
|
Alves AAC, Fernandes AFA, Lopes FB, Breen V, Hawken R, Gianola D, Rosa GJDM. (Quasi) multitask support vector regression with heuristic hyperparameter optimization for whole-genome prediction of complex traits: a case study with carcass traits in broilers. G3 (BETHESDA, MD.) 2023; 13:jkad109. [PMID: 37216670 PMCID: PMC10411556 DOI: 10.1093/g3journal/jkad109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 03/13/2023] [Accepted: 04/24/2023] [Indexed: 05/24/2023]
Abstract
This study investigates nonlinear kernels for multitrait (MT) genomic prediction using support vector regression (SVR) models. We assessed the predictive ability delivered by single-trait (ST) and MT models for 2 carcass traits (CT1 and CT2) measured in purebred broiler chickens. The MT models also included information on indicator traits measured in vivo [Growth and feed efficiency trait (FE)]. We proposed an approach termed (quasi) multitask SVR (QMTSVR), with hyperparameter optimization performed via genetic algorithm. ST and MT Bayesian shrinkage and variable selection models [genomic best linear unbiased predictor (GBLUP), BayesC (BC), and reproducing kernel Hilbert space (RKHS) regression] were employed as benchmarks. MT models were trained using 2 validation designs (CV1 and CV2), which differ if the information on secondary traits is available in the testing set. Models' predictive ability was assessed with prediction accuracy (ACC; i.e. the correlation between predicted and observed values, divided by the square root of phenotype accuracy), standardized root-mean-squared error (RMSE*), and inflation factor (b). To account for potential bias in CV2-style predictions, we also computed a parametric estimate of accuracy (ACCpar). Predictive ability metrics varied according to trait, model, and validation design (CV1 or CV2), ranging from 0.71 to 0.84 for ACC, 0.78 to 0.92 for RMSE*, and between 0.82 and 1.34 for b. The highest ACC and smallest RMSE* were achieved with QMTSVR-CV2 in both traits. We observed that for CT1, model/validation design selection was sensitive to the choice of accuracy metric (ACC or ACCpar). Nonetheless, the higher predictive accuracy of QMTSVR over MTGBLUP and MTBC was replicated across accuracy metrics, besides the similar performance between the proposed method and the MTRKHS model. Results showed that the proposed approach is competitive with conventional MT Bayesian regression models using either Gaussian or spike-slab multivariate priors.
Collapse
Affiliation(s)
| | | | | | - Vivian Breen
- Cobb-Vantress Inc., Siloam Springs, AR 72761, USA
| | | | - Daniel Gianola
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | |
Collapse
|
12
|
Wolf MJ, Neumann GB, Kokuć P, Yin T, Brockmann GA, König S, May K. Genetic evaluations for endangered dual-purpose German Black Pied cattle using 50K SNPs, a breed-specific 200K chip, and whole-genome sequencing. J Dairy Sci 2023; 106:3345-3358. [PMID: 37028956 DOI: 10.3168/jds.2022-22665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 12/16/2022] [Indexed: 04/09/2023]
Abstract
Genetic evaluations of local cattle breeds are hampered due to small reference groups or biased due to the utilization of SNP effects estimated in other large populations. Against this background, there is a lack of studies addressing the possible advantage of whole-genome sequences (WGS) or consideration of specific variants from WGS data in genomic predictions for local breeds with small population size. Consequently, the aim of this study was to compare genetic parameters and accuracies of genomic estimated breeding values (GEBV) for 305-d production traits, fat-to protein ratio (FPR), and somatic cell score (SCS) at the first test date after calving and confirmation traits of the endangered German Black Pied cattle (DSN) breed using 4 different marker panels: (1) the commercial 50K Illumina BovineSNP50 BeadChip, (2) a customized 200K chip designed for DSN (DSN200K) which considers the most important variants for DSN from WGS, (3) randomly generated 200K chips based on WGS data, and (4) a WGS panel. The same number of animals was considered for all marker panel analyses (i.e., 1,811 genotyped or sequenced cows for conformation traits, 2,383 cows for lactation production traits, and 2,420 cows for FPR and SCS). Mixed models for the estimation of genetic parameters directly included the respective genomic relationship matrix from the different marker panels plus the trait-specific fixed effects. For the calculation of GEBV accuracies, we applied repeated random subsampling validation. In the process of separate cross-validations per trait, we created a validation set including 20% of cows with masked phenotypes, and a training set comprising 80% of the cows. The cows were selected randomly in a procedure with 10 replicates considering replacements in the different scenarios. The accuracy was defined as the correlation between the direct GEBV and the phenotypes with subtracted corresponding fixed effects for the cows in the validation set. For FPR and SCS, as well as for lactation production traits, heritabilities were largest based on WGS data, but the increase compared with the 50K or DSN200K applications was quite small in the range from 0.01 to 0.03. Also, for most of the conformation traits, heritabilities were largest based on WGS and DSN200K data, but the increase was in the range of the corresponding standard error. Accordingly, GEBV accuracies for most of the studied traits were highest based on WGS data or when utilizing the DSN200K chip, but the accuracy differences across the marker panels were quite small and nonsignificant. In conclusion, WGS data and the DSN200K chip only contributed to minor improvements in genomic predictions, still justifying the use of the commercial 50K chip. Nevertheless, WGS and the 200KDSN chip harbor breed-specific variants, which are valuable for studying causal genetic mechanisms in the endangered DSN population.
Collapse
Affiliation(s)
- Manuel J Wolf
- Institute of Animal Breeding and Genetics, Justus-Liebig-University Gießen, 35390 Gießen, Germany
| | - Guilherme B Neumann
- Animal Breeding Biology and Molecular Genetics, Albrecht Daniel Thaer-Institute for Agricultural and Horticultural Sciences, Humboldt Universität zu Berlin, 10115 Berlin, Germany
| | - Paula Kokuć
- Animal Breeding Biology and Molecular Genetics, Albrecht Daniel Thaer-Institute for Agricultural and Horticultural Sciences, Humboldt Universität zu Berlin, 10115 Berlin, Germany
| | - Tong Yin
- Institute of Animal Breeding and Genetics, Justus-Liebig-University Gießen, 35390 Gießen, Germany
| | - Gudrun A Brockmann
- Animal Breeding Biology and Molecular Genetics, Albrecht Daniel Thaer-Institute for Agricultural and Horticultural Sciences, Humboldt Universität zu Berlin, 10115 Berlin, Germany
| | - Sven König
- Institute of Animal Breeding and Genetics, Justus-Liebig-University Gießen, 35390 Gießen, Germany.
| | - Katharina May
- Institute of Animal Breeding and Genetics, Justus-Liebig-University Gießen, 35390 Gießen, Germany
| |
Collapse
|
13
|
Alemu A, Batista L, Singh PK, Ceplitis A, Chawade A. Haplotype-tagged SNPs improve genomic prediction accuracy for Fusarium head blight resistance and yield-related traits in wheat. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:92. [PMID: 37009920 PMCID: PMC10068637 DOI: 10.1007/s00122-023-04352-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 03/21/2023] [Indexed: 06/19/2023]
Abstract
Linkage disequilibrium (LD)-based haplotyping with subsequent SNP tagging improved the genomic prediction accuracy up to 0.07 and 0.092 for Fusarium head blight resistance and spike width, respectively, across six different models. Genomic prediction is a powerful tool to enhance genetic gain in plant breeding. However, the method is accompanied by various complications leading to low prediction accuracy. One of the major challenges arises from the complex dimensionality of marker data. To overcome this issue, we applied two pre-selection methods for SNP markers viz. LD-based haplotype-tagging and GWAS-based trait-linked marker identification. Six different models were tested with preselected SNPs to predict the genomic estimated breeding values (GEBVs) of four traits measured in 419 winter wheat genotypes. Ten different sets of haplotype-tagged SNPs were selected by adjusting the level of LD thresholds. In addition, various sets of trait-linked SNPs were identified with different scenarios from the training-test combined and only from the training populations. The BRR and RR-BLUP models developed from haplotype-tagged SNPs had a higher prediction accuracy for FHB and SPW by 0.07 and 0.092, respectively, compared to the corresponding models developed without marker pre-selection. The highest prediction accuracy for SPW and FHB was achieved with tagged SNPs pruned at weak LD thresholds (r2 < 0.5), while stringent LD was required for spike length (SPL) and flag leaf area (FLA). Trait-linked SNPs identified only from training populations failed to improve the prediction accuracy of the four studied traits. Pre-selection of SNPs via LD-based haplotype-tagging could play a vital role in optimizing genomic selection and reducing genotyping costs. Furthermore, the method could pave the way for developing low-cost genotyping methods through customized genotyping platforms targeting key SNP markers tagged to essential haplotype blocks.
Collapse
Affiliation(s)
- Admas Alemu
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | | | - Pawan K Singh
- International Maize and Wheat Improvement Center, Texcoco, Mexico
| | | | - Aakash Chawade
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden.
| |
Collapse
|
14
|
Wu PY, Ou JH, Liao CT. Sample size determination for training set optimization in genomic prediction. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:57. [PMID: 36912999 PMCID: PMC10011335 DOI: 10.1007/s00122-023-04254-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Accepted: 11/07/2022] [Indexed: 06/18/2023]
Abstract
A practical approach is developed to determine a cost-effective optimal training set for selective phenotyping in a genomic prediction study. An R function is provided to facilitate the application of the approach. Genomic prediction (GP) is a statistical method used to select quantitative traits in animal or plant breeding. For this purpose, a statistical prediction model is first built that uses phenotypic and genotypic data in a training set. The trained model is then used to predict genomic estimated breeding values (GEBVs) for individuals within a breeding population. Setting the sample size of the training set usually takes into account time and space constraints that are inevitable in an agricultural experiment. However, the determination of the sample size remains an unresolved issue for a GP study. By applying the logistic growth curve to identify prediction accuracy for the GEBVs and the training set size, a practical approach was developed to determine a cost-effective optimal training set for a given genome dataset with known genotypic data. Three real genome datasets were used to illustrate the proposed approach. An R function is provided to facilitate widespread application of this approach to sample size determination, which can help breeders to identify a set of genotypes with an economical sample size for selective phenotyping.
Collapse
Affiliation(s)
- Po-Ya Wu
- Department of Agronomy, National Taiwan University, Taipei, Taiwan
- Institute for Quantitative Genetics and Genomics of Plants, Heinrich Heine University, Düsseldorf, Germany
| | - Jen-Hsiang Ou
- Department of Agronomy, National Taiwan University, Taipei, Taiwan
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Chen-Tuo Liao
- Department of Agronomy, National Taiwan University, Taipei, Taiwan.
| |
Collapse
|
15
|
Fernández-González J, Akdemir D, Isidro Y Sánchez J. A comparison of methods for training population optimization in genomic selection. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:30. [PMID: 36892603 PMCID: PMC9998580 DOI: 10.1007/s00122-023-04265-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 11/21/2022] [Indexed: 06/18/2023]
Abstract
Maximizing CDmean and Avg_GRM_self were the best criteria for training set optimization. A training set size of 50-55% (targeted) or 65-85% (untargeted) is needed to obtain 95% of the accuracy. With the advent of genomic selection (GS) as a widespread breeding tool, mechanisms to efficiently design an optimal training set for GS models became more relevant, since they allow maximizing the accuracy while minimizing the phenotyping costs. The literature described many training set optimization methods, but there is a lack of a comprehensive comparison among them. This work aimed to provide an extensive benchmark among optimization methods and optimal training set size by testing a wide range of them in seven datasets, six different species, different genetic architectures, population structure, heritabilities, and with several GS models to provide some guidelines about their application in breeding programs. Our results showed that targeted optimization (uses information from the test set) performed better than untargeted (does not use test set data), especially when heritability was low. The mean coefficient of determination was the best targeted method, although it was computationally intensive. Minimizing the average relationship within the training set was the best strategy for untargeted optimization. Regarding the optimal training set size, maximum accuracy was obtained when the training set was the entire candidate set. Nevertheless, a 50-55% of the candidate set was enough to reach 95-100% of the maximum accuracy in the targeted scenario, while we needed a 65-85% for untargeted optimization. Our results also suggested that a diverse training set makes GS robust against population structure, while including clustering information was less effective. The choice of the GS model did not have a significant influence on the prediction accuracies.
Collapse
Affiliation(s)
- Javier Fernández-González
- Centro de Biotecnologia y Genómica de Plantas (CBGP, UPM-INIA), Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnologia Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223, Madrid, Spain.
| | - Deniz Akdemir
- CIBMTR (Center for International Blood and Marrow Transplant Research), National Marrow Donor Program/Be The Match, Minneapolis, USA
| | - Julio Isidro Y Sánchez
- Centro de Biotecnologia y Genómica de Plantas (CBGP, UPM-INIA), Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnologia Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223, Madrid, Spain.
| |
Collapse
|
16
|
Schneider H, Heise J, Tetens J, Thaller G, Wellmann R, Bennewitz J. Genomic dominance variance analysis of health and milk production traits in German Holstein cattle. J Anim Breed Genet 2023. [PMID: 36872841 DOI: 10.1111/jbg.12765] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 02/12/2023] [Indexed: 03/07/2023]
Abstract
Genomic analyses commonly explore the additive genetic variance of traits. The non-additive variance, however, is usually small but often significant in dairy cattle. This study aimed at dissecting the genetic variance of eight health traits that recently entered the total merit index in Germany and the somatic cell score (SCS), as well as four milk production traits by analysing additive and dominance variance components. The heritabilities were low for all health traits (between 0.033 for mastitis and 0.099 for SCS), and moderate for the milk production traits (between 0.261 for milk energy yield and 0.351 for milk yield). For all traits, the contribution of dominance variance to the phenotypic variance was low, varying between 0.018 for ovarian cysts and 0.078 for milk yield. Inbreeding depression, inferred from the SNP-based observed homozygosity, was significant only for the milk production traits. The contribution of dominance variance to the genetic variance was larger for the health traits, ranging from 0.233 for ovarian cysts to 0.551 for mastitis, encouraging further studies that aim at discovering QTLs based on their additive and dominance effects.
Collapse
Affiliation(s)
- Helen Schneider
- Institute of Animal Science, University of Hohenheim, Stuttgart, Germany
| | - Johannes Heise
- Vereinigte Informationssysteme Tierhaltung w.V. (VIT), Verden, Germany
| | - Jens Tetens
- Department of Animal Sciences, University of Göttingen, Göttingen, Germany
| | - Georg Thaller
- Institute of Animal Breeding and Husbandry, Christian-Albrechts University of Kiel, Kiel, Germany
| | - Robin Wellmann
- Institute of Animal Science, University of Hohenheim, Stuttgart, Germany
| | - Jörn Bennewitz
- Institute of Animal Science, University of Hohenheim, Stuttgart, Germany
| |
Collapse
|
17
|
Toro-Ospina AM, Faria RA, Dominguez-Castaño P, Santana ML, Gonzalez LG, Espasandin AC, Silva JAIV. Genotype-environment interaction for milk production of Gyr cattle in Brazil and Colombia. Genes Genomics 2023; 45:135-143. [PMID: 35689753 DOI: 10.1007/s13258-022-01273-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 05/18/2022] [Indexed: 01/25/2023]
Abstract
BACKGROUND Genotype by environment interactions (G × E) can play an important role in cattle populations and should be included in breeding programs in order to select the best animals for different environments. OBJECTIVE The aim of this study was to investigate the G × E for milk production of Gyr cattle in Brazil and Colombia by applying a reaction norm model used genomics information, and to identify genomic regions associated with milk production in the two countries. METHODS The Brazilian and Colombian database included 464 animals (273 cows and 33 sires from Brazil and 158 cows from Colombia) and 27,505 SNPs. A two-trait animal model was used for milk yield adjusted to 305 days in Brazil and Colombia as a function of country of origin, which included genomic information obtained with a single-step genomic reaction norm model. The GIBBS3F90 and POSTGSf90 programs were used. RESULTS The results obtained indicate G × E based on the reranking of bulls between Brazil and Colombia, demonstrating environmental differences between the two countries. The findings highlight the importance of considering the environment when choosing breeding animals in order to ensure the adequate performance of their progeny. Within this context, the reranking of bulls and the different SNPs associated with milk production in the two countries suggest that G × E is an important effect that should be included in the genetic evaluation of Dairy Gyr cattle in Brazil and Colombia. CONCLUSION The Gyr breeding program can be optimized by choosing a selection environment that will allow maximum genetic progress in milk production in different environments within and between countries.
Collapse
Affiliation(s)
- Alejandra Maria Toro-Ospina
- FMVZ, Faculdade de Ciências Agrárias e Veterinárias-UNESP, Jaboticabal, DMNA, Fazenda Experimental Lageado, Rua José Barbosa de Barros, nº 1780, Botucatu, São Paulo, 18.618-307, Brazil.
| | - Ricardo Antonio Faria
- FMVZ, Faculdade de Ciências Agrárias e Veterinárias-UNESP, Jaboticabal, DMNA, Fazenda Experimental Lageado, Rua José Barbosa de Barros, nº 1780, Botucatu, São Paulo, 18.618-307, Brazil
| | - Pablo Dominguez-Castaño
- FMVZ, Faculdade de Ciências Agrárias e Veterinárias-UNESP, Jaboticabal, DMNA, Fazenda Experimental Lageado, Rua José Barbosa de Barros, nº 1780, Botucatu, São Paulo, 18.618-307, Brazil.,Facultad de Medicina Veterinaria, Fundación Universitaria Agraria de Colombia-UNIAGRARIA, Bogotá, Colombia
| | | | | | | | | |
Collapse
|
18
|
Tang Z, Yin L, Yin D, Zhang H, Fu Y, Zhou G, Zhao Y, Wang Z, Liu X, Li X, Zhao S. Development and application of an efficient genomic mating method to maximize the production performances of three-way crossbred pigs. Brief Bioinform 2023; 24:6961793. [PMID: 36575830 DOI: 10.1093/bib/bbac587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Revised: 11/24/2022] [Accepted: 11/30/2022] [Indexed: 12/29/2022] Open
Abstract
Creating synthetic lines is the standard mating mode for commercial pig production. Traditional mating performance was evaluated through a strictly designed cross-combination test at the 'breed level' to maximize the benefits of production. The Duroc-Landrace-Yorkshire (DLY) three-way crossbred production system became the most widely used breeding scheme for pigs. Here, we proposed an 'individual level' genomic mating procedure that can be applied to commercial pig production with efficient algorithms for estimating marker effects and for allocating the appropriate boar-sow pairs, which can be freely accessed to public in our developed HIBLUP software at https://www.hiblup.com/tutorials#genomic-mating. A total of 875 Duroc boars, 350 Landrace-Yorkshire sows and 3573 DLY pigs were used to carry out the genomic mating to assess the production benefits theoretically. The results showed that genomic mating significantly improved the performances of progeny across different traits compared with random mating, such as the feed conversion rate, days from 30 to 120 kg and eye muscle area could be improved by -0.12, -4.64 d and 2.65 cm2, respectively, which were consistent with the real experimental validations. Overall, our findings indicated that genomic mating is an effective strategy to improve the performances of progeny by maximizing their total genetic merit with consideration of both additive and dominant effects. Also, a herd of boars from a richer genetic source will increase the effectiveness of genomic mating further.
Collapse
Affiliation(s)
- Zhenshuang Tang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Lilin Yin
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China.,Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan 430070, PR China
| | - Dong Yin
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Haohao Zhang
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan 430070, PR China
| | - Yuhua Fu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China.,Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan 430070, PR China
| | - Guangliang Zhou
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Yunxiang Zhao
- School of Life Sciences and Engineering, Foshan University, Foshan 528225, PR China
| | - Zhiquan Wang
- Wuhan Yingzi Gene Technology Co. LTD, Wuhan 430070, PR China
| | - Xiaolei Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China.,Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan 430070, PR China.,Hubei Hongshan Laboratory, Wuhan 430070, PR China
| | - Xinyun Li
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China.,Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan 430070, PR China
| | - Shuhong Zhao
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China.,Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan 430070, PR China.,Hubei Hongshan Laboratory, Wuhan 430070, PR China
| |
Collapse
|
19
|
Sánchez-Roncancio C, García B, Gallardo-Hidalgo J, Yáñez JM. GWAS on Imputed Whole-Genome Sequence Variants Reveal Genes Associated with Resistance to Piscirickettsia salmonis in Rainbow Trout ( Oncorhynchus mykiss). Genes (Basel) 2022; 14:114. [PMID: 36672855 PMCID: PMC9859203 DOI: 10.3390/genes14010114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 12/27/2022] [Accepted: 12/28/2022] [Indexed: 12/31/2022] Open
Abstract
Genome-wide association studies (GWAS) allow the identification of associations between genetic variants and important phenotypes in domestic animals, including disease-resistance traits. Whole Genome Sequencing (WGS) data can help increase the resolution and statistical power of association mapping. Here, we conduced GWAS to asses he facultative intracellular bacterium Piscirickettsia salmonis, which affects farmed rainbow trout, Oncorhynchus mykiss, in Chile using imputed genotypes at the sequence level and searched for candidate genes located in genomic regions associated with the trait. A total of 2130 rainbow trout were intraperitoneally challenged with P. salmonis under controlled conditions and genotyped using a 57K single nucleotide polymorphism (SNP) panel. Genotype imputation was performed in all the genotyped animals using WGS data from 102 individuals. A total of 488,979 imputed WGS variants were available in the 2130 individuals after quality control. GWAS revealed genome-wide significant quantitative trait loci (QTL) in Omy02, Omy03, Omy25, Omy26 and Omy27 for time to death and in Omy26 for binary survival. Twenty-four (24) candidate genes associated with P. salmonis resistance were identified, which were mainly related to phagocytosis, innate immune response, inflammation, oxidative response, lipid metabolism and apoptotic process. Our results provide further knowledge on the genetic variants and genes associated with resistance to intracellular bacterial infection in rainbow trout.
Collapse
Affiliation(s)
- Charles Sánchez-Roncancio
- Doctorado en Acuicultura, Programa Cooperativo: Universidad de Chile. Universidad Católica del Norte. Pontificia Universidad Católica de Valparaíso, Chile
- Center for Research and Innovation in Aquaculture (CRIA), Universidad de Chile, Santiago 8820808, Chile
| | - Baltasar García
- Center for Research and Innovation in Aquaculture (CRIA), Universidad de Chile, Santiago 8820808, Chile
- Facultad de Ciencias Veterinarias y Pecuarias, Universidad de Chile, La Pintana, Santiago 8820808, Chile
| | - Jousepth Gallardo-Hidalgo
- Center for Research and Innovation in Aquaculture (CRIA), Universidad de Chile, Santiago 8820808, Chile
- Facultad de Ciencias Veterinarias y Pecuarias, Universidad de Chile, La Pintana, Santiago 8820808, Chile
| | - José M. Yáñez
- Center for Research and Innovation in Aquaculture (CRIA), Universidad de Chile, Santiago 8820808, Chile
- Facultad de Ciencias Veterinarias y Pecuarias, Universidad de Chile, La Pintana, Santiago 8820808, Chile
- Núcleo Milenio de Salmonidos Invasores Australes (INVASAL), Concepcion 4030000, Chile
| |
Collapse
|
20
|
Atanda SA, Steffes J, Lan Y, Al Bari MA, Kim JH, Morales M, Johnson JP, Saludares R, Worral H, Piche L, Ross A, Grusak M, Coyne C, McGee R, Rao J, Bandillo N. Multi-trait genomic prediction improves selection accuracy for enhancing seed mineral concentrations in pea. THE PLANT GENOME 2022; 15:e20260. [PMID: 36193571 DOI: 10.1002/tpg2.20260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 08/10/2022] [Indexed: 06/16/2023]
Abstract
Multi-trait genomic selection (MT-GS) has the potential to improve predictive ability by maximizing the use of information across related genotypes and genetically correlated traits. In this study, we extended the use of sparse phenotyping method into the MT-GS framework by split testing of entries to maximize borrowing of information across genotypes and predict missing phenotypes for targeted traits without additional phenotyping expenditure. Using 300 advanced breeding lines from North Dakota State University (NDSU) pulse breeding program and ∼200 USDA accessions that were evaluated for 10 nutritional traits, our results show that the proposed sparse phenotyping aided MT-GS can further improve predictive ability by >12% across traits compared with univariate (UNI) genomic selection. The proposed strategy departed from the previous reports that weak genetic correlation is a limitation to the advantage of MT-GS over UNI genomic selection, which was evident in the partially balanced phenotyping-enabled MT-GS. Our results point to heritability and genetic correlation between traits as possible metrics to optimize and further improve the estimation of model parameters, and ultimately, prediction performance. Overall, our study offers a new approach to optimize the prediction performance using the MT-GS and further highlight strategy to maximize the efficiency of GS in a plant breeding program. The sparse-testing-aided MT-GS proposed in this study can be further extended to multi-environment, multi-trait GS to improve prediction performance and further reduce the cost of phenotyping and time-consuming data collection process.
Collapse
Affiliation(s)
| | - Jenna Steffes
- Dep. of Plant Sciences, North Dakota State Univ., Fargo, ND, 58108-6050, USA
| | - Yang Lan
- Dep. of Plant Sciences, North Dakota State Univ., Fargo, ND, 58108-6050, USA
| | - Md Abdullah Al Bari
- Dep. of Plant Sciences, North Dakota State Univ., Fargo, ND, 58108-6050, USA
| | - Jeong-Hwa Kim
- Dep. of Plant Sciences, North Dakota State Univ., Fargo, ND, 58108-6050, USA
| | - Mario Morales
- Dep. of Plant Sciences, North Dakota State Univ., Fargo, ND, 58108-6050, USA
| | - Josephine P Johnson
- Dep. of Plant Sciences, North Dakota State Univ., Fargo, ND, 58108-6050, USA
| | - Rica Saludares
- Dep. of Plant Sciences, North Dakota State Univ., Fargo, ND, 58108-6050, USA
| | - Hannah Worral
- North Central Research Extension Center, NDSU, 5400 Hwy. 83, South Minot, ND, 58701, USA
| | - Lisa Piche
- Dep. of Plant Sciences, North Dakota State Univ., Fargo, ND, 58108-6050, USA
| | - Andrew Ross
- Dep. of Plant Sciences, North Dakota State Univ., Fargo, ND, 58108-6050, USA
| | - Mike Grusak
- Edward T. Schafer Agricultural Research Center, USDA-ARS, 1616 Albrecht Blvd. N, Fargo, ND, 58102-2765, USA
| | - Clarice Coyne
- USDA-ARS Plant Germplasm Introduction and Testing, Washington State Univ., Pullman, WA, 99164, USA
| | - Rebecca McGee
- USDA-ARS, Grain Legume Genetics and Physiology Research, Pullman, WA, 99164, USA
- Dep. of Horticulture, Washington State Univ., Pullman, WA, 99164, USA
| | - Jiajia Rao
- Dep. of Plant Sciences, North Dakota State Univ., Fargo, ND, 58108-6050, USA
| | - Nonoy Bandillo
- Dep. of Plant Sciences, North Dakota State Univ., Fargo, ND, 58108-6050, USA
| |
Collapse
|
21
|
Brzáková M, Bauer J, Steyn Y, Šplíchal J, Fulínová D. The prediction accuracies of linear-type traits in Czech Holstein cattle when using ssGBLUP or wssGBLUP. J Anim Sci 2022; 100:skac369. [PMID: 36334266 PMCID: PMC9746800 DOI: 10.1093/jas/skac369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 11/04/2022] [Indexed: 11/07/2022] Open
Abstract
The aim of this study was to assess the contribution of the weighted single-step genomic best linear unbiased prediction (wssGBLUP) method compared to the single-step genomic best linear unbiased prediction (ssGBLUP) method for genomic evaluation of 25 linear-type traits in the Czech Holstein cattle population. The nationwide database of linear-type traits with 6,99,681 records combined with deregressed proofs from Interbull (MACE method) was used as the input data. Genomic breeding values (GEBVs) were predicted based on these phenotypes using ssGBLUP and wssGBLUP methods using the BLUPF90 software. The bull validation test was employed which was based on comparing GEBVs of young bulls (N = 334) with no progeny in 2016. A minimum of 50 daughters with their own performance in 2020 was chosen to verify the contribution to the GEBV prediction, GEBV reliability, validation reliabilities (R2), and regression coefficients (b1). The results showed that the differences between the two methods were negligible. The low benefit of wssGBLUP may be due to the inclusion of a small number of SNPs; therefore, most predictions rely on polygenic relationships between animals. Nevertheless, the benefits of wssGBLUP analysis should be assessed with respect to specific population structures and given traits.
Collapse
Affiliation(s)
- Michaela Brzáková
- Department of Genetics and Breeding of Farm Animals, Institute of Animal Science, Prague-Uhříněves 104 00, Czech Republic
| | - Jiří Bauer
- Czech-Moravian Breeders’ Corporation, Hradištko 252 09, Czech Republic
| | - Yvette Steyn
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, USA
| | - Jiří Šplíchal
- Czech-Moravian Breeders’ Corporation, Hradištko 252 09, Czech Republic
| | - Daniela Fulínová
- Czech-Moravian Breeders’ Corporation, Hradištko 252 09, Czech Republic
| |
Collapse
|
22
|
Rembe M, Zhao Y, Wendler N, Oldach K, Korzun V, Reif JC. The Potential of Genome-Wide Prediction to Support Parental Selection, Evaluated with Data from a Commercial Barley Breeding Program. PLANTS 2022; 11:plants11192564. [PMID: 36235430 PMCID: PMC9571379 DOI: 10.3390/plants11192564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 09/18/2022] [Accepted: 09/23/2022] [Indexed: 11/29/2022]
Abstract
Parental selection is at the beginning and contributes significantly to the success of any breeding work. The value of a cross is reflected in the potential of its progeny population. Breeders invest substantial resources in evaluating progeny to select the best performing genotypes as candidates for variety development. Several proposals have been made to use genomics to support parental selection. These have mostly been evaluated using theoretical considerations or simulation studies. However, evaluations using experimental data have rarely been conducted. In this study, we tested the potential of genomic prediction for predicting the progeny mean, variance, and usefulness criterion using data from an applied breeding population for winter barley. For three traits with genetic architectures at varying levels of complexity, ear emergence, plant height, and grain yield, progeny mean, variance, and usefulness criterion were predicted and validated in scenarios resembling situations in which the described tools shall be used in plant breeding. While the population mean could be predicted with moderate to high prediction abilities amounting to 0.64, 0.21, and 0.39 in ear emergence, plant height, and grain yield, respectively, the prediction of family variance appeared difficult, as reflected in low prediction abilities of 0.41, 0.11, and 0.14, for ear emergence, plant height, and grain yield, respectively. We have shown that identifying superior crosses remains a challenging task and suggest that the success of predicting the usefulness criterion depends strongly on the complexity of the underlying trait.
Collapse
Affiliation(s)
- Maximilian Rembe
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), D-06466 Gatersleben, Germany
| | - Yusheng Zhao
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), D-06466 Gatersleben, Germany
| | - Neele Wendler
- KWS LOCHOW GmbH, Ferdinand-von-Lochow-Str. 5, 29303 Bergen, Germany
| | - Klaus Oldach
- KWS LOCHOW GmbH, Ferdinand-von-Lochow-Str. 5, 29303 Bergen, Germany
| | - Viktor Korzun
- KWS SAAT SE & Co. KGaA, Grimsehlstr. 31, 37574 Einbeck, Germany
| | - Jochen C. Reif
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), D-06466 Gatersleben, Germany
- Correspondence:
| |
Collapse
|
23
|
Zuffo LT, DeLima RO, Lübberstedt T. Combining datasets for maize root seedling traits increases the power of GWAS and genomic prediction accuracies. JOURNAL OF EXPERIMENTAL BOTANY 2022; 73:5460-5473. [PMID: 35608947 PMCID: PMC9467658 DOI: 10.1093/jxb/erac236] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2021] [Accepted: 06/06/2022] [Indexed: 05/13/2023]
Abstract
The identification of genomic regions associated with root traits and the genomic prediction of untested genotypes can increase the rate of genetic gain in maize breeding programs targeting roots traits. Here, we combined two maize association panels with different genetic backgrounds to identify single nucleotide polymorphisms (SNPs) associated with root traits, and used a genome-wide association study (GWAS) and to assess the potential of genomic prediction for these traits in maize. For this, we evaluated 377 lines from the Ames panel and 302 from the Backcrossed Germplasm Enhancement of Maize (BGEM) panel in a combined panel of 679 lines. The lines were genotyped with 232 460 SNPs, and four root traits were collected from 14-day-old seedlings. We identified 30 SNPs significantly associated with root traits in the combined panel, whereas only two and six SNPs were detected in the Ames and BGEM panels, respectively. Those 38 SNPs were in linkage disequilibrium with 35 candidate genes. In addition, we found higher prediction accuracy in the combined panel than in the Ames or BGEM panel. We conclude that combining association panels appears to be a useful strategy to identify candidate genes associated with root traits in maize and improve the efficiency of genomic prediction.
Collapse
Affiliation(s)
- Leandro Tonello Zuffo
- Corteva Agriscience, Rio Verde, GO, Brazil
- Department of Agronomy, Universidade Federal de Viçosa, Viçosa, MG, Brazil
- Department of Agronomy, Iowa State University, Ames, IA, USA
| | | | | |
Collapse
|
24
|
Breen EJ, MacLeod IM, Ho PN, Haile-Mariam M, Pryce JE, Thomas CD, Daetwyler HD, Goddard ME. BayesR3 enables fast MCMC blocked processing for largescale multi-trait genomic prediction and QTN mapping analysis. Commun Biol 2022; 5:661. [PMID: 35790806 PMCID: PMC9256732 DOI: 10.1038/s42003-022-03624-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 06/22/2022] [Indexed: 01/26/2023] Open
Abstract
Bayesian methods, such as BayesR, for predicting the genetic value or risk of individuals from their genotypes, such as Single Nucleotide Polymorphisms (SNP), are often implemented using a Markov Chain Monte Carlo (MCMC) process. However, the generation of Markov chains is computationally slow. We introduce a form of blocked Gibbs sampling for estimating SNP effects from Markov chains that greatly reduces computational time by sampling each SNP effect iteratively n-times from conditional block posteriors. Subsequent iteration over all blocks m-times produces chains of length m × n. We use this strategy to solve large-scale genomic prediction and fine-mapping problems using the Bayesian MCMC mixed-effects genetic model, BayesR3. We validate the method using simulated data, followed by analysis of empirical dairy cattle data using high dimension milk mid infra-red spectra data as an example of “omics” data and show its use to increase the precision of mapping variants affecting milk, fat, and protein yields relative to a univariate analysis of milk, fat, and protein. BayesR3 samples the polymorphisms affecting complex traits at reduced computational cost to predict the genetic value, breeding value, or individual risk of genotypes.
Collapse
|
25
|
Atanda SA, Govindan V, Singh R, Robbins KR, Crossa J, Bentley AR. Sparse testing using genomic prediction improves selection for breeding targets in elite spring wheat. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:1939-1950. [PMID: 35348821 PMCID: PMC9205816 DOI: 10.1007/s00122-022-04085-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 03/16/2022] [Indexed: 06/08/2023]
Abstract
Sparse testing using genomic prediction can be efficiently used to increase the number of testing environments while maintaining selection intensity in the early yield testing stage without increasing the breeding budget. Sparse testing using genomic prediction enables expanded use of selection environments in early-stage yield testing without increasing phenotyping cost. We evaluated different sparse testing strategies in the yield testing stage of a CIMMYT spring wheat breeding pipeline characterized by multiple populations each with small family sizes of 1-9 individuals. Our results indicated that a substantial overlap between lines across environments should be used to achieve optimal prediction accuracy. As sparse testing leverages information generated within and across environments, the genetic correlations between environments and genomic relationships of lines across environments were the main drivers of prediction accuracy in multi-environment yield trials. Including information from previous evaluation years did not consistently improve the prediction performance. Genomic best linear unbiased prediction was found to be the best predictor of true breeding value, and therefore, we propose that it should be used as a selection decision metric in the early yield testing stages. We also propose it as a proxy for assessing prediction performance to mirror breeder's advancement decisions in a breeding program so that it can be readily applied for advancement decisions by breeding programs.
Collapse
Affiliation(s)
| | - Velu Govindan
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Ravi Singh
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Kelly R Robbins
- Section of Plant Breeding and Genetics, School of Integrative Plant Sciences, Cornell University, Ithaca, NY, USA
| | - Jose Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Alison R Bentley
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico.
| |
Collapse
|
26
|
Genomic prediction of cotton fibre quality and yield traits using Bayesian regression methods. Heredity (Edinb) 2022; 129:103-112. [PMID: 35523950 PMCID: PMC9338257 DOI: 10.1038/s41437-022-00537-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 04/05/2022] [Accepted: 04/07/2022] [Indexed: 01/26/2023] Open
Abstract
Genomic selection or genomic prediction (GP) has increasingly become an important molecular breeding technology for crop improvement. GP aims to utilise genome-wide marker data to predict genomic breeding value for traits of economic importance. Though GP studies have been widely conducted in various crop species such as wheat and maize, its application in cotton, an essential renewable textile fibre crop, is still significantly underdeveloped. We aim to develop a new GP-based breeding system that can improve the efficiency of our cotton breeding program. This article presents a GP study on cotton fibre quality and yield traits using 1385 breeding lines from the Commonwealth Scientific and Industrial Research Organisation (CSIRO, Australia) cotton breeding program which were genotyped using a high-density SNP chip that generated 12,296 informative SNPs. The aim of this study was twofold: (1) to identify the models and data sources (i.e. genomic and pedigree) that produce the highest prediction accuracies; and (2) to assess the effectiveness of GP as a selection tool in the CSIRO cotton breeding program. The prediction analyses were conducted under various scenarios using different Bayesian predictive models. Results highlighted that the model combining genomic and pedigree information resulted in the best cross validated prediction accuracies: 0.76 for fibre length, 0.65 for fibre strength, and 0.64 for lint yield. Overall, this work represents the largest scale genomic selection studies based on cotton breeding trial data. Prediction accuracies reported in our study indicate the potential of GP as a breeding tool for cotton. The study highlighted the importance of incorporating pedigree and environmental factors in GP models to optimise the prediction performance.
Collapse
|
27
|
Meher PK, Rustgi S, Kumar A. Performance of Bayesian and BLUP alphabets for genomic prediction: analysis, comparison and results. Heredity (Edinb) 2022; 128:519-530. [PMID: 35508540 DOI: 10.1038/s41437-022-00539-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 04/19/2022] [Accepted: 04/19/2022] [Indexed: 11/09/2022] Open
Abstract
We evaluated the performances of three BLUP and five Bayesian methods for genomic prediction by using nine actual and 54 simulated datasets. The genomic prediction accuracy was measured using Pearson's correlation coefficient between the genomic estimated breeding value (GEBV) and the observed phenotypic data using a fivefold cross-validation approach with 100 replications. The Bayesian alphabets performed better for the traits governed by a few genes/QTLs with relatively larger effects. On the contrary, the BLUP alphabets (GBLUP and CBLUP) exhibited higher genomic prediction accuracy for the traits controlled by several small-effect QTLs. Additionally, Bayesian methods performed better for the highly heritable traits and, for other traits, performed at par with the BLUP methods. Further, genomic BLUP (GBLUP) was identified as the least biased method for the GEBV estimation. Among the Bayesian methods, the Bayesian ridge regression and Bayesian LASSO were less biased than other Bayesian alphabets. Nonetheless, genomic prediction accuracy increased with an increase in trait heritability, irrespective of the sample size, marker density, and the QTL type (major/minor effect). In sum, this study provides valuable information regarding the choice of the selection method for genomic prediction in different breeding programs.
Collapse
Affiliation(s)
- Prabina Kumar Meher
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi-12, India.
| | - Sachin Rustgi
- Department of Plant and Environmental Sciences, Clemson University Pee Dee Research and Education Center, Darlington, SC, USA.
| | - Anuj Kumar
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi-12, India
| |
Collapse
|
28
|
Li J, Wang Y, Mukiibi R, Karisa B, Plastow GS, Li C. Integrative analyses of genomic and metabolomic data reveal genetic mechanisms associated with carcass merit traits in beef cattle. Sci Rep 2022; 12:3389. [PMID: 35232965 PMCID: PMC8888742 DOI: 10.1038/s41598-022-06567-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 02/01/2022] [Indexed: 11/09/2022] Open
Abstract
Improvement of carcass merit traits is a priority for the beef industry. Discovering DNA variants and genes associated with variation in these traits and understanding biological functions/processes underlying their associations are of paramount importance for more effective genetic improvement of carcass merit traits in beef cattle. This study integrates 10,488,742 imputed whole genome DNA variants, 31 plasma metabolites, and animal phenotypes to identify genes and biological functions/processes that are associated with carcass merit traits including hot carcass weight (HCW), rib eye area (REA), average backfat thickness (AFAT), lean meat yield (LMY), and carcass marbling score (CMAR) in a population of 493 crossbred beef cattle. Regression analyses were performed to identify plasma metabolites associated with the carcass merit traits, and the results showed that 4 (3-hydroxybutyric acid, acetic acid, citric acid, and choline), 6 (creatinine, L-glutamine, succinic acid, pyruvic acid, L-lactic acid, and 3-hydroxybutyric acid), 4 (fumaric acid, methanol, D-glucose, and glycerol), 2 (L-lactic acid and creatinine), and 5 (succinic acid, fumaric acid, lysine, glycine, and choline) plasma metabolites were significantly associated with HCW, REA, AFAT, LMY, and CMAR (P-value < 0.1), respectively. Combining the results of metabolome-genome wide association studies using the 10,488,742 imputed SNPs, 103, 160, 83, 43, and 109 candidate genes were identified as significantly associated with HCW, REA, AFAT, LMY, and CMAR (P-value < 1 × 10-5), respectively. By applying functional enrichment analyses for candidate genes of each trait, 26, 24, 26, 24, and 28 significant cellular and molecular functions were predicted for HCW, REA, AFAT, LMY, and CMAR, respectively. Among the five topmost significantly enriched biological functions for carcass merit traits, molecular transport and small molecule biochemistry were two top biological functions associated with all carcass merit traits. Lipid metabolism was the most significant biological function for LMY and CMAR and it was also the second and fourth highest biological function for REA and HCW, respectively. Candidate genes and enriched biological functions identified by the integrative analyses of metabolites with phenotypic traits and DNA variants could help interpret the results of previous genome-wide association studies for carcass merit traits. Our integrative study also revealed additional potential novel genes associated with these economically important traits. Therefore, our study improves understanding of the molecular and biological functions/processes that influence carcass merit traits, which could help develop strategies to enhance genomic prediction of carcass merit traits with incorporation of metabolomic data. Similarly, this information could guide management practices, such as nutritional interventions, with the purpose of boosting specific carcass merit traits.
Collapse
Affiliation(s)
- Jiyuan Li
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB, Canada
| | - Yining Wang
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB, Canada.,Lacombe Research and Development Centre, Agriculture and Agri-Food Canada, Lacombe, AB, Canada
| | - Robert Mukiibi
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, Scotland, UK
| | - Brian Karisa
- Results Driven Agriculture Research, Edmonton, AB, Canada
| | - Graham S Plastow
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB, Canada.
| | - Changxi Li
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB, Canada. .,Lacombe Research and Development Centre, Agriculture and Agri-Food Canada, Lacombe, AB, Canada.
| |
Collapse
|
29
|
Roth M, Beugnot A, Mary-Huard T, Moreau L, Charcosset A, Fiévet JB. Improving genomic predictions with inbreeding and nonadditive effects in two admixed maize hybrid populations in single and multienvironment contexts. Genetics 2022; 220:6527635. [PMID: 35150258 PMCID: PMC8982028 DOI: 10.1093/genetics/iyac018] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Accepted: 01/28/2022] [Indexed: 11/12/2022] Open
Abstract
Genetic admixture, resulting from the recombination between structural groups, is frequently encountered in breeding populations. In hybrid breeding, crossing admixed lines can generate substantial nonadditive genetic variance and contrasted levels of inbreeding which can impact trait variation. This study aimed at testing recent methodological developments for the modeling of inbreeding and nonadditive effects in order to increase prediction accuracy in admixed populations. Using two maize (Zea mays L.) populations of hybrids admixed between dent and flint heterotic groups, we compared a suite of five genomic prediction models incorporating (or not) parameters accounting for inbreeding and nonadditive effects with the natural and orthogonal interaction approach in single and multienvironment contexts. In both populations, variance decompositions showed the strong impact of inbreeding on plant yield, height, and flowering time which was supported by the superiority of prediction models incorporating this effect (+0.038 in predictive ability for mean yield). In most cases dominance variance was reduced when inbreeding was accounted for. The model including additivity, dominance, epistasis, and inbreeding effects appeared to be the most robust for prediction across traits and populations (+0.054 in predictive ability for mean yield). In a multienvironment context, we found that the inclusion of nonadditive and inbreeding effects was advantageous when predicting hybrids not yet observed in any environment. Overall, comparing variance decompositions was helpful to guide model selection for genomic prediction. Finally, we recommend the use of models including inbreeding and nonadditive parameters following the natural and orthogonal interaction approach to increase prediction accuracy in admixed populations.
Collapse
Affiliation(s)
- Morgane Roth
- Plant Breeding Research Division, Agroscope, Wädenswil, 8820 Zurich, Switzerland,Corresponding author: INRAE GAFL, 67 Allée des Chênes 84140 Montfavet, France.
| | - Aurélien Beugnot
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, UMR GQE-Le Moulon, 91190 Gif-sur-Yvette, France
| | - Tristan Mary-Huard
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, UMR GQE-Le Moulon, 91190 Gif-sur-Yvette, France,Université Paris-Saclay, INRAE, AgroParisTech, UMR MIA-Paris Paris, 75005 Paris, France
| | - Laurence Moreau
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, UMR GQE-Le Moulon, 91190 Gif-sur-Yvette, France
| | - Alain Charcosset
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, UMR GQE-Le Moulon, 91190 Gif-sur-Yvette, France
| | - Julie B Fiévet
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, UMR GQE-Le Moulon, 91190 Gif-sur-Yvette, France
| |
Collapse
|
30
|
Genomic Predictions of Phenotypes and Pseudo-Phenotypes for Viral Nervous Necrosis Resistance, Cortisol Concentration, Antibody Titer and Body Weight in European Sea Bass. Animals (Basel) 2022; 12:ani12030367. [PMID: 35158690 PMCID: PMC8833701 DOI: 10.3390/ani12030367] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 01/27/2022] [Accepted: 01/30/2022] [Indexed: 11/16/2022] Open
Abstract
Simple Summary Selective breeding programs based on genomic data are still not a common practice in aquaculture, although genomic selection has been widely demonstrated to be advantageous when trait phenotyping is a difficult task. In this study, we investigated the accuracy of predicting the phenotype and the estimated breeding value (EBV) of three Bayesian models and a Random Forest algorithm exploiting the information of a genome-wide SNP panel for European sea bass. The genomic predictions were developed for mortality caused by viral nervous necrosis, post-stress cortisol concentration, antibody titer against nervous necrosis virus and body weight. Selective breeding based on genomic data is a possible option for improving these traits while overcoming difficulties related to individual phenotyping of the investigated traits. Our results evidenced that the EBV used as a pseudo-phenotype enhances the predictive performances of genomic models, and that EBV can be predicted with satisfactory accuracy. The genomic prediction of the EBV for mortality might also be used to classify the phenotype for the same trait. Abstract In European sea bass (Dicentrarchus labrax L.), the viral nervous necrosis mortality (MORT), post-stress cortisol concentration (HC), antibody titer (AT) against nervous necrosis virus and body weight (BW) show significant heritability, which makes selective breeding a possible option for their improvement. An experimental population (N = 650) generated by a commercial broodstock was phenotyped for the aforementioned traits and genotyped with a genome-wide SNP panel (16,075 markers). We compared the predictive accuracies of three Bayesian models (Bayes B, Bayes C and Bayesian Ridge Regression) and a machine-learning method (Random Forest). The prediction accuracy of the EBV for MORT was approximately 0.90, whereas the prediction accuracies of the EBV and the phenotype were 0.86 and 0.21 for HC, 0.79 and 0.26 for AT and 0.71 and 0.38 for BW. The genomic prediction of the EBV for MORT used to classify the phenotype for the same trait showed moderate classification performance. Genome-wide association studies confirmed the polygenic nature of MORT and demonstrated a complex genetic structure for HC and AT. Genomic predictions of the EBV for MORT could potentially be used to classify the phenotype of the same trait, though further investigations on a larger experimental population are needed.
Collapse
|
31
|
Wang J, Yu J, Lipka AE, Zhang Z. Interpretation of Manhattan Plots and Other Outputs of Genome-Wide Association Studies. Methods Mol Biol 2022; 2481:63-80. [PMID: 35641759 DOI: 10.1007/978-1-0716-2237-7_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
With increasing marker density, estimation of recombination rate between a marker and a causal mutation using linkage analysis becomes less important. Instead, linkage disequilibrium (LD) becomes the major indicator for gene mapping through genome-wide association studies (GWAS). In addition to the linkage between the marker and the causal mutation, many other factors may contribute to the LD, including population structure and cryptic relationships among individuals. As statistical methods and software evolve to improve statistical power and computing speed in GWAS, the corresponding outputs must also evolve to facilitate the interpretation of input data, the analytical process, and final association results. In this chapter, our descriptions focus on (1) considerations in creating a Manhattan plot displaying the strength of LD and locations of markers across a genome; (2) criteria for genome-wide significance threshold and the different appearance of Manhattan plots in single-locus and multiple-locus models; (3) exploration of population structure and kinship among individuals; (4) quantile-quantile (QQ) plot; (5) LD decay across the genome and LD between the associated markers and their neighbors; (6) exploration of individual and marker information on Manhattan and QQ plots via interactive visualization using HTML. The ultimate objective of this chapter is to help users to connect input data to GWAS outputs to balance power and false positives, and connect GWAS outputs to the selection of candidate genes using LD extent.
Collapse
Affiliation(s)
- Jiabo Wang
- Key Laboratory of Qinghai-Tibetan Plateau Animal Genetic Resource Reservation and Utilization, Sichuan Province and Ministry of Education, Southwest Minzu University, Chengdu, Sichuan, China.
| | - Jianming Yu
- Department of Agronomy, Iowa State University, Ames, IA, USA
| | - Alexander E Lipka
- Department of Crop Sciences, University of Illinois, Urbana, IL, USA
| | - Zhiwu Zhang
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, USA
| |
Collapse
|
32
|
Cañas-Gutiérrez GP, Sepulveda-Ortega S, López-Hernández F, Navas-Arboleda AA, Cortés AJ. Inheritance of Yield Components and Morphological Traits in Avocado cv. Hass From "Criollo" "Elite Trees" via Half-Sib Seedling Rootstocks. FRONTIERS IN PLANT SCIENCE 2022; 13:843099. [PMID: 35685008 PMCID: PMC9171141 DOI: 10.3389/fpls.2022.843099] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Accepted: 02/10/2022] [Indexed: 05/11/2023]
Abstract
Grafting induces precocity and maintains clonal integrity in fruit tree crops. However, the complex rootstock × scion interaction often precludes understanding how the tree phenotype is shaped, limiting the potential to select optimum rootstocks. Therefore, it is necessary to assess (1) how seedling progenies inherit trait variation from elite 'plus trees', and (2) whether such family superiority may be transferred after grafting to the clonal scion. To bridge this gap, we quantified additive genetic parameters (i.e., narrow sense heritability-h 2, and genetic-estimated breeding values-GEBVs) across landraces, "criollo", "plus trees" of the super-food fruit tree crop avocado (Persea americana Mill.), and their open-pollinated (OP) half-sib seedling families. Specifically, we used a genomic best linear unbiased prediction (G-BLUP) model to merge phenotypic characterization of 17 morpho-agronomic traits with genetic screening of 13 highly polymorphic SSR markers in a diverse panel of 104 avocado "criollo" "plus trees." Estimated additive genetic parameters were validated at a 5-year-old common garden trial (i.e., provenance test), in which 22 OP half-sib seedlings from 82 elite "plus trees" served as rootstocks for the cv. Hass clone. Heritability (h 2) scores in the "criollo" "plus trees" ranged from 0.28 to 0.51. The highest h 2 values were observed for ribbed petiole and adaxial veins with 0.47 (CI 95%0.2-0.8) and 0.51 (CI 0.2-0.8), respectively. The h 2 scores for the agronomic traits ranged from 0.34 (CI 0.2-0.6) to 0.39 (CI 0.2-0.6) for seed weight, fruit weight, and total volume, respectively. When inspecting yield variation across 5-year-old grafted avocado cv. Hass trees with elite OP half-sib seedling rootstocks, the traits total number of fruits and fruits' weight, respectively, exhibited h 2 scores of 0.36 (± 0.23) and 0.11 (± 0.09). Our results indicate that elite "criollo" "plus trees" may serve as promissory donors of seedling rootstocks for avocado cv. Hass orchards due to the inheritance of their outstanding trait values. This reinforces the feasibility to leverage natural variation from "plus trees" via OP half-sib seedling rootstock families. By jointly estimating half-sib family effects and rootstock-mediated heritability, this study promises boosting seedling rootstock breeding programs, while better discerning the consequences of grafting in fruit tree crops.
Collapse
Affiliation(s)
- Gloria Patricia Cañas-Gutiérrez
- Corporación Colombiana de Investigación Agropecuaria AGROSAVIA, C.I. La Selva, Rionegro, Colombia
- Corporation for Biological Research (CIB), Unit of Phytosanity and Biological Control, Medellín, Colombia
- *Correspondence: Gloria Patricia Cañas-Gutiérrez,
| | - Stella Sepulveda-Ortega
- Corporación Colombiana de Investigación Agropecuaria AGROSAVIA, C.I. La Selva, Rionegro, Colombia
| | - Felipe López-Hernández
- Corporación Colombiana de Investigación Agropecuaria AGROSAVIA, C.I. La Selva, Rionegro, Colombia
| | | | - Andrés J. Cortés
- Corporación Colombiana de Investigación Agropecuaria AGROSAVIA, C.I. La Selva, Rionegro, Colombia
- Andrés J. Cortés,
| |
Collapse
|
33
|
Martins Oliveira IC, Bernardeli A, Soler Guilhen JH, Pastina MM. Genomic Prediction of Complex Traits in an Allogamous Annual Crop: The Case of Maize Single-Cross Hybrids. Methods Mol Biol 2022; 2467:543-567. [PMID: 35451790 DOI: 10.1007/978-1-0716-2205-6_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
For many plant and animal species, commercial products are hybrids between individuals from different genetic groups. For allogamous plant species such as maize, the breeding objective is to produce single-cross hybrid varieties from two inbred lines each selected in complementary groups. Efficient hybrid breeding requires methods that (1) quickly generate homozygous and homogeneous parental lines with high combining abilities, (2) efficiently choose among the large number of available parental lines the most promising ones, and (3) predict the performances of sets of non-phenotyped single-cross hybrids, or hybrids phenotyped in a limited number of environments, based on their relationship with another set of hybrids with known performances. The maize breeding community has been developing model-based prediction of hybrid performances well before the genomic era. This chapter (1) provides a reminder of the maize breeding scheme before the genomic era; (2) describes how genomic data were incorporated in the prediction models involved in different steps of genomic-based single-cross maize hybrid breeding; and (3) reviews factors affecting the accuracy of genomic prediction, approaches for optimizing GP-based single-cross maize hybrid breeding schemes, and ensuring the long-term sustainability of genomic selection.
Collapse
Affiliation(s)
| | - Arthur Bernardeli
- Department of Agronomy, Universidade Federal de Viçosa, Viçosa-MG, Brazil
| | | | | |
Collapse
|
34
|
Fernández-Paz J, Cortés AJ, Hernández-Varela CA, Mejía-de-Tafur MS, Rodriguez-Medina C, Baligar VC. Rootstock-Mediated Genetic Variance in Cadmium Uptake by Juvenile Cacao ( Theobroma cacao L.) Genotypes, and Its Effect on Growth and Physiology. FRONTIERS IN PLANT SCIENCE 2021; 12:777842. [PMID: 35003163 PMCID: PMC8733334 DOI: 10.3389/fpls.2021.777842] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Accepted: 11/12/2021] [Indexed: 05/02/2023]
Abstract
Grafting typically offers a shortcut to breed tree orchards throughout a multidimensional space of traits. Despite an overwhelming spectrum of rootstock-mediated effects on scion traits observed across several species, the exact nature and mechanisms underlying the rootstock-mediated effects on scion traits in cacao (Theobroma cacao L.) plants often remain overlooked. Therefore, we aimed to explicitly quantify rootstock-mediated genetic contributions in recombinant juvenile cacao plants across target traits, specifically cadmium (Cd) uptake, and its correlation with growth and physiological traits. Content of chloroplast pigments, fluorescence of chlorophyll a, leaf gas exchange, nutrient uptake, and plant biomass were examined across ungrafted saplings and target rootstock × scion combinations in soils with contrasting levels of Cd. This panel considered a total of 320 progenies from open-pollinated half-sib families and reciprocal full-sib progenies (derived from controlled crosses between the reference genotypes IMC67 and PA121). Both family types were used as rootstocks in grafts with two commercial clones (ICS95 and CCN51) commonly grown in Colombia. A pedigree-based best linear unbiased prediction (A-BLUP) mixed model was implemented to quantify rootstock-mediated narrow-sense heritability (h 2) for target traits. A Cd effect measured on rootstocks before grafting was observed in plant biomass, nutrient uptake, and content of chloroplast pigments. After grafting, damage to the Photosystem II (PSII) was also evident in some rootstock × scion combinations. Differences in the specific combining ability for Cd uptake were mostly detected in ungrafted rootstocks, or 2 months after grafting with the clonal CCN51 scion. Moderate rootstock effects (h 2> 0.1) were detected before grafting for five growth traits, four nutrient uptake properties, and chlorophylls and carotenoids content (h 2 = 0.19, 95% CI 0.05-0.61, r = 0.7). Such rootstock effects faded (h 2< 0.1) when rootstock genotypes were examined in soils without Cd, or 4 months after grafting. These results suggest a pervasive genetic conflict between the rootstock and the scion genotypes, involving the triple rootstock × scion × soil interaction when it refers to Cd and nutrient uptake, early growth, and photosynthetic process in juvenile cacao plants. Overall, deepening on these findings will harness early breeding schemes of cacao rootstock genotypes compatible with commercial clonal scions and adapted to soils enriched with toxic levels of Cd.
Collapse
Affiliation(s)
- Jessica Fernández-Paz
- Corporación Colombiana de Investigación Agropecuaria (AGROSAVIA) – C.I Palmira, Palmira, Colombia
- Facultad de Ciencias Agropecuarias, Universidad Nacional de Colombia Sede Palmira, Palmira, Colombia
| | - Andrés J. Cortés
- Corporación Colombiana de Investigación Agropecuaria (AGROSAVIA) – C.I La Selva, Rionegro, Colombia
- Facultad de Ciencias Agrarias – Departamento de Ciencias Forestales, Universidad Nacional de Colombia Sede Medellín, Medellín, Colombia
| | | | - Maria Sara Mejía-de-Tafur
- Facultad de Ciencias Agropecuarias, Universidad Nacional de Colombia Sede Palmira, Palmira, Colombia
| | - Caren Rodriguez-Medina
- Corporación Colombiana de Investigación Agropecuaria (AGROSAVIA) – C.I Palmira, Palmira, Colombia
| | - Virupax C. Baligar
- United States Department of Agriculture-Agricultural Research Service-Beltsville Agricultural Research Center, Beltsville, MD, United States
| |
Collapse
|
35
|
Atashi H, Wilmot H, Gengler N. The pattern of linkage disequilibrium in Dual-Purpose Belgian Blue cattle. J Anim Breed Genet 2021; 139:320-329. [PMID: 34859921 DOI: 10.1111/jbg.12662] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 11/14/2021] [Accepted: 11/22/2021] [Indexed: 11/27/2022]
Abstract
Quantifying the level of linkage disequilibrium (LD), non-random association of alleles at two or more loci, is important to determine the number of markers needed for genomic selection. The aims of this study were to evaluate the extent of LD in Dual-Purpose Belgian Blue (DPBB) and to compare the level of LD in DPBB with that of Walloon Holstein. Data of 28,427 single nucleotide polymorphisms (SNP), located on 29 Bos taurus autosomes (BTA), of 639 DPBB and 398 Holstein bulls were used. The level of LD between pairwise SNPs separated by up to 10 Mb was evaluated, separately for each breed, using the squared correlation of the alleles at two loci. The analysis of molecular variance showed that the percentage of variation within populations (85.48%) was higher than between populations (14.52%). However, permutation tests showed a significant genetic differentiation between the two studied populations (p < .01). The average LD found between adjacent SNP pairs in DPBB (0.16 (SD = 0.22)) was generally lower than in Holstein (0.23 (SD = 0.27)). The proportion of SNPs in useful LD (r2 > 0.30) within a genomic distance of ≤0.10 Mb between SNPs was 18.58% and 28.23% in DPBB and Holstein bulls, respectively. In both breeds, the effective population size decreased over generations; however, the decline was greater in DPBB than that in Holstein. Based on results, it can be concluded that at least 68,000 SNPs are needed for implementing genomic selection in DPBB cattle with enough accuracy.
Collapse
Affiliation(s)
- Hadi Atashi
- TERRA Research and Training Center, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium.,Department of Animal Science, Shiraz University, Shiraz, Iran
| | - Hélène Wilmot
- TERRA Research and Training Center, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium.,National Fund for Scientific Research (F.R.S.-FNRS), Brussels, Belgium
| | - Nicolas Gengler
- TERRA Research and Training Center, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
| |
Collapse
|
36
|
|
37
|
Tilhou NW, Casler MD. Subsampling and DNA pooling can increase gains through genomic selection in switchgrass. THE PLANT GENOME 2021; 14:e20149. [PMID: 34626166 DOI: 10.1002/tpg2.20149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Accepted: 07/22/2021] [Indexed: 06/13/2023]
Abstract
Genomic selection (GS) can accelerate breeding cycles in perennial crops such as the bioenergy grass switchgrass (Panicum virgatum L.). The sequencing costs of GS can be reduced by pooling DNA samples in the training population (TP), only sequencing TP phenotypic outliers, or pooling candidate population (CP) samples. These strategies were simulated for two traits (spring vigor and anthesis date) in three breeding populations. Sequencing only the outlier 50% of the TP phenotype distribution resulted in a penalty of <5% of the predictive ability, measured using cross-validation. Predictive ability also decreased when sequencing progressively fewer TP DNA pools, but TPs constructed from only two phenotypically contrasting DNA samples retained a mean of >80% predictive ability relative to individual TP sequencing. Novel group testing methods allowed greater than one CP individual to be screened per sequenced DNA sample but resulted in a predictive ability penalty. To determine the impact of reduced sequencing, genetic gain was calculated for seven GS scenarios with variable sequencing budgets. Reduced TP sequencing and most CP pooling methods were superior to individual sequence-based GS when sequencing resources were restricted (2,000 DNA samples per 5-yr cycle). Only one scenario was superior to individual sequencing when sequencing budgets were large (8,000 DNA samples per 5-yr cycle). This study highlights multiple routes for reduced sequencing costs in GS.
Collapse
Affiliation(s)
- Neal Wepking Tilhou
- Department of Agronomy, University of Wisconsin, 1575 Linden Dr, Madison, WI, 53706, USA
| | - Michael D Casler
- U.S. Dairy Forage Research Center, USDA-ARS, 1925 Linden Dr, Madison, WI, 53706-1108, USA
| |
Collapse
|
38
|
Singh RK, Prasad M. Big genomic data analysis leads to more accurate trait prediction in hybrid breeding for yield enhancement in crop plants. PLANT CELL REPORTS 2021; 40:2009-2011. [PMID: 34309724 DOI: 10.1007/s00299-021-02761-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 07/20/2021] [Indexed: 06/13/2023]
Abstract
The 'big data' in plant breeding refers to the cumulative genotyping and phenotyping information obtained from either a series of experimental sets or generated from a large number of accessions. Recent study supports the employment of big data for enhancing the accuracy of complex trait prediction during hybrid breeding of crop plants.
Collapse
Affiliation(s)
- Roshan Kumar Singh
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Manoj Prasad
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India.
- Department of Plant Sciences, University of Hyderabad, Hyderabad, 500046, Telangana, India.
| |
Collapse
|
39
|
Isidro y Sánchez J, Akdemir D. Training Set Optimization for Sparse Phenotyping in Genomic Selection: A Conceptual Overview. FRONTIERS IN PLANT SCIENCE 2021; 12:715910. [PMID: 34589099 PMCID: PMC8475495 DOI: 10.3389/fpls.2021.715910] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 08/10/2021] [Indexed: 06/13/2023]
Abstract
Genomic selection (GS) is becoming an essential tool in breeding programs due to its role in increasing genetic gain per unit time. The design of the training set (TRS) in GS is one of the key steps in the implementation of GS in plant and animal breeding programs mainly because (i) TRS optimization is critical for the efficiency and effectiveness of GS, (ii) breeders test genotypes in multi-year and multi-location trials to select the best-performing ones. In this framework, TRS optimization can help to decrease the number of genotypes to be tested and, therefore, reduce phenotyping cost and time, and (iii) we can obtain better prediction accuracies from optimally selected TRS than an arbitrary TRS. Here, we concentrate the efforts on reviewing the lessons learned from TRS optimization studies and their impact on crop breeding and discuss important features for the success of TRS optimization under different scenarios. In this article, we review the lessons learned from training population optimization in plants and the major challenges associated with the optimization of GS including population size, the relationship between training and test set (TS), update of TRS, and the use of different packages and algorithms for TRS implementation in GS. Finally, we describe general guidelines to improving the rate of genetic improvement by maximizing the use of the TRS optimization in the GS framework.
Collapse
Affiliation(s)
- Julio Isidro y Sánchez
- Centro de Biotecnologia y Genómica de Plantas, Instituto Nacional de Investigación y Tecnologia Agraria y Alimentaria, Universidad Politécnica de Madrid, Campus de Montegancedo, Madrid, Spain
| | - Deniz Akdemir
- Animal and Crop Science Division, Agriculture and Food Science Centre, University College Dublin, Dublin, Ireland
| |
Collapse
|
40
|
Esuma W, Ozimati A, Kulakow P, Gore MA, Wolfe MD, Nuwamanya E, Egesi C, Kawuki RS. Effectiveness of genomic selection for improving provitamin A carotenoid content and associated traits in cassava. G3 (BETHESDA, MD.) 2021; 11:jkab160. [PMID: 33963852 PMCID: PMC8496257 DOI: 10.1093/g3journal/jkab160] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 04/26/2021] [Indexed: 11/14/2022]
Abstract
Global efforts are underway to develop cassava with enhanced levels of provitamin A carotenoids to sustainably meet increasing demands for food and nutrition where the crop is a major staple. Herein, we tested the effectiveness of genomic selection (GS) for rapid improvement of cassava for total carotenoids content and associated traits. We evaluated 632 clones from Uganda's provitamin A cassava breeding pipeline and 648 West African introductions. At harvest, each clone was assessed for level of total carotenoids, dry matter content, and resistance to cassava brown streak disease (CBSD). All clones were genotyped with diversity array technology and imputed to a set of 23,431 single nucleotide polymorphic markers. We assessed predictive ability of four genomic prediction methods in scenarios of cross-validation, across population prediction, and inclusion of quantitative trait loci markers. Cross-validations produced the highest mean prediction ability for total carotenoids content (0.52) and the lowest for CBSD resistance (0.20), with G-BLUP outperforming other models tested. Across population, predictions showed low ability of Ugandan population to predict the performance of West African clones, with the highest predictive ability recorded for total carotenoids content (0.34) and the lowest for CBSD resistance (0.12) using G-BLUP. By incorporating chromosome 1 markers associated with carotenoids content as independent kernel in the G-BLUP model of a cross-validation scenario, prediction ability slightly improved from 0.52 to 0.58. These results reinforce ongoing efforts aimed at integrating GS into cassava breeding and demonstrate the utility of this tool for rapid genetic improvement.
Collapse
Affiliation(s)
- Williams Esuma
- National Crops Resources Research Institute, Kampala, Uganda
| | - Alfred Ozimati
- National Crops Resources Research Institute, Kampala, Uganda
| | - Peter Kulakow
- International Institute for Tropical Agriculture, Ibadan, Nigeria
| | - Michael A Gore
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Marnin D Wolfe
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | | | - Chiedozie Egesi
- International Institute for Tropical Agriculture, Ibadan, Nigeria
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Robert S Kawuki
- National Crops Resources Research Institute, Kampala, Uganda
| |
Collapse
|
41
|
Kim M, Nguyen TTP, Ahn JH, Kim GJ, Sim SC. Genome-wide association study identifies QTL for eight fruit traits in cultivated tomato (Solanum lycopersicum L.). HORTICULTURE RESEARCH 2021; 8:203. [PMID: 34465758 PMCID: PMC8408251 DOI: 10.1038/s41438-021-00638-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 06/19/2021] [Accepted: 06/25/2021] [Indexed: 05/28/2023]
Abstract
Genome-wide association study (GWAS) is effective in identifying favorable alleles for traits of interest with high mapping resolution in crop species. In this study, we conducted GWAS to explore quantitative trait loci (QTL) for eight fruit traits using 162 tomato accessions with diverse genetic backgrounds. The eight traits included fruit weight, fruit width, fruit height, fruit shape index, pericarp thickness, locule number, fruit firmness, and brix. Phenotypic variations of these traits in the tomato collection were evaluated with three replicates in field trials over three years. We filtered 34,550 confident SNPs from the 51 K Axiom® tomato array based on < 10% of missing data and > 5% of minor allele frequency for association analysis. The 162 tomato accessions were divided into seven clusters and their membership coefficients were used to account for population structure along with a kinship matrix. To identify marker-trait associations (MTAs), four phenotypic data sets representing each of three years and combined were independently analyzed in the multilocus mixed model (MLMM). A total of 30 significant MTAs was detected over data sets for eight fruit traits at P < 0.0005. The number of MTA per trait ranged from one (brix) to seven (fruit weight and fruit width). Two SNP markers on chromosomes 1 and 2 were significantly associated with multiple traits, suggesting pleiotropic effects of QTL. Furthermore, 16 of 30 MTAs suggest potential novel QTL for eight fruit traits. These results facilitate genetic dissection of tomato fruit traits and provide a useful resource to develop molecular tools for improving fruit traits via marker-assisted selection and genomic selection in tomato breeding programs.
Collapse
Affiliation(s)
- Minkyung Kim
- Department of Bioresources Engineering, Sejong University, Seoul, Republic of Korea
| | | | | | - Gi-Jun Kim
- Asia Seed R&D center, Icheon, Republic of Korea
| | - Sung-Chur Sim
- Department of Bioresources Engineering, Sejong University, Seoul, Republic of Korea.
- Plant Engineering Research Institute, Sejong University, Seoul, Republic of Korea.
| |
Collapse
|
42
|
Impact of Marker Pruning Strategies Based on Different Measurements of Marker Distance on Genomic Prediction in Dairy Cattle. Animals (Basel) 2021; 11:ani11071992. [PMID: 34359120 PMCID: PMC8300388 DOI: 10.3390/ani11071992] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 06/27/2021] [Accepted: 06/28/2021] [Indexed: 11/16/2022] Open
Abstract
Simple Summary The usefulness of genomic prediction (GP) has been widely proofed by breeding analysis in livestock, plants and aquatic populations. It is well known that ‘marker density’ is a critical factor that affects the accuracy of GP, however, how to properly measure ‘marker density’ in GP is yet to be determined. With population-level whole-genome sequence data or high-density single nucleotide polymorphism (SNP) data available, this question seems to be answered more convincingly. In this study, we investigated and discussed the impact of four ‘marker density’ measures that reflect genetic or physical distances between SNPs on the accuracy of GP in a Germany Holstein dairy cattle population. Our results showed that the degree of variation of physical distance between adjacent SNPs had significant effects on the accuracy of GP, while the genetic distance between SNPs had no relationship with the accuracy of GP. Therefore, for studies based on high-density SNP data, the default strategy of pruning SNPs based on genetic distance is detrimental to heritability estimation and genomic prediction. The results extended the communities knowledge of ‘marker density’ and provided useful suggestions for the application and research on genome prediction. Abstract With the availability of high-density single-nucleotide polymorphism (SNP) data and the development of genotype imputation methods, high-density panel-based genomic prediction (GP) has become possible in livestock breeding. It is generally considered that the genomic estimated breeding value (GEBV) accuracy increases with the marker density, while studies have shown that the GEBV accuracy does not increase or even decrease when high-density panels were used. Therefore, in addition to the SNP number, other measurements of ‘marker density’ seem to have impacts on the GEBV accuracy, and exploring the relationship between the GEBV accuracy and the measurements of ‘marker density’ based on high-density SNP or whole-genome sequence data is important for the field of GP. In this study, we constructed different SNP panels with certain SNP numbers (e.g., 1 k) by using the physical distance (PhyD), genetic distance (GenD) and random distance (RanD) between SNPs respectively based on the high-density SNP data of a Germany Holstein dairy cattle population. Therefore, there are three different panels at a certain SNP number level. These panels were used to construct GP models to predict fat percentage, milk yield and somatic cell score. Meanwhile, the mean (d¯) and variance (σd2) of the physical distance between SNPs and the mean (r2¯) and variance (σr22) of the genetic distance between SNPs in each panel were used as marker density-related measurements and their influence on the GEBV accuracy was investigated. At the same SNP number level, the d¯ of all panels is basically the same, but the σd2, r2¯ and σr22 are different. Therefore, we only investigated the effects of σd2, r2¯ and σr22 on the GEBV accuracy. The results showed that at a certain SNP number level, the GEBV accuracy was negatively correlated with σd2, but not with r2¯ and σr22. Compared with GenD and RanD, the σd2 of panels constructed by PhyD is smaller. The low and moderate-density panels (< 50 k) constructed by RanD or GenD have large σd2, which is not conducive to genomic prediction. The GEBV accuracy of the low and moderate-density panels constructed by PhyD is 3.8~34.8% higher than that of the low and moderate-density panels constructed by RanD and GenD. Panels with 20–30 k SNPs constructed by PhyD can achieve the same or slightly higher GEBV accuracy than that of high-density SNP panels for all three traits. In summary, the smaller the variation degree of physical distance between adjacent SNPs, the higher the GEBV accuracy. The low and moderate-density panels construct by physical distance are beneficial to genomic prediction, while pruning high-density SNP data based on genetic distance is detrimental to genomic prediction. The results provide suggestions for the development of SNP panels and the research of genome prediction based on whole-genome sequence data.
Collapse
|
43
|
Gore D, Okeno T, Muasya T, Mburu J. Improved response to selection in dairy goat breeding programme through reproductive technology and genomic selection in the tropics. Small Rumin Res 2021. [DOI: 10.1016/j.smallrumres.2021.106397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
44
|
Derbyshire MC, Khentry Y, Severn-Ellis A, Mwape V, Saad NSM, Newman TE, Taiwo A, Regmi R, Buchwaldt L, Denton-Giles M, Batley J, Kamphuis LG. Modeling first order additive × additive epistasis improves accuracy of genomic prediction for sclerotinia stem rot resistance in canola. THE PLANT GENOME 2021; 14:e20088. [PMID: 33629543 DOI: 10.1002/tpg2.20088] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Accepted: 01/13/2021] [Indexed: 06/12/2023]
Abstract
The fungus Sclerotinia sclerotiorum infects hundreds of plant species including many crops. Resistance to this pathogen in canola (Brassica napus L. subsp. napus) is controlled by numerous quantitative trait loci (QTL). For such polygenic traits, genomic prediction may be useful for breeding as it can capture many QTL at once while also considering nonadditive genetic effects. Here, we test application of common regression models to genomic prediction of S. sclerotiorum resistance in canola in a diverse panel of 218 plants genotyped at 24,634 loci. Disease resistance was scored by infection with an aggressive isolate and monitoring over 3 wk. We found that including first-order additive × additive epistasis in linear mixed models (LMMs) improved accuracy of breeding value estimation between 3 and 40%, depending on method of assessment, and correlation between phenotypes and predicted total genetic values by 14%. Bayesian models performed similarly to or worse than genomic relationship matrix-based models for estimating breeding values or overall phenotypes from genetic values. Bayesian ridge regression, which is most similar to the genomic relationship matrix-based approach in the amount of shrinkage it applies to marker effects, was the most accurate of this family of models. This confirms several studies indicating the highly polygenic nature of sclerotinia stem rot resistance. Overall, our results highlight the use of simple epistasis terms for prediction of breeding values and total genetic values for a complex disease resistance phenotype in canola.
Collapse
Affiliation(s)
- Mark C Derbyshire
- Centre for Crop and Disease Management, School of Molecular and Life Sciences, Curtin University, Perth, Western Australia, Australia
| | - Yuphin Khentry
- Centre for Crop and Disease Management, School of Molecular and Life Sciences, Curtin University, Perth, Western Australia, Australia
| | - Anita Severn-Ellis
- School of Biological Sciences, University of Western Australia, Perth, Western Australia, Australia
| | - Virginia Mwape
- Centre for Crop and Disease Management, School of Molecular and Life Sciences, Curtin University, Perth, Western Australia, Australia
| | - Nur Shuhadah Mohd Saad
- School of Biological Sciences, University of Western Australia, Perth, Western Australia, Australia
| | - Toby E Newman
- Centre for Crop and Disease Management, School of Molecular and Life Sciences, Curtin University, Perth, Western Australia, Australia
| | - Akeem Taiwo
- Centre for Crop and Disease Management, School of Molecular and Life Sciences, Curtin University, Perth, Western Australia, Australia
| | - Roshan Regmi
- Centre for Crop and Disease Management, School of Molecular and Life Sciences, Curtin University, Perth, Western Australia, Australia
| | - Lone Buchwaldt
- Agriculture and Agri-Food, Saskatoon, Saskatchewan, Canada
| | | | - Jacqueline Batley
- School of Biological Sciences, University of Western Australia, Perth, Western Australia, Australia
| | - Lars G Kamphuis
- Centre for Crop and Disease Management, School of Molecular and Life Sciences, Curtin University, Perth, Western Australia, Australia
| |
Collapse
|
45
|
Genomic Prediction in Local Breeds: The Rendena Cattle as a Case Study. Animals (Basel) 2021; 11:ani11061815. [PMID: 34207091 PMCID: PMC8234894 DOI: 10.3390/ani11061815] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Revised: 06/16/2021] [Accepted: 06/16/2021] [Indexed: 01/26/2023] Open
Abstract
Simple Summary Although genomic selection is being used in many livestock species, it has not yet been considered in local breeds due to the lower population size and the potential less effective impact on the genetic evaluation of these breeds. The current research aims to investigate how genomic data can impact the accuracy of genetic predictions for beef traits in Rendena, a small local cattle breed of the North-East of Italy selected for a dual purpose. Classical animal models using only phenotypic information were compared with two models that integrated genomic data with pedigree information. The genomic models presented better accuracy in estimated breeding values of the animals than the ‘classical’ animal model, especially the ‘simpler’ one assuming homogeneous variances of single nucleotide polymorphisms. Our results show that the inclusion of genomic information can be successfully applied to breeding selection scenarios even in small local cattle breeds such as Rendena. Abstract The maintenance of local cattle breeds is key to selecting for efficient food production, landscape protection, and conservation of biodiversity and local cultural heritage. Rendena is an indigenous cattle breed from the alpine North-East of Italy, selected for dual purpose, but with lesser emphasis given to beef traits. In this situation, increasing accuracy for beef traits could prevent detrimental effects due to the antagonism with milk production. Our study assessed the impact of genomic information on estimated breeding values (EBVs) in Rendena performance-tested bulls. Traits considered were average daily gain, in vivo EUROP score, and in vivo estimate of dressing percentage. The final dataset contained 1691 individuals with phenotypes and 8372 animals in pedigree, 1743 of which were genotyped. Using the cross-validation method, three models were compared: (i) Pedigree-BLUP (PBLUP); (ii) single-step GBLUP (ssGBLUP), and (iii) weighted single-step GBLUP (WssGBLUP). Models including genomic information presented higher accuracy, especially WssGBLUP. However, the model with the best overall properties was the ssGBLUP, showing higher accuracy than PBLUP and optimal values of bias and dispersion parameters. Our study demonstrated that integrating phenotypes for beef traits with genomic data can be helpful to estimate EBVs, even in a small local breed.
Collapse
|
46
|
Salek Ardestani S, Jafarikia M, Sargolzaei M, Sullivan B, Miar Y. Genomic Prediction of Average Daily Gain, Back-Fat Thickness, and Loin Muscle Depth Using Different Genomic Tools in Canadian Swine Populations. Front Genet 2021; 12:665344. [PMID: 34149806 PMCID: PMC8209496 DOI: 10.3389/fgene.2021.665344] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 04/15/2021] [Indexed: 12/12/2022] Open
Abstract
Improvement of prediction accuracy of estimated breeding values (EBVs) can lead to increased profitability for swine breeding companies. This study was performed to compare the accuracy of different popular genomic prediction methods and traditional best linear unbiased prediction (BLUP) for future performance of back-fat thickness (BFT), average daily gain (ADG), and loin muscle depth (LMD) in Canadian Duroc, Landrace, and Yorkshire swine breeds. In this study, 17,019 pigs were genotyped using Illumina 60K and Affymetrix 50K panels. After quality control and imputation steps, a total of 41,304, 48,580, and 49,102 single-nucleotide polymorphisms remained for Duroc (n = 6,649), Landrace (n = 5,362), and Yorkshire (n = 5,008) breeds, respectively. The breeding values of animals in the validation groups (n = 392–774) were predicted before performance test using BLUP, BayesC, BayesCπ, genomic BLUP (GBLUP), and single-step GBLUP (ssGBLUP) methods. The prediction accuracies were obtained using the correlation between the predicted breeding values and their deregressed EBVs (dEBVs) after performance test. The genomic prediction methods showed higher prediction accuracies than traditional BLUP for all scenarios. Although the accuracies of genomic prediction methods were not significantly (P > 0.05) different, ssGBLUP was the most accurate method for Duroc-ADG, Duroc-LMD, Landrace-BFT, Landrace-ADG, and Yorkshire-BFT scenarios, and BayesCπ was the most accurate method for Duroc-BFT, Landrace-LMD, and Yorkshire-ADG scenarios. Furthermore, BayesCπ method was the least biased method for Duroc-LMD, Landrace-BFT, Landrace-ADG, Yorkshire-BFT, and Yorkshire-ADG scenarios. Our findings can be beneficial for accelerating the genetic progress of BFT, ADG, and LMD in Canadian swine populations by selecting more accurate and unbiased genomic prediction methods.
Collapse
Affiliation(s)
| | - Mohsen Jafarikia
- Canadian Centre for Swine Improvement, Ottawa, ON, Canada.,Centre for Genetic Improvement of Livestock (CGIL), Department of Animal Biosciences, University of Guelph, Guelph, ON, Canada
| | - Mehdi Sargolzaei
- Department of Pathobiology, University of Guelph, Guelph, ON, Canada.,Select Sires Inc., Plain City, OH, United States
| | - Brian Sullivan
- Canadian Centre for Swine Improvement, Ottawa, ON, Canada
| | - Younes Miar
- Department of Animal Science and Aquaculture, Dalhousie University, Truro, NS, Canada
| |
Collapse
|
47
|
Rice BR, Lipka AE. Diversifying maize genomic selection models. MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2021; 41:33. [PMID: 37309328 PMCID: PMC10236107 DOI: 10.1007/s11032-021-01221-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 03/07/2021] [Indexed: 06/14/2023]
Abstract
Genomic selection (GS) is one of the most powerful tools available for maize breeding. Its use of genome-wide marker data to estimate breeding values translates to increased genetic gains with fewer breeding cycles. In this review, we cover the history of GS and highlight particular milestones during its adaptation to maize breeding. We discuss how GS can be applied to developing superior maize inbreds and hybrids. Additionally, we characterize refinements in GS models that could enable the encapsulation of non-additive genetic effects, genotype by environment interactions, and multiple levels of the biological hierarchy, all of which could ultimately result in more accurate predictions of breeding values. Finally, we suggest the stages in a maize breeding program where it would be beneficial to apply GS. Given the current sophistication of high-throughput phenotypic, genotypic, and other -omic level data currently available to the maize community, now is the time to explore the implications of their incorporation into GS models and thus ensure that genetic gains are being achieved as quickly and efficiently as possible.
Collapse
Affiliation(s)
- Brian R. Rice
- Department of Crop Sciences, University of Illinois, Urbana, IL USA
| | | |
Collapse
|
48
|
An Overview of Key Factors Affecting Genomic Selection for Wheat Quality Traits. PLANTS 2021; 10:plants10040745. [PMID: 33920359 PMCID: PMC8069980 DOI: 10.3390/plants10040745] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 04/06/2021] [Accepted: 04/08/2021] [Indexed: 11/17/2022]
Abstract
Selection for wheat (Triticum aestivum L.) grain quality is often costly and time-consuming since it requires extensive phenotyping in the last phases of development of new lines and cultivars. The development of high-throughput genotyping in the last decade enabled reliable and rapid predictions of breeding values based only on marker information. Genomic selection (GS) is a method that enables the prediction of breeding values of individuals by simultaneously incorporating all available marker information into a model. The success of GS depends on the obtained prediction accuracy, which is influenced by various molecular, genetic, and phenotypic factors, as well as the factors of the selected statistical model. The objectives of this article are to review research on GS for wheat quality done so far and to highlight the key factors affecting prediction accuracy, in order to suggest the most applicable approach in GS for wheat quality traits.
Collapse
|
49
|
Knoch D, Werner CR, Meyer RC, Riewe D, Abbadi A, Lücke S, Snowdon RJ, Altmann T. Multi-omics-based prediction of hybrid performance in canola. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2021; 134:1147-1165. [PMID: 33523261 PMCID: PMC7973648 DOI: 10.1007/s00122-020-03759-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 12/19/2020] [Indexed: 05/05/2023]
Abstract
Complementing or replacing genetic markers with transcriptomic data and use of reproducing kernel Hilbert space regression based on Gaussian kernels increases hybrid prediction accuracies for complex agronomic traits in canola. In plant breeding, hybrids gained particular importance due to heterosis, the superior performance of offspring compared to their inbred parents. Since the development of new top performing hybrids requires labour-intensive and costly breeding programmes, including testing of large numbers of experimental hybrids, the prediction of hybrid performance is of utmost interest to plant breeders. In this study, we tested the effectiveness of hybrid prediction models in spring-type oilseed rape (Brassica napus L./canola) employing different omics profiles, individually and in combination. To this end, a population of 950 F1 hybrids was evaluated for seed yield and six other agronomically relevant traits in commercial field trials at several locations throughout Europe. A subset of these hybrids was also evaluated in a climatized glasshouse regarding early biomass production. For each of the 477 parental rapeseed lines, 13,201 single nucleotide polymorphisms (SNPs), 154 primary metabolites, and 19,479 transcripts were determined and used as predictive variables. Both, SNP markers and transcripts, effectively predict hybrid performance using (genomic) best linear unbiased prediction models (gBLUP). Compared to models using pure genetic markers, models incorporating transcriptome data resulted in significantly higher prediction accuracies for five out of seven agronomic traits, indicating that transcripts carry important information beyond genomic data. Notably, reproducing kernel Hilbert space regression based on Gaussian kernels significantly exceeded the predictive abilities of gBLUP models for six of the seven agronomic traits, demonstrating its potential for implementation in future canola breeding programmes.
Collapse
Affiliation(s)
- Dominic Knoch
- Department of Molecular Genetics, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466 Seeland, OT Gatersleben Germany
| | - Christian R. Werner
- The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG Scotland, UK
| | - Rhonda C. Meyer
- Department of Molecular Genetics, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466 Seeland, OT Gatersleben Germany
| | - David Riewe
- Department of Molecular Genetics, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466 Seeland, OT Gatersleben Germany
- Institute for Ecological Chemistry, Plant Analysis and Stored Product Protection, Julius Kühn Institute (JKI)—Federal Research Centre for Cultivated Plants, 14195 Berlin, Germany
| | - Amine Abbadi
- NPZ Innovation GmbH, Hohenlieth, 24363 Holtsee, Germany
- Norddeutsche Pflanzenzucht Hans-Georg Lembke KG, Hohenlieth, 24363 Holtsee, Germany
| | - Sophie Lücke
- Norddeutsche Pflanzenzucht Hans-Georg Lembke KG, Hohenlieth, 24363 Holtsee, Germany
| | - Rod J. Snowdon
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Heinrich-Buff-Ring 26-32, 35392 Giessen, Germany
| | - Thomas Altmann
- Department of Molecular Genetics, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466 Seeland, OT Gatersleben Germany
| |
Collapse
|
50
|
Mancin E, Sosa-Madrid BS, Blasco A, Ibáñez-Escriche N. Genotype Imputation to Improve the Cost-Efficiency of Genomic Selection in Rabbits. Animals (Basel) 2021; 11:ani11030803. [PMID: 33805619 PMCID: PMC8000098 DOI: 10.3390/ani11030803] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 03/04/2021] [Accepted: 03/05/2021] [Indexed: 01/19/2023] Open
Abstract
Simple Summary Genotyping costs are still the major limitation for the uptake of genomic selection by the rabbit meat industry, as a large number of genetic markers are needed for improving the prediction of breeding values by genomic data. In this study, several genotyping strategies were examined through simulation scenarios to disentangle the best feasible options of implementing genomic selection in rabbit breeding programs. Most scenarios emphasized the genotyping of candidate animals with a low Single Nucleotide Polymorphism (SNP) density platform. Imputation accuracies were high for the scenarios with ancestors genotyped at high or medium SNP-densities. However, the scenario with male ancestors genotyped at high SNP-density and only dams genotyped at medium SNP-density showed the best economically feasible strategy, taking into account the trade-off among genotyping costs, the accuracy of breeding values and response to selection. The results confirmed that by combining the imputation technique with a mindful selection of the animals to be genotyped, it is possible to achieve better performance than Best Linear Unbiased Prediction (BLUP), reducing genotyping cost at the same time. Abstract Genomic selection uses genetic marker information to predict genomic breeding values (gEBVs), and can be a suitable tool for selecting low-hereditability traits such as litter size in rabbits. However, genotyping costs in rabbits are still too high to enable genomic prediction in selective breeding programs. One method for decreasing genotyping costs is the genotype imputation, where parents are genotyped at high SNP-density (HD) and the progeny are genotyped at lower SNP-density, followed by imputation to HD. The aim of this study was to disentangle the best imputation strategies with a trade-off between genotyping costs and the accuracy of breeding values for litter size. A selection process, mimicking a commercial breeding rabbit selection program for litter size, was simulated. Two different Quantitative Trait Nucleotide (QTN) models (QTN_5 and QTN_44) were generated 36 times each. From these simulations, seven different scenarios (S1–S7) and a further replicate of the third scenario (S3_A) were created. Scenarios consist of a different combination of genotyping strategies. In these scenarios, ancestors and progeny were genotyped with a mix of three different platforms, containing 200,000, 60,000, and 600 SNPs under a cost of EUR 100, 50 and 11 per animal, respectively. Imputation accuracy (IA) was measured as a Pearson’s correlation between true genotype and imputed genotype, whilst the accuracy of gEBVs was the correlation between true breeding value and the estimated one. The relationships between IA, the accuracy of gEBVs, genotyping costs, and response to selection were examined under each QTN model. QTN_44 presented better performance, according to the results of genomic prediction, but the same ranks between scenarios remained in both QTN models. The highest IA (0.99) and the accuracy of gEBVs (0.26; QTN_44, and 0.228; QTN_5) were observed in S1 where all ancestors were genotyped at HD and progeny at medium SNP-density (MD). Nevertheless, this was the most expensive scenario compared to the others in which the progenies were genotyped at low SNP-density (LD). Scenarios with low average costs presented low IA, particularly when female ancestors were genotyped at LD (S5) or non-genotyped (S7). The S3_A, imputing whole-genomes, had the lowest accuracy of gEBVs (0.09), even worse than Best Linear Unbiased Prediction (BLUP). The best trade-off between genotyping costs and the accuracy of gEBVs (0.234; QTN_44 and 0.199) was in S6, in which dams were genotyped with MD whilst grand-dams were non-genotyped. However, this relationship would depend mainly on the distribution of QTN and SNP across the genome, suggesting further studies on the characterization of the rabbit genome in the Spanish lines. In summary, genomic selection with genotype imputation is feasible in the rabbit industry, considering only genotyping strategies with suitable IA, accuracy of gEBVs, genotyping costs, and response to selection.
Collapse
Affiliation(s)
- Enrico Mancin
- Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, viale dell’Università 16, 35020 Legnaro, PD, Italy;
| | - Bolívar Samuel Sosa-Madrid
- Institute for Animal Science and Technology, Universitat Politècnica de València, 46022 Valencia, Spain;
- Correspondence: (B.S.S.-M.); (N.I.-E.); Tel.: +34-963877438 (N.I.-E.)
| | - Agustín Blasco
- Institute for Animal Science and Technology, Universitat Politècnica de València, 46022 Valencia, Spain;
| | - Noelia Ibáñez-Escriche
- Institute for Animal Science and Technology, Universitat Politècnica de València, 46022 Valencia, Spain;
- Correspondence: (B.S.S.-M.); (N.I.-E.); Tel.: +34-963877438 (N.I.-E.)
| |
Collapse
|