Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Akdemir D, Sanchez JI, Jannink JL. Optimization of genomic selection training populations with a genetic algorithm. Genet Sel Evol 2015;47:38. [PMID: 25943105 PMCID: PMC4422310 DOI: 10.1186/s12711-015-0116-6] [Citation(s) in RCA: 79] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2014] [Accepted: 03/30/2015] [Indexed: 01/12/2023] Open

For:	Akdemir D, Sanchez JI, Jannink JL. Optimization of genomic selection training populations with a genetic algorithm. Genet Sel Evol 2015;47:38. [PMID: 25943105 PMCID: PMC4422310 DOI: 10.1186/s12711-015-0116-6] [Citation(s) in RCA: 79] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2014] [Accepted: 03/30/2015] [Indexed: 01/12/2023] Open

Number

Cited by Other Article(s)

Couto EGO, Chaves SFS, Dias KOG, Morales-Marroquín JA, Alves-Pereira A, Motoike SY, Colombo CA, Zucchi MI. Training set optimization is a feasible alternative for perennial orphan crop domestication and germplasm management: an Acrocomia aculeata example. FRONTIERS IN PLANT SCIENCE 2024;15:1441683. [PMID: 39323537 PMCID: PMC11423296 DOI: 10.3389/fpls.2024.1441683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Accepted: 08/14/2024] [Indexed: 09/27/2024]

Abstract

Orphan perennial native species are gaining importance as sustainability in agriculture becomes crucial to mitigate climate change. Nevertheless, issues related to the undomesticated status and lack of improved germplasm impede the evolution of formal agricultural initiatives. Acrocomia aculeata - a neotropical palm with potential for oil production - is an example. Breeding efforts can aid the species to reach its full potential and increase market competitiveness. Here, we present genomic information and training set optimization as alternatives to boost orphan perennial native species breeding using Acrocomia aculeata as an example. Furthermore, we compared three SNP calling methods and, for the first time, presented the prediction accuracies of three yield-related traits. We collected data for two years from 201 wild individuals. These trees were genotyped, and three references were used for SNP calling: the oil palm genome, de novo sequencing, and the A. aculeata transcriptome. The traits analyzed were fruit dry mass (FDM), pulp dry mass (PDM), and pulp oil content (OC). We compared the predictive ability of GBLUP and BayesB models in cross- and real validation procedures. Afterwards, we tested several optimization criteria regarding consistency and the ability to provide the optimized training set that yielded less risk in both targeted and untargeted scenarios. Using the oil palm genome as a reference and GBLUP models had better results for the genomic prediction of FDM, OC, and PDM (prediction accuracies of 0.46, 0.45, and 0.39, respectively). Using the criteria PEV, r-score and core collection methodology provides risk-averse decisions. Training set optimization is an alternative to improve decision-making while leveraging genomic information as a cost-saving tool to accelerate plant domestication and breeding. The optimized training set can be used as a reference for the characterization of native species populations, aiding in decisions involving germplasm collection and construction of breeding populations.

Collapse

Xie Z, Weng L, He J, Feng X, Xu X, Ma Y, Bai P, Kong Q. PNNGS, a multi-convolutional parallel neural network for genomic selection. FRONTIERS IN PLANT SCIENCE 2024;15:1410596. [PMID: 39290743 PMCID: PMC11405342 DOI: 10.3389/fpls.2024.1410596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Accepted: 08/19/2024] [Indexed: 09/19/2024]

Kusmec A, Yeh CT'E, Schnable PS. Data-driven identification of environmental variables influencing phenotypic plasticity to facilitate breeding for future climates. THE NEW PHYTOLOGIST 2024. [PMID: 39183371 DOI: 10.1111/nph.19937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 05/20/2024] [Indexed: 08/27/2024]

Montesinos-López OA, Crespo-Herrera L, Pierre CS, Cano-Paez B, Huerta-Prado GI, Mosqueda-González BA, Ramos-Pulido S, Gerard G, Alnowibet K, Fritsche-Neto R, Montesinos-López A, Crossa J. Feature engineering of environmental covariates improves plant genomic-enabled prediction. FRONTIERS IN PLANT SCIENCE 2024;15:1349569. [PMID: 38812738 PMCID: PMC11135473 DOI: 10.3389/fpls.2024.1349569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 04/11/2024] [Indexed: 05/31/2024]

Bose S, Banerjee S, Kumar S, Saha A, Nandy D, Hazra S. Review of applications of artificial intelligence (AI) methods in crop research. J Appl Genet 2024;65:225-240. [PMID: 38216788 DOI: 10.1007/s13353-023-00826-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 12/23/2023] [Accepted: 12/26/2023] [Indexed: 01/14/2024]

Alemu A, Åstrand J, Montesinos-López OA, Isidro Y Sánchez J, Fernández-Gónzalez J, Tadesse W, Vetukuri RR, Carlsson AS, Ceplitis A, Crossa J, Ortiz R, Chawade A. Genomic selection in plant breeding: Key factors shaping two decades of progress. MOLECULAR PLANT 2024;17:552-578. [PMID: 38475993 DOI: 10.1016/j.molp.2024.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 01/22/2024] [Accepted: 03/08/2024] [Indexed: 03/14/2024]

Fernández-González J, Haquin B, Combes E, Bernard K, Allard A, Isidro Y Sánchez J. Maximizing efficiency in sunflower breeding through historical data optimization. PLANT METHODS 2024;20:42. [PMID: 38493115 PMCID: PMC10943787 DOI: 10.1186/s13007-024-01151-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 01/30/2024] [Indexed: 03/18/2024]

Lorenzi A, Bauland C, Pin S, Madur D, Combes V, Palaffre C, Guillaume C, Touzy G, Mary-Huard T, Charcosset A, Moreau L. Portability of genomic predictions trained on sparse factorial designs across two maize silage breeding cycles. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024;137:75. [PMID: 38453705 PMCID: PMC11341662 DOI: 10.1007/s00122-024-04566-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 01/30/2024] [Indexed: 03/09/2024]

Abstract

KEY MESSAGE

We validated the efficiency of genomic predictions calibrated on sparse factorial training sets to predict the next generation of hybrids and tested different strategies for updating predictions along generations. Genomic selection offers new prospects for revisiting hybrid breeding schemes by replacing extensive phenotyping of individuals with genomic predictions. Finding the ideal design for training genomic prediction models is still an open question. Previous studies have shown promising predictive abilities using sparse factorial instead of tester-based training sets to predict single-cross hybrids from the same generation. This study aims to further investigate the use of factorials and their optimization to predict line general combining abilities (GCAs) and hybrid values across breeding cycles. It relies on two breeding cycles of a maize reciprocal genomic selection scheme involving multiparental connected reciprocal populations from flint and dent complementary heterotic groups selected for silage performances. Selection based on genomic predictions trained on a factorial design resulted in a significant genetic gain for dry matter yield in the new generation. Results confirmed the efficiency of sparse factorial training sets to predict candidate line GCAs and hybrid values across breeding cycles. Compared to a previous study based on the first generation, the advantage of factorial over tester training sets appeared lower across generations. Updating factorial training sets by adding single-cross hybrids between selected lines from the previous generation or a random subset of hybrids from the new generation both improved predictive abilities. The CDmean criterion helped determine the set of single-crosses to phenotype to update the training set efficiently. Our results validated the efficiency of sparse factorial designs for calibrating hybrid genomic prediction experimentally and showed the benefit of updating it along generations.

Collapse

de Verdal H, Baertschi C, Frouin J, Quintero C, Ospina Y, Alvarez MF, Cao TV, Bartholomé J, Grenier C. Optimization of Multi-Generation Multi-location Genomic Prediction Models for Recurrent Genomic Selection in an Upland Rice Population. RICE (NEW YORK, N.Y.) 2023;16:43. [PMID: 37758969 PMCID: PMC10533757 DOI: 10.1186/s12284-023-00661-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 09/19/2023] [Indexed: 09/29/2023]

Abstract

Genomic selection is a worthy breeding method to improve genetic gain in recurrent selection breeding schemes. The integration of multi-generation and multi-location information could significantly improve genomic prediction models in the context of shuttle breeding. The Cirad-CIAT upland rice breeding program applies recurrent genomic selection and seeks to optimize the scheme to increase genetic gain while reducing phenotyping efforts. We used a synthetic population (PCT27) of which S0 plants were all genotyped and advanced by selfing and bulk seed harvest to the S0:2, S0:3, and S0:4 generations. The PCT27 was then divided into two sets. The S0:2 and S0:3 progenies for PCT27A and the S0:4 progenies for PCT27B were phenotyped in two locations: Santa Rosa the target selection location, within the upland rice growing area, and Palmira, the surrogate location, far from the upland rice growing area but easier for experimentation. While the calibration used either one of the two sets phenotyped in one or two locations, the validation population was only the PCT27B phenotyped in Santa Rosa. Five scenarios of genomic prediction and 24 models were performed and compared. Training the prediction model with the PCT27B phenotyped in Santa Rosa resulted in predictive abilities ranging from 0.19 for grain zinc concentration to 0.30 for grain yield. Expanding the training set with the inclusion of the PCT27A resulted in greater predictive abilities for all traits but grain yield, with increases from 5% for plant height to 61% for grain zinc concentration. Models with the PCT27B phenotyped in two locations resulted in higher prediction accuracy when the models assumed no genotype-by-environment (G × E) interaction for flowering (0.38) and grain zinc concentration (0.27). For plant height, the model assuming a single G × E variance provided higher accuracy (0.28). The gain in predictive ability for grain yield was the greatest (0.25) when environment-specific variance deviation effect for G × E was considered. While the best scenario was specific to each trait, the results indicated that the gain in predictive ability provided by the multi-location and multi-generation calibration was low. Yet, this approach could lead to increased selection intensity, acceleration of the breeding cycle, and a sizable economic advantage for the program.

Collapse

Liu Y, Ao M, Lu M, Zheng S, Zhu F, Ruan Y, Guan Y, Zhang A, Cui Z. Genomic selection to improve husk tightness based on genomic molecular markers in maize. FRONTIERS IN PLANT SCIENCE 2023;14:1252298. [PMID: 37828926 PMCID: PMC10566295 DOI: 10.3389/fpls.2023.1252298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 09/04/2023] [Indexed: 10/14/2023]

Wang Q, Jiang S, Li T, Qiu Z, Yan J, Fu R, Ma C, Wang X, Jiang S, Cheng Q. G2P Provides an Integrative Environment for Multi-model genomic selection analysis to improve genotype-to-phenotype prediction. FRONTIERS IN PLANT SCIENCE 2023;14:1207139. [PMID: 37600179 PMCID: PMC10437076 DOI: 10.3389/fpls.2023.1207139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 07/21/2023] [Indexed: 08/22/2023]

Affiliation(s)

Qian Wang Frontiers Science Center for Molecular Design Breeding, China Agricultural University, Beijing, China National Maize Improvement Center of China, College of Agriculture and Biotechnology, China Agricultural University, Beijing, China
Shan Jiang Frontiers Science Center for Molecular Design Breeding, China Agricultural University, Beijing, China National Maize Improvement Center of China, College of Agriculture and Biotechnology, China Agricultural University, Beijing, China
Tong Li Frontiers Science Center for Molecular Design Breeding, China Agricultural University, Beijing, China National Maize Improvement Center of China, College of Agriculture and Biotechnology, China Agricultural University, Beijing, China
Zhixu Qiu Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Yangling, Shaanxi, China State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Shaanxi, Yangling, China
Jun Yan Frontiers Science Center for Molecular Design Breeding, China Agricultural University, Beijing, China National Maize Improvement Center of China, College of Agriculture and Biotechnology, China Agricultural University, Beijing, China
Ran Fu Frontiers Science Center for Molecular Design Breeding, China Agricultural University, Beijing, China National Maize Improvement Center of China, College of Agriculture and Biotechnology, China Agricultural University, Beijing, China
Chuang Ma Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Yangling, Shaanxi, China State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Shaanxi, Yangling, China
Xiangfeng Wang Frontiers Science Center for Molecular Design Breeding, China Agricultural University, Beijing, China National Maize Improvement Center of China, College of Agriculture and Biotechnology, China Agricultural University, Beijing, China
Shuqin Jiang Frontiers Science Center for Molecular Design Breeding, China Agricultural University, Beijing, China National Maize Improvement Center of China, College of Agriculture and Biotechnology, China Agricultural University, Beijing, China
Qian Cheng Frontiers Science Center for Molecular Design Breeding, China Agricultural University, Beijing, China National Maize Improvement Center of China, College of Agriculture and Biotechnology, China Agricultural University, Beijing, China

Collapse

Montesinos-López OA, Crespo-Herrera L, Saint Pierre C, Bentley AR, de la Rosa-Santamaria R, Ascencio-Laguna JA, Agbona A, Gerard GS, Montesinos-López A, Crossa J. Do feature selection methods for selecting environmental covariables enhance genomic prediction accuracy? Front Genet 2023;14:1209275. [PMID: 37554404 PMCID: PMC10405933 DOI: 10.3389/fgene.2023.1209275] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 07/03/2023] [Indexed: 08/10/2023] Open

Pégard M, Barre P, Delaunay S, Surault F, Karagić D, Milić D, Zorić M, Ruttink T, Julier B. Genome-wide genotyping data renew knowledge on genetic diversity of a worldwide alfalfa collection and give insights on genetic control of phenology traits. FRONTIERS IN PLANT SCIENCE 2023;14:1196134. [PMID: 37476178 PMCID: PMC10354441 DOI: 10.3389/fpls.2023.1196134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 05/30/2023] [Indexed: 07/22/2023]

Abstract

China's and Europe's dependence on imported protein is a threat to the food self-sufficiency of these regions. It could be solved by growing more legumes, including alfalfa that is the highest protein producer under temperate climate. To create productive and high-value varieties, the use of large genetic diversity combined with genomic evaluation could improve current breeding programs. To study alfalfa diversity, we have used a set of 395 alfalfa accessions (i.e. populations), mainly from Europe, North and South America and China, with fall dormancy ranging from 3 to 7 on a scale of 11. Five breeders provided materials (617 accessions) that were compared to the 400 accessions. All accessions were genotyped using Genotyping-by-Sequencing (GBS) to obtain SNP allele frequency. These genomic data were used to describe genetic diversity and identify genetic groups. The accessions were phenotyped for phenology traits (fall dormancy and flowering date) at two locations (Lusignan in France, Novi Sad in Serbia) from 2018 to 2021. The QTL were detected by a Multi-Locus Mixed Model (mlmm). Subsequently, the quality of the genomic prediction for each trait was assessed. Cross-validation was used to assess the quality of prediction by testing GBLUP, Bayesian Ridge Regression (BRR), and Bayesian Lasso methods. A genetic structure with seven groups was found. Most of these groups were related to the geographical origin of the accessions and showed that European and American material is genetically distinct from Chinese material. Several QTL associated with fall dormancy were found and most of these were linked to genes. In our study, the infinitesimal methods showed a higher prediction quality than the Bayesian Lasso, and the genomic prediction achieved high (>0.75) predicting abilities in some cases. Our results are encouraging for alfalfa breeding by showing that it is possible to achieve high genomic prediction quality.

Collapse

He S, Liang S, Meng L, Cao L, Ye G. Sparse Phenotyping and Haplotype-Based Models for Genomic Prediction in Rice. RICE (NEW YORK, N.Y.) 2023;16:27. [PMID: 37284992 DOI: 10.1186/s12284-023-00643-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 05/20/2023] [Indexed: 06/08/2023]

Abstract

The multi-environment genomic selection enables plant breeders to select varieties resilient to diverse environments or particularly adapted to specific environments, which holds a great potential to be used in rice breeding. To realize the multi-environment genomic selection, a robust training set with multi-environment phenotypic data is of necessity. Considering the huge potential of genomic prediction enhanced sparse phenotyping on the cost saving of multi-environment trials (MET), the establishment of a multi-environment training set could also benefit from it. Optimizing the genomic prediction methods is also crucial to enhance the multi-environment genomic selection. Using haplotype-based genomic prediction models is able to capture local epistatic effects which could be conserved and accumulated across generations much like additive effects thereby benefitting breeding. However, previous studies often used fixed length haplotypes composed by a few adjacent molecular markers disregarding the linkage disequilibrium (LD) which is of essential role in determining the haplotype length. In our study, based on three rice populations with different sizes and compositions, we investigated the usefulness and effectiveness of multi-environment training sets with varying phenotyping intensities and different haplotype-based genomic prediction models based on LD-derived haplotype blocks for two agronomic traits, i.e., days to heading (DTH) and plant height (PH). Results showed that phenotyping merely 30% records in multi-environment training set is able to provide a comparable prediction accuracy to high phenotyping intensities; the local epistatic effects are much likely existent in DTH; dividing the LD-derived haplotype blocks into small segments with two or three single nucleotide polymorphisms (SNPs) helps to maintain the predictive ability of haplotype-based models in large populations; modelling the covariances between environments improves genomic prediction accuracy. Our study provides means to improve the efficiency of multi-environment genomic selection in rice.

Collapse

Wu PY, Ou JH, Liao CT. Sample size determination for training set optimization in genomic prediction. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023;136:57. [PMID: 36912999 PMCID: PMC10011335 DOI: 10.1007/s00122-023-04254-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Accepted: 11/07/2022] [Indexed: 06/18/2023]

Ficht A, Konkin DJ, Cram D, Sidebottom C, Tan Y, Pozniak C, Rajcan I. Genomic selection for agronomic traits in a winter wheat breeding program. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023;136:38. [PMID: 36897431 DOI: 10.1007/s00122-023-04294-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 12/19/2022] [Indexed: 06/18/2023]

Fernández-González J, Akdemir D, Isidro Y Sánchez J. A comparison of methods for training population optimization in genomic selection. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023;136:30. [PMID: 36892603 PMCID: PMC9998580 DOI: 10.1007/s00122-023-04265-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 11/21/2022] [Indexed: 06/18/2023]

Abstract

Maximizing CDmean and Avg_GRM_self were the best criteria for training set optimization. A training set size of 50-55% (targeted) or 65-85% (untargeted) is needed to obtain 95% of the accuracy. With the advent of genomic selection (GS) as a widespread breeding tool, mechanisms to efficiently design an optimal training set for GS models became more relevant, since they allow maximizing the accuracy while minimizing the phenotyping costs. The literature described many training set optimization methods, but there is a lack of a comprehensive comparison among them. This work aimed to provide an extensive benchmark among optimization methods and optimal training set size by testing a wide range of them in seven datasets, six different species, different genetic architectures, population structure, heritabilities, and with several GS models to provide some guidelines about their application in breeding programs. Our results showed that targeted optimization (uses information from the test set) performed better than untargeted (does not use test set data), especially when heritability was low. The mean coefficient of determination was the best targeted method, although it was computationally intensive. Minimizing the average relationship within the training set was the best strategy for untargeted optimization. Regarding the optimal training set size, maximum accuracy was obtained when the training set was the entire candidate set. Nevertheless, a 50-55% of the candidate set was enough to reach 95-100% of the maximum accuracy in the targeted scenario, while we needed a 65-85% for untargeted optimization. Our results also suggested that a diverse training set makes GS robust against population structure, while including clustering information was less effective. The choice of the GS model did not have a significant influence on the prediction accuracies.

Collapse

Jeon D, Kang Y, Lee S, Choi S, Sung Y, Lee TH, Kim C. Digitalizing breeding in plants: A new trend of next-generation breeding based on genomic prediction. FRONTIERS IN PLANT SCIENCE 2023;14:1092584. [PMID: 36743488 PMCID: PMC9892199 DOI: 10.3389/fpls.2023.1092584] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 01/05/2023] [Indexed: 06/18/2023]

Gevartosky R, Carvalho HF, Costa-Neto G, Montesinos-López OA, Crossa J, Fritsche-Neto R. Enviromic-based kernels may optimize resource allocation with multi-trait multi-environment genomic prediction for tropical Maize. BMC PLANT BIOLOGY 2023;23:10. [PMID: 36604618 PMCID: PMC9814176 DOI: 10.1186/s12870-022-03975-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 11/24/2022] [Indexed: 06/17/2023]

Abstract

BACKGROUND

Success in any genomic prediction platform is directly dependent on establishing a representative training set. This is a complex task, even in single-trait single-environment conditions and tends to be even more intricated wherein additional information from envirotyping and correlated traits are considered. Here, we aimed to design optimized training sets focused on genomic prediction, considering multi-trait multi-environment trials, and how those methods may increase accuracy reducing phenotyping costs. For that, we considered single-trait multi-environment trials and multi-trait multi-environment trials for three traits: grain yield, plant height, and ear height, two datasets, and two cross-validation schemes. Next, two strategies for designing optimized training sets were conceived, first considering only the genomic by environment by trait interaction (GET), while a second including large-scale environmental data (W, enviromics) as genomic by enviromic by trait interaction (GWT). The effective number of individuals (genotypes × environments × traits) was assumed as those that represent at least 98% of each kernel (GET or GWT) variation, in which those individuals were then selected by a genetic algorithm based on prediction error variance criteria to compose an optimized training set for genomic prediction purposes.

RESULTS

The combined use of genomic and enviromic data efficiently designs optimized training sets for genomic prediction, improving the response to selection per dollar invested by up to 145% when compared to the model without enviromic data, and even more when compared to cross validation scheme with 70% of training set or pure phenotypic selection. Prediction models that include G × E or enviromic data + G × E yielded better prediction ability.

CONCLUSIONS

Our findings indicate that a genomic by enviromic by trait interaction kernel associated with genetic algorithms is efficient and can be proposed as a promising approach to designing optimized training sets for genomic prediction when the variance-covariance matrix of traits is available. Additionally, great improvements in the genetic gains per dollar invested were observed, suggesting that a good allocation of resources can be deployed by using the proposed approach.

Collapse

Ballén-Taborda C, Lyerly J, Smith J, Howell K, Brown-Guedira G, Babar MA, Harrison SA, Mason RE, Mergoum M, Murphy JP, Sutton R, Griffey CA, Boyles RE. Utilizing genomics and historical data to optimize gene pools for new breeding programs: A case study in winter wheat. Front Genet 2022;13:964684. [PMID: 36276956 PMCID: PMC9585219 DOI: 10.3389/fgene.2022.964684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 08/05/2022] [Indexed: 11/13/2022] Open

Abstract

With the rapid generation and preservation of both genomic and phenotypic information for many genotypes within crops and across locations, emerging breeding programs have a valuable opportunity to leverage these resources to 1) establish the most appropriate genetic foundation at program inception and 2) implement robust genomic prediction platforms that can effectively select future breeding lines. Integrating genomics-enabled¹ breeding into cultivar development can save costs and allow resources to be reallocated towards advanced (i.e., later) stages of field evaluation, which can facilitate an increased number of testing locations and replicates within locations. In this context, a reestablished winter wheat breeding program was used as a case study to understand best practices to leverage and tailor existing genomic and phenotypic resources to determine optimal genetics for a specific target population of environments. First, historical multi-environment phenotype data, representing 1,285 advanced breeding lines, were compiled from multi-institutional testing as part of the SunGrains cooperative and used to produce GGE biplots and PCA for yield. Locations were clustered based on highly correlated line performance among the target population of environments into 22 subsets. For each of the subsets generated, EMMs and BLUPs were calculated using linear models with the ‘lme4’ R package. Second, for each subset, TPs representative of the new SC breeding lines were determined based on genetic relatedness using the ‘STPGA’ R package. Third, for each TP, phenotypic values and SNP data were incorporated into the ‘rrBLUP’ mixed models for generation of GEBVs of YLD, TW, HD and PH. Using a five-fold cross-validation strategy, an average accuracy of r = 0.42 was obtained for yield between all TPs. The validation performed with 58 SC elite breeding lines resulted in an accuracy of r = 0.62 when the TP included complete historical data. Lastly, QTL-by-environment interaction for 18 major effect genes across three geographic regions was examined. Lines harboring major QTL in the absence of disease could potentially underperform (e.g., Fhb1 R-gene), whereas it is advantageous to express a major QTL under biotic pressure (e.g., stripe rust R-gene). This study highlights the importance of genomics-enabled breeding and multi-institutional partnerships to accelerate cultivar development.

Collapse

Li Z, Liu S, Conaty W, Zhu QH, Moncuquet P, Stiller W, Wilson I. Genomic prediction of cotton fibre quality and yield traits using Bayesian regression methods. Heredity (Edinb) 2022;129:103-112. [PMID: 35523950 PMCID: PMC9338257 DOI: 10.1038/s41437-022-00537-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 04/05/2022] [Accepted: 04/07/2022] [Indexed: 01/26/2023] Open

Building a Calibration Set for Genomic Prediction, Characteristics to Be Considered, and Optimization Approaches. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022;2467:77-112. [PMID: 35451773 DOI: 10.1007/978-1-0716-2205-6_3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Bartholomé J, Prakash PT, Cobb JN. Genomic Prediction: Progress and Perspectives for Rice Improvement. Methods Mol Biol 2022;2467:569-617. [PMID: 35451791 DOI: 10.1007/978-1-0716-2205-6_21] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Cazenave X, Petit B, Lateur M, Nybom H, Sedlak J, Tartarini S, Laurens F, Durel CE, Muranty H. Combining genetic resources and elite material populations to improve the accuracy of genomic prediction in apple. G3 (BETHESDA, MD.) 2021;12:6459174. [PMID: 34893831 PMCID: PMC9210277 DOI: 10.1093/g3journal/jkab420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Accepted: 11/29/2021] [Indexed: 11/12/2022]

Rio S, Gallego-Sánchez L, Montilla-Bascón G, Canales FJ, Isidro Y Sánchez J, Prats E. Genomic prediction and training set optimization in a structured Mediterranean oat population. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2021;134:3595-3609. [PMID: 34341832 DOI: 10.1007/s00122-021-03916-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 07/13/2021] [Indexed: 05/22/2023]

Dzievit MJ, Guo T, Li X, Yu J. Comprehensive analytical and empirical evaluation of genomic prediction across diverse accessions in maize. THE PLANT GENOME 2021;14:e20160. [PMID: 34661990 DOI: 10.1002/tpg2.20160] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Accepted: 08/23/2021] [Indexed: 06/13/2023]

Larkin DL, Mason RE, Moon DE, Holder AL, Ward BP, Brown-Guedira G. Predicting Fusarium Head Blight Resistance for Advanced Trials in a Soft Red Winter Wheat Breeding Program With Genomic Selection. FRONTIERS IN PLANT SCIENCE 2021;12:715314. [PMID: 34745156 PMCID: PMC8569947 DOI: 10.3389/fpls.2021.715314] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 09/27/2021] [Indexed: 06/13/2023]

Tanaka R, Lui-King J, Mandaharisoa ST, Rakotondramanana M, Ranaivo HN, Pariasca-Tanaka J, Kanegae HK, Iwata H, Wissuwa M. From gene banks to farmer's fields: using genomic selection to identify donors for a breeding program in rice to close the yield gap on smallholder farms. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2021;134:3397-3410. [PMID: 34264372 PMCID: PMC8440315 DOI: 10.1007/s00122-021-03909-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Accepted: 07/06/2021] [Indexed: 06/13/2023]

Abstract

KEY MESSAGE

Despite phenotyping the training set under unfavorable conditions on smallholder farms in Madagascar, we were able to successfully apply genomic prediction to select donors among gene bank accessions. Poor soil fertility and low fertilizer application rates are main reasons for the large yield gap observed for rice produced in sub-Saharan Africa. Traditional varieties that are preserved in gene banks were shown to possess traits and alleles that would improve the performance of modern variety under such low-input conditions. How to accelerate the utilization of gene bank resources in crop improvement is an unresolved question and here our objective was to test whether genomic prediction could aid in the selection of promising donors. A subset of the 3,024 sequenced accessions from the IRRI rice gene bank was phenotyped for yield and agronomic traits for two years in unfertilized farmers' fields in Madagascar, and based on these data, a genomic prediction model was developed. This model was applied to predict the performance of the entire set of 3024 accessions, and the top predicted performers were sent to Madagascar for confirmatory trials. The prediction accuracies ranged from 0.10 to 0.30 for grain yield, from 0.25 to 0.63 for straw biomass, to 0.71 for heading date. Two accessions have subsequently been utilized as donors in rice breeding programs in Madagascar. Despite having conducted phenotypic evaluations under challenging conditions on smallholder farms, our results are encouraging as the prediction accuracy realized in on-farm experiments was in the range of accuracies achieved in on-station studies. Thus, we could provide clear empirical evidence on the value of genomic selection in identifying suitable genetic resources for crop improvement, if genotypic data are available.

Collapse

Isidro y Sánchez J, Akdemir D. Training Set Optimization for Sparse Phenotyping in Genomic Selection: A Conceptual Overview. FRONTIERS IN PLANT SCIENCE 2021;12:715910. [PMID: 34589099 PMCID: PMC8475495 DOI: 10.3389/fpls.2021.715910] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 08/10/2021] [Indexed: 06/13/2023]

Zhang W, Boyle K, Brule-Babel A, Fedak G, Gao P, Djama ZR, Polley B, Cuthbert R, Randhawa H, Graf R, Jiang F, Eudes F, Fobert PR. Evaluation of Genomic Prediction for Fusarium Head Blight Resistance with a Multi-Parental Population. BIOLOGY 2021;10:biology10080756. [PMID: 34439988 PMCID: PMC8389552 DOI: 10.3390/biology10080756] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 08/01/2021] [Accepted: 08/02/2021] [Indexed: 12/12/2022]

Abstract

Simple Summary

Genomic selection is a promising approach to select superior wheat lines with better resistance to Fusarium head blight. The accuracy of genomic selection is determined by many factors. In this study, we found a training population with large size, genomic selection models incorporating biological information, and multi-environment modelling led to considerably better predictabilities. A training population designed by the coefficient of determination (CDmean) could increase accuracy of prediction. Relatedness between training population (TP) and testing population is the key for accuracies of genomic selection across populations.

Abstract

Fusarium head blight (FHB) resistance is quantitatively inherited, controlled by multiple minor effect genes, and highly affected by the interaction of genotype and environment. This makes genomic selection (GS) that uses genome-wide molecular marker data to predict the genetic breeding value as a promising approach to select superior lines with better resistance. However, various factors can affect accuracies of GS and better understanding how these factors affect GS accuracies could ensure the success of applying GS to improve FHB resistance in wheat. In this study, we performed a comprehensive evaluation of factors that affect GS accuracies with a multi-parental population designed for FHB resistance. We found larger sample sizes could get better accuracies. Training population designed by CDmean based optimization algorithms significantly increased accuracies than random sampling approach, while mean of predictor error variance (PEVmean) had the poorest performance. Different genomic selection models performed similarly for accuracies. Including prior known large effect quantitative trait loci (QTL) as fixed effect into the GS model considerably improved the predictability. Multi-traits models had almost no effects, while the multi-environment model outperformed the single environment model for prediction across different environments. By comparing within and across family prediction, better accuracies were obtained with the training population more closely related to the testing population. However, achieving good accuracies for GS prediction across populations is still a challenging issue for GS application.

Collapse

Affiliation(s)

Wentao Zhang Aquatic and Crop Resources Development, National Research Council of Canada, Saskatoon, SK S7N 0W9, Canada; (K.B.); (P.G.); (B.P.) Correspondence: (W.Z.); (P.R.F.)
Kerry Boyle Aquatic and Crop Resources Development, National Research Council of Canada, Saskatoon, SK S7N 0W9, Canada; (K.B.); (P.G.); (B.P.)
Anita Brule-Babel Department of Plant Science, Agriculture Building, University of Manitoba, Winnipeg, MB R3T 2N2, Canada;
George Fedak Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada; (G.F.); (Z.R.D.)
Peng Gao Aquatic and Crop Resources Development, National Research Council of Canada, Saskatoon, SK S7N 0W9, Canada; (K.B.); (P.G.); (B.P.)
Zeinab Robleh Djama Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada; (G.F.); (Z.R.D.)
Brittany Polley Aquatic and Crop Resources Development, National Research Council of Canada, Saskatoon, SK S7N 0W9, Canada; (K.B.); (P.G.); (B.P.)
Richard Cuthbert Swift Current Research and Development Centre, Agriculture and Agri-Food Canada, Swift Current, SK S9H 3X2, Canada;
Harpinder Randhawa Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB T1J 4B1, Canada; (H.R.); (R.G.); (F.J.); (F.E.)
Robert Graf Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB T1J 4B1, Canada; (H.R.); (R.G.); (F.J.); (F.E.)
Fengying Jiang Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB T1J 4B1, Canada; (H.R.); (R.G.); (F.J.); (F.E.)
Francois Eudes Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB T1J 4B1, Canada; (H.R.); (R.G.); (F.J.); (F.E.)
Pierre R. Fobert Aquatic and Crop Resources Development, National Research Council of Canada, Ottawa, ON K1A 0R6, Canada Correspondence: (W.Z.); (P.R.F.)

Collapse

Griot R, Allal F, Phocas F, Brard-Fudulea S, Morvezen R, Haffray P, François Y, Morin T, Bestin A, Bruant JS, Cariou S, Peyrou B, Brunier J, Vandeputte M. Optimization of Genomic Selection to Improve Disease Resistance in Two Marine Fishes, the European Sea Bass (Dicentrarchus labrax) and the Gilthead Sea Bream (Sparus aurata). Front Genet 2021;12:665920. [PMID: 34335683 PMCID: PMC8317601 DOI: 10.3389/fgene.2021.665920] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 06/25/2021] [Indexed: 11/13/2022] Open

Abstract

Disease outbreaks are a major threat to the aquaculture industry, and can be controlled by selective breeding. With the development of high-throughput genotyping technologies, genomic selection may become accessible even in minor species. Training population size and marker density are among the main drivers of the prediction accuracy, which both have a high impact on the cost of genomic selection. In this study, we assessed the impact of training population size as well as marker density on the prediction accuracy of disease resistance traits in European sea bass (Dicentrarchus labrax) and gilthead sea bream (Sparus aurata). We performed a challenge to nervous necrosis virus (NNV) in two sea bass cohorts, a challenge to Vibrio harveyi in one sea bass cohort and a challenge to Photobacterium damselae subsp. piscicida in one sea bream cohort. Challenged individuals were genotyped on 57K-60K SNP chips. Markers were sampled to design virtual SNP chips of 1K, 3K, 6K, and 10K markers. Similarly, challenged individuals were randomly sampled to vary training population size from 50 to 800 individuals. The accuracy of genomic-based (GBLUP model) and pedigree-based estimated breeding values (EBV) (PBLUP model) was computed for each training population size using Monte-Carlo cross-validation. Genomic-based breeding values were also computed using the virtual chips to study the effect of marker density. For resistance to Viral Nervous Necrosis (VNN), as one major QTL was detected, the opportunity of marker-assisted selection was investigated by adding a QTL effect in both genomic and pedigree prediction models. As training population size increased, accuracy increased to reach values in range of 0.51-0.65 for full density chips. The accuracy could still increase with more individuals in the training population as the accuracy plateau was not reached. When using only the 6K density chip, accuracy reached at least 90% of that obtained with the full density chip. Adding the QTL effect increased the accuracy of the PBLUP model to values higher than the GBLUP model without the QTL effect. This work sets a framework for the practical implementation of genomic selection to improve the resistance to major diseases in European sea bass and gilthead sea bream.

Collapse

Cersonsky RK, Helfrecht BA, Engel EA, Kliavinek S, Ceriotti M. Improving sample and feature selection with principal covariates regression. MACHINE LEARNING-SCIENCE AND TECHNOLOGY 2021. [DOI: 10.1088/2632-2153/abfe7c] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Fritsche-Neto R, Galli G, Borges KLR, Costa-Neto G, Alves FC, Sabadin F, Lyra DH, Morais PPP, Braatz de Andrade LR, Granato I, Crossa J. Optimizing Genomic-Enabled Prediction in Small-Scale Maize Hybrid Breeding Programs: A Roadmap Review. FRONTIERS IN PLANT SCIENCE 2021;12:658267. [PMID: 34276721 PMCID: PMC8281958 DOI: 10.3389/fpls.2021.658267] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Accepted: 05/10/2021] [Indexed: 06/13/2023]

Abstract

The usefulness of genomic prediction (GP) for many animal and plant breeding programs has been highlighted for many studies in the last 20 years. In maize breeding programs, mostly dedicated to delivering more highly adapted and productive hybrids, this approach has been proved successful for both large- and small-scale breeding programs worldwide. Here, we present some of the strategies developed to improve the accuracy of GP in tropical maize, focusing on its use under low budget and small-scale conditions achieved for most of the hybrid breeding programs in developing countries. We highlight the most important outcomes obtained by the University of São Paulo (USP, Brazil) and how they can improve the accuracy of prediction in tropical maize hybrids. Our roadmap starts with the efforts for germplasm characterization, moving on to the practices for mating design, and the selection of the genotypes that are used to compose the training population in field phenotyping trials. Factors including population structure and the importance of non-additive effects (dominance and epistasis) controlling the desired trait are also outlined. Finally, we explain how the source of the molecular markers, environmental, and the modeling of genotype-environment interaction can affect the accuracy of GP. Results of 7 years of research in a public maize hybrid breeding program under tropical conditions are discussed, and with the great advances that have been made, we find that what is yet to come is exciting. The use of open-source software for the quality control of molecular markers, implementing GP, and envirotyping pipelines may reduce costs in an efficient computational manner. We conclude that exploring new models/tools using high-throughput phenotyping data along with large-scale envirotyping may bring more resolution and realism when predicting genotype performances. Despite the initial costs, mostly for genotyping, the GP platforms in combination with these other data sources can be a cost-effective approach for predicting the performance of maize hybrids for a large set of growing conditions.

Collapse

Atanda SA, Olsen M, Crossa J, Burgueño J, Rincent R, Dzidzienyo D, Beyene Y, Gowda M, Dreher K, Boddupalli PM, Tongoona P, Danquah EY, Olaoye G, Robbins KR. Scalable Sparse Testing Genomic Selection Strategy for Early Yield Testing Stage. FRONTIERS IN PLANT SCIENCE 2021;12:658978. [PMID: 34239521 PMCID: PMC8259603 DOI: 10.3389/fpls.2021.658978] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 05/25/2021] [Indexed: 06/08/2023]

Puglisi D, Delbono S, Visioni A, Ozkan H, Kara İ, Casas AM, Igartua E, Valè G, Piero ARL, Cattivelli L, Tondelli A, Fricano A. Genomic Prediction of Grain Yield in a Barley MAGIC Population Modeling Genotype per Environment Interaction. FRONTIERS IN PLANT SCIENCE 2021;12:664148. [PMID: 34108982 PMCID: PMC8183822 DOI: 10.3389/fpls.2021.664148] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 04/26/2021] [Indexed: 06/12/2023]

Abstract

Multi-parent Advanced Generation Inter-crosses (MAGIC) lines have mosaic genomes that are generated shuffling the genetic material of the founder parents following pre-defined crossing schemes. In cereal crops, these experimental populations have been extensively used to investigate the genetic bases of several traits and dissect the genetic bases of epistasis. In plants, genomic prediction models are usually fitted using either diverse panels of mostly unrelated accessions or individuals of biparental families and several empirical analyses have been conducted to evaluate the predictive ability of models fitted to these populations using different traits. In this paper, we constructed, genotyped and evaluated a barley MAGIC population of 352 individuals developed with a diverse set of eight founder parents showing contrasting phenotypes for grain yield. We combined phenotypic and genotypic information of this MAGIC population to fit several genomic prediction models which were cross-validated to conduct empirical analyses aimed at examining the predictive ability of these models varying the sizes of training populations. Moreover, several methods to optimize the composition of the training population were also applied to this MAGIC population and cross-validated to estimate the resulting predictive ability. Finally, extensive phenotypic data generated in field trials organized across an ample range of water regimes and climatic conditions in the Mediterranean were used to fit and cross-validate multi-environment genomic prediction models including G×E interaction, using both genomic best linear unbiased prediction and reproducing kernel Hilbert space along with a non-linear Gaussian Kernel. Overall, our empirical analyses showed that genomic prediction models trained with a limited number of MAGIC lines can be used to predict grain yield with values of predictive ability that vary from 0.25 to 0.60 and that beyond QTL mapping and analysis of epistatic effects, MAGIC population might be used to successfully fit genomic prediction models. We concluded that for grain yield, the single-environment genomic prediction models examined in this study are equivalent in terms of predictive ability while, in general, multi-environment models that explicitly split marker effects in main and environmental-specific effects outperform simpler multi-environment models.

Collapse

Akdemir D, Rio S, Isidro y Sánchez J. TrainSel: An R Package for Selection of Training Populations. Front Genet 2021;12:655287. [PMID: 34025720 PMCID: PMC8138169 DOI: 10.3389/fgene.2021.655287] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Accepted: 03/31/2021] [Indexed: 01/01/2023] Open

David O, Le Rouzic A, Dillmann C. Optimization of sampling designs for pedigrees and association studies. Biometrics 2021;78:1056-1066. [PMID: 33876835 DOI: 10.1111/biom.13476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Revised: 03/10/2021] [Accepted: 04/02/2021] [Indexed: 11/29/2022]

Lopez-Cruz M, de Los Campos G. Optimal breeding-value prediction using a sparse selection index. Genetics 2021;218:6179494. [PMID: 33748861 PMCID: PMC8128408 DOI: 10.1093/genetics/iyab030] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 02/13/2021] [Indexed: 02/06/2023] Open

Beche E, Gillman JD, Song Q, Nelson R, Beissinger T, Decker J, Shannon G, Scaboo AM. Genomic prediction using training population design in interspecific soybean populations. MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2021;41:15. [PMID: 37309481 PMCID: PMC10236090 DOI: 10.1007/s11032-021-01203-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Accepted: 01/11/2021] [Indexed: 06/14/2023]

Abstract

Agronomically important traits generally have complex genetic architecture, where many genes have a small and largely additive effect. Genomic prediction has been demonstrated to increase genetic gain and efficiency in plant breeding programs beyond marker-assisted selection and phenotypic selection. The objective of this study was to evaluate the impact of allelic origin, marker density, training population size, and cross-validation schemes on the accuracy of genomic prediction models in an interspecific soybean nested association mapping (NAM) panel. Three cross-validation schemes were used: (a) Within-Family (WF): training population and predictions are made exclusively within each family; (b) Across All families (AF): all the individuals from the three families were randomly assigned to either the training or validation set; (c) Leave one Family out (LFO): each family is predicted using a training set that contains the other two families. Predictive abilities increased with training population size up to 350 individuals, but no significant gains were noted beyond 250 individuals in the training population. The number of markers had a limited impact on the observed predictive ability across traits; increasing markers used in the model above 1000 revealed no significant increases in prediction accuracy. Predictive abilities for AF were not significantly different from the WF method, and predictive abilities across populations for the WF method had a range of 0.58 to 0.70 for maturity, protein, meal, and oil. Our results also showed encouraging prediction accuracies for grain yield (0.58-0.69) using the WF method. Partitioning genomic prediction between G. max and G. soja alleles revealed useful information to select material with a larger allele contribution from both parents and could accelerate allele introgression from exotic germplasm into the elite soybean gene pool.

Supplementary Information

The online version contains supplementary material available at 10.1007/s11032-021-01203-6.

Collapse

Kadam DC, Rodriguez OR, Lorenz AJ. Optimization of training sets for genomic prediction of early-stage single crosses in maize. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2021;134:687-699. [PMID: 33398385 DOI: 10.1007/s00122-020-03722-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Accepted: 11/03/2020] [Indexed: 06/12/2023]

Abstract

Training population optimization algorithms are useful for efficiently training genomic prediction models for single-cross performance, especially if the population is extended beyond only realized crosses to all possible single crosses. Genomic prediction of single-cross performance could allow effective evaluation of all possible single crosses between all inbreds developed in a hybrid breeding program. The objectives of the present study were to investigate the effect of different levels of relatedness on genomic predictive ability of single crosses, evaluate the usefulness of deterministic formula to forecast prediction accuracy in advance, and determine the potential for TRS optimization based on prediction error variance (PEVmean) and coefficient of determination (CDmean) criteria. We used 481 single crosses made by crossing 89 random recombinant inbred lines (RILs) belonging to the Iowa stiff stalk synthetic group with 103 random RILs belonging to the non-stiff stalk synthetic heterotic group. As expected, predictive ability was enhanced by ensuring close relationships between TRSs and target sets, even when TRS sizes were smaller. We found that designing a TRS based on PEVmean or CDmean criteria is useful for increasing the efficiency of genomic prediction of maize single crosses. We went further and extended the sampling space from that of all observed single crosses to all possible single crosses, providing a much larger genetic space within which to design a training population. Using all possible single crosses increased the advantage of the PEVmean and CDmean methods based on expected prediction accuracy. This finding suggests that it may be worthwhile using an optimization algorithm to select a training population from all possible single crosses to maximize efficiency in training accurate models for hybrid genomic prediction.

Collapse

Yu X, Leiboff S, Li X, Guo T, Ronning N, Zhang X, Muehlbauer GJ, Timmermans MC, Schnable PS, Scanlon MJ, Yu J. Genomic prediction of maize microphenotypes provides insights for optimizing selection and mining diversity. PLANT BIOTECHNOLOGY JOURNAL 2020;18:2456-2465. [PMID: 32452105 PMCID: PMC7680549 DOI: 10.1111/pbi.13420] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2019] [Revised: 05/05/2020] [Accepted: 05/13/2020] [Indexed: 05/25/2023]

Pégard M, Segura V, Muñoz F, Bastien C, Jorge V, Sanchez L. Favorable Conditions for Genomic Evaluation to Outperform Classical Pedigree Evaluation Highlighted by a Proof-of-Concept Study in Poplar. FRONTIERS IN PLANT SCIENCE 2020;11:581954. [PMID: 33193528 PMCID: PMC7655903 DOI: 10.3389/fpls.2020.581954] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Accepted: 09/22/2020] [Indexed: 06/11/2023]

Abstract

Forest trees like poplar are particular in many ways compared to other domesticated species. They have long juvenile phases, ongoing crop-wild gene flow, extensive outcrossing, and slow growth. All these particularities tend to make the conduction of breeding programs and evaluation stages costly both in time and resources. Perennials like trees are therefore good candidates for the implementation of genomic selection (GS) which is a good way to accelerate the breeding process, by unchaining selection from phenotypic evaluation without affecting precision. In this study, we tried to compare GS to pedigree-based traditional evaluation, and evaluated under which conditions genomic evaluation outperforms classical pedigree evaluation. Several conditions were evaluated as the constitution of the training population by cross-validation, the implementation of multi-trait, single trait, additive and non-additive models with different estimation methods (G-BLUP or weighted G-BLUP). Finally, the impact of the marker densification was tested through four marker density sets. The population under study corresponds to a pedigree of 24 parents and 1,011 offspring, structured into 35 full-sib families. Four evaluation batches were planted in the same location and seven traits were evaluated on 1 and 2 years old trees. The quality of prediction was reported by the accuracy, the Spearman rank correlation and prediction bias and tested with a cross-validation and an independent individual test set. Our results show that genomic evaluation performance could be comparable to the already well-optimized pedigree-based evaluation under certain conditions. Genomic evaluation appeared to be advantageous when using an independent test set and a set of less precise phenotypes. Genome-based methods showed advantages over pedigree counterparts when ranking candidates at the within-family levels, for most of the families. Our study also showed that looking at ranking criteria as Spearman rank correlation can reveal benefits to genomic selection hidden by biased predictions.

Collapse

Heslot N, Feoktistov V. Optimization of Selective Phenotyping and Population Design for Genomic Prediction. JOURNAL OF AGRICULTURAL, BIOLOGICAL AND ENVIRONMENTAL STATISTICS 2020. [DOI: 10.1007/s13253-020-00415-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]

Roth M, Muranty H, Di Guardo M, Guerra W, Patocchi A, Costa F. Genomic prediction of fruit texture and training population optimization towards the application of genomic selection in apple. HORTICULTURE RESEARCH 2020;7:148. [PMID: 32922820 PMCID: PMC7459338 DOI: 10.1038/s41438-020-00370-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Revised: 07/18/2020] [Accepted: 07/24/2020] [Indexed: 05/11/2023]

Preservation of Genetic Variation in a Breeding Population for Long-Term Genetic Gain. G3-GENES GENOMES GENETICS 2020;10:2753-2762. [PMID: 32513654 PMCID: PMC7407475 DOI: 10.1534/g3.120.401354] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Verges VL, Lyerly J, Dong Y, Van Sanford DA. Training Population Design With the Use of Regional Fusarium Head Blight Nurseries to Predict Independent Breeding Lines for FHB Traits. FRONTIERS IN PLANT SCIENCE 2020;11:1083. [PMID: 32765564 PMCID: PMC7381120 DOI: 10.3389/fpls.2020.01083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2020] [Accepted: 06/30/2020] [Indexed: 06/11/2023]

Abstract

Fusarium head blight (FHB) is a devastating disease in cereals around the world. Because it is quantitatively inherited and technically difficult to reproduce, breeding to increase resistance in wheat germplasm is difficult and slow. Genomic selection (GS) is a form of marker-assisted selection (MAS) that simultaneously estimates all locus, haplotype, or marker effects across the entire genome to calculate genomic estimated breeding values (GEBVs). Since its inception, there have been many studies that demonstrate the utility of GS approaches to breeding for disease resistance in crops. In this study, the Uniform Northern (NUS) and Uniform Southern (SUS) soft red winter wheat scab nurseries (a total 452 lines) were evaluated as possible training populations (TP) to predict FHB traits in breeding lines of the UK (University of Kentucky) wheat breeding program. DON was best predicted by the SUS; Fusarium damaged kernels (FDK), FHB rating, and two indices, DSK index and DK index were best predicted by NUS. The highest prediction accuracies were obtained when the NUS and SUS were combined, reaching up to 0.5 for almost all traits except FHB rating. Highest prediction accuracies were obtained with bigger TP sizes (300-400) and there were not significant effects of TP optimization method for all traits, although at small TP size, the PEVmean algorithm worked better than other methods. To select for lines with tolerance to DON accumulation, a primary breeding target for many breeders, we compared selection based on DON BLUES with selection based on DON GEBVs, DSK GEBVs, and DK GEBVs. At selection intensities (SI) of 30-40%, DSK index showed the best performance with a 4-6% increase over direct selection for DON. Our results confirm the usefulness of regional nurseries as a source of lines to predict GEBVs for local breeding programs, and shows that an index that includes DON, together with FDK and FHB rating could be an excellent choice to identify lines with low DON content and an overall improved FHB resistance.

Collapse

Sallam AH, Conley E, Prakapenka D, Da Y, Anderson JA. Improving Prediction Accuracy Using Multi-allelic Haplotype Prediction and Training Population Optimization in Wheat. G3 (BETHESDA, MD.) 2020;10:2265-2273. [PMID: 32371453 PMCID: PMC7341132 DOI: 10.1534/g3.120.401165] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 04/29/2020] [Indexed: 02/01/2023]

Ben-Sadoun S, Rincent R, Auzanneau J, Oury FX, Rolland B, Heumez E, Ravel C, Charmet G, Bouchet S. Economical optimization of a breeding scheme by selective phenotyping of the calibration set in a multi-trait context: application to bread making quality. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2020;133:2197-2212. [PMID: 32303775 DOI: 10.1007/s00122-020-03590-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Accepted: 03/31/2020] [Indexed: 05/27/2023]

Seye AI, Bauland C, Charcosset A, Moreau L. Revisiting hybrid breeding designs using genomic predictions: simulations highlight the superiority of incomplete factorials between segregating families over topcross designs. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2020;133:1995-2010. [PMID: 32185420 DOI: 10.1007/s00122-020-03573-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 02/28/2020] [Indexed: 06/10/2023]

Abstract

Simulations showed that hybrid performances issued from an incomplete factorial between segregating families of two heterotic groups enable to calibrate genomic predictions of hybrid value more efficiently than tester-based designs. Genomic selection offers new opportunities to revisit hybrid breeding by replacing extensive phenotyping of hybrid combinations by genomic predictions. A key question remains to identify the best design to calibrate genomic prediction models. We proposed to use single-cross hybrids issued from an incomplete factorial design between segregating populations and compared this strategy with a conventional approach based on topcross evaluation. Two multiparental segregating populations of lines, each specific of one heterotic group, were simulated. Hybrids considered as training sets were generated using either (1) a parental line from the opposite group as tester or (2) following an incomplete factorial design. Different specific combining ability (SCA) proportions were simulated by considering different levels of group divergence and dominance effects for the simulated QTL. For the incomplete factorial design, for a same number of hybrids, we considered different numbers of parental lines and different contributions of lines (one to four) to calibration hybrids. We evaluated for different training set sizes prediction accuracies of new hybrids and genetic gains along three generations. At a given training set size, factorial design was as efficient (considering accuracy) as tester design in additive scenarios, but significantly outperformed tester design when SCA was present. The contribution number of each parental line to the incomplete factorial design had a small impact on accuracies. Our simulations confirmed experimental results and showed that calibrating models on hybrids between two multiparental populations is a cost-efficient way to perform genomic predictions in both groups, opening prospects for revisiting reciprocal recurrent selection schemes.

Collapse

BWGS: A R package for genomic selection and its application to a wheat breeding programme. PLoS One 2020;15:e0222733. [PMID: 32240182 PMCID: PMC7141418 DOI: 10.1371/journal.pone.0222733] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open

Abstract

We developed an integrated R library called BWGS to enable easy computation of Genomic Estimates of Breeding values (GEBV) for genomic selection. BWGS, for BreedWheat Genomic selection, was developed in the framework of a cooperative private-public partnership project called Breedwheat (https://breedwheat.fr) and relies on existing R-libraries, all freely available from CRAN servers. The two main functions enable to run 1) replicated random cross validations within a training set of genotyped and phenotyped lines and 2) GEBV prediction, for a set of genotyped-only lines. Options are available for 1) missing data imputation, 2) markers and training set selection and 3) genomic prediction with 15 different methods, either parametric or semi-parametric. The usefulness and efficiency of BWGS are illustrated using a population of wheat lines from a real breeding programme. Adjusted yield data from historical trials (highly unbalanced design) were used for testing the options of BWGS. On the whole, 760 candidate lines with adjusted phenotypes and genotypes for 47 839 robust SNP were used. With a simple desktop computer, we obtained results which compared with previously published results on wheat genomic selection. As predicted by the theory, factors that are most influencing predictive ability, for a given trait of moderate heritability, are the size of the training population and a minimum number of markers for capturing every QTL information. Missing data up to 40%, if randomly distributed, do not degrade predictive ability once imputed, and up to 80% randomly distributed missing data are still acceptable once imputed with Expectation-Maximization method of package rrBLUP. It is worth noticing that selecting markers that are most associated to the trait do improve predictive ability, compared with the whole set of markers, but only when marker selection is made on the whole population. When marker selection is made only on the sampled training set, this advantage nearly disappeared, since it was clearly due to overfitting. Few differences are observed between the 15 prediction models with this dataset. Although non-parametric methods that are supposed to capture non-additive effects have slightly better predictive accuracy, differences remain small. Finally, the GEBV from the 15 prediction models are all highly correlated to each other. These results are encouraging for an efficient use of genomic selection in applied breeding programmes and BWGS is a simple and powerful toolbox to apply in breeding programmes or training activities.

Collapse