1
|
Nazzicari N, Franguelli N, Ferrari B, Pecetti L, Annicchiarico P. The Effect of Genome Parametrization and SNP Marker Subsetting on Genomic Selection in Autotetraploid Alfalfa. Genes (Basel) 2024; 15:449. [PMID: 38674384 PMCID: PMC11050091 DOI: 10.3390/genes15040449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 03/21/2024] [Accepted: 03/27/2024] [Indexed: 04/28/2024] Open
Abstract
BACKGROUND Alfalfa, the most economically important forage legume worldwide, features modest genetic progress due to long selection cycles and the extent of the non-additive genetic variance associated with its autotetraploid genome. METHODS To improve the efficiency of genomic selection in alfalfa, we explored the effects of genome parametrization (as tetraploid and diploid dosages, plus allele ratios) and SNP marker subsetting (all available SNPs, only genic regions, and only non-genic regions) on genomic regressions, together with various levels of filtering on reading depth and missing rates. We used genotyping by sequencing-generated data and focused on traits of different genetic complexity, i.e., dry biomass yield in moisture-favorable (FE) and drought stress (SE) environments, leaf size, and the onset of flowering, which were assessed in 143 genotyped plants from a genetically broad European reference population and their phenotyped half-sib progenies. RESULTS On average, the allele ratio improved the predictive ability compared with other genome parametrizations (+7.9% vs. tetraploid dosage, +12.6% vs. diploid dosage), while using all the SNPs offered an advantage compared with any specific SNP subsetting (+3.7% vs. genic regions, +7.6% vs. non-genic regions). However, when focusing on specific traits, different combinations of genome parametrization and subsetting achieved better performances. We also released Legpipe2, an SNP calling pipeline tailored for reduced representation (GBS, RAD) in medium-sized genotyping experiments.
Collapse
Affiliation(s)
- Nelson Nazzicari
- Council for Agricultural Research and Economics (CREA), Research Center for Animal Production and Aquaculture, Viale Piacenza 29, 26900 Lodi, Italy
| | | | | | | | | |
Collapse
|
2
|
Yadav S, Ross EM, Wei X, Liu S, Nguyen LT, Powell O, Hickey LT, Deomano E, Atkin F, Voss-Fels KP, Hayes BJ. Use of continuous genotypes for genomic prediction in sugarcane. THE PLANT GENOME 2024; 17:e20417. [PMID: 38066702 DOI: 10.1002/tpg2.20417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 10/30/2023] [Accepted: 11/14/2023] [Indexed: 03/22/2024]
Abstract
Genomic selection in sugarcane faces challenges due to limited genomic tools and high genomic complexity, particularly because of its high and variable ploidy. The classification of genotypes for single nucleotide polymorphisms (SNPs) becomes difficult due to the wide range of possible allele dosages. Previous genomic studies in sugarcane used pseudo-diploid genotyping, grouping all heterozygotes into a single class. In this study, we investigate the use of continuous genotypes as a proxy for allele-dosage in genomic prediction models. The hypothesis is that continuous genotypes could better reflect allele dosage at SNPs linked to mutations affecting target traits, resulting in phenotypic variation. The dataset included genotypes of 1318 clones at 58K SNP markers, with about 26K markers filtered using standard quality controls. Predictions for tonnes of cane per hectare (TCH), commercial cane sugar (CCS), and fiber content (Fiber) were made using parametric, non-parametric, and Bayesian methods. Continuous genotypes increased accuracy by 5%-7% for CCS and Fiber. The pseudo-diploid parametrization performed better for TCH. Reproducing kernel Hilbert spaces model with Gaussian kernel and AK4 (arc-cosine kernel with hidden layer 4) kernel outperformed other methods for TCH and CCS, suggesting that non-additive effects might influence these traits. The prevalence of low-dosage markers in the study may have limited the benefits of approximating allele-dosage information with continuous genotypes in genomic prediction models. Continuous genotypes simplify genomic prediction in polyploid crops, allowing additional markers to be used without adhering to pseudo-diploid inheritance. The approach can particularly benefit high ploidy species or emerging crops with unknown ploidy.
Collapse
Affiliation(s)
- Seema Yadav
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Queensland, Australia
| | - Elizabeth M Ross
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Queensland, Australia
| | - Xianming Wei
- Sugar Research Australia, Mackay, Queensland, Australia
| | - Shouye Liu
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Loan To Nguyen
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Queensland, Australia
| | - Owen Powell
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Queensland, Australia
| | - Lee T Hickey
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Queensland, Australia
| | - Emily Deomano
- Sugar Research Australia, Indooroopilly, Queensland, Australia
| | - Felicity Atkin
- Sugar Research Australia, Meringa Gordonvale, Queensland, Australia
| | - Kai P Voss-Fels
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Queensland, Australia
- Department of Grapevine Breeding, Hochschule Geisenheim University, Geisenheim, Germany
| | - Ben J Hayes
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
3
|
Song H, Zhang Q, Hu H. polyGBLUP: a modified genomic best linear unbiased prediction improved the genomic prediction efficiency for autopolyploid species. Brief Bioinform 2024; 25:bbae106. [PMID: 38517695 PMCID: PMC10959164 DOI: 10.1093/bib/bbae106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 12/22/2023] [Accepted: 02/26/2024] [Indexed: 03/24/2024] Open
Abstract
Given the universality of autopolyploid species in nature, it is crucial to develop genomic selection methods that consider different allele dosages for autopolyploid breeding. However, no method has been developed to deal with autopolyploid data regardless of the ploidy level. In this study, we developed a modified genomic best linear unbiased prediction (GBLUP) model (polyGBLUP) through constructing additive and dominant genomic relationship matrices based on different allele dosages. polyGBLUP could carry out genomic prediction for autopolyploid species regardless of the ploidy level. Through comprehensive simulations and analysis of real data of autotetraploid blueberry and guinea grass and autohexaploid sweet potato, the results showed that polyGBLUP achieved higher prediction accuracy than GBLUP and its superiority was more obvious when the ploidy level of autopolyploids is high. Furthermore, when the dominant effect was added to polyGBLUP (polyGDBLUP), the greater the dominance degree, the more obvious the advantages of polyGDBLUP over the diploid models in terms of prediction accuracy, bias, mean squared error and mean absolute error. For real data, the superiority of polyGBLUP over GBLUP appeared in blueberry and sweet potato populations and a part of the traits in guinea grass population due to the high correlation coefficients between diploid and polyploidy genomic relationship matrices. In addition, polyGDBLUP did not produce higher prediction accuracy than polyGBLUP for most traits of real data as dominant genetic variance was not captured for these traits. Our study will be a significant promising method for genomic prediction of autopolyploid species.
Collapse
Affiliation(s)
- Hailiang Song
- Fisheries Science Institute, Beijing Academy of Agriculture and Forestry Sciences & Beijing Key Laboratory of Fisheries Biotechnology, Beijing 100068, China
- Key Laboratory of Sturgeon Genetics and Breeding, Ministry of Agriculture and Rural Affairs, Hangzhou, 311799, China
| | - Qin Zhang
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, Shandong Agricultural University, Taian 271001, China
| | - Hongxia Hu
- Fisheries Science Institute, Beijing Academy of Agriculture and Forestry Sciences & Beijing Key Laboratory of Fisheries Biotechnology, Beijing 100068, China
- Key Laboratory of Sturgeon Genetics and Breeding, Ministry of Agriculture and Rural Affairs, Hangzhou, 311799, China
| |
Collapse
|
4
|
Martins FB, Aono AH, Moraes ADCL, Ferreira RCU, Vilela MDM, Pessoa-Filho M, Rodrigues-Motta M, Simeão RM, de Souza AP. Genome-wide family prediction unveils molecular mechanisms underlying the regulation of agronomic traits in Urochloa ruziziensis. FRONTIERS IN PLANT SCIENCE 2023; 14:1303417. [PMID: 38148869 PMCID: PMC10749977 DOI: 10.3389/fpls.2023.1303417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 11/15/2023] [Indexed: 12/28/2023]
Abstract
Tropical forage grasses, particularly those belonging to the Urochloa genus, play a crucial role in cattle production and serve as the main food source for animals in tropical and subtropical regions. The majority of these species are apomictic and tetraploid, highlighting the significance of U. ruziziensis, a sexual diploid species that can be tetraploidized for use in interspecific crosses with apomictic species. As a means to support breeding programs, our study investigates the feasibility of genome-wide family prediction in U. ruziziensis families to predict agronomic traits. Fifty half-sibling families were assessed for green matter yield, dry matter yield, regrowth capacity, leaf dry matter, and stem dry matter across different clippings established in contrasting seasons with varying available water capacity. Genotyping was performed using a genotyping-by-sequencing approach based on DNA samples from family pools. In addition to conventional genomic prediction methods, machine learning and feature selection algorithms were employed to reduce the necessary number of markers for prediction and enhance predictive accuracy across phenotypes. To explore the regulation of agronomic traits, our study evaluated the significance of selected markers for prediction using a tree-based approach, potentially linking these regions to quantitative trait loci (QTLs). In a multiomic approach, genes from the species transcriptome were mapped and correlated to those markers. A gene coexpression network was modeled with gene expression estimates from a diverse set of U. ruziziensis genotypes, enabling a comprehensive investigation of molecular mechanisms associated with these regions. The heritabilities of the evaluated traits ranged from 0.44 to 0.92. A total of 28,106 filtered SNPs were used to predict phenotypic measurements, achieving a mean predictive ability of 0.762. By employing feature selection techniques, we could reduce the dimensionality of SNP datasets, revealing potential genotype-phenotype associations. The functional annotation of genes near these markers revealed associations with auxin transport and biosynthesis of lignin, flavonol, and folic acid. Further exploration with the gene coexpression network uncovered associations with DNA metabolism, stress response, and circadian rhythm. These genes and regions represent important targets for expanding our understanding of the metabolic regulation of agronomic traits and offer valuable insights applicable to species breeding. Our work represents an innovative contribution to molecular breeding techniques for tropical forages, presenting a viable marker-assisted breeding approach and identifying target regions for future molecular studies on these agronomic traits.
Collapse
Affiliation(s)
- Felipe Bitencourt Martins
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, São Paulo, Brazil
| | - Alexandre Hild Aono
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, São Paulo, Brazil
| | - Aline da Costa Lima Moraes
- Department of Plant Biology, Biology Institute, University of Campinas (UNICAMP), Campinas, São Paulo, Brazil
| | | | | | - Marco Pessoa-Filho
- Embrapa Cerrados, Brazilian Agricultural Research Corporation, Brasília, Brazil
| | | | - Rosangela Maria Simeão
- Embrapa Gado de Corte, Brazilian Agricultural Research Corporation, Campo Grande, Mato Grosso, Brazil
| | - Anete Pereira de Souza
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, São Paulo, Brazil
- Department of Plant Biology, Biology Institute, University of Campinas (UNICAMP), Campinas, São Paulo, Brazil
| |
Collapse
|
5
|
Njuguna JN, Clark LV, Lipka AE, Anzoua KG, Bagmet L, Chebukin P, Dwiyanti MS, Dzyubenko E, Dzyubenko N, Ghimire BK, Jin X, Johnson DA, Kjeldsen JB, Nagano H, de Bem Oliveira I, Peng J, Petersen KK, Sabitov A, Seong ES, Yamada T, Yoo JH, Yu CY, Zhao H, Munoz P, Long SP, Sacks EJ. Impact of genotype-calling methodologies on genome-wide association and genomic prediction in polyploids. THE PLANT GENOME 2023; 16:e20401. [PMID: 37903749 DOI: 10.1002/tpg2.20401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 09/17/2023] [Accepted: 09/23/2023] [Indexed: 11/01/2023]
Abstract
Discovery and analysis of genetic variants underlying agriculturally important traits are key to molecular breeding of crops. Reduced representation approaches have provided cost-efficient genotyping using next-generation sequencing. However, accurate genotype calling from next-generation sequencing data is challenging, particularly in polyploid species due to their genome complexity. Recently developed Bayesian statistical methods implemented in available software packages, polyRAD, EBG, and updog, incorporate error rates and population parameters to accurately estimate allelic dosage across any ploidy. We used empirical and simulated data to evaluate the three Bayesian algorithms and demonstrated their impact on the power of genome-wide association study (GWAS) analysis and the accuracy of genomic prediction. We further incorporated uncertainty in allelic dosage estimation by testing continuous genotype calls and comparing their performance to discrete genotypes in GWAS and genomic prediction. We tested the genotype-calling methods using data from two autotetraploid species, Miscanthus sacchariflorus and Vaccinium corymbosum, and performed GWAS and genomic prediction. In the empirical study, the tested Bayesian genotype-calling algorithms differed in their downstream effects on GWAS and genomic prediction, with some showing advantages over others. Through subsequent simulation studies, we observed that at low read depth, polyRAD was advantageous in its effect on GWAS power and limit of false positives. Additionally, we found that continuous genotypes increased the accuracy of genomic prediction, by reducing genotyping error, particularly at low sequencing depth. Our results indicate that by using the Bayesian algorithm implemented in polyRAD and continuous genotypes, we can accurately and cost-efficiently implement GWAS and genomic prediction in polyploid crops.
Collapse
Affiliation(s)
- Joyce N Njuguna
- Department of Crop Sciences, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Lindsay V Clark
- Research Scientific Computing, Seattle Children's Research Institute, Seattle, Washington, USA
| | - Alexander E Lipka
- Department of Crop Sciences, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Kossonou G Anzoua
- Field Science Center for Northern Biosphere, Hokkaido University, Sapporo, Japan
| | - Larisa Bagmet
- Vavilov All-Russian Institute of Plant Genetic Resources, St. Petersburg, Russian Federation
| | - Pavel Chebukin
- FSBSI "FSC of Agricultural Biotechnology of the Far East named after A.K. Chaiki", Ussuriysk, Russian Federation
| | - Maria S Dwiyanti
- Field Science Center for Northern Biosphere, Hokkaido University, Sapporo, Japan
| | - Elena Dzyubenko
- Vavilov All-Russian Institute of Plant Genetic Resources, St. Petersburg, Russian Federation
| | - Nicolay Dzyubenko
- Vavilov All-Russian Institute of Plant Genetic Resources, St. Petersburg, Russian Federation
| | - Bimal Kumar Ghimire
- Department of Crop Science, College of Sanghuh Life Science, Konkuk University, Seoul, South Korea
| | - Xiaoli Jin
- Agronomy Department, Key Laboratory of Crop Germplasm Research of Zhejiang Province, Zhejiang University, Hangzhou, China
| | - Douglas A Johnson
- USDA-ARS Forage and Range Research Lab, Utah State University, Logan, Utah, USA
| | | | - Hironori Nagano
- Field Science Center for Northern Biosphere, Hokkaido University, Sapporo, Japan
| | | | - Junhua Peng
- Spring Valley Agriscience Co. Ltd., Jinan, China
| | | | - Andrey Sabitov
- Vavilov All-Russian Institute of Plant Genetic Resources, St. Petersburg, Russian Federation
| | - Eun Soo Seong
- Division of Bioresource Sciences, Kangwon National University, Chuncheon, South Korea
| | - Toshihiko Yamada
- Field Science Center for Northern Biosphere, Hokkaido University, Sapporo, Japan
| | - Ji Hye Yoo
- Bioherb Research Institute, Kangwon National University, Chuncheon, South Korea
| | - Chang Yeon Yu
- Bioherb Research Institute, Kangwon National University, Chuncheon, South Korea
| | - Hua Zhao
- Key Laboratory of Horticultural Plant Biology of Ministry of Education, Huazhong Agricultural University, Wuhan, China
| | - Patricio Munoz
- Horticultural Science Department, University of Florida, Gainesville, Florida, USA
| | - Stephen P Long
- Department of Crop Sciences, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Erik J Sacks
- Department of Crop Sciences, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| |
Collapse
|
6
|
Chen C, Powell O, Dinglasan E, Ross EM, Yadav S, Wei X, Atkin F, Deomano E, Hayes BJ. Genomic prediction with machine learning in sugarcane, a complex highly polyploid clonally propagated crop with substantial non-additive variation for key traits. THE PLANT GENOME 2023; 16:e20390. [PMID: 37728221 DOI: 10.1002/tpg2.20390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Revised: 08/01/2023] [Accepted: 08/29/2023] [Indexed: 09/21/2023]
Abstract
Sugarcane has a complex, highly polyploid genome with multi-species ancestry. Additive models for genomic prediction of clonal performance might not capture interactions between genes and alleles from different ploidies and ancestral species. As such, genomic prediction in sugarcane presents an interesting case for machine learning (ML) methods, which are purportedly able to deal with high levels of complexity in prediction. Here, we investigated deep learning (DL) neural networks, including multilayer networks (MLP) and convolution neural networks (CNN), and an ensemble machine learning approach, random forest (RF), for genomic prediction in sugarcane. The data set used was 2912 sugarcane clones, scored for 26,086 genome wide single nucleotide polymorphism markers, with final assessment trial data for total cane harvested (TCH), commercial cane sugar (CCS), and fiber content (Fiber). The clones in the latest trial (2017) were used as a validation set. We compared prediction accuracy of these methods to genomic best linear unbiased prediction (GBLUP) extended to include dominance and epistatic effects. The prediction accuracies from GBLUP models were up to 0.37 for TCH, 0.43 for CCS, and 0.48 for Fiber, while the optimized ML models had prediction accuracies of 0.35 for TCH, 0.38 for CCS, and 0.48 for Fiber. Both RF and DL neural network models have comparable predictive ability with the additive GBLUP model but are less accurate than the extended GBLUP model.
Collapse
Affiliation(s)
- Chensong Chen
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Queensland, Australia
| | - Owen Powell
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Queensland, Australia
| | - Eric Dinglasan
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Queensland, Australia
| | - Elizabeth M Ross
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Queensland, Australia
| | - Seema Yadav
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Queensland, Australia
| | | | | | | | - Ben J Hayes
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Queensland, Australia
| |
Collapse
|
7
|
da Costa Lima Moraes A, Mollinari M, Ferreira RCU, Aono A, de Castro Lara LA, Pessoa-Filho M, Barrios SCL, Garcia AAF, do Valle CB, de Souza AP, Vigna BBZ. Advances in genomic characterization of Urochloa humidicola: exploring polyploid inheritance and apomixis. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:238. [PMID: 37919432 DOI: 10.1007/s00122-023-04485-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 10/11/2023] [Indexed: 11/04/2023]
Abstract
KEY MESSAGE We present the highest-density genetic map for the hexaploid Urochloa humidicola. SNP markers expose genetic organization, reproduction, and species origin, aiding polyploid and tropical forage research. Tropical forage grasses are an important food source for animal feeding, with Urochloa humidicola, also known as Koronivia grass, being one of the main pasture grasses for poorly drained soils in the tropics. However, genetic and genomic resources for this species are lacking due to its genomic complexity, including high heterozygosity, evidence of segmental allopolyploidy, and reproduction by apomixis. These complexities hinder the application of marker-assisted selection (MAS) in breeding programs. Here, we developed the highest-density linkage map currently available for the hexaploid tropical forage grass U. humidicola. This map was constructed using a biparental F1 population generated from a cross between the female parent H031 (CIAT 26146), the only known sexual genotype for the species, and the apomictic male parent H016 (BRS cv. Tupi). The linkage analysis included 4873 single nucleotide polymorphism (SNP) markers with allele dosage information. It allowed mapping of the ASGR locus and apospory phenotype to linkage group 3, in a region syntenic with chromosome 3 of Urochloa ruziziensis and chromosome 1 of Setaria italica. We also identified hexaploid haplotypes for all individuals, assessed the meiotic configuration, and estimated the level of preferential pairing in parents during the meiotic process, which revealed the autopolyploid origin of sexual H031 in contrast to apomictic H016, which presented allopolyploid behavior in preferential pairing analysis. These results provide new information regarding the genetic organization, mode of reproduction, and allopolyploid origin of U. humidicola, potential SNPs markers associated with apomixis for MAS and resources for research on polyploids and tropical forage grasses.
Collapse
Affiliation(s)
- Aline da Costa Lima Moraes
- Department of Plant Biology, Biology Institute, University of Campinas (UNICAMP), Campinas, São Paulo, Brazil
| | - Marcelo Mollinari
- Department of Horticultural Science, Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| | | | - Alexandre Aono
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, São Paulo, Brazil
| | | | | | | | | | | | - Anete Pereira de Souza
- Department of Plant Biology, Biology Institute, University of Campinas (UNICAMP), Campinas, São Paulo, Brazil
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, São Paulo, Brazil
| | | |
Collapse
|
8
|
Polyploid SNP Genotyping Using the MassARRAY System. Methods Mol Biol 2023; 2638:93-113. [PMID: 36781637 DOI: 10.1007/978-1-0716-3024-2_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
Molecular marker discovery and genotyping are major challenges in polyploid breeding programs incorporating molecular biology tools. In this context, this work describes a method for single nucleotide polymorphism (SNP) genotyping in polyploid crops using matrix-assisted laser desorption ionization (MALDI) time-of-flight (TOF) mass spectrometry, the MassARRAY System.
Collapse
|
9
|
Tomaszewska P, Vorontsova MS, Renvoize SA, Ficinski SZ, Tohme J, Schwarzacher T, Castiblanco V, de Vega JJ, Mitchell RAC, Heslop-Harrison JS(P. Complex polyploid and hybrid species in an apomictic and sexual tropical forage grass group: genomic composition and evolution in Urochloa (Brachiaria) species. ANNALS OF BOTANY 2023; 131:87-108. [PMID: 34874999 PMCID: PMC9904353 DOI: 10.1093/aob/mcab147] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Accepted: 12/06/2021] [Indexed: 05/25/2023]
Abstract
BACKGROUND AND AIMS Diploid and polyploid Urochloa (including Brachiaria, Panicum and Megathyrsus species) C4 tropical forage grasses originating from Africa are important for food security and the environment, often being planted in marginal lands worldwide. We aimed to characterize the nature of their genomes, the repetitive DNA and the genome composition of polyploids, leading to a model of the evolutionary pathways within the group including many apomictic species. METHODS Some 362 forage grass accessions from international germplasm collections were studied, and ploidy was determined using an optimized flow cytometry method. Whole-genome survey sequencing and molecular cytogenetic analysis were used to identify chromosomes and genomes in Urochloa accessions belonging to the 'brizantha' and 'humidicola' agamic complexes and U. maxima. KEY RESULTS Genome structures are complex and variable, with multiple ploidies and genome compositions within the species, and no clear geographical patterns. Sequence analysis of nine diploid and polyploid accessions enabled identification of abundant genome-specific repetitive DNA motifs. In situ hybridization with a combination of repetitive DNA and genomic DNA probes identified evolutionary divergence and allowed us to discriminate the different genomes present in polyploids. CONCLUSIONS We suggest a new coherent nomenclature for the genomes present. We develop a model of evolution at the whole-genome level in diploid and polyploid accessions showing processes of grass evolution. We support the retention of narrow species concepts for Urochloa brizantha, U. decumbens and U. ruziziensis, and do not consider diploids and polyploids of single species as cytotypes. The results and model will be valuable in making rational choices of parents for new hybrids, assist in use of the germplasm for breeding and selection of Urochloa with improved sustainability and agronomic potential, and assist in measuring and conserving biodiversity in grasslands.
Collapse
Affiliation(s)
| | | | | | | | - Joseph Tohme
- International Center for Tropical Agriculture (CIAT), A.A. 6713, Cali, Colombia
| | - Trude Schwarzacher
- Department of Genetics and Genome Biology, University of Leicester, Leicester, UK
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization/Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | | | | | | | - J S (Pat) Heslop-Harrison
- Department of Genetics and Genome Biology, University of Leicester, Leicester, UK
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization/Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| |
Collapse
|
10
|
Annicchiarico P, Nazzicari N, Bouizgaren A, Hayek T, Laouar M, Cornacchione M, Basigalup D, Monterrubio Martin C, Brummer EC, Pecetti L. Alfalfa genomic selection for different stress-prone growing regions. THE PLANT GENOME 2022; 15:e20264. [PMID: 36222346 DOI: 10.1002/tpg2.20264] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 08/25/2022] [Indexed: 06/16/2023]
Abstract
Alfalfa (Medicago sativa L.) selection for stress-prone regions has high priority for sustainable crop-livestock systems. This study assessed the genomic selection (GS) ability to predict alfalfa breeding values for drought-prone agricultural sites of Algeria, Morocco, and Argentina; managed-stress (MS) environments of Italy featuring moderate or intense drought; and one Tunisian site irrigated with moderately saline water. Additional aims were to investigate genotype × environment interaction (GEI) patterns and the effect on GS predictions of three single-nucleotide polymorphism (SNP) calling procedures, 12 statistical models that exclude or incorporate GEI, and allele dosage information. Our study included 127 genotypes from a Mediterranean reference population originated from three geographically contrasting populations, genotyped via genotyping-by-sequencing and phenotyped based on multi-year biomass dry matter yield of their dense-planted half-sib progenies. The GEI was very large, as shown by 27-fold greater additive genetic variance × environment interaction relative to the additive genetic variance and low genetic correlation for progeny yield responses across environments. The predictive ability of GS (using at least 37,969 SNP markers) exceeded 0.20 for moderate MS (representing Italian stress-prone sites) and the sites of Algeria and Argentina while being quite low for the Tunisian site and intense MS. Predictions of GS were complicated by rapid linkage disequilibrium decay. The weighted GBLUP model, GEI incorporation into GS models, and SNP calling based on a mock reference genome exhibited a predictive ability advantage for some environments. Our results support the specific breeding for each target region and suggest a positive role for GS in most regions when considering the challenges associated with phenotypic selection.
Collapse
Affiliation(s)
- Paolo Annicchiarico
- Consiglio per la Ricerca in Agricoltura e l'Analisi dell'Economia Agraria, Centro di ricerca Zootecnia e Acquacoltura, 29 viale Piacenza, Lodi, 26900, Italy
| | - Nelson Nazzicari
- Consiglio per la Ricerca in Agricoltura e l'Analisi dell'Economia Agraria, Centro di ricerca Zootecnia e Acquacoltura, 29 viale Piacenza, Lodi, 26900, Italy
| | - Abdelaziz Bouizgaren
- Institut National de la Recherche Agronomique, Centre Régional de Marrakech, BP 533, Marrakech, 40000, Morocco
| | - Taoufik Hayek
- Institut des Régions Arides, Route du Jorf, Médenine, 4119, Tunisia
| | - Meriem Laouar
- Ecole Nationale Supérieure Agronomique, Dép. de Productions Végétales. Laboratoire d'Amélioration Intégrative des Productions Végétales (C2711100), Rue Hassen Badi, El Harrach 16200, Alger, Algérie
| | - Monica Cornacchione
- Instituto Nacional de Tecnología Agropecuaria, Estación Experimental Santiago del Estero, Jujuy 850, Santiago del Estero, 4200, Argentina
| | - Daniel Basigalup
- Instituto Nacional de Tecnología Agropecuaria, Estación Experimental Manfredi, Ruta Nacional no. 9 km 636, Manfredi, Córdoba, X5988, Argentina
| | - Cristina Monterrubio Martin
- Consiglio per la Ricerca in Agricoltura e l'Analisi dell'Economia Agraria, Centro di ricerca Zootecnia e Acquacoltura, 29 viale Piacenza, Lodi, 26900, Italy
| | - Edward Charles Brummer
- Plant Breeding Center, Dep. of Plant Sciences, Univ. of California, Davis, CA, 95616, USA
| | - Luciano Pecetti
- Consiglio per la Ricerca in Agricoltura e l'Analisi dell'Economia Agraria, Centro di ricerca Zootecnia e Acquacoltura, 29 viale Piacenza, Lodi, 26900, Italy
| |
Collapse
|
11
|
Mbo Nkoulou LF, Ngalle HB, Cros D, Adje COA, Fassinou NVH, Bell J, Achigan-Dako EG. Perspective for genomic-enabled prediction against black sigatoka disease and drought stress in polyploid species. FRONTIERS IN PLANT SCIENCE 2022; 13:953133. [PMID: 36388523 PMCID: PMC9650417 DOI: 10.3389/fpls.2022.953133] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 09/28/2022] [Indexed: 06/16/2023]
Abstract
Genomic selection (GS) in plant breeding is explored as a promising tool to solve the problems related to the biotic and abiotic threats. Polyploid plants like bananas (Musa spp.) face the problem of drought and black sigatoka disease (BSD) that restrict their production. The conventional plant breeding is experiencing difficulties, particularly phenotyping costs and long generation interval. To overcome these difficulties, GS in plant breeding is explored as an alternative with a great potential for reducing costs and time in selection process. So far, GS does not have the same success in polyploid plants as with diploid plants because of the complexity of their genome. In this review, we present the main constraints to the application of GS in polyploid plants and the prospects for overcoming these constraints. Particular emphasis is placed on breeding for BSD and drought-two major threats to banana production-used in this review as a model of polyploid plant. It emerges that the difficulty in obtaining markers of good quality in polyploids is the first challenge of GS on polyploid plants, because the main tools used were developed for diploid species. In addition to that, there is a big challenge of mastering genetic interactions such as dominance and epistasis effects as well as the genotype by environment interaction, which are very common in polyploid plants. To get around these challenges, we have presented bioinformatics tools, as well as artificial intelligence approaches, including machine learning. Furthermore, a scheme for applying GS to banana for BSD and drought has been proposed. This review is of paramount impact for breeding programs that seek to reduce the selection cycle of polyploids despite the complexity of their genome.
Collapse
Affiliation(s)
- Luther Fort Mbo Nkoulou
- Genetics, Biotechnology, and Seed Science Unit (GBioS), Department of Plant Sciences, Faculty of Agronomic Sciences, University of Abomey Calavi, Cotonou, Benin
- Unit of Genetics and Plant Breeding (UGAP), Department of Plant Biology, Faculty of Sciences, University of Yaoundé 1, Yaoundé, Cameroon
- Institute of Agricultural Research for Development, Centre de Recherche Agricole de Mbalmayo (CRAM), Mbalmayo, Cameroon
| | - Hermine Bille Ngalle
- Unit of Genetics and Plant Breeding (UGAP), Department of Plant Biology, Faculty of Sciences, University of Yaoundé 1, Yaoundé, Cameroon
| | - David Cros
- Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), Unité Mixte de Recherche (UMR) Amélioration Génétique et Adaptation des Plantes méditerranéennes et tropicales (AGAP) Institut, Montpellier, France
- Unité Mixte de Recherche (UMR) Amélioration Génétique et Adaptation des Plantes méditerranéennes et tropicales (AGAP) Institut, University of Montpellier, Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Institut Agro, Montpellier, France
| | - Charlotte O. A. Adje
- Genetics, Biotechnology, and Seed Science Unit (GBioS), Department of Plant Sciences, Faculty of Agronomic Sciences, University of Abomey Calavi, Cotonou, Benin
| | - Nicodeme V. H. Fassinou
- Genetics, Biotechnology, and Seed Science Unit (GBioS), Department of Plant Sciences, Faculty of Agronomic Sciences, University of Abomey Calavi, Cotonou, Benin
| | - Joseph Bell
- Unit of Genetics and Plant Breeding (UGAP), Department of Plant Biology, Faculty of Sciences, University of Yaoundé 1, Yaoundé, Cameroon
| | - Enoch G. Achigan-Dako
- Genetics, Biotechnology, and Seed Science Unit (GBioS), Department of Plant Sciences, Faculty of Agronomic Sciences, University of Abomey Calavi, Cotonou, Benin
| |
Collapse
|
12
|
Meena MR, Appunu C, Arun Kumar R, Manimekalai R, Vasantha S, Krishnappa G, Kumar R, Pandey SK, Hemaprabha G. Recent Advances in Sugarcane Genomics, Physiology, and Phenomics for Superior Agronomic Traits. Front Genet 2022; 13:854936. [PMID: 35991570 PMCID: PMC9382102 DOI: 10.3389/fgene.2022.854936] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Accepted: 05/26/2022] [Indexed: 11/13/2022] Open
Abstract
Advances in sugarcane breeding have contributed significantly to improvements in agronomic traits and crop yield. However, the growing global demand for sugar and biofuel in the context of climate change requires further improvements in cane and sugar yields. Attempts to achieve the desired rates of genetic gain in sugarcane by conventional breeding means are difficult as many agronomic traits are genetically complex and polygenic, with each gene exerting small effects. Unlike those of many other crops, the sugarcane genome is highly heterozygous due to its autopolyploid nature, which further hinders the development of a comprehensive genetic map. Despite these limitations, many superior agronomic traits/genes for higher cane yield, sugar production, and disease/pest resistance have been identified through the mapping of quantitative trait loci, genome-wide association studies, and transcriptome approaches. Improvements in traits controlled by one or two loci are relatively easy to achieve; however, this is not the case for traits governed by many genes. Many desirable phenotypic traits are controlled by quantitative trait nucleotides (QTNs) with small and variable effects. Assembling these desired QTNs by conventional breeding methods is time consuming and inefficient due to genetic drift. However, recent developments in genomics selection (GS) have allowed sugarcane researchers to select and accumulate desirable alleles imparting superior traits as GS is based on genomic estimated breeding values, which substantially increases the selection efficiency and genetic gain in sugarcane breeding programs. Next-generation sequencing techniques coupled with genome-editing technologies have provided new vistas in harnessing the sugarcane genome to look for desirable agronomic traits such as erect canopy, leaf angle, prolonged greening, high biomass, deep root system, and the non-flowering nature of the crop. Many desirable cane-yielding traits, such as single cane weight, numbers of tillers, numbers of millable canes, as well as cane quality traits, such as sucrose and sugar yield, have been explored using these recent biotechnological tools. This review will focus on the recent advances in sugarcane genomics related to genetic gain and the identification of favorable alleles for superior agronomic traits for further utilization in sugarcane breeding programs.
Collapse
Affiliation(s)
- Mintu Ram Meena
- Regional Centre, ICAR-Sugarcane Breeding Institute, Karnal, India
- *Correspondence: Mintu Ram Meena, ; Chinnaswamy Appunu,
| | - Chinnaswamy Appunu
- ICAR-Sugarcane Breeding Institute, Coimbatore, India
- *Correspondence: Mintu Ram Meena, ; Chinnaswamy Appunu,
| | - R. Arun Kumar
- ICAR-Sugarcane Breeding Institute, Coimbatore, India
| | | | - S. Vasantha
- ICAR-Sugarcane Breeding Institute, Coimbatore, India
| | | | - Ravinder Kumar
- Regional Centre, ICAR-Sugarcane Breeding Institute, Karnal, India
| | - S. K. Pandey
- Regional Centre, ICAR-Sugarcane Breeding Institute, Karnal, India
| | - G. Hemaprabha
- ICAR-Sugarcane Breeding Institute, Coimbatore, India
| |
Collapse
|
13
|
Correr FH, Furtado A, Franco Garcia AA, Henry RJ, Rodrigues Alves Margarido G. Allele expression biases in mixed-ploid sugarcane accessions. Sci Rep 2022; 12:8778. [PMID: 35610293 PMCID: PMC9130122 DOI: 10.1038/s41598-022-12725-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 04/27/2022] [Indexed: 11/16/2022] Open
Abstract
Allele-specific expression (ASE) represents differences in the magnitude of expression between alleles of the same gene. This is not straightforward for polyploids, especially autopolyploids, as knowledge about the dose of each allele is required for accurate estimation of ASE. This is the case for the genomically complex Saccharum species, characterized by high levels of ploidy and aneuploidy. We used a Beta-Binomial model to test for allelic imbalance in Saccharum, with adaptations for mixed-ploid organisms. The hierarchical Beta-Binomial model was used to test if allele expression followed the expectation based on genomic allele dosage. The highest frequencies of ASE occurred in sugarcane hybrids, suggesting a possible influence of interspecific hybridization in these genotypes. For all accessions, genes showing ASE (ASEGs) were less frequent than those with balanced allelic expression. These genes were related to a broad range of processes, mostly associated with general metabolism, organelles, responses to stress and responses to stimuli. In addition, the frequency of ASEGs in high-level functional terms was similar among the genotypes, with a few genes associated with more specific biological processes. We hypothesize that ASE in Saccharum is largely a genotype-specific phenomenon, as a large number of ASEGs were exclusive to individual accessions.
Collapse
Affiliation(s)
- Fernando Henrique Correr
- Department of Genetics, University of São Paulo, "Luiz de Queiroz" College of Agriculture, Av Pádua Dias, 11, Piracicaba, 13418-900, Brazil.,Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, 4072, Australia
| | - Agnelo Furtado
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, 4072, Australia
| | - Antonio Augusto Franco Garcia
- Department of Genetics, University of São Paulo, "Luiz de Queiroz" College of Agriculture, Av Pádua Dias, 11, Piracicaba, 13418-900, Brazil
| | - Robert James Henry
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, 4072, Australia
| | - Gabriel Rodrigues Alves Margarido
- Department of Genetics, University of São Paulo, "Luiz de Queiroz" College of Agriculture, Av Pádua Dias, 11, Piracicaba, 13418-900, Brazil. .,Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, 4072, Australia.
| |
Collapse
|
14
|
Ferreira RCU, da Costa Lima Moraes A, Chiari L, Simeão RM, Vigna BBZ, de Souza AP. An Overview of the Genetics and Genomics of the Urochloa Species Most Commonly Used in Pastures. FRONTIERS IN PLANT SCIENCE 2021; 12:770461. [PMID: 34966402 PMCID: PMC8710810 DOI: 10.3389/fpls.2021.770461] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 11/17/2021] [Indexed: 06/14/2023]
Abstract
Pastures based on perennial monocotyledonous plants are the principal source of nutrition for ruminant livestock in tropical and subtropical areas across the globe. The Urochloa genus comprises important species used in pastures, and these mainly include Urochloa brizantha, Urochloa decumbens, Urochloa humidicola, and Urochloa ruziziensis. Despite their economic relevance, there is an absence of genomic-level information for these species, and this lack is mainly due to genomic complexity, including polyploidy, high heterozygosity, and genomes with a high repeat content, which hinders advances in molecular approaches to genetic improvement. Next-generation sequencing techniques have enabled the recent release of reference genomes, genetic linkage maps, and transcriptome sequences, and this information helps improve our understanding of the genetic architecture and molecular mechanisms involved in relevant traits, such as the apomictic reproductive mode. However, more concerted research efforts are still needed to characterize germplasm resources and identify molecular markers and genes associated with target traits. In addition, the implementation of genomic selection and gene editing is needed to reduce the breeding time and expenditure. In this review, we highlight the importance and characteristics of the four main species of Urochloa used in pastures and discuss the current findings from genetic and genomic studies and research gaps that should be addressed in future research.
Collapse
Affiliation(s)
| | - Aline da Costa Lima Moraes
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Lucimara Chiari
- Embrapa Gado de Corte, Brazilian Agricultural Research Corporation, Campo Grande, Brazil
| | - Rosangela Maria Simeão
- Embrapa Gado de Corte, Brazilian Agricultural Research Corporation, Campo Grande, Brazil
| | | | - Anete Pereira de Souza
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
- Department of Plant Biology, Biology Institute, University of Campinas (UNICAMP), Campinas, Brazil
| |
Collapse
|
15
|
Strategies to Increase Prediction Accuracy in Genomic Selection of Complex Traits in Alfalfa ( Medicago sativa L.). Cells 2021; 10:cells10123372. [PMID: 34943880 PMCID: PMC8699225 DOI: 10.3390/cells10123372] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 11/19/2021] [Accepted: 11/24/2021] [Indexed: 12/27/2022] Open
Abstract
Agronomic traits such as biomass yield and abiotic stress tolerance are genetically complex and challenging to improve through conventional breeding approaches. Genomic selection (GS) is an alternative approach in which genome-wide markers are used to determine the genomic estimated breeding value (GEBV) of individuals in a population. In alfalfa (Medicago sativa L.), previous results indicated that low to moderate prediction accuracy values (<70%) were obtained in complex traits, such as yield and abiotic stress resistance. There is a need to increase the prediction value in order to employ GS in breeding programs. In this paper we reviewed different statistic models and their applications in polyploid crops, such as alfalfa and potato. Specifically, we used empirical data affiliated with alfalfa yield under salt stress to investigate approaches that use DNA marker importance values derived from machine learning models, and genome-wide association studies (GWAS) of marker-trait association scores based on different GWASpoly models, in weighted GBLUP analyses. This approach increased prediction accuracies from 50% to more than 80% for alfalfa yield under salt stress. Finally, we expended the weighted GBLUP approach to potato and analyzed 13 phenotypic traits and obtained similar results. This is the first report on alfalfa to use variable importance and GWAS-assisted approaches to increase the prediction accuracy of GS, thus helping to select superior alfalfa lines based on their GEBVs.
Collapse
|
16
|
Martins FB, Moraes ACL, Aono AH, Ferreira RCU, Chiari L, Simeão RM, Barrios SCL, Santos MF, Jank L, do Valle CB, Vigna BBZ, de Souza AP. A Semi-Automated SNP-Based Approach for Contaminant Identification in Biparental Polyploid Populations of Tropical Forage Grasses. FRONTIERS IN PLANT SCIENCE 2021; 12:737919. [PMID: 34745171 PMCID: PMC8569613 DOI: 10.3389/fpls.2021.737919] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Accepted: 09/20/2021] [Indexed: 06/13/2023]
Abstract
Artificial hybridization plays a fundamental role in plant breeding programs since it generates new genotypic combinations that can result in desirable phenotypes. Depending on the species and mode of reproduction, controlled crosses may be challenging, and contaminating individuals can be introduced accidentally. In this context, the identification of such contaminants is important to avoid compromising further selection cycles, as well as genetic and genomic studies. The main objective of this work was to propose an automated multivariate methodology for the detection and classification of putative contaminants, including apomictic clones (ACs), self-fertilized individuals, half-siblings (HSs), and full contaminants (FCs), in biparental polyploid progenies of tropical forage grasses. We established a pipeline to identify contaminants in genotyping-by-sequencing (GBS) data encoded as allele dosages of single nucleotide polymorphism (SNP) markers by integrating principal component analysis (PCA), genotypic analysis (GA) measures based on Mendelian segregation, and clustering analysis (CA). The combination of these methods allowed for the correct identification of all contaminants in all simulated progenies and the detection of putative contaminants in three real progenies of tropical forage grasses, providing an easy and promising methodology for the identification of contaminants in biparental progenies of tetraploid and hexaploid species. The proposed pipeline was made available through the polyCID Shiny app and can be easily coupled with traditional genetic approaches, such as linkage map construction, thereby increasing the efficiency of breeding programs.
Collapse
Affiliation(s)
- Felipe Bitencourt Martins
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), São Paulo, Brazil
| | - Aline Costa Lima Moraes
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), São Paulo, Brazil
| | - Alexandre Hild Aono
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), São Paulo, Brazil
| | | | - Lucimara Chiari
- Embrapa Gado de Corte, Brazilian Agricultural Research Corporation, Campo Grande, Brazil
| | - Rosangela Maria Simeão
- Embrapa Gado de Corte, Brazilian Agricultural Research Corporation, Campo Grande, Brazil
| | | | | | - Liana Jank
- Embrapa Gado de Corte, Brazilian Agricultural Research Corporation, Campo Grande, Brazil
| | | | | | - Anete Pereira de Souza
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), São Paulo, Brazil
- Department of Plant Biology, Biology Institute, University of Campinas (UNICAMP), São Paulo, Brazil
| |
Collapse
|
17
|
Rios EF, Andrade MHML, Resende MFR, Kirst M, de Resende MDV, de Almeida Filho JE, Gezan SA, Munoz P. Genomic prediction in family bulks using different traits and cross-validations in pine. G3-GENES GENOMES GENETICS 2021; 11:6321952. [PMID: 34544139 PMCID: PMC8496210 DOI: 10.1093/g3journal/jkab249] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 07/02/2021] [Indexed: 11/13/2022]
Abstract
Genomic prediction integrates statistical, genomic, and computational tools to improve the estimation of breeding values and increase genetic gain. Due to the broad diversity in mating systems, breeding schemes, propagation methods, and unit of selection, no universal genomic prediction approach can be applied in all crops. In a genome-wide family prediction (GWFP) approach, the family is the basic unit of selection. We tested GWFP in two loblolly pine (Pinus taeda L.) datasets: a breeding population composed of 63 full-sib families (5–20 individuals per family), and a simulated population with the same pedigree structure. In both populations, phenotypic and genomic data was pooled at the family level in silico. Marker effects were estimated to compute genomic estimated breeding values (GEBV) at the individual and family (GWFP) levels. Less than six individuals per family produced inaccurate estimates of family phenotypic performance and allele frequency. Tested across different scenarios, GWFP predictive ability was higher than those for GEBV in both populations. Validation sets composed of families with similar phenotypic mean and variance as the training population yielded predictions consistently higher and more accurate than other validation sets. Results revealed potential for applying GWFP in breeding programs whose selection unit are family, and for systems where family can serve as training sets. The GWFP approach is well suited for crops that are routinely genotyped and phenotyped at the plot-level, but it can be extended to other breeding programs. Higher predictive ability obtained with GWFP would motivate the application of genomic prediction in these situations.
Collapse
Affiliation(s)
- Esteban F Rios
- Agronomy Department, University of Florida, Gainesville, FL 32611, USA
| | | | - Marcio F R Resende
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| | - Matias Kirst
- School of Forest Resources and Conservation, University of Florida, Gainesville, FL 32611, USA
| | - Marcos D V de Resende
- EMBRAPA Café/Department of Statistics, Federal University of Viçosa, Avenida PH Rolfs S/N, Viçosa 36570-000, Brazil
| | | | | | - Patricio Munoz
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
18
|
Ferrão LFV, Amadeu RR, Benevenuto J, de Bem Oliveira I, Munoz PR. Genomic Selection in an Outcrossing Autotetraploid Fruit Crop: Lessons From Blueberry Breeding. FRONTIERS IN PLANT SCIENCE 2021; 12:676326. [PMID: 34194453 PMCID: PMC8236943 DOI: 10.3389/fpls.2021.676326] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 05/12/2021] [Indexed: 05/17/2023]
Abstract
Blueberry (Vaccinium corymbosum and hybrids) is a specialty crop with expanding production and consumption worldwide. The blueberry breeding program at the University of Florida (UF) has greatly contributed to expanding production areas by developing low-chilling cultivars better adapted to subtropical and Mediterranean climates of the globe. The breeding program has historically focused on recurrent phenotypic selection. As an autopolyploid, outcrossing, perennial, long juvenile phase crop, blueberry breeding cycles are costly and time consuming, which results in low genetic gains per unit of time. Motivated by applying molecular markers for a more accurate selection in the early stages of breeding, we performed pioneering genomic selection studies and optimization for its implementation in the blueberry breeding program. We have also addressed some complexities of sequence-based genotyping and model parametrization for an autopolyploid crop, providing empirical contributions that can be extended to other polyploid species. We herein revisited some of our previous genomic selection studies and showed for the first time its application in an independent validation set. In this paper, our contribution is three-fold: (i) summarize previous results on the relevance of model parametrizations, such as diploid or polyploid methods, and inclusion of dominance effects; (ii) assess the importance of sequence depth of coverage and genotype dosage calling steps; (iii) demonstrate the real impact of genomic selection on leveraging breeding decisions by using an independent validation set. Altogether, we propose a strategy for using genomic selection in blueberry, with the potential to be applied to other polyploid species of a similar background.
Collapse
Affiliation(s)
- Luís Felipe V. Ferrão
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Rodrigo R. Amadeu
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Juliana Benevenuto
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Ivone de Bem Oliveira
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
- Hortifrut North America, Inc., Estero, FL, United States
| | - Patricio R. Munoz
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| |
Collapse
|
19
|
Voss-Fels KP, Wei X, Ross EM, Frisch M, Aitken KS, Cooper M, Hayes BJ. Strategies and considerations for implementing genomic selection to improve traits with additive and non-additive genetic architectures in sugarcane breeding. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2021; 134:1493-1511. [PMID: 33587151 DOI: 10.1007/s00122-021-03785-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Accepted: 01/27/2021] [Indexed: 05/14/2023]
Abstract
Simulations highlight the potential of genomic selection to substantially increase genetic gain for complex traits in sugarcane. The success rate depends on the trait genetic architecture and the implementation strategy. Genomic selection (GS) has the potential to increase the rate of genetic gain in sugarcane beyond the levels achieved by conventional phenotypic selection (PS). To assess different implementation strategies, we simulated two different GS-based breeding strategies and compared genetic gain and genetic variance over five breeding cycles to standard PS. GS scheme 1 followed similar routines like conventional PS but included three rapid recurrent genomic selection (RRGS) steps. GS scheme 2 also included three RRGS steps but did not include a progeny assessment stage and therefore differed more fundamentally from PS. Under an additive trait model, both simulated GS schemes achieved annual genetic gains of 2.6-2.7% which were 1.9 times higher compared to standard phenotypic selection (1.4%). For a complex non-additive trait model, the expected annual rates of genetic gain were lower for all breeding schemes; however, the rates for the GS schemes (1.5-1.6%) were still greater than PS (1.1%). Investigating cost-benefit ratios with regard to numbers of genotyped clones showed that substantial benefits could be achieved when only 1500 clones were genotyped per 10-year breeding cycle for the additive genetic model. Our results show that under a complex non-additive genetic model, the success rate of GS depends on the implementation strategy, the number of genotyped clones and the stage of the breeding program, likely reflecting how changes in QTL allele frequencies change additive genetic variance and therefore the efficiency of selection. These results are encouraging and motivate further work to facilitate the adoption of GS in sugarcane breeding.
Collapse
Affiliation(s)
- Kai P Voss-Fels
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, 4072, Australia
| | - Xianming Wei
- Sugar Research Australia, Mackay, QLD, 4741, Australia
| | - Elizabeth M Ross
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, 4072, Australia
| | - Matthias Frisch
- Institute of Agronomy and Plant Breeding II, Justus Liebig University, Giessen, Germany
| | - Karen S Aitken
- Agriculture and Food, CSIRO, QBP, St. Lucia, QLD, 4067, Australia
| | - Mark Cooper
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, 4072, Australia
| | - Ben J Hayes
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, 4072, Australia.
| |
Collapse
|
20
|
Simeão RM, Resende MDV, Alves RS, Pessoa-Filho M, Azevedo ALS, Jones CS, Pereira JF, Machado JC. Genomic Selection in Tropical Forage Grasses: Current Status and Future Applications. FRONTIERS IN PLANT SCIENCE 2021; 12:665195. [PMID: 33995461 PMCID: PMC8120112 DOI: 10.3389/fpls.2021.665195] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 04/06/2021] [Indexed: 05/06/2023]
Abstract
The world population is expected to be larger and wealthier over the next few decades and will require more animal products, such as milk and beef. Tropical regions have great potential to meet this growing global demand, where pasturelands play a major role in supporting increased animal production. Better forage is required in consonance with improved sustainability as the planted area should not increase and larger areas cultivated with one or a few forage species should be avoided. Although, conventional tropical forage breeding has successfully released well-adapted and high-yielding cultivars over the last few decades, genetic gains from these programs have been low in view of the growing food demand worldwide. To guarantee their future impact on livestock production, breeding programs should leverage genotyping, phenotyping, and envirotyping strategies to increase genetic gains. Genomic selection (GS) and genome-wide association studies play a primary role in this process, with the advantage of increasing genetic gain due to greater selection accuracy, reduced cycle time, and increased number of individuals that can be evaluated. This strategy provides solutions to bottlenecks faced by conventional breeding methods, including long breeding cycles and difficulties to evaluate complex traits. Initial results from implementing GS in tropical forage grasses (TFGs) are promising with notable improvements over phenotypic selection alone. However, the practical impact of GS in TFG breeding programs remains unclear. The development of appropriately sized training populations is essential for the evaluation and validation of selection markers based on estimated breeding values. Large panels of single-nucleotide polymorphism markers in different tropical forage species are required for multiple application targets at a reduced cost. In this context, this review highlights the current challenges, achievements, availability, and development of genomic resources and statistical methods for the implementation of GS in TFGs. Additionally, the prediction accuracies from recent experiments and the potential to harness diversity from genebanks are discussed. Although, GS in TFGs is still incipient, the advances in genomic tools and statistical models will speed up its implementation in the foreseeable future. All TFG breeding programs should be prepared for these changes.
Collapse
Affiliation(s)
| | | | - Rodrigo S. Alves
- Instituto Nacional de Ciência e Tecnologia do Café, Universidade Federal de Viçosa, Viçosa, Brazil
| | | | | | - Chris S. Jones
- International Livestock Research Institute, Nairobi, Kenya
| | | | | |
Collapse
|
21
|
Gerard D. Pairwise linkage disequilibrium estimation for polyploids. Mol Ecol Resour 2021; 21:1230-1242. [PMID: 33559321 DOI: 10.1111/1755-0998.13349] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 01/18/2021] [Accepted: 02/01/2021] [Indexed: 12/31/2022]
Abstract
Many tasks in statistical genetics involve pairwise estimation of linkage disequilibrium (LD). The study of LD in diploids is mature. However, in polyploids, the field lacks a comprehensive characterization of LD. Polyploids also exhibit greater levels of genotype uncertainty than diploids, yet no methods currently exist to estimate LD in polyploids in the presence of such genotype uncertainty. Furthermore, most LD estimation methods do not quantify the level of uncertainty in their LD estimates. Our study contains three major contributions. (i) We characterize haplotypic and composite measures of LD in polyploids. These composite measures of LD turn out to be functions of common statistical measures of association. (ii) We derive procedures to estimate haplotypic and composite LD in polyploids in the presence of genotype uncertainty. We do this by estimating LD directly from genotype likelihoods, which may be obtained from many genotyping platforms. (iii) We derive standard errors of all LD estimators that we discuss. We validate our methods on both real and simulated data. Our methods are implemented in the R package ldsep, available on the Comprehensive R Archive Network https://cran.r-project.org/package=ldsep.
Collapse
Affiliation(s)
- David Gerard
- Department of Mathematics and Statistics, American University, Washington, DC, USA
| |
Collapse
|
22
|
Abstract
A suitable pairwise relatedness estimation is key to genetic studies. Several methods are proposed to compute relatedness in autopolyploids based on molecular data. However, unlike diploids, autopolyploids still need further studies considering scenarios with many linked molecular markers with known dosage. In this study, we provide guidelines for plant geneticists and breeders to access trustworthy pairwise relatedness estimates. To this end, we simulated populations considering different ploidy levels, meiotic pairings patterns, number of loci and alleles, and inbreeding levels. Analysis were performed to access the accuracy of distinct methods and to demonstrate the usefulness of molecular marker in practical situations. Overall, our results suggest that at least 100 effective biallelic molecular markers are required to have good pairwise relatedness estimation if methods based on correlation is used. For this number of loci, current methods based on multiallelic markers show lower performance than biallelic ones. To estimate relatedness in cases of inbreeding or close relationships (as parent-offspring, full-sibs, or half-sibs) is more challenging. Methods to estimate pairwise relatedness based on molecular markers, for different ploidy levels or pedigrees were implemented in the AGHmatrix R package.
Collapse
|
23
|
Singh RK, Prasad A, Muthamilarasan M, Parida SK, Prasad M. Breeding and biotechnological interventions for trait improvement: status and prospects. PLANTA 2020; 252:54. [PMID: 32948920 PMCID: PMC7500504 DOI: 10.1007/s00425-020-03465-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 09/12/2020] [Indexed: 05/06/2023]
Abstract
Present review describes the molecular tools and strategies deployed in the trait discovery and improvement of major crops. The prospects and challenges associated with these approaches are discussed. Crop improvement relies on modulating the genes and genomic regions underlying key traits, either directly or indirectly. Direct approaches include overexpression, RNA interference, genome editing, etc., while breeding majorly constitutes the indirect approach. With the advent of latest tools and technologies, these strategies could hasten the improvement of crop species. Next-generation sequencing, high-throughput genotyping, precision editing, use of space technology for accelerated growth, etc. had provided a new dimension to crop improvement programmes that work towards delivering better varieties to cope up with the challenges. Also, studies have widened from understanding the response of plants to single stress to combined stress, which provides insights into the molecular mechanisms regulating tolerance to more than one stress at a given point of time. Altogether, next-generation genetics and genomics had made tremendous progress in delivering improved varieties; however, the scope still exists to expand its horizon to other species that remain underutilized. In this context, the present review systematically analyses the different genomics approaches that are deployed for trait discovery and improvement in major species that could serve as a roadmap for executing similar strategies in other crop species. The application, pros, and cons, and scope for improvement of each approach have been discussed with examples, and altogether, the review provides comprehensive coverage on the advances in genomics to meet the ever-growing demands for agricultural produce.
Collapse
Affiliation(s)
- Roshan Kumar Singh
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Ashish Prasad
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Mehanathan Muthamilarasan
- Department of Plant Sciences, School of Life Sciences, University of Hyderabad, Hyderabad, Telangana, 500046, India
| | - Swarup K Parida
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Manoj Prasad
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India.
| |
Collapse
|
24
|
Medina CA, Hawkins C, Liu XP, Peel M, Yu LX. Genome-Wide Association and Prediction of Traits Related to Salt Tolerance in Autotetraploid Alfalfa ( Medicago sativa L.). Int J Mol Sci 2020; 21:ijms21093361. [PMID: 32397526 PMCID: PMC7247575 DOI: 10.3390/ijms21093361%20] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 05/05/2020] [Accepted: 05/06/2020] [Indexed: 05/28/2023] Open
Abstract
Soil salinity is a growing problem in world production agriculture. Continued improvement in crop salt tolerance will require the implementation of innovative breeding strategies such as marker-assisted selection (MAS) and genomic selection (GS). Genetic analyses for yield and vigor traits under salt stress in alfalfa breeding populations with three different phenotypic datasets was assessed. Genotype-by-sequencing (GBS) developed markers with allele dosage and phenotypic data were analyzed by genome-wide association studies (GWAS) and GS using different models. GWAS identified 27 single nucleotide polymorphism (SNP) markers associated with salt tolerance. Mapping SNPs markers against the Medicago truncatula reference genome revealed several putative candidate genes based on their roles in response to salt stress. Additionally, eight GS models were used to estimate breeding values of the training population under salt stress. Highest prediction accuracies and root mean square errors were used to determine the best prediction model. The machine learning methods (support vector machine and random forest) performance best with the prediction accuracy of 0.793 for yield. The marker loci and candidate genes identified, along with optimized GS prediction models, were shown to be useful in improvement of alfalfa with enhanced salt tolerance. DNA markers and the outcome of the GS will be made available to the alfalfa breeding community in efforts to accelerate genetic gains, in the development of biotic stress tolerant and more productive modern-day alfalfa cultivars.
Collapse
Affiliation(s)
- Cesar Augusto Medina
- United States Department of Agriculture-Agricultural Research Service, Plant Germplasm Introduction and Testing Research, Prosser, WA 99350, USA; (C.A.M.); (C.H.); (X.-P.L.)
| | - Charles Hawkins
- United States Department of Agriculture-Agricultural Research Service, Plant Germplasm Introduction and Testing Research, Prosser, WA 99350, USA; (C.A.M.); (C.H.); (X.-P.L.)
- Current address: Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA
| | - Xiang-Ping Liu
- United States Department of Agriculture-Agricultural Research Service, Plant Germplasm Introduction and Testing Research, Prosser, WA 99350, USA; (C.A.M.); (C.H.); (X.-P.L.)
- Current address: College of Animal Science & Veterinary Medicine, Heilongjiang Bayi Agricultural University, Daqing 163316, Heilongjiang, China
| | - Michael Peel
- United States Department of Agriculture-Agricultural Research Service, Forage and Range Research Lab, Logan, UT 84322, USA;
| | - Long-Xi Yu
- United States Department of Agriculture-Agricultural Research Service, Plant Germplasm Introduction and Testing Research, Prosser, WA 99350, USA; (C.A.M.); (C.H.); (X.-P.L.)
| |
Collapse
|
25
|
Medina CA, Hawkins C, Liu XP, Peel M, Yu LX. Genome-Wide Association and Prediction of Traits Related to Salt Tolerance in Autotetraploid Alfalfa ( Medicago sativa L.). Int J Mol Sci 2020; 21:E3361. [PMID: 32397526 PMCID: PMC7247575 DOI: 10.3390/ijms21093361] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 05/05/2020] [Accepted: 05/06/2020] [Indexed: 12/13/2022] Open
Abstract
Soil salinity is a growing problem in world production agriculture. Continued improvement in crop salt tolerance will require the implementation of innovative breeding strategies such as marker-assisted selection (MAS) and genomic selection (GS). Genetic analyses for yield and vigor traits under salt stress in alfalfa breeding populations with three different phenotypic datasets was assessed. Genotype-by-sequencing (GBS) developed markers with allele dosage and phenotypic data were analyzed by genome-wide association studies (GWAS) and GS using different models. GWAS identified 27 single nucleotide polymorphism (SNP) markers associated with salt tolerance. Mapping SNPs markers against the Medicago truncatula reference genome revealed several putative candidate genes based on their roles in response to salt stress. Additionally, eight GS models were used to estimate breeding values of the training population under salt stress. Highest prediction accuracies and root mean square errors were used to determine the best prediction model. The machine learning methods (support vector machine and random forest) performance best with the prediction accuracy of 0.793 for yield. The marker loci and candidate genes identified, along with optimized GS prediction models, were shown to be useful in improvement of alfalfa with enhanced salt tolerance. DNA markers and the outcome of the GS will be made available to the alfalfa breeding community in efforts to accelerate genetic gains, in the development of biotic stress tolerant and more productive modern-day alfalfa cultivars.
Collapse
Affiliation(s)
- Cesar Augusto Medina
- United States Department of Agriculture-Agricultural Research Service, Plant Germplasm Introduction and Testing Research, Prosser, WA 99350, USA; (C.A.M.); (C.H.); (X.-P.L.)
| | - Charles Hawkins
- United States Department of Agriculture-Agricultural Research Service, Plant Germplasm Introduction and Testing Research, Prosser, WA 99350, USA; (C.A.M.); (C.H.); (X.-P.L.)
- Current address: Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA
| | - Xiang-Ping Liu
- United States Department of Agriculture-Agricultural Research Service, Plant Germplasm Introduction and Testing Research, Prosser, WA 99350, USA; (C.A.M.); (C.H.); (X.-P.L.)
- Current address: College of Animal Science & Veterinary Medicine, Heilongjiang Bayi Agricultural University, Daqing 163316, Heilongjiang, China
| | - Michael Peel
- United States Department of Agriculture-Agricultural Research Service, Forage and Range Research Lab, Logan, UT 84322, USA;
| | - Long-Xi Yu
- United States Department of Agriculture-Agricultural Research Service, Plant Germplasm Introduction and Testing Research, Prosser, WA 99350, USA; (C.A.M.); (C.H.); (X.-P.L.)
| |
Collapse
|
26
|
Gerard D, Ferrão LFV. Priors for genotyping polyploids. BIOINFORMATICS (OXFORD, ENGLAND) 2020; 36:1795-1800. [PMID: 32176767 DOI: 10.1101/751784] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 11/01/2019] [Accepted: 11/12/2019] [Indexed: 05/29/2023]
Abstract
MOTIVATION Empirical Bayes techniques to genotype polyploid organisms usually either (i) assume technical artifacts are known a priori or (ii) estimate technical artifacts simultaneously with the prior genotype distribution. Case (i) is unappealing as it places the onus on the researcher to estimate these artifacts, or to ensure that there are no systematic biases in the data. However, as we demonstrate with a few empirical examples, case (ii) makes choosing the class of prior genotype distributions extremely important. Choosing a class is either too flexible or too restrictive results in poor genotyping performance. RESULTS We propose two classes of prior genotype distributions that are of intermediate levels of flexibility: the class of proportional normal distributions and the class of unimodal distributions. We provide a complete characterization of and optimization details for the class of unimodal distributions. We demonstrate, using both simulated and real data that using these classes results in superior genotyping performance. AVAILABILITY AND IMPLEMENTATION Genotyping methods that use these priors are implemented in the updog R package available on the Comprehensive R Archive Network: https://cran.r-project.org/package=updog. All code needed to reproduce the results of this article is available on GitHub: https://github.com/dcgerard/reproduce_prior_sims. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- David Gerard
- Department of Mathematics and Statistics, American University, Washington, DC 20016, USA
| | | |
Collapse
|
27
|
Deo TG, Ferreira RCU, Lara LAC, Moraes ACL, Alves-Pereira A, de Oliveira FA, Garcia AAF, Santos MF, Jank L, de Souza AP. High-Resolution Linkage Map With Allele Dosage Allows the Identification of Regions Governing Complex Traits and Apospory in Guinea Grass ( Megathyrsus maximus). FRONTIERS IN PLANT SCIENCE 2020; 11:15. [PMID: 32161603 PMCID: PMC7054243 DOI: 10.3389/fpls.2020.00015] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Accepted: 01/08/2020] [Indexed: 05/11/2023]
Abstract
Forage grasses are mainly used in animal feed to fatten cattle and dairy herds, and guinea grass (Megathyrsus maximus) is considered one of the most productive of the tropical forage crops that reproduce by seeds. Due to the recent process of domestication, this species has several genomic complexities, such as autotetraploidy and aposporous apomixis. Consequently, approaches that relate phenotypic and genotypic data are incipient. In this context, we built a linkage map with allele dosage and generated novel information of the genetic architecture of traits that are important for the breeding of M. maximus. From a full-sib progeny, a linkage map containing 858 single nucleotide polymorphism (SNP) markers with allele dosage information expected for an autotetraploid was obtained. The high genetic variability of the progeny allowed us to map 10 quantitative trait loci (QTLs) related to agronomic traits, such as regrowth capacity and total dry matter, and 36 QTLs related to nutritional quality, which were distributed among all homology groups (HGs). Various overlapping regions associated with the quantitative traits suggested QTL hotspots. In addition, we were able to map one locus that controls apospory (apo-locus) in HG II. A total of 55 different gene families involved in cellular metabolism and plant growth were identified from markers adjacent to the QTLs and APOSPORY locus using the Panicum virgatum genome as a reference in comparisons with the genomes of Arabidopsis thaliana and Oryza sativa. Our results provide a better understanding of the genetic basis of reproduction by apomixis and traits important for breeding programs that considerably influence animal productivity as well as the quality of meat and milk.
Collapse
Affiliation(s)
- Thamiris G. Deo
- Center for Molecular Biology and Genetic Engineering, University of Campinas, Campinas, Brazil
| | - Rebecca C. U. Ferreira
- Center for Molecular Biology and Genetic Engineering, University of Campinas, Campinas, Brazil
| | - Letícia A. C. Lara
- Genetics Department, Escola Superior de Agricultura “Luiz de Queiroz,” University of São Paulo, Piracicaba, Brazil
| | - Aline C. L. Moraes
- Plant Biology Department, Biology Institute, University of Campinas, Campinas, Brazil
| | | | - Fernanda A. de Oliveira
- Center for Molecular Biology and Genetic Engineering, University of Campinas, Campinas, Brazil
| | - Antonio A. F. Garcia
- Genetics Department, Escola Superior de Agricultura “Luiz de Queiroz,” University of São Paulo, Piracicaba, Brazil
| | - Mateus F. Santos
- Embrapa Beef Cattle, Brazilian Agricultural Research Corporation, Campo Grande, Brazil
| | - Liana Jank
- Embrapa Beef Cattle, Brazilian Agricultural Research Corporation, Campo Grande, Brazil
| | - Anete P. de Souza
- Center for Molecular Biology and Genetic Engineering, University of Campinas, Campinas, Brazil
- Plant Biology Department, Biology Institute, University of Campinas, Campinas, Brazil
| |
Collapse
|
28
|
Zingaretti LM, Gezan SA, Ferrão LFV, Osorio LF, Monfort A, Muñoz PR, Whitaker VM, Pérez-Enciso M. Exploring Deep Learning for Complex Trait Genomic Prediction in Polyploid Outcrossing Species. FRONTIERS IN PLANT SCIENCE 2020; 11:25. [PMID: 32117371 PMCID: PMC7015897 DOI: 10.3389/fpls.2020.00025] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 01/10/2020] [Indexed: 05/21/2023]
Abstract
Genomic prediction (GP) is the procedure whereby the genetic merits of untested candidates are predicted using genome wide marker information. Although numerous examples of GP exist in plants and animals, applications to polyploid organisms are still scarce, partly due to limited genome resources and the complexity of this system. Deep learning (DL) techniques comprise a heterogeneous collection of machine learning algorithms that have excelled at many prediction tasks. A potential advantage of DL for GP over standard linear model methods is that DL can potentially take into account all genetic interactions, including dominance and epistasis, which are expected to be of special relevance in most polyploids. In this study, we evaluated the predictive accuracy of linear and DL techniques in two important small fruits or berries: strawberry and blueberry. The two datasets contained a total of 1,358 allopolyploid strawberry (2n=8x=112) and 1,802 autopolyploid blueberry (2n=4x=48) individuals, genotyped for 9,908 and 73,045 single nucleotide polymorphism (SNP) markers, respectively, and phenotyped for five agronomic traits each. DL depends on numerous parameters that influence performance and optimizing hyperparameter values can be a critical step. Here we show that interactions between hyperparameter combinations should be expected and that the number of convolutional filters and regularization in the first layers can have an important effect on model performance. In terms of genomic prediction, we did not find an advantage of DL over linear model methods, except when the epistasis component was important. Linear Bayesian models were better than convolutional neural networks for the full additive architecture, whereas the opposite was observed under strong epistasis. However, by using a parameterization capable of taking into account these non-linear effects, Bayesian linear models can match or exceed the predictive accuracy of DL. A semiautomatic implementation of the DL pipeline is available at https://github.com/lauzingaretti/deepGP/.
Collapse
Affiliation(s)
- Laura M. Zingaretti
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Campus UAB, Barcelona, Spain
| | - Salvador Alejandro Gezan
- School of Forest Resources and Conservation, University of Florida, Gainesville, FL, United States
| | - Luis Felipe V. Ferrão
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Luis F. Osorio
- IFAS Gulf Coast Research and Education Center, University of Florida, Wimauma, FL, United States
| | - Amparo Monfort
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Campus UAB, Barcelona, Spain
- Institut de Recerca i Tecnologia Agroalimentàries (IRTA), Barcelona, Spain
| | - Patricio R. Muñoz
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Vance M. Whitaker
- IFAS Gulf Coast Research and Education Center, University of Florida, Wimauma, FL, United States
| | - Miguel Pérez-Enciso
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Campus UAB, Barcelona, Spain
- ICREA, Passeig de Lluís Companys 23, Barcelona, Spain
| |
Collapse
|