1
|
Endelman JB, Kante M, Lindqvist-Kreuze H, Kilian A, Shannon LM, Caraza-Harter MV, Vaillancourt B, Mailloux K, Hamilton JP, Buell CR. Targeted genotyping-by-sequencing of potato and data analysis with R/polyBreedR. THE PLANT GENOME 2024:e20484. [PMID: 38887158 DOI: 10.1002/tpg2.20484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Revised: 04/30/2024] [Accepted: 05/10/2024] [Indexed: 06/20/2024]
Abstract
Mid-density targeted genotyping-by-sequencing (GBS) combines trait-specific markers with thousands of genomic markers at an attractive price for linkage mapping and genomic selection. A 2.5K targeted GBS assay for potato (Solanum tuberosum L.) was developed using the DArTag technology and later expanded to 4K targets. Genomic markers were selected from the potato Infinium single nucleotide polymorphism (SNP) array to maximize genome coverage and polymorphism rates. The DArTag and SNP array platforms produced equivalent dendrograms in a test set of 298 tetraploid samples, and 83% of the common markers showed good quantitative agreement, with RMSE (root mean squared error) <0.5. DArTag is suited for genomic selection candidates in the clonal evaluation trial, coupled with imputation to a higher density platform for the training population. Using the software polyBreedR, an R package for the manipulation and analysis of polyploid marker data, the RMSE for imputation by linkage analysis was 0.15 in a small half-diallel population (N = 85), which was significantly lower than the RMSE of 0.42 with the random forest method. Regarding high-value traits, the DArTag markers for resistance to potato virus Y, golden cyst nematode, and potato wart appeared to track their targets successfully, as did multi-allelic markers for maturity and tuber shape. In summary, the potato DArTag assay is a transformative and publicly available technology for potato breeding and genetics.
Collapse
Affiliation(s)
- Jeffrey B Endelman
- Department of Plant & Agroecosystem Sciences, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Moctar Kante
- Genetics, Genomics and Crop Improvement, International Potato Center, Lima, Peru
| | | | - Andrzej Kilian
- Diversity Arrays Technology Pty Ltd., University of Canberra, Bruce, Australian Capital Territory, Australia
| | - Laura M Shannon
- Department of Horticultural Science, University of Minnesota, Saint Paul, Minnesota, USA
| | - Maria V Caraza-Harter
- Department of Plant & Agroecosystem Sciences, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Brieanne Vaillancourt
- Center for Applied Genetic Technologies, University of Georgia, Athens, Georgia, USA
| | - Kathrine Mailloux
- Center for Applied Genetic Technologies, University of Georgia, Athens, Georgia, USA
| | - John P Hamilton
- Center for Applied Genetic Technologies, University of Georgia, Athens, Georgia, USA
| | - C Robin Buell
- Center for Applied Genetic Technologies, University of Georgia, Athens, Georgia, USA
| |
Collapse
|
2
|
Qiao Y, Jewett EM, McManus KF, Freyman WA, Curran JE, Williams-Blangero S, Blangero J, Williams AL. Reconstructing parent genomes using siblings and other relatives. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.10.593578. [PMID: 38798596 PMCID: PMC11118276 DOI: 10.1101/2024.05.10.593578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Reconstructing the DNA of ancestors from their descendants has the potential to empower phenotypic analyses (including association and genetic nurture studies), improve pedigree reconstruction, and shed light on the ancestral population and phenotypes of ancestors. We developed HAPI-RECAP, a method that reconstructs the DNA of parents from full siblings and their relatives. This tool leverages HAPI2's output, a new phasing approach that applies to siblings (and optionally one or both parents) and reliably infers parent haplotypes but does not link the ungenotyped parents' DNA across chromosomes or between segments flanking ambiguities. By combining IBD between the reconstructed parents and the relatives, HAPI-RECAP resolves the source parent of these segments. Moreover, the method exploits crossovers the children inherited and sex-specific genetic maps to infer the reconstructed parents' sexes. We validated these methods on research participants from both 23andMe, Inc. and the San Antonio Mexican American Family Studies. Given data for one parent, HAPI2 reconstructs large fractions of the missing parent's DNA, between 77.6% and 99.97% among all families, and 90.3% on average in three- and four-child families. When reconstructing both parents, HAPI-RECAP inferred between 33.2% and 96.6% of the parents' genotypes, averaging 70.6% in four-child families. Reconstructed genotypes have average error rates < 10-3, or comparable to those from direct genotyping. HAPI-RECAP inferred the parent sexes 100% correctly given IBD-linked segments and can also reconstruct parents without any IBD. As datasets grow in size, more families will be implicitly collected; HAPI-RECAP holds promise to enable high quality parent genotype reconstruction.
Collapse
Affiliation(s)
- Ying Qiao
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
| | | | | | | | - Joanne E. Curran
- South Texas Diabetes and Obesity Institute and Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX 78520, USA
| | - Sarah Williams-Blangero
- South Texas Diabetes and Obesity Institute and Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX 78520, USA
| | - John Blangero
- South Texas Diabetes and Obesity Institute and Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX 78520, USA
| | | | - Amy L. Williams
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- 23andMe, Inc., Sunnyvale, CA 94086, USA
| |
Collapse
|
3
|
Ou JH, Rönneburg T, Carlborg Ö, Honaker CF, Siegel PB, Rubin CJ. Complex genetic architecture of the chicken Growth1 QTL region. PLoS One 2024; 19:e0295109. [PMID: 38739572 PMCID: PMC11090294 DOI: 10.1371/journal.pone.0295109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Accepted: 04/05/2024] [Indexed: 05/16/2024] Open
Abstract
The genetic complexity of polygenic traits represents a captivating and intricate facet of biological inheritance. Unlike Mendelian traits controlled by a single gene, polygenic traits are influenced by multiple genetic loci, each exerting a modest effect on the trait. This cumulative impact of numerous genes, interactions among them, environmental factors, and epigenetic modifications results in a multifaceted architecture of genetic contributions to complex traits. Given the well-characterized genome, diverse traits, and range of genetic resources, chicken (Gallus gallus) was employed as a model organism to dissect the intricate genetic makeup of a previously identified major Quantitative Trait Loci (QTL) for body weight on chromosome 1. A multigenerational advanced intercross line (AIL) of 3215 chickens whose genomes had been sequenced to an average of 0.4x was analyzed using genome-wide association study (GWAS) and variance-heterogeneity GWAS (vGWAS) to identify markers associated with 8-week body weight. Additionally, epistatic interactions were studied using the natural and orthogonal interaction (NOIA) model. Six genetic modules, two from GWAS and four from vGWAS, were strongly associated with the studied trait. We found evidence of both additive- and non-additive interactions between these modules and constructed a putative local epistasis network for the region. Our screens for functional alleles revealed a missense variant in the gene ribonuclease H2 subunit B (RNASEH2B), which has previously been associated with growth-related traits in chickens and Darwin's finches. In addition, one of the most strongly associated SNPs identified is located in a non-coding region upstream of the long non-coding RNA, ENSGALG00000053256, previously suggested as a candidate gene for regulating chicken body weight. By studying large numbers of individuals from a family material using approaches to capture both additive and non-additive effects, this study advances our understanding of genetic complexities in a highly polygenic trait and has practical implications for poultry breeding and agriculture.
Collapse
Affiliation(s)
- Jen-Hsiang Ou
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Tilman Rönneburg
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Örjan Carlborg
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Christa Ferst Honaker
- Virginia Polytechnic Institute and State University, School of Animal Sciences, Blacksburg, Virginia, United States of America
| | - Paul B. Siegel
- Virginia Polytechnic Institute and State University, School of Animal Sciences, Blacksburg, Virginia, United States of America
| | - Carl-Johan Rubin
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
- Institute of Marine Research, Bergen, Norway
| |
Collapse
|
4
|
Wragg D, Zhang W, Peterson S, Yerramilli M, Mellanby R, Schoenebeck JJ, Clements DN. A cautionary tale of low-pass sequencing and imputation with respect to haplotype accuracy. Genet Sel Evol 2024; 56:6. [PMID: 38216889 PMCID: PMC10785484 DOI: 10.1186/s12711-024-00875-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 01/03/2024] [Indexed: 01/14/2024] Open
Abstract
BACKGROUND Low-pass whole-genome sequencing and imputation offer significant cost savings, enabling substantial increases in sample size and statistical power. This approach is particularly promising in livestock breeding, providing an affordable means of screening individuals for deleterious alleles or calculating genomic breeding values. Consequently, it may also be of value in companion animal genomics to support pedigree breeding. We sought to evaluate in dogs the impact of low coverage sequencing and reference-guided imputation on genotype concordance and association analyses. RESULTS DNA isolated from saliva of 30 Labrador retrievers was sequenced at low (0.9X and 3.8X) and high (43.5X) coverage, and down-sampled from 43.5X to 9.6X and 17.4X. Genotype imputation was performed using a diverse reference panel (1021 dogs), and two subsets of the former panel (256 dogs each) where one had an excess of Labrador retrievers relative to other breeds. We observed little difference in imputed genotype concordance between reference panels. Association analyses for a locus acting as a disease proxy were performed using single-marker (GEMMA) and haplotype-based (XP-EHH) tests. GEMMA results were highly correlated (r ≥ 0.97) between 43.5X and ≥ 3.8X depths of coverage, while for 0.9X the correlation was lower (r ≤ 0.8). XP-EHH results were less well correlated, with r ranging from 0.58 (0.9X) to 0.88 (17.4X). Across a random sample of 10,000 genomic regions averaging 17 kb in size, we observed a median of three haplotypes per dog across the sequencing depths, with 5% of the regions returning more than eight haplotypes. Inspection of one such region revealed genotype and phasing inconsistencies across sequencing depths. CONCLUSIONS We demonstrate that saliva-derived canine DNA is suitable for whole-genome sequencing, highlighting the feasibility of client-based sampling. Low-pass sequencing and imputation require caution as incorrect allele assignments result when the subject possesses alleles that are absent in the reference panel. Larger panels have the capacity for greater allelic diversity, which should reduce the potential for imputation error. Although low-pass sequencing can accurately impute allele dosage, we highlight issues with phasing accuracy that impact haplotype-based analyses. Consequently, if accurately phased genotypes are required for analyses, we advocate sequencing at high depth (> 20X).
Collapse
Affiliation(s)
- David Wragg
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK.
| | - Wengang Zhang
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK
| | - Sarah Peterson
- IDEXX Laboratories Inc, One IDEXX Drive, Westbrook, ME, 04092, USA
| | | | - Richard Mellanby
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK
- IDEXX Laboratories Inc, One IDEXX Drive, Westbrook, ME, 04092, USA
| | - Jeffrey J Schoenebeck
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK.
| | - Dylan N Clements
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK
| |
Collapse
|
5
|
Divilov K, Merz N, Schoolfield B, Green TJ, Langdon C. Genome-wide allele frequency studies in Pacific oyster families identify candidate genes for tolerance to ostreid herpesvirus 1 (OsHV-1). BMC Genomics 2023; 24:631. [PMID: 37872508 PMCID: PMC10594793 DOI: 10.1186/s12864-023-09744-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 10/14/2023] [Indexed: 10/25/2023] Open
Abstract
BACKGROUND Host genetics influences the development of infectious diseases in many agricultural animal species. Identifying genes associated with disease development has the potential to make selective breeding for disease tolerance more likely to succeed through the selection of different genes in diverse signaling pathways. In this study, four families of Pacific oysters (Crassostrea gigas) were identified to be segregating for a quantitative trait locus (QTL) on chromosome 8. This QTL was previously found to be associated with basal antiviral gene expression and survival to ostreid herpesvirus 1 (OsHV-1) mortality events in Tomales Bay, California. Individuals from these four families were phenotyped and genotyped in an attempt to find candidate genes associated with the QTL on chromosome 8. RESULTS Genome-wide allele frequencies of oysters from each family prior to being planting in Tomales Bay were compared with the allele frequencies of oysters from respective families that survived an OsHV-1 mortality event. Six significant unique QTL were identified in two families in these genome-wide allele frequency studies, all of which were located on chromosome 8. Three QTL were assigned to candidate genes (ABCA1, PIK3R1, and WBP2) that have been previously associated with antiviral innate immunity in vertebrates. CONCLUSION The identification of vertebrate antiviral innate immunity genes as candidate genes involved in molluscan antiviral innate immunity reinforces the similarities between the innate immune systems of these two groups. Causal variant identification in these candidate genes will enable future functional studies of these genes in an effort to better understand their antiviral modes of action.
Collapse
Affiliation(s)
- Konstantin Divilov
- Department of Fisheries, Wildlife, and Conservation Sciences, Coastal Oregon Marine Experiment Station, Oregon State University, Hatfield Marine Science Center, Newport, OR, 97365, USA.
| | - Noah Merz
- Department of Fisheries, Wildlife, and Conservation Sciences, Coastal Oregon Marine Experiment Station, Oregon State University, Hatfield Marine Science Center, Newport, OR, 97365, USA
| | - Blaine Schoolfield
- Department of Fisheries, Wildlife, and Conservation Sciences, Coastal Oregon Marine Experiment Station, Oregon State University, Hatfield Marine Science Center, Newport, OR, 97365, USA
| | - Timothy J Green
- Centre for Shellfish Research, Vancouver Island University, Nanaimo, BC, V9R 5S5, Canada
| | - Chris Langdon
- Department of Fisheries, Wildlife, and Conservation Sciences, Coastal Oregon Marine Experiment Station, Oregon State University, Hatfield Marine Science Center, Newport, OR, 97365, USA
| |
Collapse
|
6
|
Vu NT, Phuc TH, Nguyen NH, Van Sang N. Effects of common full-sib families on accuracy of genomic prediction for tagging weight in striped catfish Pangasianodon hypophthalmus. Front Genet 2023; 13:1081246. [PMID: 36685869 PMCID: PMC9845282 DOI: 10.3389/fgene.2022.1081246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 12/06/2022] [Indexed: 01/06/2023] Open
Abstract
Common full-sib families (c 2 ) make up a substantial proportion of total phenotypic variation in traits of commercial importance in aquaculture species and omission or inclusion of the c 2 resulted in possible changes in genetic parameter estimates and re-ranking of estimated breeding values. However, the impacts of common full-sib families on accuracy of genomic prediction for commercial traits of economic importance are not well known in many species, including aquatic animals. This research explored the impacts of common full-sib families on accuracy of genomic prediction for tagging weight in a population of striped catfish comprising 11,918 fish traced back to the base population (four generations), in which 560 individuals had genotype records of 14,154 SNPs. Our single step genomic best linear unbiased prediction (ssGLBUP) showed that the accuracy of genomic prediction for tagging weight was reduced by 96.5%-130.3% when the common full-sib families were included in statistical models. The reduction in the prediction accuracy was to a smaller extent in multivariate analysis than in univariate models. Imputation of missing genotypes somewhat reduced the upward biases in the prediction accuracy for tagging weight. It is therefore suggested that genomic evaluation models for traits recorded during the early phase of growth development should account for the common full-sib families to minimise possible biases in the accuracy of genomic prediction and hence, selection response.
Collapse
Affiliation(s)
- Nguyen Thanh Vu
- School of Science, Technology and Engineering, University of the Sunshine Coast, Sippy Downs, QLD, Australia,Center for Bio-Innovation, University of the Sunshine Coast, Maroochydore, QLD, Australia,Research Institute for Aquaculture No. 2, Ho Chi Minh City, Vietnam
| | - Tran Huu Phuc
- Research Institute for Aquaculture No. 2, Ho Chi Minh City, Vietnam
| | - Nguyen Hong Nguyen
- School of Science, Technology and Engineering, University of the Sunshine Coast, Sippy Downs, QLD, Australia,Center for Bio-Innovation, University of the Sunshine Coast, Maroochydore, QLD, Australia,*Correspondence: Nguyen Hong Nguyen, ; Nguyen Van Sang,
| | - Nguyen Van Sang
- Research Institute for Aquaculture No. 2, Ho Chi Minh City, Vietnam,*Correspondence: Nguyen Hong Nguyen, ; Nguyen Van Sang,
| |
Collapse
|
7
|
Niehoff T, Pook T, Gholami M, Beissinger T. Imputation of low-density marker chip data in plant breeding: Evaluation of methods based on sugar beet. THE PLANT GENOME 2022; 15:e20257. [PMID: 36258672 DOI: 10.1002/tpg2.20257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 08/02/2022] [Indexed: 06/16/2023]
Abstract
Low-density genotyping followed by imputation reduces genotyping costs while still providing high-density marker information. An increased marker density has the potential to improve the outcome of all applications that are based on genomic data. This study investigates techniques for 1k to 20k genomic marker imputation for plant breeding programs with sugar beet (Beta vulgaris L. ssp. vulgaris) as an example crop, where these are realistic marker numbers for modern breeding applications. The generally accepted 'gold standard' for imputation, Beagle 5.1, was compared with the recently developed software AlphaPlantImpute2 which is designed specifically for plant breeding. For Beagle 5.1 and AlphaPlantImpute2, the imputation strategy as well as the imputation parameters were optimized in this study. We found that the imputation accuracy of Beagle could be tremendously improved (0.22 to 0.67) by tuning parameters, mainly by lowering the values for the parameter for the effective population size and increasing the number of iterations performed. Separating the phasing and imputation steps also improved accuracies when optimized parameters were used (0.67 to 0.82). We also found that the imputation accuracy of Beagle decreased when more low-density lines were included for imputation. AlphaPlantImpute2 produced very high accuracies without optimization (0.89) and was generally less responsive to optimization. Overall, AlphaPlantImpute2 performed relatively better for imputation whereas Beagle was better for phasing. Combining both tools yielded the highest accuracies.
Collapse
Affiliation(s)
- Tobias Niehoff
- Animal Breeding and Genomics, Wageningen Univ. & Research, Postbox 338, 6700AH, Wageningen, The Netherlands
- Dep. of Crop Sciences, Division of Plant Breeding Methodology, Univ. of Göttingen, Göttingen, 37075, Germany
| | - Torsten Pook
- Animal Breeding and Genomics, Wageningen Univ. & Research, Postbox 338, 6700AH, Wageningen, The Netherlands
- Dep. of Animal Sciences, Animal Breeding and Genetics Group, Univ. of Göttingen, Göttingen, 37075, Germany
- Center for Integrated Breeding Research, Univ. of Göttingen, Göttingen, 37075, Germany
| | - Mahmood Gholami
- RD-SBCE-BTA, KWS SAAT SE & Co. KGaA, Grimsehlstr. 31, Einbeck, 37574, Germany
| | - Timothy Beissinger
- Dep. of Crop Sciences, Division of Plant Breeding Methodology, Univ. of Göttingen, Göttingen, 37075, Germany
- Center for Integrated Breeding Research, Univ. of Göttingen, Göttingen, 37075, Germany
| |
Collapse
|
8
|
Mendelian imputation of parental genotypes improves estimates of direct genetic effects. Nat Genet 2022; 54:897-905. [PMID: 35681053 PMCID: PMC9197765 DOI: 10.1038/s41588-022-01085-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Accepted: 04/28/2022] [Indexed: 01/22/2023]
Abstract
Effects estimated by genome-wide association studies (GWASs) include effects of alleles in an individual on that individual (direct genetic effects), indirect genetic effects (for example, effects of alleles in parents on offspring through the environment) and bias from confounding. Within-family genetic variation is random, enabling unbiased estimation of direct genetic effects when parents are genotyped. However, parental genotypes are often missing. We introduce a method that imputes missing parental genotypes and estimates direct genetic effects. Our method, implemented in the software package snipar (single-nucleotide imputation of parents), gives more precise estimates of direct genetic effects than existing approaches. Using 39,614 individuals from the UK Biobank with at least one genotyped sibling/parent, we estimate the correlation between direct genetic effects and effects from standard GWASs for nine phenotypes, including educational attainment (r = 0.739, standard error (s.e.) = 0.086) and cognitive ability (r = 0.490, s.e. = 0.086). Our results demonstrate substantial confounding bias in standard GWASs for some phenotypes.
Collapse
|
9
|
Genotyping, the Usefulness of Imputation to Increase SNP Density, and Imputation Methods and Tools. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2467:113-138. [PMID: 35451774 DOI: 10.1007/978-1-0716-2205-6_4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Imputation has become a standard practice in modern genetic research to increase genome coverage and improve accuracy of genomic selection and genome-wide association study as a large number of samples can be genotyped at lower density (and lower cost) and, imputed up to denser marker panels or to sequence level, using information from a limited reference population. Most genotype imputation algorithms use information from relatives and population linkage disequilibrium. A number of software for imputation have been developed originally for human genetics and, more recently, for animal and plant genetics considering pedigree information and very sparse SNP arrays or genotyping-by-sequencing data. In comparison to human populations, the population structures in farmed species and their limited effective sizes allow to accurately impute high-density genotypes or sequences from very low-density SNP panels and a limited set of reference individuals. Whatever the imputation method, the imputation accuracy, measured by the correct imputation rate or the correlation between true and imputed genotypes, increased with the increasing relatedness of the individual to be imputed with its denser genotyped ancestors and as its own genotype density increased. Increasing the imputation accuracy pushes up the genomic selection accuracy whatever the genomic evaluation method. Given the marker densities, the most important factors affecting imputation accuracy are clearly the size of the reference population and the relationship between individuals in the reference and target populations.
Collapse
|
10
|
Bredeson JV, Lyons JB, Oniyinde IO, Okereke NR, Kolade O, Nnabue I, Nwadili CO, Hřibová E, Parker M, Nwogha J, Shu S, Carlson J, Kariba R, Muthemba S, Knop K, Barton GJ, Sherwood AV, Lopez-Montes A, Asiedu R, Jamnadass R, Muchugi A, Goodstein D, Egesi CN, Featherston J, Asfaw A, Simpson GG, Doležel J, Hendre PS, Van Deynze A, Kumar PL, Obidiegwu JE, Bhattacharjee R, Rokhsar DS. Chromosome evolution and the genetic basis of agronomically important traits in greater yam. Nat Commun 2022; 13:2001. [PMID: 35422045 PMCID: PMC9010478 DOI: 10.1038/s41467-022-29114-w] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 02/08/2022] [Indexed: 12/14/2022] Open
Abstract
The nutrient-rich tubers of the greater yam, Dioscorea alata L., provide food and income security for millions of people around the world. Despite its global importance, however, greater yam remains an orphan crop. Here, we address this resource gap by presenting a highly contiguous chromosome-scale genome assembly of D. alata combined with a dense genetic map derived from African breeding populations. The genome sequence reveals an ancient allotetraploidization in the Dioscorea lineage, followed by extensive genome-wide reorganization. Using the genomic tools, we find quantitative trait loci for resistance to anthracnose, a damaging fungal pathogen of yam, and several tuber quality traits. Genomic analysis of breeding lines reveals both extensive inbreeding as well as regions of extensive heterozygosity that may represent interspecific introgression during domestication. These tools and insights will enable yam breeders to unlock the potential of this staple crop and take full advantage of its adaptability to varied environments.
Collapse
Affiliation(s)
- Jessen V Bredeson
- Department of Molecular & Cell Biology, University of California, Berkeley, CA, 94720, USA
| | - Jessica B Lyons
- Department of Molecular & Cell Biology, University of California, Berkeley, CA, 94720, USA
- Innovative Genomics Institute, Berkeley, CA, USA
| | - Ibukun O Oniyinde
- International Institute of Tropical Agriculture, PMB 5320, Oyo Road, Ibadan, Nigeria
| | - Nneka R Okereke
- National Root Crops Research Institute (NRCRI), Umudike, Nigeria
| | - Olufisayo Kolade
- International Institute of Tropical Agriculture, PMB 5320, Oyo Road, Ibadan, Nigeria
| | - Ikenna Nnabue
- National Root Crops Research Institute (NRCRI), Umudike, Nigeria
| | | | - Eva Hřibová
- Institute of Experimental Botany of the Czech Academy of Sciences, Centre of the Region Haná for Biotechnological and Agricultural Research, Šlechtitelů 31, CZ-77900, Olomouc, Czech Republic
| | - Matthew Parker
- School of Life Sciences, University of Dundee, Dundee, UK
| | - Jeremiah Nwogha
- National Root Crops Research Institute (NRCRI), Umudike, Nigeria
| | | | | | - Robert Kariba
- World Agroforestry (CIFOR-ICRAF), Nairobi, Kenya
- African Orphan Crops Consortium, Nairobi, Kenya
| | - Samuel Muthemba
- World Agroforestry (CIFOR-ICRAF), Nairobi, Kenya
- African Orphan Crops Consortium, Nairobi, Kenya
| | - Katarzyna Knop
- School of Life Sciences, University of Dundee, Dundee, UK
| | | | - Anna V Sherwood
- School of Life Sciences, University of Dundee, Dundee, UK
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Antonio Lopez-Montes
- International Institute of Tropical Agriculture, PMB 5320, Oyo Road, Ibadan, Nigeria
- International Trade Center, Accra, Ghana
| | - Robert Asiedu
- International Institute of Tropical Agriculture, PMB 5320, Oyo Road, Ibadan, Nigeria
| | - Ramni Jamnadass
- World Agroforestry (CIFOR-ICRAF), Nairobi, Kenya
- African Orphan Crops Consortium, Nairobi, Kenya
| | - Alice Muchugi
- World Agroforestry (CIFOR-ICRAF), Nairobi, Kenya
- African Orphan Crops Consortium, Nairobi, Kenya
| | | | - Chiedozie N Egesi
- International Institute of Tropical Agriculture, PMB 5320, Oyo Road, Ibadan, Nigeria
- National Root Crops Research Institute (NRCRI), Umudike, Nigeria
- Cornell University, Ithaca, NY, 14850, USA
| | | | - Asrat Asfaw
- International Institute of Tropical Agriculture, PMB 5320, Oyo Road, Ibadan, Nigeria
| | - Gordon G Simpson
- School of Life Sciences, University of Dundee, Dundee, UK
- James Hutton Institute, Dundee, UK
| | - Jaroslav Doležel
- Institute of Experimental Botany of the Czech Academy of Sciences, Centre of the Region Haná for Biotechnological and Agricultural Research, Šlechtitelů 31, CZ-77900, Olomouc, Czech Republic
| | - Prasad S Hendre
- World Agroforestry (CIFOR-ICRAF), Nairobi, Kenya
- African Orphan Crops Consortium, Nairobi, Kenya
| | | | - Pullikanti Lava Kumar
- International Institute of Tropical Agriculture, PMB 5320, Oyo Road, Ibadan, Nigeria
| | - Jude E Obidiegwu
- National Root Crops Research Institute (NRCRI), Umudike, Nigeria.
| | - Ranjana Bhattacharjee
- International Institute of Tropical Agriculture, PMB 5320, Oyo Road, Ibadan, Nigeria.
| | - Daniel S Rokhsar
- Department of Molecular & Cell Biology, University of California, Berkeley, CA, 94720, USA.
- Innovative Genomics Institute, Berkeley, CA, USA.
- DOE Joint Genome Institute, Berkeley, CA, USA.
- Okinawa Institute of Science and Technology, Onna, Okinawa, Japan.
- Chan-Zuckerberg BioHub, 499 Illinois St., San Francisco, CA, 94158, USA.
| |
Collapse
|
11
|
Vu NT, Phuc TH, Oanh KTP, Sang NV, Trang TT, Nguyen NH. Accuracies of genomic predictions for disease resistance of striped catfish to Edwardsiella ictaluri using artificial intelligence algorithms. G3-GENES GENOMES GENETICS 2021; 12:6408442. [PMID: 34788431 PMCID: PMC8727988 DOI: 10.1093/g3journal/jkab361] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Accepted: 10/10/2021] [Indexed: 02/04/2023]
Abstract
Assessments of genomic prediction accuracies using artificial intelligent (AI) algorithms (i.e., machine and deep learning methods) are currently not available or very limited in aquaculture species. The principal aim of this study was to examine the predictive performance of these new methods for disease resistance to Edwardsiella ictaluri in a population of striped catfish Pangasianodon hypophthalmus and to make comparisons with four common methods, i.e., pedigree-based best linear unbiased prediction (PBLUP), genomic-based best linear unbiased prediction (GBLUP), single-step GBLUP (ssGBLUP) and a nonlinear Bayesian approach (notably BayesR). Our analyses using machine learning (i.e., ML-KAML) and deep learning (i.e., DL-MLP and DL-CNN) together with the four common methods (PBLUP, GBLUP, ssGBLUP, and BayesR) were conducted for two main disease resistance traits (i.e., survival status coded as 0 and 1 and survival time, i.e., days that the animals were still alive after the challenge test) in a pedigree consisting of 560 individual animals (490 offspring and 70 parents) genotyped for 14,154 single nucleotide polymorphism (SNPs). The results using 6,470 SNPs after quality control showed that machine learning methods outperformed PBLUP, GBLUP, and ssGBLUP, with the increases in the prediction accuracies for both traits by 9.1–15.4%. However, the prediction accuracies obtained from machine learning methods were comparable to those estimated using BayesR. Imputation of missing genotypes using AlphaFamImpute increased the prediction accuracies by 5.3–19.2% in all the methods and data used. On the other hand, there were insignificant decreases (0.3–5.6%) in the prediction accuracies for both survival status and survival time when multivariate models were used in comparison to univariate analyses. Interestingly, the genomic prediction accuracies based on only highly significant SNPs (P < 0.00001, 318–400 SNPs for survival status and 1,362–1,589 SNPs for survival time) were somewhat lower (0.3–15.6%) than those obtained from the whole set of 6,470 SNPs. In most of our analyses, the accuracies of genomic prediction were somewhat higher for survival time than survival status (0/1 data). It is concluded that although there are prospects for the application of genomic selection to increase disease resistance to E. ictaluri in striped catfish breeding programs, further evaluation of these methods should be made in independent families/populations when more data are accumulated in future generations to avoid possible biases in the genetic parameters estimates and prediction accuracies for the disease-resistant traits studied in this population of striped catfish P. hypophthalmus.
Collapse
Affiliation(s)
- Nguyen Thanh Vu
- School of Science, Technology and Engineering, University of the Sunshine Coast, Sippy Downs, QLD, Australia.,Genecology Research Center, University of the Sunshine Coast, Sippy Downs, QLD, Australia.,Research Institute for Aquaculture No.2, Ho Chi Minh 710000, Vietnam
| | - Tran Huu Phuc
- Research Institute for Aquaculture No.2, Ho Chi Minh 710000, Vietnam
| | - Kim Thi Phuong Oanh
- Institute of Genome Research, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Nguyen Van Sang
- Research Institute for Aquaculture No.2, Ho Chi Minh 710000, Vietnam
| | - Trinh Thi Trang
- School of Science, Technology and Engineering, University of the Sunshine Coast, Sippy Downs, QLD, Australia.,Genecology Research Center, University of the Sunshine Coast, Sippy Downs, QLD, Australia.,Vietnam National University of Agriculture, Gia Lam 131000, Vietnam
| | - Nguyen Hong Nguyen
- School of Science, Technology and Engineering, University of the Sunshine Coast, Sippy Downs, QLD, Australia.,Genecology Research Center, University of the Sunshine Coast, Sippy Downs, QLD, Australia
| |
Collapse
|