1
|
Kan-Lingwood NY, Sagi L, Mazie S, Shahar N, Zecherle Bitton L, Templeton A, Rubenstein D, Bouskila A, Bar-David S. Genotyping Error Detection and Customised Filtration for SNP Datasets. Mol Ecol Resour 2025; 25:e14033. [PMID: 39435526 DOI: 10.1111/1755-0998.14033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 08/23/2024] [Accepted: 09/02/2024] [Indexed: 10/23/2024]
Abstract
A major challenge in analysing single-nucleotide polymorphism (SNP) genotype datasets is detecting and filtering errors that bias analyses and misinterpret ecological and evolutionary processes. Here, we present a comprehensive method to estimate and minimise genotyping error rates (deviations from the 'true' genotype) in any SNP datasets using triplicates (three repeats of the same sample) in a four-step filtration pipeline. The approach involves: (1) SNP filtering by missing data; (2) SNP filtering by error rates; (3) sample filtering by missing data and (4) detection of recaptured individuals by using estimated SNP error rates. The modular pipeline is provided in an R script that allows customised adjustments. We demonstrate the applicability of the method using non-invasive sampling from the Asiatic wild ass (Equus hemionus) population in Israel. We genotyped 756 samples using 625 SNPs, of which 255 were triplicates of 85 samples. The average SNP error rate, calculated based on the number of mismatching genotypes across triplicates before filtration, was 0.0034 and was reduced to 0.00174 following filtration. Evaluating genetic distance (GD) and relatedness (r) between triplicates before and after filtration (expected to be at the minimum and maximum respectively) showed a significant reduction in the average GD, from 58.1 to 25.3 (p = 0.0002) and a significant increase in relatedness, from r = 0.98 to r = 0.991 (p = 0.00587). We demonstrate how error rate estimation enhances recapture detection and improves genotype quality.
Collapse
Affiliation(s)
- Noa Yaffa Kan-Lingwood
- Mitrani Department of Desert Ecology, Ben-Gurion University of the Negev, The Swiss Institute for Dryland Environmental & Energy Research, Midreshet Ben-Gurion, Israel
| | - Liran Sagi
- Life Science Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Shahar Mazie
- The Alexander Silberman Institute of Life Science, The Hebrew University, Jerusalem, Israel
| | - Naama Shahar
- Mitrani Department of Desert Ecology, Ben-Gurion University of the Negev, The Swiss Institute for Dryland Environmental & Energy Research, Midreshet Ben-Gurion, Israel
| | - Lilith Zecherle Bitton
- Mitrani Department of Desert Ecology, Ben-Gurion University of the Negev, The Swiss Institute for Dryland Environmental & Energy Research, Midreshet Ben-Gurion, Israel
| | - Alan Templeton
- Department of Biology, Washington University, St. Louis, Missouri, USA
| | - Daniel Rubenstein
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey, USA
| | - Amos Bouskila
- Life Science Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Shirli Bar-David
- Mitrani Department of Desert Ecology, Ben-Gurion University of the Negev, The Swiss Institute for Dryland Environmental & Energy Research, Midreshet Ben-Gurion, Israel
| |
Collapse
|
2
|
Aguirre NC, Villalba PV, García MN, Filippi CV, Rivas JG, Martínez MC, Acuña CV, López AJ, López JA, Pathauer P, Palazzini D, Harrand L, Oberschelp J, Marcó MA, Cisneros EF, Carreras R, Martins Alves AM, Rodrigues JC, Hopp HE, Grattapaglia D, Cappa EP, Paniego NB, Marcucci Poltri SN. Comparison of ddRADseq and EUChip60K SNP genotyping systems for population genetics and genomic selection in Eucalyptus dunnii (Maiden). Front Genet 2024; 15:1361418. [PMID: 38606359 PMCID: PMC11008695 DOI: 10.3389/fgene.2024.1361418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Accepted: 02/19/2024] [Indexed: 04/13/2024] Open
Abstract
Eucalyptus dunnii is one of the most important Eucalyptus species for short-fiber pulp production in regions where other species of the genus are affected by poor soil and climatic conditions. In this context, E. dunnii holds promise as a resource to address and adapt to the challenges of climate change. Despite its rapid growth and favorable wood properties for solid wood products, the advancement of its improvement remains in its early stages. In this work, we evaluated the performance of two single nucleotide polymorphism, (SNP), genotyping methods for population genetics analysis and Genomic Selection in E. dunnii. Double digest restriction-site associated DNA sequencing (ddRADseq) was compared with the EUChip60K array in 308 individuals from a provenance-progeny trial. The compared SNP set included 8,011 and 19,008 informative SNPs distributed along the 11 chromosomes, respectively. Although the two datasets differed in the percentage of missing data, genome coverage, minor allele frequency and estimated genetic diversity parameters, they revealed a similar genetic structure, showing two subpopulations with little differentiation between them, and low linkage disequilibrium. GS analyses were performed for eleven traits using Genomic Best Linear Unbiased Prediction (GBLUP) and a conventional pedigree-based model (ABLUP). Regardless of the SNP dataset, the predictive ability (PA) of GBLUP was better than that of ABLUP for six traits (Cellulose content, Total and Ethanolic extractives, Total and Klason lignin content and Syringyl and Guaiacyl lignin monomer ratio). When contrasting the SNP datasets used to estimate PAs, the GBLUP-EUChip60K model gave higher and significant PA values for six traits, meanwhile, the values estimated using ddRADseq gave higher values for three other traits. The PAs correlated positively with narrow sense heritabilities, with the highest correlations shown by the ABLUP and GBLUP-EUChip60K. The two genotyping methods, ddRADseq and EUChip60K, are generally comparable for population genetics and genomic prediction, demonstrating the utility of the former when subjected to rigorous SNP filtering. The results of this study provide a basis for future whole-genome studies using ddRADseq in non-model forest species for which SNP arrays have not yet been developed.
Collapse
Affiliation(s)
| | | | - Martín Nahuel García
- Instituto de Agrobiotecnología y Biología Molecular, UEDD INTA-CONICET, Hurlingham, Argentina
| | - Carla Valeria Filippi
- Instituto de Agrobiotecnología y Biología Molecular, UEDD INTA-CONICET, Hurlingham, Argentina
- Laboratorio de Bioquímica, Departamento de Biología Vegetal, Facultad de Agronomía, Universidad de la República, Montevideo, Uruguay
| | - Juan Gabriel Rivas
- Instituto de Agrobiotecnología y Biología Molecular, UEDD INTA-CONICET, Hurlingham, Argentina
| | - María Carolina Martínez
- Instituto de Agrobiotecnología y Biología Molecular, UEDD INTA-CONICET, Hurlingham, Argentina
| | - Cintia Vanesa Acuña
- Instituto de Agrobiotecnología y Biología Molecular, UEDD INTA-CONICET, Hurlingham, Argentina
| | - Augusto J. López
- Estación Experimental Agropecuaria de Bella Vista, Instituto Nacional de Tecnología Agropecuaria, Bella Vista, Argentina
| | - Juan Adolfo López
- Estación Experimental Agropecuaria de Bella Vista, Instituto Nacional de Tecnología Agropecuaria, Bella Vista, Argentina
| | - Pablo Pathauer
- Instituto de Recursos Biológicos, Instituto Nacional de Tecnología Agropecuaria, Hurlingham, Argentina
| | - Dino Palazzini
- Instituto de Recursos Biológicos, Instituto Nacional de Tecnología Agropecuaria, Hurlingham, Argentina
| | - Leonel Harrand
- Estación Experimental Agropecuaria de Concordia, Instituto Nacional de Tecnología Agropecuaria, Concordia, Argentina
| | - Javier Oberschelp
- Estación Experimental Agropecuaria de Concordia, Instituto Nacional de Tecnología Agropecuaria, Concordia, Argentina
| | - Martín Alberto Marcó
- Estación Experimental Agropecuaria de Concordia, Instituto Nacional de Tecnología Agropecuaria, Concordia, Argentina
| | - Esteban Felipe Cisneros
- Facultad de Ciencias Forestales, Universidad Nacional de Santiago del Estero (UNSE), Santiago del Estero, Argentina
| | - Rocío Carreras
- Facultad de Ciencias Forestales, Universidad Nacional de Santiago del Estero (UNSE), Santiago del Estero, Argentina
| | - Ana Maria Martins Alves
- Centro de Estudos Florestais e Laboratório Associado TERRA, Instituto Superior de Agronomia, Universidade de Lisboa, Tapada da Ajuda, Lisboa, Portugal
| | - José Carlos Rodrigues
- Centro de Estudos Florestais e Laboratório Associado TERRA, Instituto Superior de Agronomia, Universidade de Lisboa, Tapada da Ajuda, Lisboa, Portugal
| | - H. Esteban Hopp
- Instituto de Agrobiotecnología y Biología Molecular, UEDD INTA-CONICET, Hurlingham, Argentina
| | - Dario Grattapaglia
- Empresa Brasileira de Pesquisa Agropecuária (EMBRAPA), Recursos Genéticos e Biotecnologia, Brasilia, Brazil
| | - Eduardo Pablo Cappa
- Instituto de Recursos Biológicos, Instituto Nacional de Tecnología Agropecuaria, Hurlingham, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas, Buenos Aires, Argentina
| | - Norma Beatriz Paniego
- Instituto de Agrobiotecnología y Biología Molecular, UEDD INTA-CONICET, Hurlingham, Argentina
| | | |
Collapse
|
3
|
Estravis Barcala M, van der Valk T, Chen Z, Funda T, Chaudhary R, Klingberg A, Fundova I, Suontama M, Hallingbäck H, Bernhardsson C, Nystedt B, Ingvarsson PK, Sherwood E, Street N, Gyllensten U, Nilsson O, Wu HX. Whole-genome resequencing facilitates the development of a 50K single nucleotide polymorphism genotyping array for Scots pine (Pinus sylvestris L.) and its transferability to other pine species. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2024; 117:944-955. [PMID: 37947292 DOI: 10.1111/tpj.16535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 10/25/2023] [Accepted: 10/26/2023] [Indexed: 11/12/2023]
Abstract
Scots pine (Pinus sylvestris L.) is one of the most widespread and economically important conifer species in the world. Applications like genomic selection and association studies, which could help accelerate breeding cycles, are challenging in Scots pine because of its large and repetitive genome. For this reason, genotyping tools for conifer species, and in particular for Scots pine, are commonly based on transcribed regions of the genome. In this article, we present the Axiom Psyl50K array, the first single nucleotide polymorphism (SNP) genotyping array for Scots pine based on whole-genome resequencing, that represents both genic and intergenic regions. This array was designed following a two-step procedure: first, 192 trees were sequenced, and a 430K SNP screening array was constructed. Then, 480 samples, including haploid megagametophytes, full-sib family trios, breeding population, and range-wide individuals from across Eurasia were genotyped with the screening array. The best 50K SNPs were selected based on quality, replicability, distribution across the draft genome assembly, balance between genic and intergenic regions, and genotype-environment and genotype-phenotype associations. Of the final 49 877 probes tiled in the array, 20 372 (40.84%) occur inside gene models, while the rest lie in intergenic regions. We also show that the Psyl50K array can yield enough high-confidence SNPs for genetic studies in pine species from North America and Eurasia. This new genotyping tool will be a valuable resource for high-throughput fundamental and applied research of Scots pine and other pine species.
Collapse
Affiliation(s)
- Maximiliano Estravis Barcala
- Department of Forest Genetics and Plant Physiology, Umeå Plant Science Centre (UPSC), Swedish University of Agricultural Sciences, Umeå, Sweden
| | - Tom van der Valk
- Department of Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Zhiqiang Chen
- Department of Forest Genetics and Plant Physiology, Umeå Plant Science Centre (UPSC), Swedish University of Agricultural Sciences, Umeå, Sweden
| | - Tomas Funda
- Department of Forest Genetics and Plant Physiology, Umeå Plant Science Centre (UPSC), Swedish University of Agricultural Sciences, Umeå, Sweden
| | - Rajiv Chaudhary
- Department of Forest Genetics and Plant Physiology, Umeå Plant Science Centre (UPSC), Swedish University of Agricultural Sciences, Umeå, Sweden
| | - Adam Klingberg
- Department of Forest Genetics and Plant Physiology, Umeå Plant Science Centre (UPSC), Swedish University of Agricultural Sciences, Umeå, Sweden
- Skogforsk, Sävar, Uppsala, Sweden
| | - Irena Fundova
- Department of Forest Genetics and Plant Physiology, Umeå Plant Science Centre (UPSC), Swedish University of Agricultural Sciences, Umeå, Sweden
| | | | | | - Carolina Bernhardsson
- Department of Organismal Biology, Human Evolution, Uppsala University, Uppsala, Sweden
- Department of Plant Biology, Linnean Centre for Plant Biology, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Björn Nystedt
- Department of Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Pär K Ingvarsson
- Department of Plant Biology, Linnean Centre for Plant Biology, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Ellen Sherwood
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Stockholm, Sweden
- Department of Gene Technology, Science for Life Laboratory, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Nathaniel Street
- Department of Plant Physiology, Umeå Plant Science Centre (UPSC), Umeå University, Umeå, Sweden
| | - Ulf Gyllensten
- Department of Immunology, Genetics, and Pathology, Biomedical Center, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Ove Nilsson
- Department of Forest Genetics and Plant Physiology, Umeå Plant Science Centre (UPSC), Swedish University of Agricultural Sciences, Umeå, Sweden
| | - Harry X Wu
- Department of Forest Genetics and Plant Physiology, Umeå Plant Science Centre (UPSC), Swedish University of Agricultural Sciences, Umeå, Sweden
| |
Collapse
|
4
|
Addison S, Armstrong C, Wigley K, Hartley R, Wakelin S. What matters most? Assessment of within-canopy factors influencing the needle microbiome of the model conifer, Pinus radiata. ENVIRONMENTAL MICROBIOME 2023; 18:45. [PMID: 37254222 DOI: 10.1186/s40793-023-00507-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 05/22/2023] [Indexed: 06/01/2023]
Abstract
The assembly and function of the phyllosphere microbiome is important to the overall fitness of plants and, thereby, the ecosystems they inhabit. Presently, model systems for tree phyllosphere microbiome studies are lacking, yet forests resilient to pests, diseases, and climate change are important to support a myriad of ecosystem services impacting from local to global levels. In this study, we extend the development of model microbiome systems for trees species, particularly coniferous gymnosperms, by undertaking a structured approach assessing the phyllosphere microbiome of Pinus radiata. Canopy sampling height was the single most important factor influencing both alpha- and beta-diversity of bacterial and fungal communities (p < 0.005). Bacterial and fungal phyllosphere microbiome richness was lowest in samples from the top of the canopy, subsequently increasing in the middle and then bottom canopy samples. These differences maybe driven by either by (1) exchange of microbiomes with the forest floor and soil with the lower foliage, (2) strong ecological filtering in the upper canopy via environmental exposure (e.g., UV), (3) canopy density, (4) or combinations of factors. Most taxa present in the top canopy were also present lower in tree; as such, sampling strategies focussing on lower canopy sampling should provide good overall phyllosphere microbiome coverage for the tree. The dominant phyllosphere bacteria were Alpha-proteobacteria (Rhizobiales and Sphingomonas) along with Acidobacteria Gp1. However, the P. radiata phyllosphere microbiome samples were fungal dominated. From the top canopy samples, Arthoniomycetes and Dothideomycetes were highly represented, with abundances of Arthoniomycetes then reducing in lower canopy samples whilst abundances of Ascomycota increased. The most abundant fungal taxa were Phaeococcomyces (14.4% of total reads) and Phaeotheca spp. (10.38%). A second-order effect of canopy sampling direction was evident in bacterial community composition (p = 0.01); these directional influences were not evident for fungal communities. However, sterilisation of needles did impact fungal community composition (p = 0.025), indicating potential for community differences in the endosphere versus leaf surface compartments. Needle age was only important in relation to bacterial communities, but was canopy height dependant (interaction p = 0.008). By building an understanding of the primary and secondary factors related to intra-canopy phyllosphere microbiome variation, we provide a sampling framework to either explicitly minimise or capture variation in needle collection to enable ongoing ecological studies targeted at inter-canopy or other experimental levels.
Collapse
Affiliation(s)
| | | | | | | | - Steven Wakelin
- Scion, P.O. Box 29237, Riccarton, Christchurch, 8440, New Zealand.
| |
Collapse
|
5
|
Nantongo JS, Potts BM, Klápště J, Graham NJ, Dungey HS, Fitzgerald H, O'Reilly-Wapstra JM. Genomic selection for resistance to mammalian bark stripping and associated chemical compounds in radiata pine. G3 (BETHESDA, MD.) 2022; 12:jkac245. [PMID: 36218439 PMCID: PMC9635650 DOI: 10.1093/g3journal/jkac245] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Accepted: 08/29/2022] [Indexed: 07/28/2023]
Abstract
The integration of genomic data into genetic evaluations can facilitate the rapid selection of superior genotypes and accelerate the breeding cycle in trees. In this study, 390 trees from 74 control-pollinated families were genotyped using a 36K Axiom SNP array. A total of 15,624 high-quality SNPs were used to develop genomic prediction models for mammalian bark stripping, tree height, and selected primary and secondary chemical compounds in the bark. Genetic parameters from different genomic prediction methods-single-trait best linear unbiased prediction based on a marker-based relationship matrix (genomic best linear unbiased prediction), multitrait single-step genomic best linear unbiased prediction, which integrated the marker-based and pedigree-based relationship matrices (single-step genomic best linear unbiased prediction) and the single-trait generalized ridge regression-were compared to equivalent single- or multitrait pedigree-based approaches (ABLUP). The influence of the statistical distribution of data on the genetic parameters was assessed. Results indicated that the heritability estimates were increased nearly 2-fold with genomic models compared to the equivalent pedigree-based models. Predictive accuracy of the single-step genomic best linear unbiased prediction was higher than the ABLUP for most traits. Allowing for heterogeneity in marker effects through the use of generalized ridge regression did not markedly improve predictive ability over genomic best linear unbiased prediction, arguing that most of the chemical traits are modulated by many genes with small effects. Overall, the traits with low pedigree-based heritability benefited more from genomic models compared to the traits with high pedigree-based heritability. There was no evidence that data skewness or the presence of outliers affected the genomic or pedigree-based genetic estimates.
Collapse
Affiliation(s)
- Judith S Nantongo
- Corresponding author: National Agricultural Research Organization, P.O Box 1752, Mukono, Uganda.
| | - Brad M Potts
- School of Natural Sciences, University of Tasmania, Hobart, TAS 7001, Australia
- ARC Training Centre for Forest Value, Hobart, TAS 7001, Australia
| | - Jaroslav Klápště
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua 3046, New Zealand
| | - Natalie J Graham
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua 3046, New Zealand
| | - Heidi S Dungey
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua 3046, New Zealand
| | - Hugh Fitzgerald
- School of Natural Sciences, University of Tasmania, Hobart, TAS 7001, Australia
| | - Julianne M O'Reilly-Wapstra
- School of Natural Sciences, University of Tasmania, Hobart, TAS 7001, Australia
- ARC Training Centre for Forest Value, Hobart, TAS 7001, Australia
| |
Collapse
|
6
|
Borthakur D, Busov V, Cao XH, Du Q, Gailing O, Isik F, Ko JH, Li C, Li Q, Niu S, Qu G, Vu THG, Wang XR, Wei Z, Zhang L, Wei H. Current status and trends in forest genomics. FORESTRY RESEARCH 2022; 2:11. [PMID: 39525413 PMCID: PMC11524260 DOI: 10.48130/fr-2022-0011] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 08/19/2022] [Indexed: 11/16/2024]
Abstract
Forests are not only the most predominant of the Earth's terrestrial ecosystems, but are also the core supply for essential products for human use. However, global climate change and ongoing population explosion severely threatens the health of the forest ecosystem and aggravtes the deforestation and forest degradation. Forest genomics has great potential of increasing forest productivity and adaptation to the changing climate. In the last two decades, the field of forest genomics has advanced quickly owing to the advent of multiple high-throughput sequencing technologies, single cell RNA-seq, clustered regularly interspaced short palindromic repeats (CRISPR)-mediated genome editing, and spatial transcriptomes, as well as bioinformatics analysis technologies, which have led to the generation of multidimensional, multilayered, and spatiotemporal gene expression data. These technologies, together with basic technologies routinely used in plant biotechnology, enable us to tackle many important or unique issues in forest biology, and provide a panoramic view and an integrative elucidation of molecular regulatory mechanisms underlying phenotypic changes and variations. In this review, we recapitulated the advancement and current status of 12 research branches of forest genomics, and then provided future research directions and focuses for each area. Evidently, a shift from simple biotechnology-based research to advanced and integrative genomics research, and a setup for investigation and interpretation of many spatiotemporal development and differentiation issues in forest genomics have just begun to emerge.
Collapse
Affiliation(s)
- Dulal Borthakur
- Dulal Borthakur, Department of Molecular Biosciences and Bioengineering, University of Hawaii at Manoa, 1955 East-West Road, Honolulu, HI 96822, USA
| | - Victor Busov
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI 49931, USA
| | - Xuan Hieu Cao
- Forest Genetics and Forest Tree Breeding, Faculty for Forest Sciences and Forest Ecology, University of Göttingen, Büsgenweg 2, 37077 Göttingen, Germany
| | - Qingzhang Du
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, P.R. China
| | - Oliver Gailing
- Forest Genetics and Forest Tree Breeding, Faculty for Forest Sciences and Forest Ecology, University of Göttingen, Büsgenweg 2, 37077 Göttingen, Germany
| | - Fikret Isik
- Cooperative Tree Improvement Program, North Carolina State University, Raleigh, NC 27695, USA
| | - Jae-Heung Ko
- Department of Plant & Environmental New Resources, Kyung Hee University, 1732 Deogyeong-daero, Yongin 17104, Republic of Korea
| | - Chenghao Li
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin 150040, P.R. China
| | - Quanzi Li
- State Key Laboratory of Tree Genetics and Breeding, Chinese Academy of Forestry, Beijing 100093, P.R. China
| | - Shihui Niu
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, P.R. China
| | - Guanzheng Qu
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin 150040, P.R. China
| | - Thi Ha Giang Vu
- Forest Genetics and Forest Tree Breeding, Faculty for Forest Sciences and Forest Ecology, University of Göttingen, Büsgenweg 2, 37077 Göttingen, Germany
| | - Xiao-Ru Wang
- Department of Ecology and Environmental Science, Umeå Plant Science Centre, Umeå University, Umeå 90187, Sweden
| | - Zhigang Wei
- College of Life Sciences, Heilongjiang University, Harbin 150080, P. R. China
| | - Lin Zhang
- Key Laboratory of Cultivation and Protection for Non-Wood Forest Trees, Ministry of Education, Central South University of Forestry and Technology, Changsha 410004, Hunan Province, P.R. China
| | - Hairong Wei
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI 49931, USA
| |
Collapse
|
7
|
Abstract
Traditional tree improvement is cumbersome and costly. Our main objective was to assess the extent to which genomic data can currently accelerate and improve decision making in this field. We used diameter at breast height (DBH) and wood density (WD) data for 4430 tree genotypes and single-nucleotide polymorphism (SNP) data for 2446 tree genotypes. Pedigree reconstruction was performed using a combination of maximum likelihood parentage assignment and matching based on identity-by-state (IBS) similarity. In addition, we used best linear unbiased prediction (BLUP) methods to predict phenotypes using SNP markers (GBLUP), recorded pedigree information (ABLUP), and single-step “blended” BLUP (HBLUP) combining SNP and pedigree information. We substantially improved the accuracy of pedigree records, resolving the inconsistent parental information of 506 tree genotypes. This led to substantially increased predictive ability (i.e., by up to 87%) in HBLUP analyses compared to a baseline from ABLUP. Genomic prediction was possible across populations and within previously untested families with moderately large training populations (N = 800–1200 tree genotypes) and using as few as 2000–5000 SNP markers. HBLUP was generally more effective than traditional ABLUP approaches, particularly after dealing appropriately with pedigree uncertainties. Our study provides evidence that genome-wide marker data can significantly enhance tree improvement. The operational implementation of genomic selection has started in radiata pine breeding in New Zealand, but further reductions in DNA extraction and genotyping costs may be required to realise the full potential of this approach.
Collapse
|