1
|
Akutsu H, Na’iem M, Widiyatno, Indrioko S, Sawitri, Purnomo S, Uchiyama K, Tsumura Y, Tani N. Comparing modeling methods of genomic prediction for growth traits of a tropical timber species, Shorea macrophylla. FRONTIERS IN PLANT SCIENCE 2023; 14:1241908. [PMID: 38023878 PMCID: PMC10644202 DOI: 10.3389/fpls.2023.1241908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Accepted: 09/13/2023] [Indexed: 12/01/2023]
Abstract
Introduction Shorea macrophylla is a commercially important tropical tree species grown for timber and oil. It is amenable to plantation forestry due to its fast initial growth. Genomic selection (GS) has been used in tree breeding studies to shorten long breeding cycles but has not previously been applied to S. macrophylla. Methods To build genomic prediction models for GS, leaves and growth trait data were collected from a half-sib progeny population of S. macrophylla in Sari Bumi Kusuma forest concession, central Kalimantan, Indonesia. 18037 SNP markers were identified in two ddRAD-seq libraries. Genomic prediction models based on these SNPs were then generated for diameter at breast height and total height in the 7th year from planting (D7 and H7). Results and discussion These traits were chosen because of their relatively high narrow-sense genomic heritability and because seven years was considered long enough to assess initial growth. Genomic prediction models were built using 6 methods and their derivatives with the full set of identified SNPs and subsets of 48, 96, and 192 SNPs selected based on the results of a genome-wide association study (GWAS). The GBLUP and RKHS methods gave the highest predictive ability for D7 and H7 with the sets of selected SNPs and showed that D7 has an additive genetic architecture while H7 has an epistatic genetic architecture. LightGBM and CNN1D also achieved high predictive abilities for D7 with 48 and 96 selected SNPs, and for H7 with 96 and 192 selected SNPs, showing that gradient boosting decision trees and deep learning can be useful in genomic prediction. Predictive abilities were higher in H7 when smaller number of SNP subsets selected by GWAS p-value was used, However, D7 showed the contrary tendency, which might have originated from the difference in genetic architecture between primary and secondary growth of the species. This study suggests that GS with GWAS-based SNP selection can be used in breeding for non-cultivated tree species to improve initial growth and reduce genotyping costs for next-generation seedlings.
Collapse
Affiliation(s)
- Haruto Akutsu
- Graduate School of Science and Technology, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Mohammad Na’iem
- Faculty of Forestry, Gadjah Mada University, Yogyakarta, Indonesia
| | - Widiyatno
- Faculty of Forestry, Gadjah Mada University, Yogyakarta, Indonesia
| | - Sapto Indrioko
- Faculty of Forestry, Gadjah Mada University, Yogyakarta, Indonesia
| | - Sawitri
- Faculty of Forestry, Gadjah Mada University, Yogyakarta, Indonesia
| | - Susilo Purnomo
- PT. Sari Bumi Kusuma, Pontianak, West Kalimantan, Indonesia
| | - Kentaro Uchiyama
- Department of Forest Molecular Genetics and Biotechnology, Forestry and Forest Products Research Institute, Tsukuba, Ibaraki, Japan
| | - Yoshihiko Tsumura
- Faculty of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Naoki Tani
- Faculty of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
- Forestry Division, Japan International Research Center for Agricultural Sciences, Tsukuba, Ibaraki, Japan
| |
Collapse
|
2
|
Jubair S, Domaratzki M. Crop genomic selection with deep learning and environmental data: A survey. Front Artif Intell 2023; 5:1040295. [PMID: 36703955 PMCID: PMC9871498 DOI: 10.3389/frai.2022.1040295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 12/22/2022] [Indexed: 01/12/2023] Open
Abstract
Machine learning techniques for crop genomic selections, especially for single-environment plants, are well-developed. These machine learning models, which use dense genome-wide markers to predict phenotype, routinely perform well on single-environment datasets, especially for complex traits affected by multiple markers. On the other hand, machine learning models for predicting crop phenotype, especially deep learning models, using datasets that span different environmental conditions, have only recently emerged. Models that can accept heterogeneous data sources, such as temperature, soil conditions and precipitation, are natural choices for modeling GxE in multi-environment prediction. Here, we review emerging deep learning techniques that incorporate environmental data directly into genomic selection models.
Collapse
Affiliation(s)
- Sheikh Jubair
- Department of Computer Science, University of Manitoba, Winnipeg, MB, Canada,*Correspondence: Sheikh Jubair ✉
| | - Mike Domaratzki
- Department of Computer Science, University of Western Ontario, London, ON, Canada
| |
Collapse
|
3
|
Genomic Prediction of Complex Traits in Perennial Plants: A Case for Forest Trees. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2467:493-520. [PMID: 35451788 DOI: 10.1007/978-1-0716-2205-6_18] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
This chapter provides an overview of the genomic selection progress in long-lived forest tree species. Factors affecting the prediction accuracy in genomic prediction are assessed with examples from empirical studies. Infrastructure and resources required for the implementation of genomic selection are evaluated. Some general guidelines are provided for the successful application of genomic selection in forest tree breeding programs.
Collapse
|
4
|
Mhoswa L, Myburg AA, Slippers B, Külheim C, Naidoo S. Genome-wide association study identifies SNP markers and putative candidate genes for terpene traits important for Leptocybe invasa resistance in Eucalyptus grandis. G3 GENES|GENOMES|GENETICS 2022; 12:6521028. [PMID: 35134191 PMCID: PMC8982386 DOI: 10.1093/g3journal/jkac004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Accepted: 12/20/2021] [Indexed: 11/17/2022]
Abstract
Terpenes are an important group of plant specialized metabolites influencing, amongst other functions, defence mechanisms against pests. We used a genome-wide association study to identify single nucleotide polymorphism (SNP) markers and putative candidate genes for terpene traits. We tested 15,387 informative SNP markers derived from genotyping 416 Eucalyptus grandis individuals for association with 3 terpene traits, 1,8-cineole, γ-terpinene, and p-cymene. A multilocus mixed model analysis identified 21 SNP markers for 1,8-cineole on chromosomes 2, 4, 6, 7, 8, 9, 10, and 11, that individually explained 3.0%–8.4% and jointly 42.7% of the phenotypic variation. Association analysis of γ-terpinene found 32 significant SNP markers on chromosomes 1, 2, 4, 5, 6, 9, and 11, explaining 3.4–15.5% and jointly 54.5% of phenotypic variation. For p-cymene, 28 significant SNP markers were identified on chromosomes 1, 2, 3, 5, 6, 7, 10, and 11, explaining 3.4–16.1% of the phenotypic variation and jointly 46.9%. Our results show that variation underlying the 3 terpene traits is influenced by a few minor loci in combination with a few major effect loci, suggesting an oligogenic nature of the traits.
Collapse
Affiliation(s)
- Lorraine Mhoswa
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria 0028, South Africa
| | - Alexander A Myburg
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria 0028, South Africa
| | - Bernard Slippers
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria 0028, South Africa
| | - Carsten Külheim
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI 49931-1295, USA
| | - Sanushka Naidoo
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria 0028, South Africa
| |
Collapse
|
5
|
Genome Wide Association Study Identifies Candidate Genes Related to the Earlywood Tracheid Properties in Picea crassifolia Kom. FORESTS 2022. [DOI: 10.3390/f13020332] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Picea crassifolia Kom. is one of the timber and ecological conifers in China and its wood tracheid traits directly affect wood formation and adaptability under harsh environment. Molecular studies on P. crassifolia remain inadequate because relatively few genes have been associated with these traits. To identify markers and candidate genes that can potentially be used for genetic improvement of wood tracheid traits, we examined 106 clones of P. crassifolia, and investigated phenotypic data for 14 wood tracheid traits before specific-locus amplified fragment sequencing (SLAF-seq) was employed to perform a genome wide association study (GWAS). Subsequently, the results were used to screen single nucleotide polymorphism (SNP) loci and candidate genes that exhibited a significant correlation with the studied traits. We developed 4,058,883 SLAF-tags and 12,275,765 SNP loci, and our analyses identified a total of 96 SNP loci that showed significant correlations with three earlywood tracheid traits using a mixed linear model (MLM). Next, candidate genes were screened in the 100 kb zone (50 kb upstream, 50 kb downstream) of each of the SNP loci, whereby 67 candidate genes were obtained in earlywood tracheid traits, including 34 genes of known function and 33 genes of unknown function. We provide the most significant SNP for each trait-locus combination and candidate genes occurring within the GWAS hits. These resources provide a foundation for the development of markers that could be used in wood traits improvement and candidate genes for the development of earlywood tracheid in P. crassifolia.
Collapse
|
6
|
Genomic Selection for Forest Tree Improvement: Methods, Achievements and Perspectives. FORESTS 2020. [DOI: 10.3390/f11111190] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
The breeding of forest trees is only a few decades old, and is a much more complicated, longer, and expensive endeavor than the breeding of agricultural crops. One breeding cycle for forest trees can take 20–30 years. Recent advances in genomics and molecular biology have revolutionized traditional plant breeding based on visual phenotype assessment: the development of different types of molecular markers has made genotype selection possible. Marker-assisted breeding can significantly accelerate the breeding process, but this method has not been shown to be effective for selection of complex traits on forest trees. This new method of genomic selection is based on the analysis of all effects of quantitative trait loci (QTLs) using a large number of molecular markers distributed throughout the genome, which makes it possible to assess the genomic estimated breeding value (GEBV) of an individual. This approach is expected to be much more efficient for forest tree improvement than traditional breeding. Here, we review the current state of the art in the application of genomic selection in forest tree breeding and discuss different methods of genotyping and phenotyping. We also compare the accuracies of genomic prediction models and highlight the importance of a prior cost-benefit analysis before implementing genomic selection. Perspectives for the further development of this approach in forest breeding are also discussed: expanding the range of species and the list of valuable traits, the application of high-throughput phenotyping methods, and the possibility of using epigenetic variance to improve of forest trees.
Collapse
|
7
|
Cortés AJ, Restrepo-Montoya M, Bedoya-Canas LE. Modern Strategies to Assess and Breed Forest Tree Adaptation to Changing Climate. FRONTIERS IN PLANT SCIENCE 2020; 11:583323. [PMID: 33193532 PMCID: PMC7609427 DOI: 10.3389/fpls.2020.583323] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 09/29/2020] [Indexed: 05/02/2023]
Abstract
Studying the genetics of adaptation to new environments in ecologically and industrially important tree species is currently a major research line in the fields of plant science and genetic improvement for tolerance to abiotic stress. Specifically, exploring the genomic basis of local adaptation is imperative for assessing the conditions under which trees will successfully adapt in situ to global climate change. However, this knowledge has scarcely been used in conservation and forest tree improvement because woody perennials face major research limitations such as their outcrossing reproductive systems, long juvenile phase, and huge genome sizes. Therefore, in this review we discuss predictive genomic approaches that promise increasing adaptive selection accuracy and shortening generation intervals. They may also assist the detection of novel allelic variants from tree germplasm, and disclose the genomic potential of adaptation to different environments. For instance, natural populations of tree species invite using tools from the population genomics field to study the signatures of local adaptation. Conventional genetic markers and whole genome sequencing both help identifying genes and markers that diverge between local populations more than expected under neutrality, and that exhibit unique signatures of diversity indicative of "selective sweeps." Ultimately, these efforts inform the conservation and breeding status capable of pivoting forest health, ecosystem services, and sustainable production. Key long-term perspectives include understanding how trees' phylogeographic history may affect the adaptive relevant genetic variation available for adaptation to environmental change. Encouraging "big data" approaches (machine learning-ML) capable of comprehensively merging heterogeneous genomic and ecological datasets is becoming imperative, too.
Collapse
Affiliation(s)
- Andrés J. Cortés
- Corporación Colombiana de Investigación Agropecuaria AGROSAVIA, Rionegro, Colombia
- Departamento de Ciencias Forestales, Facultad de Ciencias Agrarias, Universidad Nacional de Colombia – Sede Medellín, Medellín, Colombia
| | - Manuela Restrepo-Montoya
- Departamento de Ciencias Forestales, Facultad de Ciencias Agrarias, Universidad Nacional de Colombia – Sede Medellín, Medellín, Colombia
| | - Larry E. Bedoya-Canas
- Departamento de Ciencias Forestales, Facultad de Ciencias Agrarias, Universidad Nacional de Colombia – Sede Medellín, Medellín, Colombia
| |
Collapse
|