1
|
Akutsu H, Na’iem M, Widiyatno, Indrioko S, Sawitri, Purnomo S, Uchiyama K, Tsumura Y, Tani N. Comparing modeling methods of genomic prediction for growth traits of a tropical timber species, Shorea macrophylla. FRONTIERS IN PLANT SCIENCE 2023; 14:1241908. [PMID: 38023878 PMCID: PMC10644202 DOI: 10.3389/fpls.2023.1241908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Accepted: 09/13/2023] [Indexed: 12/01/2023]
Abstract
Introduction Shorea macrophylla is a commercially important tropical tree species grown for timber and oil. It is amenable to plantation forestry due to its fast initial growth. Genomic selection (GS) has been used in tree breeding studies to shorten long breeding cycles but has not previously been applied to S. macrophylla. Methods To build genomic prediction models for GS, leaves and growth trait data were collected from a half-sib progeny population of S. macrophylla in Sari Bumi Kusuma forest concession, central Kalimantan, Indonesia. 18037 SNP markers were identified in two ddRAD-seq libraries. Genomic prediction models based on these SNPs were then generated for diameter at breast height and total height in the 7th year from planting (D7 and H7). Results and discussion These traits were chosen because of their relatively high narrow-sense genomic heritability and because seven years was considered long enough to assess initial growth. Genomic prediction models were built using 6 methods and their derivatives with the full set of identified SNPs and subsets of 48, 96, and 192 SNPs selected based on the results of a genome-wide association study (GWAS). The GBLUP and RKHS methods gave the highest predictive ability for D7 and H7 with the sets of selected SNPs and showed that D7 has an additive genetic architecture while H7 has an epistatic genetic architecture. LightGBM and CNN1D also achieved high predictive abilities for D7 with 48 and 96 selected SNPs, and for H7 with 96 and 192 selected SNPs, showing that gradient boosting decision trees and deep learning can be useful in genomic prediction. Predictive abilities were higher in H7 when smaller number of SNP subsets selected by GWAS p-value was used, However, D7 showed the contrary tendency, which might have originated from the difference in genetic architecture between primary and secondary growth of the species. This study suggests that GS with GWAS-based SNP selection can be used in breeding for non-cultivated tree species to improve initial growth and reduce genotyping costs for next-generation seedlings.
Collapse
Affiliation(s)
- Haruto Akutsu
- Graduate School of Science and Technology, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Mohammad Na’iem
- Faculty of Forestry, Gadjah Mada University, Yogyakarta, Indonesia
| | - Widiyatno
- Faculty of Forestry, Gadjah Mada University, Yogyakarta, Indonesia
| | - Sapto Indrioko
- Faculty of Forestry, Gadjah Mada University, Yogyakarta, Indonesia
| | - Sawitri
- Faculty of Forestry, Gadjah Mada University, Yogyakarta, Indonesia
| | - Susilo Purnomo
- PT. Sari Bumi Kusuma, Pontianak, West Kalimantan, Indonesia
| | - Kentaro Uchiyama
- Department of Forest Molecular Genetics and Biotechnology, Forestry and Forest Products Research Institute, Tsukuba, Ibaraki, Japan
| | - Yoshihiko Tsumura
- Faculty of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Naoki Tani
- Faculty of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
- Forestry Division, Japan International Research Center for Agricultural Sciences, Tsukuba, Ibaraki, Japan
| |
Collapse
|
2
|
Jubair S, Domaratzki M. Crop genomic selection with deep learning and environmental data: A survey. Front Artif Intell 2023; 5:1040295. [PMID: 36703955 PMCID: PMC9871498 DOI: 10.3389/frai.2022.1040295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 12/22/2022] [Indexed: 01/12/2023] Open
Abstract
Machine learning techniques for crop genomic selections, especially for single-environment plants, are well-developed. These machine learning models, which use dense genome-wide markers to predict phenotype, routinely perform well on single-environment datasets, especially for complex traits affected by multiple markers. On the other hand, machine learning models for predicting crop phenotype, especially deep learning models, using datasets that span different environmental conditions, have only recently emerged. Models that can accept heterogeneous data sources, such as temperature, soil conditions and precipitation, are natural choices for modeling GxE in multi-environment prediction. Here, we review emerging deep learning techniques that incorporate environmental data directly into genomic selection models.
Collapse
Affiliation(s)
- Sheikh Jubair
- Department of Computer Science, University of Manitoba, Winnipeg, MB, Canada
| | - Mike Domaratzki
- Department of Computer Science, University of Western Ontario, London, ON, Canada
| |
Collapse
|
3
|
Borthakur D, Busov V, Cao XH, Du Q, Gailing O, Isik F, Ko JH, Li C, Li Q, Niu S, Qu G, Vu THG, Wang XR, Wei Z, Zhang L, Wei H. Current status and trends in forest genomics. FORESTRY RESEARCH 2022; 2:11. [PMID: 39525413 PMCID: PMC11524260 DOI: 10.48130/fr-2022-0011] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 08/19/2022] [Indexed: 11/16/2024]
Abstract
Forests are not only the most predominant of the Earth's terrestrial ecosystems, but are also the core supply for essential products for human use. However, global climate change and ongoing population explosion severely threatens the health of the forest ecosystem and aggravtes the deforestation and forest degradation. Forest genomics has great potential of increasing forest productivity and adaptation to the changing climate. In the last two decades, the field of forest genomics has advanced quickly owing to the advent of multiple high-throughput sequencing technologies, single cell RNA-seq, clustered regularly interspaced short palindromic repeats (CRISPR)-mediated genome editing, and spatial transcriptomes, as well as bioinformatics analysis technologies, which have led to the generation of multidimensional, multilayered, and spatiotemporal gene expression data. These technologies, together with basic technologies routinely used in plant biotechnology, enable us to tackle many important or unique issues in forest biology, and provide a panoramic view and an integrative elucidation of molecular regulatory mechanisms underlying phenotypic changes and variations. In this review, we recapitulated the advancement and current status of 12 research branches of forest genomics, and then provided future research directions and focuses for each area. Evidently, a shift from simple biotechnology-based research to advanced and integrative genomics research, and a setup for investigation and interpretation of many spatiotemporal development and differentiation issues in forest genomics have just begun to emerge.
Collapse
Affiliation(s)
- Dulal Borthakur
- Dulal Borthakur, Department of Molecular Biosciences and Bioengineering, University of Hawaii at Manoa, 1955 East-West Road, Honolulu, HI 96822, USA
| | - Victor Busov
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI 49931, USA
| | - Xuan Hieu Cao
- Forest Genetics and Forest Tree Breeding, Faculty for Forest Sciences and Forest Ecology, University of Göttingen, Büsgenweg 2, 37077 Göttingen, Germany
| | - Qingzhang Du
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, P.R. China
| | - Oliver Gailing
- Forest Genetics and Forest Tree Breeding, Faculty for Forest Sciences and Forest Ecology, University of Göttingen, Büsgenweg 2, 37077 Göttingen, Germany
| | - Fikret Isik
- Cooperative Tree Improvement Program, North Carolina State University, Raleigh, NC 27695, USA
| | - Jae-Heung Ko
- Department of Plant & Environmental New Resources, Kyung Hee University, 1732 Deogyeong-daero, Yongin 17104, Republic of Korea
| | - Chenghao Li
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin 150040, P.R. China
| | - Quanzi Li
- State Key Laboratory of Tree Genetics and Breeding, Chinese Academy of Forestry, Beijing 100093, P.R. China
| | - Shihui Niu
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, P.R. China
| | - Guanzheng Qu
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin 150040, P.R. China
| | - Thi Ha Giang Vu
- Forest Genetics and Forest Tree Breeding, Faculty for Forest Sciences and Forest Ecology, University of Göttingen, Büsgenweg 2, 37077 Göttingen, Germany
| | - Xiao-Ru Wang
- Department of Ecology and Environmental Science, Umeå Plant Science Centre, Umeå University, Umeå 90187, Sweden
| | - Zhigang Wei
- College of Life Sciences, Heilongjiang University, Harbin 150080, P. R. China
| | - Lin Zhang
- Key Laboratory of Cultivation and Protection for Non-Wood Forest Trees, Ministry of Education, Central South University of Forestry and Technology, Changsha 410004, Hunan Province, P.R. China
| | - Hairong Wei
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI 49931, USA
| |
Collapse
|
4
|
Genomic Prediction of Complex Traits in Perennial Plants: A Case for Forest Trees. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2467:493-520. [PMID: 35451788 DOI: 10.1007/978-1-0716-2205-6_18] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
This chapter provides an overview of the genomic selection progress in long-lived forest tree species. Factors affecting the prediction accuracy in genomic prediction are assessed with examples from empirical studies. Infrastructure and resources required for the implementation of genomic selection are evaluated. Some general guidelines are provided for the successful application of genomic selection in forest tree breeding programs.
Collapse
|
5
|
Mhoswa L, Myburg AA, Slippers B, Külheim C, Naidoo S. Genome-wide association study identifies SNP markers and putative candidate genes for terpene traits important for Leptocybe invasa resistance in Eucalyptus grandis. G3 GENES|GENOMES|GENETICS 2022; 12:6521028. [PMID: 35134191 PMCID: PMC8982386 DOI: 10.1093/g3journal/jkac004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Accepted: 12/20/2021] [Indexed: 11/17/2022]
Abstract
Terpenes are an important group of plant specialized metabolites influencing, amongst other functions, defence mechanisms against pests. We used a genome-wide association study to identify single nucleotide polymorphism (SNP) markers and putative candidate genes for terpene traits. We tested 15,387 informative SNP markers derived from genotyping 416 Eucalyptus grandis individuals for association with 3 terpene traits, 1,8-cineole, γ-terpinene, and p-cymene. A multilocus mixed model analysis identified 21 SNP markers for 1,8-cineole on chromosomes 2, 4, 6, 7, 8, 9, 10, and 11, that individually explained 3.0%–8.4% and jointly 42.7% of the phenotypic variation. Association analysis of γ-terpinene found 32 significant SNP markers on chromosomes 1, 2, 4, 5, 6, 9, and 11, explaining 3.4–15.5% and jointly 54.5% of phenotypic variation. For p-cymene, 28 significant SNP markers were identified on chromosomes 1, 2, 3, 5, 6, 7, 10, and 11, explaining 3.4–16.1% of the phenotypic variation and jointly 46.9%. Our results show that variation underlying the 3 terpene traits is influenced by a few minor loci in combination with a few major effect loci, suggesting an oligogenic nature of the traits.
Collapse
Affiliation(s)
- Lorraine Mhoswa
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria 0028, South Africa
| | - Alexander A Myburg
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria 0028, South Africa
| | - Bernard Slippers
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria 0028, South Africa
| | - Carsten Külheim
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI 49931-1295, USA
| | - Sanushka Naidoo
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria 0028, South Africa
| |
Collapse
|
6
|
Genome Wide Association Study Identifies Candidate Genes Related to the Earlywood Tracheid Properties in Picea crassifolia Kom. FORESTS 2022. [DOI: 10.3390/f13020332] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Picea crassifolia Kom. is one of the timber and ecological conifers in China and its wood tracheid traits directly affect wood formation and adaptability under harsh environment. Molecular studies on P. crassifolia remain inadequate because relatively few genes have been associated with these traits. To identify markers and candidate genes that can potentially be used for genetic improvement of wood tracheid traits, we examined 106 clones of P. crassifolia, and investigated phenotypic data for 14 wood tracheid traits before specific-locus amplified fragment sequencing (SLAF-seq) was employed to perform a genome wide association study (GWAS). Subsequently, the results were used to screen single nucleotide polymorphism (SNP) loci and candidate genes that exhibited a significant correlation with the studied traits. We developed 4,058,883 SLAF-tags and 12,275,765 SNP loci, and our analyses identified a total of 96 SNP loci that showed significant correlations with three earlywood tracheid traits using a mixed linear model (MLM). Next, candidate genes were screened in the 100 kb zone (50 kb upstream, 50 kb downstream) of each of the SNP loci, whereby 67 candidate genes were obtained in earlywood tracheid traits, including 34 genes of known function and 33 genes of unknown function. We provide the most significant SNP for each trait-locus combination and candidate genes occurring within the GWAS hits. These resources provide a foundation for the development of markers that could be used in wood traits improvement and candidate genes for the development of earlywood tracheid in P. crassifolia.
Collapse
|
7
|
Genomic Selection for Forest Tree Improvement: Methods, Achievements and Perspectives. FORESTS 2020. [DOI: 10.3390/f11111190] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
The breeding of forest trees is only a few decades old, and is a much more complicated, longer, and expensive endeavor than the breeding of agricultural crops. One breeding cycle for forest trees can take 20–30 years. Recent advances in genomics and molecular biology have revolutionized traditional plant breeding based on visual phenotype assessment: the development of different types of molecular markers has made genotype selection possible. Marker-assisted breeding can significantly accelerate the breeding process, but this method has not been shown to be effective for selection of complex traits on forest trees. This new method of genomic selection is based on the analysis of all effects of quantitative trait loci (QTLs) using a large number of molecular markers distributed throughout the genome, which makes it possible to assess the genomic estimated breeding value (GEBV) of an individual. This approach is expected to be much more efficient for forest tree improvement than traditional breeding. Here, we review the current state of the art in the application of genomic selection in forest tree breeding and discuss different methods of genotyping and phenotyping. We also compare the accuracies of genomic prediction models and highlight the importance of a prior cost-benefit analysis before implementing genomic selection. Perspectives for the further development of this approach in forest breeding are also discussed: expanding the range of species and the list of valuable traits, the application of high-throughput phenotyping methods, and the possibility of using epigenetic variance to improve of forest trees.
Collapse
|
8
|
Cortés AJ, Restrepo-Montoya M, Bedoya-Canas LE. Modern Strategies to Assess and Breed Forest Tree Adaptation to Changing Climate. FRONTIERS IN PLANT SCIENCE 2020; 11:583323. [PMID: 33193532 PMCID: PMC7609427 DOI: 10.3389/fpls.2020.583323] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 09/29/2020] [Indexed: 05/02/2023]
Abstract
Studying the genetics of adaptation to new environments in ecologically and industrially important tree species is currently a major research line in the fields of plant science and genetic improvement for tolerance to abiotic stress. Specifically, exploring the genomic basis of local adaptation is imperative for assessing the conditions under which trees will successfully adapt in situ to global climate change. However, this knowledge has scarcely been used in conservation and forest tree improvement because woody perennials face major research limitations such as their outcrossing reproductive systems, long juvenile phase, and huge genome sizes. Therefore, in this review we discuss predictive genomic approaches that promise increasing adaptive selection accuracy and shortening generation intervals. They may also assist the detection of novel allelic variants from tree germplasm, and disclose the genomic potential of adaptation to different environments. For instance, natural populations of tree species invite using tools from the population genomics field to study the signatures of local adaptation. Conventional genetic markers and whole genome sequencing both help identifying genes and markers that diverge between local populations more than expected under neutrality, and that exhibit unique signatures of diversity indicative of "selective sweeps." Ultimately, these efforts inform the conservation and breeding status capable of pivoting forest health, ecosystem services, and sustainable production. Key long-term perspectives include understanding how trees' phylogeographic history may affect the adaptive relevant genetic variation available for adaptation to environmental change. Encouraging "big data" approaches (machine learning-ML) capable of comprehensively merging heterogeneous genomic and ecological datasets is becoming imperative, too.
Collapse
Affiliation(s)
- Andrés J. Cortés
- Corporación Colombiana de Investigación Agropecuaria AGROSAVIA, Rionegro, Colombia
- Departamento de Ciencias Forestales, Facultad de Ciencias Agrarias, Universidad Nacional de Colombia – Sede Medellín, Medellín, Colombia
| | - Manuela Restrepo-Montoya
- Departamento de Ciencias Forestales, Facultad de Ciencias Agrarias, Universidad Nacional de Colombia – Sede Medellín, Medellín, Colombia
| | - Larry E. Bedoya-Canas
- Departamento de Ciencias Forestales, Facultad de Ciencias Agrarias, Universidad Nacional de Colombia – Sede Medellín, Medellín, Colombia
| |
Collapse
|