1
|
Onogi A. Integration of Crop Growth Models and Genomic Prediction. Methods Mol Biol 2022; 2467:359-396. [PMID: 35451783 DOI: 10.1007/978-1-0716-2205-6_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Crop growth models (CGMs) consist of multiple equations that represent physiological processes of plants and simulate crop growth dynamically given environmental inputs. Because parameters of CGMs are often genotype-specific, gene effects can be related to environmental inputs through CGMs. Thus, CGMs are attractive tools for predicting genotype by environment (G×E) interactions. This chapter reviews CGMs, genetic analyses using these models, and the status of studies that integrate genomic prediction with CGMs. Examples of CGM analyses are also provided.
Collapse
Affiliation(s)
- Akio Onogi
- Department of Plant Life Science, Faculty of Agriculture, Ryukoku University, Otsu, Shiga, Japan.
| |
Collapse
|
2
|
Pandey AK, Jiang L, Moshelion M, Gosa SC, Sun T, Lin Q, Wu R, Xu P. Functional physiological phenotyping with functional mapping: A general framework to bridge the phenotype-genotype gap in plant physiology. iScience 2021; 24:102846. [PMID: 34381971 PMCID: PMC8333144 DOI: 10.1016/j.isci.2021.102846] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 05/27/2021] [Accepted: 07/09/2021] [Indexed: 11/19/2022] Open
Abstract
The recent years have witnessed the emergence of high-throughput phenotyping techniques. In particular, these techniques can characterize a comprehensive landscape of physiological traits of plants responding to dynamic changes in the environment. These innovations, along with the next-generation genomic technologies, have brought plant science into the big-data era. However, a general framework that links multifaceted physiological traits to DNA variants is still lacking. Here, we developed a general framework that integrates functional physiological phenotyping (FPP) with functional mapping (FM). This integration, implemented with high-dimensional statistical reasoning, can aid in our understanding of how genotype is translated toward phenotype. As a demonstration of method, we implemented the transpiration and soil-plant-atmosphere measurements of a tomato introgression line population into the FPP-FM framework, facilitating the identification of quantitative trait loci (QTLs) that mediate the spatiotemporal change of transpiration rate and the test of how these QTLs control, through their interaction networks, phenotypic plasticity under drought stress.
Collapse
Affiliation(s)
- Arun K. Pandey
- College of Life Sciences, China Jiliang University, Hangzhou 310018, China
| | - Libo Jiang
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100080, China
| | - Menachem Moshelion
- The Robert H. Smith Institute of Plant Sciences and Genetics in Agriculture, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot 76100, Israel
- Corresponding author
| | - Sanbon Chaka Gosa
- The Robert H. Smith Institute of Plant Sciences and Genetics in Agriculture, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot 76100, Israel
| | - Ting Sun
- College of Life Sciences, China Jiliang University, Hangzhou 310018, China
| | - Qin Lin
- Biozeron Biotechnology Co., Ltd, Shanghai 201800, China
| | - Rongling Wu
- Center for Statistical Genetics, Departments of Public Health Sciences and Statistics, The Pennsylvania State University, Hershey, PA 17033, USA
- Corresponding author
| | - Pei Xu
- College of Life Sciences, China Jiliang University, Hangzhou 310018, China
- Corresponding author
| |
Collapse
|
3
|
Arjas A, Hauptmann A, Sillanpää MJ. Estimation of dynamic SNP-heritability with Bayesian Gaussian process models. Bioinformatics 2020; 36:3795-3802. [PMID: 32186692 PMCID: PMC7672693 DOI: 10.1093/bioinformatics/btaa199] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Revised: 03/10/2020] [Accepted: 03/17/2020] [Indexed: 11/23/2022] Open
Abstract
Motivation Improved DNA technology has made it practical to estimate single-nucleotide polymorphism (SNP)-heritability among distantly related individuals with unknown relationships. For growth- and development-related traits, it is meaningful to base SNP-heritability estimation on longitudinal data due to the time-dependency of the process. However, only few statistical methods have been developed so far for estimating dynamic SNP-heritability and quantifying its full uncertainty. Results We introduce a completely tuning-free Bayesian Gaussian process (GP)-based approach for estimating dynamic variance components and heritability as their function. For parameter estimation, we use a modern Markov Chain Monte Carlo method which allows full uncertainty quantification. Several datasets are analysed and our results clearly illustrate that the 95% credible intervals of the proposed joint estimation method (which ‘borrows strength’ from adjacent time points) are significantly narrower than of a two-stage baseline method that first estimates the variance components at each time point independently and then performs smoothing. We compare the method with a random regression model using MTG2 and BLUPF90 software and quantitative measures indicate superior performance of our method. Results are presented for simulated and real data with up to 1000 time points. Finally, we demonstrate scalability of the proposed method for simulated data with tens of thousands of individuals. Availability and implementation The C++ implementation dynBGP and simulated data are available in GitHub: https://github.com/aarjas/dynBGP. The programmes can be run in R. Real datasets are available in QTL archive: https://phenome.jax.org/centers/QTLA. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Arttu Arjas
- Research Unit of Mathematical Sciences, University of Oulu, Oulu FI-90014, Finland
| | - Andreas Hauptmann
- Research Unit of Mathematical Sciences, University of Oulu, Oulu FI-90014, Finland.,Department of Computer Science, University College London, London WC1E 6BT, UK
| | - Mikko J Sillanpää
- Research Unit of Mathematical Sciences, University of Oulu, Oulu FI-90014, Finland.,Infotech Oulu, University of Oulu, Oulu FI-90014, Finland
| |
Collapse
|
4
|
Onogi A. Connecting mathematical models to genomes: joint estimation of model parameters and genome-wide marker effects on these parameters. Bioinformatics 2020; 36:3169-3176. [PMID: 32101279 DOI: 10.1093/bioinformatics/btaa129] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 01/17/2020] [Accepted: 02/21/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Parameters of mathematical models used in biology may be genotype-specific and regarded as new traits. Therefore, an accurate estimation of these parameters and the association mapping on the estimated parameters can lead to important findings regarding the genetic architecture of biological processes. In this study, a statistical framework for a joint analysis (JA) of model parameters and genome-wide marker effects on these parameters was proposed and evaluated. RESULTS In the simulation analyses based on different types of mathematical models, the JA inferred the model parameters and identified the responsible genomic regions more accurately than the independent analysis (IA). The JA of real plant data provided interesting insights into photosensitivity, which were uncovered by the IA. AVAILABILITY AND IMPLEMENTATION The statistical framework is provided by the R package GenomeBasedModel available at https://github.com/Onogi/GenomeBasedModel. All R and C++ scripts used in this study are also available at the site. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Akio Onogi
- Japan Institute of Crop Science, National Agriculture and Food Research Organization, Tsukuba, Ibaraki 305-8518, Japan.,Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, Ibaraki 305-0856, Japan
| |
Collapse
|
5
|
Vanhatalo J, Li Z, Sillanpää MJ. A Gaussian process model and Bayesian variable selection for mapping function-valued quantitative traits with incomplete phenotypic data. Bioinformatics 2020; 35:3684-3692. [PMID: 30850830 PMCID: PMC6761969 DOI: 10.1093/bioinformatics/btz164] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Revised: 12/05/2018] [Accepted: 03/06/2019] [Indexed: 12/22/2022] Open
Abstract
Motivation Recent advances in high dimensional phenotyping bring time as an extra dimension into the phenotypes. This promotes the quantitative trait locus (QTL) studies of function-valued traits such as those related to growth and development. Existing approaches for analyzing functional traits utilize either parametric methods or semi-parametric approaches based on splines and wavelets. However, very limited choices of software tools are currently available for practical implementation of functional QTL mapping and variable selection. Results We propose a Bayesian Gaussian process (GP) approach for functional QTL mapping. We use GPs to model the continuously varying coefficients which describe how the effects of molecular markers on the quantitative trait are changing over time. We use an efficient gradient based algorithm to estimate the tuning parameters of GPs. Notably, the GP approach is directly applicable to the incomplete datasets having even larger than 50% missing data rate (among phenotypes). We further develop a stepwise algorithm to search through the model space in terms of genetic variants, and use a minimal increase of Bayesian posterior probability as a stopping rule to focus on only a small set of putative QTL. We also discuss the connection between GP and penalized B-splines and wavelets. On two simulated and three real datasets, our GP approach demonstrates great flexibility for modeling different types of phenotypic trajectories with low computational cost. The proposed model selection approach finds the most likely QTL reliably in tested datasets. Availability and implementation Software and simulated data are available as a MATLAB package ‘GPQTLmapping’, and they can be downloaded from GitHub (https://github.com/jpvanhat/GPQTLmapping). Real datasets used in case studies are publicly available at QTL Archive. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jarno Vanhatalo
- Department of Mathematics and Statistics and Organismal and Evolutionary Biology Research Programme, University of Helsinki, Helsinki, Finland
| | - Zitong Li
- CSIRO Agriculture & Food, GPO Box 1600, Canberra, ACT 2601, Australia
| | - Mikko J Sillanpää
- Department of Mathematical Sciences, Biocenter Oulu and Infotech Oulu University of Oulu, Oulu FI-90014, Finland
| |
Collapse
|
6
|
Wang N, Chu T, Luo J, Wu R, Wang Z. Funmap2: an R package for QTL mapping using longitudinal phenotypes. PeerJ 2019; 7:e7008. [PMID: 31183256 PMCID: PMC6546077 DOI: 10.7717/peerj.7008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Accepted: 04/23/2019] [Indexed: 01/08/2023] Open
Abstract
Quantitative trait locus (QTL) mapping has been used as a powerful tool for inferring the complexity of the genetic architecture that underlies phenotypic traits. This approach has shown its unique power to map the developmental genetic architecture of complex traits by implementing longitudinal data analysis. Here, we introduce the R package Funmap2 based on the functional mapping framework, which integrates prior biological knowledge into the statistical model. Specifically, the functional mapping framework is engineered to include longitudinal curves that describe the genetic effects and the covariance matrix of the trait of interest. Funmap2 chooses the type of longitudinal curve and covariance matrix automatically using information criteria. Funmap2 is available for download at https://github.com/wzhy2000/Funmap2.
Collapse
Affiliation(s)
- Nating Wang
- College of Biological Sciences and Technology, Beijing Forestry University, Beijing, China
| | - Tinyi Chu
- Graduate field of Computational Biology, Cornell University, Ithaca, NY, United States of America
| | - Jiangtao Luo
- Department of Biostatistics, College of Public Health, University of Nebraska Medical Center, Omaha, NE, United States of America
| | - Rongling Wu
- College of Biological Sciences and Technology, Beijing Forestry University, Beijing, China
| | - Zhong Wang
- College of Biological Sciences and Technology, Beijing Forestry University, Beijing, China.,Baker Institute for Animal Health, College of Veterinary Medicine, Cornell College, Ithaca, NY, United States of America
| |
Collapse
|
7
|
Sang M, Shi H, Wei K, Ye M, Jiang L, Sun L, Wu R. A dissection model for mapping complex traits. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2019; 97:1168-1182. [PMID: 30536697 DOI: 10.1111/tpj.14185] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Revised: 10/26/2018] [Accepted: 11/26/2018] [Indexed: 06/09/2023]
Abstract
Many quantitative traits are composites of other traits that contribute differentially to genetic variation. Quantitative trait locus (QTL) mapping of these composite traits can benefit by incorporating the mechanistic process of how their formation is mediated by the underlying components. We propose a dissection model by which to map these interconnected components traits under a joint likelihood setting. The model can test how a composite trait is determined by pleiotropic QTLs for its component traits or jointly by different sets of QTLs each responsible for a different component. The model can visualize the pattern of time-varying genetic effects for individual components and their impacts on composite traits. The dissection model was used to map two composite traits, stemwood volume growth decomposed into its stem height, stem diameter and stem form components for Euramerican poplar adult trees, and total lateral root length constituted by its average lateral root length and lateral root number components for Euphrates poplar seedlings. We found the pattern of how QTLs for different components contribute to phenotypic variation in composite traits. The detailed understanding of the genetic machineries of composite traits will not only help in the design of molecular breeding in plants and animals, but also shed light on the evolutionary processes of quantitative traits under natural selection.
Collapse
Affiliation(s)
- Mengmeng Sang
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, 100083, China
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100083, China
| | - Hexin Shi
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, 100083, China
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100083, China
| | - Kun Wei
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, 100083, China
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100083, China
| | - Meixia Ye
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, 100083, China
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100083, China
| | - Libo Jiang
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, 100083, China
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100083, China
| | - Lidan Sun
- Beijing Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, National Engineering Research Center for Floriculture, College of Landscape Architecture, Beijing Forestry University, Beijing, 100083, China
| | - Rongling Wu
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, 100083, China
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100083, China
- State Key Laboratory of Tree Genetics and Breeding, Chinese Academy of Forestry, Beijing, 100091, China
- Center for Statistical Genetics, Departments of Public Health Sciences and Statistics, Pennsylvania State University, Hershey, PA, 17033, USA
| |
Collapse
|
8
|
Baker RL, Leong WF, An N, Brock MT, Rubin MJ, Welch S, Weinig C. Bayesian estimation and use of high-throughput remote sensing indices for quantitative genetic analyses of leaf growth. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2018; 131:283-298. [PMID: 29058049 DOI: 10.1007/s00122-017-3001-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Accepted: 10/09/2017] [Indexed: 06/07/2023]
Abstract
We develop Bayesian function-valued trait models that mathematically isolate genetic mechanisms underlying leaf growth trajectories by factoring out genotype-specific differences in photosynthesis. Remote sensing data can be used instead of leaf-level physiological measurements. Characterizing the genetic basis of traits that vary during ontogeny and affect plant performance is a major goal in evolutionary biology and agronomy. Describing genetic programs that specifically regulate morphological traits can be complicated by genotypic differences in physiological traits. We describe the growth trajectories of leaves using novel Bayesian function-valued trait (FVT) modeling approaches in Brassica rapa recombinant inbred lines raised in heterogeneous field settings. While frequentist approaches estimate parameter values by treating each experimental replicate discretely, Bayesian models can utilize information in the global dataset, potentially leading to more robust trait estimation. We illustrate this principle by estimating growth asymptotes in the face of missing data and comparing heritabilities of growth trajectory parameters estimated by Bayesian and frequentist approaches. Using pseudo-Bayes factors, we compare the performance of an initial Bayesian logistic growth model and a model that incorporates carbon assimilation (A max) as a cofactor, thus statistically accounting for genotypic differences in carbon resources. We further evaluate two remotely sensed spectroradiometric indices, photochemical reflectance (pri2) and MERIS Terrestrial Chlorophyll Index (mtci) as covariates in lieu of A max, because these two indices were genetically correlated with A max across years and treatments yet allow much higher throughput compared to direct leaf-level gas-exchange measurements. For leaf lengths in uncrowded settings, including A max improves model fit over the initial model. The mtci and pri2 indices also outperform direct A max measurements. Of particular importance for evolutionary biologists and plant breeders, hierarchical Bayesian models estimating FVT parameters improve heritabilities compared to frequentist approaches.
Collapse
Affiliation(s)
- Robert L Baker
- Department of Botany, University of Wyoming, Laramie, WY, 82071, USA.
- Biology Department, Miami University, Oxford, OH, 45056, USA.
| | - Wen Fung Leong
- Department of Agronomy, Kansas State University, Manhattan, KS, 66506, USA
| | - Nan An
- Department of Agronomy, Kansas State University, Manhattan, KS, 66506, USA
| | - Marcus T Brock
- Department of Botany, University of Wyoming, Laramie, WY, 82071, USA
| | - Matthew J Rubin
- Department of Botany, University of Wyoming, Laramie, WY, 82071, USA
| | - Stephen Welch
- Department of Agronomy, Kansas State University, Manhattan, KS, 66506, USA
| | - Cynthia Weinig
- Department of Botany, University of Wyoming, Laramie, WY, 82071, USA
- Department of Molecular Biology, University of Wyoming, Laramie, WY, 82071, USA
| |
Collapse
|
9
|
Liu J, Ye M, Zhu S, Jiang L, Sang M, Gan J, Wang Q, Huang M, Wu R. Two-stage identification of SNP effects on dynamic poplar growth. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2018; 93:286-296. [PMID: 29168265 DOI: 10.1111/tpj.13777] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Revised: 10/16/2017] [Accepted: 10/23/2017] [Indexed: 05/23/2023]
Abstract
This project proposes an approach to identify significant single nucleotide polymorphism (SNP) effects, both additive and dominant, on the dynamic growth of poplar in diameter and height. The annual changes in yearly phenotypes based on regular observation periods are considered to represent multiple responses. In total 156,362 candidate SNPs are studied, and the phenotypes of 64 poplar trees are recorded. To address this ultrahigh dimensionality issue, this paper adopts a two-stage approach. First, the conventional genome-wide association studies (GWAS) and the distance correlation sure independence screening (DC-SIS) methods (Li et al., 2012) were combined to reduce the model dimensions at the sample size; second, a grouped penalized regression was applied to further refine the model and choose the final sparse SNPs. The multiple response issue was also carefully addressed. The SNP effects on the dynamic diameter and height growth patterns of poplar were systematically analyzed. In addition, a series of intensive simulation studies was performed to validate the proposed approach.
Collapse
Affiliation(s)
- Jingyuan Liu
- Department of Statistics in School of Economics, Wang Yanan Institute for Studies in Economics, Fujian Key Laboratory of Statistical Science, Xiamen University, China
| | - Meixia Ye
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100081, China
| | - Sheng Zhu
- Jiangsu Key Laboratory for Poplar Germplasm Enhancement and Variety Improvement, Nanjing Forestry University, Nanjing, 210037, China
| | - Libo Jiang
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100081, China
| | - Mengmeng Sang
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100081, China
| | - Jingwen Gan
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100081, China
| | - Qian Wang
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100081, China
| | - Minren Huang
- Jiangsu Key Laboratory for Poplar Germplasm Enhancement and Variety Improvement, Nanjing Forestry University, Nanjing, 210037, China
| | - Rongling Wu
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100081, China
- Department of Public Health Sciences, Penn State Hershey College of Medicine, Hershey, PA17033, USA
| |
Collapse
|
10
|
Jiang L, Zhang M, Sang M, Ye M, Wu R. Evo-Devo-EpiR: a genome-wide search platform for epistatic control on the evolution of development. Brief Bioinform 2017; 18:754-760. [PMID: 27473062 DOI: 10.1093/bib/bbw062] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Indexed: 11/14/2022] Open
Abstract
Evo-devo is a theory proposed to study how phenotypes evolve by comparing the developmental processes of different organisms or the same organism experiencing changing environments. It has been recognized that nonallelic interactions at different genes or quantitative trait loci, known as epistasis, may play a pivotal role in the evolution of development, but it has proven difficult to quantify and elucidate this role into a coherent picture. We implement a high-dimensional genome-wide association study model into the evo-devo paradigm and pack it into the R-based Evo-Devo-EpiR, aimed at facilitating the genome-wide landscaping of epistasis for the diversification of phenotypic development. By analyzing a high-throughput assay of DNA markers and their pairs simultaneously, Evo-Devo-EpiR is equipped with a capacity to systematically characterize various epistatic interactions that impact on the pattern and timing of development and its evolution. Enabling a global search for all possible genetic interactions for developmental processes throughout the whole genome, Evo-Devo-EpiR provides a computational tool to illustrate a precise genotype-phenotype map at interface between epistasis, development and evolution.
Collapse
|
11
|
Muraya MM, Chu J, Zhao Y, Junker A, Klukas C, Reif JC, Altmann T. Genetic variation of growth dynamics in maize (Zea mays L.) revealed through automated non-invasive phenotyping. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2017; 89:366-380. [PMID: 27714888 DOI: 10.1111/tpj.13390] [Citation(s) in RCA: 54] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Revised: 09/14/2016] [Accepted: 09/19/2016] [Indexed: 05/02/2023]
Abstract
Hitherto, most quantitative trait loci of maize growth and biomass yield have been identified for a single time point, usually the final harvest stage. Through this approach cumulative effects are detected, without considering genetic factors causing phase-specific differences in growth rates. To assess the genetics of growth dynamics, we employed automated non-invasive phenotyping to monitor the plant sizes of 252 diverse maize inbred lines at 11 different developmental time points; 50 k SNP array genotype data were used for genome-wide association mapping and genomic selection. The heritability of biomass was estimated to be over 71%, and the average prediction accuracy amounted to 0.39. Using the individual time point data, 12 main effect marker-trait associations (MTAs) and six pairs of epistatic interactions were detected that displayed different patterns of expression at various developmental time points. A subset of them also showed significant effects on relative growth rates in different intervals. The detected MTAs jointly explained up to 12% of the total phenotypic variation, decreasing with developmental progression. Using non-parametric functional mapping and multivariate mapping approaches, four additional marker loci affecting growth dynamics were detected. Our results demonstrate that plant biomass accumulation is a complex trait governed by many small effect loci, most of which act at certain restricted developmental phases. This highlights the need for investigation of stage-specific growth affecting genes to elucidate important processes operating at different developmental phases.
Collapse
Affiliation(s)
- Moses M Muraya
- Department of Molecular Genetics, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, D-06466, Seeland, Germany
- Department of Plant Sciences, Chuka University, P.O. Box 109 - 60400, Chuka, Kenya
| | - Jianting Chu
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, D-06466, Seeland, Germany
| | - Yusheng Zhao
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, D-06466, Seeland, Germany
| | - Astrid Junker
- Department of Molecular Genetics, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, D-06466, Seeland, Germany
| | - Christian Klukas
- Department of Molecular Genetics, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, D-06466, Seeland, Germany
| | - Jochen C Reif
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, D-06466, Seeland, Germany
| | - Thomas Altmann
- Department of Molecular Genetics, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, D-06466, Seeland, Germany
| |
Collapse
|
12
|
Onogi A, Watanabe M, Mochizuki T, Hayashi T, Nakagawa H, Hasegawa T, Iwata H. Toward integration of genomic selection with crop modelling: the development of an integrated approach to predicting rice heading dates. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2016; 129:805-817. [PMID: 26791836 DOI: 10.1007/s00122-016-2667-5] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Accepted: 01/09/2016] [Indexed: 05/28/2023]
Abstract
It is suggested that accuracy in predicting plant phenotypes can be improved by integrating genomic prediction with crop modelling in a single hierarchical model. Accurate prediction of phenotypes is important for plant breeding and management. Although genomic prediction/selection aims to predict phenotypes on the basis of whole-genome marker information, it is often difficult to predict phenotypes of complex traits in diverse environments, because plant phenotypes are often influenced by genotype-environment interaction. A possible remedy is to integrate genomic prediction with crop/ecophysiological modelling, which enables us to predict plant phenotypes using environmental and management information. To this end, in the present study, we developed a novel method for integrating genomic prediction with phenological modelling of Asian rice (Oryza sativa, L.), allowing the heading date of untested genotypes in untested environments to be predicted. The method simultaneously infers the phenological model parameters and whole-genome marker effects on the parameters in a Bayesian framework. By cultivating backcross inbred lines of Koshihikari × Kasalath in nine environments, we evaluated the potential of the proposed method in comparison with conventional genomic prediction, phenological modelling, and two-step methods that applied genomic prediction to phenological model parameters inferred from Nelder-Mead or Markov chain Monte Carlo algorithms. In predicting heading dates of untested lines in untested environments, the proposed and two-step methods tended to provide more accurate predictions than the conventional genomic prediction methods, particularly in environments where phenotypes from environments similar to the target environment were unavailable for training genomic prediction. The proposed method showed greater accuracy in prediction than the two-step methods in all cross-validation schemes tested, suggesting the potential of the integrated approach in the prediction of phenotypes of plants.
Collapse
Affiliation(s)
- Akio Onogi
- Department of Agricultural and Environmental Biology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo, 113-8657, Japan
| | - Maya Watanabe
- Department of Agricultural and Environmental Biology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo, 113-8657, Japan
| | | | - Takeshi Hayashi
- National Agriculture and Food Research Organization Agricultural Research Center, Tsukuba, Ibaraki, 305-8666, Japan
| | - Hiroshi Nakagawa
- National Agriculture and Food Research Organization Agricultural Research Center, Tsukuba, Ibaraki, 305-8666, Japan
| | - Toshihiro Hasegawa
- National Institute for Agro-Environmental Sciences, Tsukuba, Ibaraki, 305-8604, Japan
| | - Hiroyoshi Iwata
- Department of Agricultural and Environmental Biology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo, 113-8657, Japan.
| |
Collapse
|
13
|
Li Z, Sillanpää MJ. Dynamic Quantitative Trait Locus Analysis of Plant Phenomic Data. TRENDS IN PLANT SCIENCE 2015; 20:822-833. [PMID: 26482958 DOI: 10.1016/j.tplants.2015.08.012] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2015] [Revised: 08/12/2015] [Accepted: 08/26/2015] [Indexed: 05/27/2023]
Abstract
Advanced platforms have recently become available for automatic and systematic quantification of plant growth and development. These new techniques can efficiently produce multiple measurements of phenotypes over time, and introduce time as an extra dimension to quantitative trait locus (QTL) studies. Functional mapping utilizes a class of statistical models for identifying QTLs associated with the growth characteristics of interest. A major benefit of functional mapping is that it integrates information over multiple timepoints, and therefore could increase the statistical power for QTL detection. We review the current development of computationally efficient functional mapping methods which provide invaluable tools for analyzing large-scale timecourse data that are readily available in our post-genome era.
Collapse
Affiliation(s)
- Zitong Li
- Biocenter Oulu, Oulu, Finland; Department of Mathematical Sciences and Department of Biology, University of Oulu, 90014 Oulu, Finland
| | - Mikko J Sillanpää
- Biocenter Oulu, Oulu, Finland; Department of Mathematical Sciences and Department of Biology, University of Oulu, 90014 Oulu, Finland.
| |
Collapse
|
14
|
Mapping Quantitative Trait Loci Underlying Function-Valued Traits Using Functional Principal Component Analysis and Multi-Trait Mapping. G3-GENES GENOMES GENETICS 2015; 6:79-86. [PMID: 26530421 PMCID: PMC4704727 DOI: 10.1534/g3.115.024133] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
We previously proposed a simple regression-based method to map quantitative trait loci underlying function-valued phenotypes. In order to better handle the case of noisy phenotype measurements and accommodate the correlation structure among time points, we propose an alternative approach that maintains much of the simplicity and speed of the regression-based method. We overcome noisy measurements by replacing the observed data with a smooth approximation. We then apply functional principal component analysis, replacing the smoothed phenotype data with a small number of principal components. Quantitative trait locus mapping is applied to these dimension-reduced data, either with a multi-trait method or by considering the traits individually and then taking the average or maximum LOD score across traits. We apply these approaches to root gravitropism data on Arabidopsis recombinant inbred lines and further investigate their performance in computer simulations. Our methods have been implemented in the R package, funqtl.
Collapse
|
15
|
Abstract
Despite increasing emphasis on the genetic study of quantitative traits, we are still far from being able to chart a clear picture of their genetic architecture, given an inherent complexity involved in trait formation. A competing theory for studying such complex traits has emerged by viewing their phenotypic formation as a "system" in which a high-dimensional group of interconnected components act and interact across different levels of biological organization from molecules through cells to whole organisms. This system is initiated by a machinery of DNA sequences that regulate a cascade of biochemical pathways to synthesize endophenotypes and further assemble these endophenotypes toward the end-point phenotype in virtue of various developmental changes. This review focuses on a conceptual framework for genetic mapping of complex traits by which to delineate the underlying components, interactions and mechanisms that govern the system according to biological principles and understand how these components function synergistically under the control of quantitative trait loci (QTLs) to comprise a unified whole. This framework is built by a system of differential equations that quantifies how alterations of different components lead to the global change of trait development and function, and provides a quantitative and testable platform for assessing the multiscale interplay between QTLs and development. The method will enable geneticists to shed light on the genetic complexity of any biological system and predict, alter or engineer its physiological and pathological states.
Collapse
Affiliation(s)
- Lidan Sun
- National Engineering Research Center for Floriculture, College of Landscape Architecture, Beijing Forestry University, Beijing 100083, China; Center for Statistical Genetics, Departments of Public Health Sciences and Statistics, The Pennsylvania State University, Hershey, PA 17033, USA
| | - Rongling Wu
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China; Center for Statistical Genetics, Departments of Public Health Sciences and Statistics, The Pennsylvania State University, Hershey, PA 17033, USA.
| |
Collapse
|
16
|
Sun L, Wu R. Toward the practical utility of systems mapping: Reply to comments on "Mapping complex traits as a dynamic system". Phys Life Rev 2015; 13:198-201. [PMID: 26009264 DOI: 10.1016/j.plrev.2015.04.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Accepted: 04/29/2015] [Indexed: 11/19/2022]
Affiliation(s)
- Lidan Sun
- Beijing Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, National Engineering Research Center for Floriculture, College of Landscape Architecture, Beijing Forestry University, Beijing 100083, China; Center for Statistical Genetics, Departments of Public Health Sciences and Statistics, The Pennsylvania State University, Hershey, PA 17033, USA
| | - Rongling Wu
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China; Center for Statistical Genetics, Departments of Public Health Sciences and Statistics, The Pennsylvania State University, Hershey, PA 17033, USA.
| |
Collapse
|
17
|
Pasanen L, Holmström L, Sillanpää MJ. Bayesian LASSO, scale space and decision making in association genetics. PLoS One 2015; 10:e0120017. [PMID: 25856391 PMCID: PMC4391919 DOI: 10.1371/journal.pone.0120017] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2014] [Accepted: 12/25/2014] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND LASSO is a penalized regression method that facilitates model fitting in situations where there are as many, or even more explanatory variables than observations, and only a few variables are relevant in explaining the data. We focus on the Bayesian version of LASSO and consider four problems that need special attention: (i) controlling false positives, (ii) multiple comparisons, (iii) collinearity among explanatory variables, and (iv) the choice of the tuning parameter that controls the amount of shrinkage and the sparsity of the estimates. The particular application considered is association genetics, where LASSO regression can be used to find links between chromosome locations and phenotypic traits in a biological organism. However, the proposed techniques are relevant also in other contexts where LASSO is used for variable selection. RESULTS We separate the true associations from false positives using the posterior distribution of the effects (regression coefficients) provided by Bayesian LASSO. We propose to solve the multiple comparisons problem by using simultaneous inference based on the joint posterior distribution of the effects. Bayesian LASSO also tends to distribute an effect among collinear variables, making detection of an association difficult. We propose to solve this problem by considering not only individual effects but also their functionals (i.e. sums and differences). Finally, whereas in Bayesian LASSO the tuning parameter is often regarded as a random variable, we adopt a scale space view and consider a whole range of fixed tuning parameters, instead. The effect estimates and the associated inference are considered for all tuning parameters in the selected range and the results are visualized with color maps that provide useful insights into data and the association problem considered. The methods are illustrated using two sets of artificial data and one real data set, all representing typical settings in association genetics.
Collapse
Affiliation(s)
- Leena Pasanen
- Department of Mathematical Sciences, University of Oulu, Oulu, Finland
| | - Lasse Holmström
- Department of Mathematical Sciences, University of Oulu, Oulu, Finland
| | - Mikko J. Sillanpää
- Department of Mathematical Sciences, University of Oulu, Oulu, Finland
- Department of Biology, University of Oulu, Oulu, Finland
- Biocenter Oulu, Oulu, Finland
| |
Collapse
|
18
|
Functional multi-locus QTL mapping of temporal trends in Scots pine wood traits. G3-GENES GENOMES GENETICS 2014; 4:2365-79. [PMID: 25305041 PMCID: PMC4267932 DOI: 10.1534/g3.114.014068] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Quantitative trait loci (QTL) mapping of wood properties in conifer species has focused on single time point measurements or on trait means based on heterogeneous wood samples (e.g., increment cores), thus ignoring systematic within-tree trends. In this study, functional QTL mapping was performed for a set of important wood properties in increment cores from a 17-yr-old Scots pine (Pinus sylvestris L.) full-sib family with the aim of detecting wood trait QTL for general intercepts (means) and for linear slopes by increasing cambial age. Two multi-locus functional QTL analysis approaches were proposed and their performances were compared on trait datasets comprising 2 to 9 time points, 91 to 455 individual tree measurements and genotype datasets of amplified length polymorphisms (AFLP), and single nucleotide polymorphism (SNP) markers. The first method was a multilevel LASSO analysis whereby trend parameter estimation and QTL mapping were conducted consecutively; the second method was our Bayesian linear mixed model whereby trends and underlying genetic effects were estimated simultaneously. We also compared several different hypothesis testing methods under either the LASSO or the Bayesian framework to perform QTL inference. In total, five and four significant QTL were observed for the intercepts and slopes, respectively, across wood traits such as earlywood percentage, wood density, radial fiberwidth, and spiral grain angle. Four of these QTL were represented by candidate gene SNPs, thus providing promising targets for future research in QTL mapping and molecular function. Bayesian and LASSO methods both detected similar sets of QTL given datasets that comprised large numbers of individuals.
Collapse
|
19
|
Zeng Y, Ye S, Yu W, Wu S, Hou W, Wu R, Dai W, Chang J. Genetic linkage map construction and QTL identification of juvenile growth traits in Torreya grandis. BMC Genet 2014; 15 Suppl 1:S2. [PMID: 25079139 PMCID: PMC4118616 DOI: 10.1186/1471-2156-15-s1-s2] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Torreya grandis Fort. ex Lindl, a conifer species widely distributed in Southeastern China, is of high economic value by producing edible, nutrient seeds. However, knowledge about the genome structure and organization of this species is poorly understood, thereby limiting the effective use of its gene resources. Here, we report on a first genetic linkage map for Torreya grandis using 96 progeny randomly chosen from a half-sib family of a commercially cultivated variety of this species, Torreya grandis Fort. ex Lindl cv. Merrillii. The map contains 262 molecular markers, i.e., 75 random amplified polymorphic DNAs (RAPD), 119 inter-simple sequence repeats (ISSR) and 62 amplified fragments length polymorphisms (AFLP), and spans a total of 7,139.9 cM, separated by 10 linkage groups. The linkage map was used to map quantitative trait loci (QTLs) associated with juvenile growth traits by functional mapping. We identified four basal diameter-related QTLs on linkage groups 1, 5 and 9; four height-related QTLs on linkage groups 1, 2, 5 and 8. It was observed that the genetic effects of QTLs on growth traits vary with age, suggesting the dynamic behavior of growth QTLs. Part of the QTLs was found to display a pleiotropic effect on basal diameter growth and height growth.
Collapse
|
20
|
Abstract
Most statistical methods for quantitative trait loci (QTL) mapping focus on a single phenotype. However, multiple phenotypes are commonly measured, and recent technological advances have greatly simplified the automated acquisition of numerous phenotypes, including function-valued phenotypes, such as growth measured over time. While methods exist for QTL mapping with function-valued phenotypes, they are generally computationally intensive and focus on single-QTL models. We propose two simple, fast methods that maintain high power and precision and are amenable to extensions with multiple-QTL models using a penalized likelihood approach. After identifying multiple QTL by these approaches, we can view the function-valued QTL effects to provide a deeper understanding of the underlying processes. Our methods have been implemented as a package for R, funqtl.
Collapse
|
21
|
Pikkuhookana P, Sillanpää MJ. Combined linkage disequilibrium and linkage mapping: Bayesian multilocus approach. Heredity (Edinb) 2013; 112:351-60. [PMID: 24253936 DOI: 10.1038/hdy.2013.111] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2013] [Revised: 09/02/2013] [Accepted: 09/27/2013] [Indexed: 01/24/2023] Open
Abstract
Quantitative trait loci (QTL) affecting the phenotype of interest can be detected using linkage analysis (LA), linkage disequilibrium (LD) mapping or a combination of both (LDLA). The LA approach uses information from recombination events within the observed pedigree and LD mapping from the historical recombinations within the unobserved pedigree. We propose the Bayesian variable selection approach for combined LDLA analysis for single-nucleotide polymorphism (SNP) data. The novel approach uses both sources of information simultaneously as is commonly done in plant and animal genetics, but it makes fewer assumptions about population demography than previous LDLA methods. This differs from approaches in human genetics, where LDLA methods use LA information conditional on LD information or the other way round. We argue that the multilocus LDLA model is more powerful for the detection of phenotype-genotype associations than single-locus LDLA analysis. To illustrate the performance of the Bayesian multilocus LDLA method, we analyzed simulation replicates based on real SNP genotype data from small three-generational CEPH families and compared the results with commonly used quantitative transmission disequilibrium test (QTDT). This paper is intended to be conceptual in the sense that it is not meant to be a practical method for analyzing high-density SNP data, which is more common. Our aim was to test whether this approach can function in principle.
Collapse
Affiliation(s)
- P Pikkuhookana
- 1] Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland [2] Department of Biology, University of Oulu, Oulu, Finland [3] Department of Mathematical Sciences, University of Oulu, Oulu, Finland [4] Biocenter Oulu, University of Oulu, Oulu, Finland
| | - M J Sillanpää
- 1] Department of Biology, University of Oulu, Oulu, Finland [2] Department of Mathematical Sciences, University of Oulu, Oulu, Finland [3] Biocenter Oulu, University of Oulu, Oulu, Finland
| |
Collapse
|
22
|
Abstract
In biology, many quantitative traits are dynamic in nature. They can often be described by some smooth functions or curves. A joint analysis of all the repeated measurements of the dynamic traits by functional quantitative trait loci (QTL) mapping methods has the benefits to (1) understand the genetic control of the whole dynamic process of the quantitative traits and (2) improve the statistical power to detect QTL. One crucial issue in functional QTL mapping is how to correctly describe the smoothness of trajectories of functional valued traits. We develop an efficient Bayesian nonparametric multiple-loci procedure for mapping dynamic traits. The method uses the Bayesian P-splines with (nonparametric) B-spline bases to specify the functional form of a QTL trajectory and a random walk prior to automatically determine its degree of smoothness. An efficient deterministic variational Bayes algorithm is used to implement both (1) the search of an optimal subset of QTL among large marker panels and (2) estimation of the genetic effects of the selected QTL changing over time. Our method can be fast even on some large-scale data sets. The advantages of our method are illustrated on both simulated and real data sets.
Collapse
|
23
|
McPherson S, Barbosa-Leiker C. An example of a two-part latent growth curve model for semicontinuous outcomes in the health sciences. J Appl Stat 2012. [DOI: 10.1080/02664763.2012.702205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
24
|
A decision rule for quantitative trait locus detection under the extended Bayesian LASSO model. Genetics 2012; 192:1483-91. [PMID: 22982577 DOI: 10.1534/genetics.111.130278] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Bayesian shrinkage analysis is arguably the state-of-the-art technique for large-scale multiple quantitative trait locus (QTL) mapping. However, when the shrinkage model does not involve indicator variables for marker inclusion, QTL detection remains heavily dependent on significance thresholds derived from phenotype permutation under the null hypothesis of no phenotype-to-genotype association. This approach is computationally intensive and more importantly, the hypothetical data generation at the heart of the permutation-based method violates the Bayesian philosophy. Here we propose a fully Bayesian decision rule for QTL detection under the recently introduced extended Bayesian LASSO for QTL mapping. Our new decision rule is free of any hypothetical data generation and relies on the well-established Bayes factors for evaluating the evidence for QTL presence at any locus. Simulation results demonstrate the remarkable performance of our decision rule. An application to real-world data is considered as well.
Collapse
|