1
|
Liang Z, Prakapenka D, Da Y. Comparison of the Accuracy of Epistasis and Haplotype Models for Genomic Prediction of Seven Human Phenotypes. Biomolecules 2023; 13:1478. [PMID: 37892160 PMCID: PMC10604971 DOI: 10.3390/biom13101478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/26/2023] [Accepted: 09/28/2023] [Indexed: 10/29/2023] Open
Abstract
The accuracy of predicting seven human phenotypes of 3657-7564 individuals using global epistasis effects was evaluated and compared to the accuracy of haplotype genomic prediction using 380,705 SNPs and 10-fold cross-validation studies. The seven human phenotypes were the normality transformed high density lipoproteins (HDL), low density lipoproteins (LDL), total cholesterol (TC), triglycerides (TG), weight (WT), and the original phenotypic observations of height (HTo) and body mass index (BMIo). Fourth-order epistasis effects virtually had no contribution to the phenotypic variances, and third-order epistasis effects did not affect the prediction accuracy. Without haplotype effects in the prediction model, pairwise epistasis effects improved the prediction accuracy over the SNP models for six traits, with accuracy increases of 2.41%, 3.85%, 0.70%, 0.97%, 0.62% and 0.93% for HDL, LDL, TC, HTo, WT and BMIo respectively. However, none of the epistasis models had higher prediction accuracy than the haplotype models we previously reported. The epistasis model for TG decreased the prediction accuracy by 2.35% relative to the accuracy of the SNP model. The integrated models with epistasis and haplotype effects had slightly higher prediction accuracy than the haplotype models for two traits, HDL and BMIo. These two traits were the only traits where additive × dominance effects increased the prediction accuracy. These results indicated that haplotype effects containing local high-order epistasis effects had a tendency to be more important than global pairwise epistasis effects for the seven human phenotypes, and that the genetic mechanism of HDL and BMIo was more complex than that of the other traits.
Collapse
Affiliation(s)
| | | | - Yang Da
- Department of Animal Science, University of Minnesota, Saint Paul, MN 55108, USA; (Z.L.); (D.P.)
| |
Collapse
|
2
|
Yuan W, Beitel F, Srikant T, Bezrukov I, Schäfer S, Kraft R, Weigel D. Pervasive under-dominance in gene expression underlying emergent growth trajectories in Arabidopsis thaliana hybrids. Genome Biol 2023; 24:200. [PMID: 37667232 PMCID: PMC10478501 DOI: 10.1186/s13059-023-03043-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 08/21/2023] [Indexed: 09/06/2023] Open
Abstract
BACKGROUND Complex traits, such as growth and fitness, are typically controlled by a very large number of variants, which can interact in both additive and non-additive fashion. In an attempt to gauge the relative importance of both types of genetic interactions, we turn to hybrids, which provide a facile means for creating many novel allele combinations. RESULTS We focus on the interaction between alleles of the same locus, i.e., dominance, and perform a transcriptomic study involving 141 random crosses between different accessions of the plant model species Arabidopsis thaliana. Additivity is rare, consistently observed for only about 300 genes enriched for roles in stress response and cell death. Regulatory rare-allele burden affects the expression level of these genes but does not correlate with F1 rosette size. Non-additive, dominant gene expression in F1 hybrids is much more common, with the vast majority of genes (over 90%) being expressed below the parental average. Unlike in the additive genes, regulatory rare-allele burden in the dominant gene set is strongly correlated with F1 rosette size, even though it only mildly covaries with the expression level of these genes. CONCLUSIONS Our study underscores under-dominance as the predominant gene action associated with emergence of rosette growth trajectories in the A. thaliana hybrid model. Our work lays the foundation for understanding molecular mechanisms and evolutionary forces that lead to dominance complementation of rare regulatory alleles.
Collapse
Affiliation(s)
- Wei Yuan
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076, Tübingen, Germany
| | - Fiona Beitel
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076, Tübingen, Germany
| | - Thanvi Srikant
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076, Tübingen, Germany
| | - Ilja Bezrukov
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076, Tübingen, Germany
| | - Sabine Schäfer
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076, Tübingen, Germany
| | - Robin Kraft
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076, Tübingen, Germany
| | - Detlef Weigel
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076, Tübingen, Germany.
| |
Collapse
|
3
|
Zhao W, Qadri QR, Zhang Z, Wang Z, Pan Y, Wang Q, Zhang Z. PyAGH: a python package to fast construct kinship matrices based on different levels of omic data. BMC Bioinformatics 2023; 24:153. [PMID: 37072709 PMCID: PMC10111838 DOI: 10.1186/s12859-023-05280-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Accepted: 04/10/2023] [Indexed: 04/20/2023] Open
Abstract
BACKGROUND Construction of kinship matrices among individuals is an important step for both association studies and prediction studies based on different levels of omic data. Methods for constructing kinship matrices are becoming diverse and different methods have their specific appropriate scenes. However, software that can comprehensively calculate kinship matrices for a variety of scenarios is still in an urgent demand. RESULTS In this study, we developed an efficient and user-friendly python module, PyAGH, that can accomplish (1) conventional additive kinship matrces construction based on pedigree, genotypes, abundance data from transcriptome or microbiome; (2) genomic kinship matrices construction in combined population; (3) dominant and epistatic effects kinship matrices construction; (4) pedigree selection, tracing, detection and visualization; (5) visualization of cluster, heatmap and PCA analysis based on kinship matrices. The output from PyAGH can be easily integrated in other mainstream software based on users' purposes. Compared with other softwares, PyAGH integrates multiple methods for calculating the kinship matrix and has advantages in terms of speed and data size compared to other software. PyAGH is developed in python and C + + and can be easily installed by pip tool. Installation instructions and a manual document can be freely available from https://github.com/zhaow-01/PyAGH . CONCLUSION PyAGH is a fast and user-friendly Python package for calculating kinship matrices using pedigree, genotype, microbiome and transcriptome data as well as processing, analyzing and visualizing data and results. This package makes it easier to perform predictions and association studies processes based on different levels of omic data.
Collapse
Affiliation(s)
- Wei Zhao
- Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, 800# Dongchuan Road, Shanghai, China
| | - Qamar Raza Qadri
- Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, 800# Dongchuan Road, Shanghai, China
| | - Zhenyang Zhang
- Department of Animal Science, College of Animal Sciences, Zhejiang University, 866# Yuhangtang Road, Hangzhou, 310058, China
| | - Zhen Wang
- Department of Animal Science, College of Animal Sciences, Zhejiang University, 866# Yuhangtang Road, Hangzhou, 310058, China
| | - Yuchun Pan
- Department of Animal Science, College of Animal Sciences, Zhejiang University, 866# Yuhangtang Road, Hangzhou, 310058, China
- Hainan Research Institute, Zhejiang University, 11# Yonyou Industrial Park, Yazhou Bay Science and Technology City, Sanya, 572025, China
| | - Qishan Wang
- Department of Animal Science, College of Animal Sciences, Zhejiang University, 866# Yuhangtang Road, Hangzhou, 310058, China.
| | - Zhe Zhang
- Department of Animal Science, College of Animal Sciences, Zhejiang University, 866# Yuhangtang Road, Hangzhou, 310058, China.
| |
Collapse
|
4
|
Nadeau S, Beaulieu J, Gezan SA, Perron M, Bousquet J, Lenz PRN. Increasing genomic prediction accuracy for unphenotyped full-sib families by modeling additive and dominance effects with large datasets in white spruce. FRONTIERS IN PLANT SCIENCE 2023; 14:1137834. [PMID: 37035077 PMCID: PMC10073444 DOI: 10.3389/fpls.2023.1137834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 02/14/2023] [Indexed: 06/19/2023]
Abstract
INTRODUCTION Genomic selection is becoming a standard technique in plant breeding and is now being introduced into forest tree breeding. Despite promising results to predict the genetic merit of superior material based on their additive breeding values, many studies and operational programs still neglect non-additive effects and their potential for enhancing genetic gains. METHODS Using two large comprehensive datasets totaling 4,066 trees from 146 full-sib families of white spruce (Picea glauca (Moench) Voss), we evaluated the effect of the inclusion of dominance on the precision of genetic parameter estimates and on the accuracy of conventional pedigree-based (ABLUP-AD) and genomic-based (GBLUP-AD) models. RESULTS While wood quality traits were mostly additively inherited, considerable non-additive effects and lower heritabilities were detected for growth traits. For growth, GBLUP-AD better partitioned the additive and dominance effects into roughly equal variances, while ABLUP-AD strongly overestimated dominance. The predictive abilities of breeding and total genetic value estimates were similar between ABLUP-AD and GBLUP-AD when predicting individuals from the same families as those included in the training dataset. However, GBLUP-AD outperformed ABLUP-AD when predicting for new unphenotyped families that were not represented in the training dataset, with, on average, 22% and 53% higher predictive ability of breeding and genetic values, respectively. Resampling simulations showed that GBLUP-AD required smaller sample sizes than ABLUP-AD to produce precise estimates of genetic variances and accurate predictions of genetic values. Still, regardless of the method used, large training datasets were needed to estimate additive and non-additive genetic variances precisely. DISCUSSION This study highlights the different quantitative genetic architectures between growth and wood traits. Furthermore, the usefulness of genomic additive-dominance models for predicting new families should allow practicing mating allocation to maximize the total genetic values for the propagation of elite material.
Collapse
Affiliation(s)
- Simon Nadeau
- Natural Resources Canada, Canadian Forest Service, Canadian Wood Fibre Centre, Québec, QC, Canada
| | - Jean Beaulieu
- Canada Research Chair in Forest Genomics, Institute for Systems and Integrative Biology, Université Laval, Québec, QC, Canada
| | | | - Martin Perron
- Canada Research Chair in Forest Genomics, Institute for Systems and Integrative Biology, Université Laval, Québec, QC, Canada
- Direction de la Recherche Forestière, Ministère des Ressources Naturelles et des Forêts, Québec, QC, Canada
| | - Jean Bousquet
- Canada Research Chair in Forest Genomics, Institute for Systems and Integrative Biology, Université Laval, Québec, QC, Canada
| | - Patrick R. N. Lenz
- Natural Resources Canada, Canadian Forest Service, Canadian Wood Fibre Centre, Québec, QC, Canada
- Canada Research Chair in Forest Genomics, Institute for Systems and Integrative Biology, Université Laval, Québec, QC, Canada
| |
Collapse
|
5
|
Lombardi E, Shestakova TA, Santini F, Resco de Dios V, Voltas J. Harnessing tree-ring phenotypes to disentangle gene by environment interactions and their climate dependencies in a circum-Mediterranean pine. ANNALS OF BOTANY 2022; 130:509-523. [PMID: 35797146 PMCID: PMC9510947 DOI: 10.1093/aob/mcac092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 07/06/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND AND AIMS Understanding the genetic basis of adaptation and plasticity in trees constitutes a knowledge gap. We linked dendrochronology and genomics [single nucleotide polymorphisms (SNPs)] for a widespread conifer (Pinus halepensis Mill.) to characterize intraspecific growth differences elicited by climate. METHODS The analysis comprised 20-year tree-ring series of 130 trees structured in 23 populations evaluated in a common garden. We tested for genotype by environment interactions (G × E) of indexed ring width (RWI) and early- to latewood ratios (ELI) using factorial regression, which describes G × E as differential gene sensitivity to climate. KEY RESULTS The species' annual growth was positively influenced by winter temperature and spring moisture and negatively influenced by previous autumn precipitation and warm springs. Four and five climate factors explained 10 % (RWI) and 16 % (ELI) of population-specific interannual variability, respectively, with populations from drought-prone areas and with uneven precipitation experiencing larger growth reductions during dry vegetative periods. Furthermore, four and two SNPs explained 14 % (RWI) and 10 % (ELI) of interannual variability among trees, respectively. Two SNPs played a putative role in adaptation to climate: one identified from transcriptome sequencing of P. halepensis and another involved in response regulation to environmental stressors. CONCLUSIONS We highlight how tree-ring phenotypes, obtained from a common garden experiment, combined with a candidate-gene approach allow the quantification of genetic and environmental effects determining adaptation for a conifer with a large and complex genome.
Collapse
Affiliation(s)
| | | | - Filippo Santini
- Joint Research Unit CTFC – AGROTECNIO – CERCA, Av. Alcalde Rovira Roure 191, Lleida E-25198, Spain
- Departament of Crop and Forest Sciences, University of Lleida, Av. Alcalde Rovira Roure 191, Lleida E-25198, Spain
| | - Víctor Resco de Dios
- Joint Research Unit CTFC – AGROTECNIO – CERCA, Av. Alcalde Rovira Roure 191, Lleida E-25198, Spain
- Departament of Crop and Forest Sciences, University of Lleida, Av. Alcalde Rovira Roure 191, Lleida E-25198, Spain
| | - Jordi Voltas
- Joint Research Unit CTFC – AGROTECNIO – CERCA, Av. Alcalde Rovira Roure 191, Lleida E-25198, Spain
- Departament of Crop and Forest Sciences, University of Lleida, Av. Alcalde Rovira Roure 191, Lleida E-25198, Spain
| |
Collapse
|