1
|
The impact of species-wide gene expression variation on Caenorhabditis elegans complex traits. Nat Commun 2022; 13:3462. [PMID: 35710766 PMCID: PMC9203580 DOI: 10.1038/s41467-022-31208-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 06/08/2022] [Indexed: 12/15/2022] Open
Abstract
Phenotypic variation in organism-level traits has been studied in Caenorhabditis elegans wild strains, but the impacts of differences in gene expression and the underlying regulatory mechanisms are largely unknown. Here, we use natural variation in gene expression to connect genetic variants to differences in organismal-level traits, including drug and toxicant responses. We perform transcriptomic analyses on 207 genetically distinct C. elegans wild strains to study natural regulatory variation of gene expression. Using this massive dataset, we perform genome-wide association mappings to investigate the genetic basis underlying gene expression variation and reveal complex genetic architectures. We find a large collection of hotspots enriched for expression quantitative trait loci across the genome. We further use mediation analysis to understand how gene expression variation could underlie organism-level phenotypic variation for a variety of complex traits. These results reveal the natural diversity in gene expression and possible regulatory mechanisms in this keystone model organism, highlighting the promise of using gene expression variation to understand how phenotypic diversity is generated.
Collapse
|
2
|
Perez BC, Bink MCAM, Svenson KL, Churchill GA, Calus MPL. Prediction performance of linear models and gradient boosting machine on complex phenotypes in outbred mice. G3 (BETHESDA, MD.) 2022; 12:6528848. [PMID: 35166767 PMCID: PMC8982369 DOI: 10.1093/g3journal/jkac039] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 01/29/2022] [Indexed: 12/14/2022]
Abstract
We compared the performance of linear (GBLUP, BayesB, and elastic net) methods to a nonparametric tree-based ensemble (gradient boosting machine) method for genomic prediction of complex traits in mice. The dataset used contained genotypes for 50,112 SNP markers and phenotypes for 835 animals from 6 generations. Traits analyzed were bone mineral density, body weight at 10, 15, and 20 weeks, fat percentage, circulating cholesterol, glucose, insulin, triglycerides, and urine creatinine. The youngest generation was used as a validation subset, and predictions were based on all older generations. Model performance was evaluated by comparing predictions for animals in the validation subset against their adjusted phenotypes. Linear models outperformed gradient boosting machine for 7 out of 10 traits. For bone mineral density, cholesterol, and glucose, the gradient boosting machine model showed better prediction accuracy and lower relative root mean squared error than the linear models. Interestingly, for these 3 traits, there is evidence of a relevant portion of phenotypic variance being explained by epistatic effects. Using a subset of top markers selected from a gradient boosting machine model helped for some of the traits to improve the accuracy of prediction when these were fitted into linear and gradient boosting machine models. Our results indicate that gradient boosting machine is more strongly affected by data size and decreased connectedness between reference and validation sets than the linear models. Although the linear models outperformed gradient boosting machine for the polygenic traits, our results suggest that gradient boosting machine is a competitive method to predict complex traits with assumed epistatic effects.
Collapse
Affiliation(s)
- Bruno C Perez
- Hendrix Genetics B.V., Research and Technology Center (RTC), 5830 AC Boxmeer, The Netherlands
| | - Marco C A M Bink
- Hendrix Genetics B.V., Research and Technology Center (RTC), 5830 AC Boxmeer, The Netherlands
| | | | | | - Mario P L Calus
- Wageningen University & Research, Animal Breeding and Genomics, 6700 AH Wageningen, The Netherlands
| |
Collapse
|
3
|
The Genetic Architecture of a Congenital Heart Defect Is Related to Its Fitness Cost. Genes (Basel) 2021; 12:genes12091368. [PMID: 34573350 PMCID: PMC8467714 DOI: 10.3390/genes12091368] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 08/29/2021] [Indexed: 11/24/2022] Open
Abstract
In newborns, severe congenital heart defects are rarer than mild ones. This epidemiological relationship between heart defect severity and incidence lacks explanation. Here, an analysis of ~10,000 Nkx2-5+/− mice from two inbred strain crosses illustrates the fundamental role of epistasis. Modifier genes raise or lower the risk of specific defects via pairwise (G×GNkx) and higher-order (G×G×GNkx) interactions with Nkx2-5. Their effect sizes correlate with the severity of a defect. The risk loci for mild, atrial septal defects exert predominantly small G×GNkx effects, while the loci for severe, atrioventricular septal defects exert large G×GNkx and G×G×GNkx effects. The loci for moderately severe ventricular septal defects have intermediate effects. Interestingly, G×G×GNkx effects are three times more likely to suppress risk when the genotypes at the first two loci are from the same rather than different parental inbred strains. This suggests the genetic coadaptation of interacting G×G×GNkx loci, a phenomenon that Dobzhansky first described in Drosophila. Thus, epistasis plays dual roles in the pathogenesis of congenital heart disease and the robustness of cardiac development. The empirical results suggest a relationship between the fitness cost and genetic architecture of a disease phenotype and a means for phenotypic robustness to have evolved.
Collapse
|
4
|
Sheppard B, Rappoport N, Loh PR, Sanders SJ, Zaitlen N, Dahl A. A model and test for coordinated polygenic epistasis in complex traits. Proc Natl Acad Sci U S A 2021; 118:e1922305118. [PMID: 33833052 PMCID: PMC8053945 DOI: 10.1073/pnas.1922305118] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Interactions between genetic variants-epistasis-is pervasive in model systems and can profoundly impact evolutionary adaption, population disease dynamics, genetic mapping, and precision medicine efforts. In this work, we develop a model for structured polygenic epistasis, called coordinated epistasis (CE), and prove that several recent theories of genetic architecture fall under the formal umbrella of CE. Unlike standard epistasis models that assume epistasis and main effects are independent, CE captures systematic correlations between epistasis and main effects that result from pathway-level epistasis, on balance skewing the penetrance of genetic effects. To test for the existence of CE, we propose the even-odd (EO) test and prove it is calibrated in a range of realistic biological models. Applying the EO test in the UK Biobank, we find evidence of CE in 18 of 26 traits spanning disease, anthropometric, and blood categories. Finally, we extend the EO test to tissue-specific enrichment and identify several plausible tissue-trait pairs. Overall, CE is a dimension of genetic architecture that can capture structured, systemic forms of epistasis in complex human traits.
Collapse
Affiliation(s)
- Brooke Sheppard
- Department of Psychiatry and Behavioral Sciences, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA 94143
| | - Nadav Rappoport
- Department of Psychiatry and Behavioral Sciences, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA 94143
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA 94143
| | - Po-Ru Loh
- Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115
| | - Stephan J Sanders
- Department of Psychiatry and Behavioral Sciences, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA 94143
| | - Noah Zaitlen
- Department of Psychiatry and Behavioral Sciences, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA 94143;
- Department of Neurology, University of California Los Angeles, Los Angeles, CA 90095
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA 90095
| | - Andy Dahl
- Department of Neurology, University of California Los Angeles, Los Angeles, CA 90095;
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA 90095
- Section of Genetic Medicine, University of Chicago, Chicago, IL 60637
| |
Collapse
|
5
|
A Novel Mapping Strategy Utilizing Mouse Chromosome Substitution Strains Identifies Multiple Epistatic Interactions That Regulate Complex Traits. G3-GENES GENOMES GENETICS 2020; 10:4553-4563. [PMID: 33023974 PMCID: PMC7718749 DOI: 10.1534/g3.120.401824] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The genetic contribution of additive vs. non-additive (epistatic) effects in the regulation of complex traits is unclear. While genome-wide association studies typically ignore gene-gene interactions, in part because of the lack of statistical power for detecting them, mouse chromosome substitution strains (CSSs) represent an alternate approach for detecting epistasis given their limited allelic variation. Therefore, we utilized CSSs to identify and map both additive and epistatic loci that regulate a range of hematologic- and metabolism-related traits, as well as hepatic gene expression. Quantitative trait loci (QTL) were identified using a CSS-based backcross strategy involving the segregation of variants on the A/J-derived substituted chromosomes 4 and 6 on an otherwise C57BL/6J genetic background. In the liver transcriptomes of offspring from this cross, we identified and mapped additive QTL regulating the hepatic expression of 768 genes, and epistatic QTL pairs for 519 genes. Similarly, we identified additive QTL for fat pad weight, platelets, and the percentage of granulocytes in blood, as well as epistatic QTL pairs controlling the percentage of lymphocytes in blood and red cell distribution width. The variance attributed to the epistatic QTL pairs was approximately equal to that of the additive QTL; however, the SNPs in the epistatic QTL pairs that accounted for the largest variances were undetected in our single locus association analyses. These findings highlight the need to account for epistasis in association studies, and more broadly demonstrate the importance of identifying genetic interactions to understand the complete genetic architecture of complex traits.
Collapse
|
6
|
Lee KY, Leung KS, Tang NLS, Wong MH. Discovering Genetic Factors for psoriasis through exhaustively searching for significant second order SNP-SNP interactions. Sci Rep 2018; 8:15186. [PMID: 30315195 PMCID: PMC6185942 DOI: 10.1038/s41598-018-33493-w] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Accepted: 09/28/2018] [Indexed: 12/24/2022] Open
Abstract
In this paper, we aim at discovering genetic factors of psoriasis through searching for statistically significant SNP-SNP interactions exhaustively from two real psoriasis genome-wide association study datasets (phs000019.v1.p1 and phs000982.v1.p1) downloaded from the database of Genotypes and Phenotypes. To deal with the enormous search space, our search algorithm is accelerated with eight biological plausible interaction patterns and a pre-computed look-up table. After our search, we have discovered several SNPs having a stronger association to psoriasis when they are in combination with another SNP and these combinations may be non-linear interactions. Among the top 20 SNP-SNP interactions being found in terms of pairwise p-value and improvement metric value, we have discovered 27 novel potential psoriasis-associated SNPs where most of them are reported to be eQTLs of a number of known psoriasis-associated genes. On the other hand, we have inferred a gene network after selecting the top 10000 SNP-SNP interactions in terms of improvement metric value and we have discovered a novel long distance interaction between XXbac-BPG154L12.4 and RNU6-283P which is not a long distance haplotype and may be a new discovery. Finally, our experiments with the synthetic datasets have shown that our pre-computed look-up table technique can significantly speed up the search process.
Collapse
Affiliation(s)
- Kwan-Yeung Lee
- Department of Computer Science and Engineering, the Chinese University of Hong Kong, Hong Kong, China.
| | - Kwong-Sak Leung
- Department of Computer Science and Engineering, the Chinese University of Hong Kong, Hong Kong, China
| | - Nelson L S Tang
- Department of Chemical Pathology, the Chinese University of Hong Kong, Hong Kong, China.
| | - Man-Hon Wong
- Department of Computer Science and Engineering, the Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
7
|
Panzeri I, Pospisilik JA. Epigenetic control of variation and stochasticity in metabolic disease. Mol Metab 2018; 14:26-38. [PMID: 29909200 PMCID: PMC6034039 DOI: 10.1016/j.molmet.2018.05.010] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/22/2018] [Revised: 05/11/2018] [Accepted: 05/14/2018] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND The alarming rise of obesity and its associated comorbidities represents a medical burden and a major global health and economic issue. Understanding etiological mechanisms underpinning susceptibility and therapeutic response is of primary importance. Obesity, diabetes, and metabolic diseases are complex trait disorders with only partial genetic heritability, indicating important roles for environmental programing and epigenetic effects. SCOPE OF THE REVIEW We will highlight some of the reasons for the scarce predictability of metabolic diseases. We will outline how genetic variants generate phenotypic variation in disease susceptibility across populations. We will then focus on recent conclusions about epigenetic mechanisms playing a fundamental role in increasing variability and subsequently disease triggering. MAJOR CONCLUSIONS Currently, we are unable to predict or mechanistically define how "missing heritability" drives disease. Unravelling this black box of regulatory processes will allow us to move towards a truly personalized and precision medicine.
Collapse
Affiliation(s)
- Ilaria Panzeri
- Max Planck Institute of Immunobiology and Epigenetics, Stuebeweg 51, 79108, Freiburg, Germany
| | - John Andrew Pospisilik
- Max Planck Institute of Immunobiology and Epigenetics, Stuebeweg 51, 79108, Freiburg, Germany.
| |
Collapse
|