76
|
Fang Z, Li S. Alternative polyadenylation-associated loci interpret human traits and diseases. Trends Genet 2021; 37:773-775. [PMID: 34148698 DOI: 10.1016/j.tig.2021.06.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2021] [Accepted: 06/07/2021] [Indexed: 10/21/2022]
Abstract
Alternative polyadenylation (APA)-associated genetic variants have been proposed to impact diverse human phenotypes and disorders. In a recent study, Li et al. established a landscape of 3'-untranslated region (UTR) APA quantitative trait loci (3'aQTLs) across multiple human tissues, revealing substantial 3'aQTLs that contribute to complex human traits and diseases.
Collapse
|
77
|
Albiñana C, Grove J, McGrath JJ, Agerbo E, Wray NR, Bulik CM, Nordentoft M, Hougaard DM, Werge T, Børglum AD, Mortensen PB, Privé F, Vilhjálmsson BJ. Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction. Am J Hum Genet 2021; 108:1001-1011. [PMID: 33964208 PMCID: PMC8206385 DOI: 10.1016/j.ajhg.2021.04.014] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 04/20/2021] [Indexed: 12/12/2022] Open
Abstract
The accuracy of polygenic risk scores (PRSs) to predict complex diseases increases with the training sample size. PRSs are generally derived based on summary statistics from large meta-analyses of multiple genome-wide association studies (GWASs). However, it is now common for researchers to have access to large individual-level data as well, such as the UK Biobank data. To the best of our knowledge, it has not yet been explored how best to combine both types of data (summary statistics and individual-level data) to optimize polygenic prediction. The most widely used approach to combine data is the meta-analysis of GWAS summary statistics (meta-GWAS), but we show that it does not always provide the most accurate PRS. Through simulations and using 12 real case-control and quantitative traits from both iPSYCH and UK Biobank along with external GWAS summary statistics, we compare meta-GWAS with two alternative data-combining approaches, stacked clumping and thresholding (SCT) and meta-PRS. We find that, when large individual-level data are available, the linear combination of PRSs (meta-PRS) is both a simple alternative to meta-GWAS and often more accurate.
Collapse
|
78
|
Di Giovannantonio M, Harris BH, Zhang P, Kitchen-Smith I, Xiong L, Sahgal N, Stracquadanio G, Wallace M, Blagden S, Lord S, Harris D, Harris AHL, Buffa FM, Bond GL. Heritable genetic variants in key cancer genes link cancer risk with anthropometric traits. J Med Genet 2021; 58:392-399. [PMID: 32591342 PMCID: PMC8142426 DOI: 10.1136/jmedgenet-2019-106799] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Revised: 05/19/2020] [Accepted: 05/21/2020] [Indexed: 12/12/2022]
Abstract
BACKGROUND Height and other anthropometric measures are consistently found to associate with differential cancer risk. However, both genetic and mechanistic insights into these epidemiological associations are notably lacking. Conversely, inherited genetic variants in tumour suppressors and oncogenes increase cancer risk, but little is known about their influence on anthropometric traits. METHODS By integrating inherited and somatic cancer genetic data from the Genome-Wide Association Study Catalog, expression Quantitative Trait Loci databases and the Cancer Gene Census, we identify SNPs that associate with different cancer types and differential gene expression in at least one tissue type, and explore the potential pleiotropic associations of these SNPs with anthropometric traits through SNP-wise association in a cohort of 500,000 individuals. RESULTS We identify three regulatory SNPs for three important cancer genes, FANCA, MAP3K1 and TP53 that associate with both anthropometric traits and cancer risk. Of particular interest, we identify a previously unrecognised strong association between the rs78378222[C] SNP in the 3' untranslated region (3'-UTR) of TP53 and both increased risk for developing non-melanomatous skin cancer (OR=1.36 (95% 1.31 to 1.41), adjusted p=7.62E-63), brain malignancy (OR=3.12 (2.22 to 4.37), adjusted p=1.43E-12) and increased standing height (adjusted p=2.18E-24, beta=0.073±0.007), lean body mass (adjusted p=8.34E-37, beta=0.073±0.005) and basal metabolic rate (adjusted p=1.13E-31, beta=0.076±0.006), thus offering a novel genetic link between these anthropometric traits and cancer risk. CONCLUSION Our results clearly demonstrate that heritable variants in key cancer genes can associate with both differential cancer risk and anthropometric traits in the general population, thereby lending support for a genetic basis for linking these human phenotypes.
Collapse
|
79
|
Abstract
Some of the genes responsible for the evolution of light skin pigmentation in Europeans show signals of positive selection in present-day populations. Recently, genome-wide association studies have highlighted the highly polygenic nature of skin pigmentation. It is unclear whether selection has operated on all of these genetic variants or just a subset. By studying variation in over a thousand ancient genomes from West Eurasia covering 40,000 y, we are able to study both the aggregate behavior of pigmentation-associated variants and the evolutionary history of individual variants. We find that the evolution of light skin pigmentation in Europeans was driven by frequency changes in a relatively small fraction of the genetic variants that are associated with variation in the trait today. Skin pigmentation is a classic example of a polygenic trait that has experienced directional selection in humans. Genome-wide association studies have identified well over a hundred pigmentation-associated loci, and genomic scans in present-day and ancient populations have identified selective sweeps for a small number of light pigmentation-associated alleles in Europeans. It is unclear whether selection has operated on all of the genetic variation associated with skin pigmentation as opposed to just a small number of large-effect variants. Here, we address this question using ancient DNA from 1,158 individuals from West Eurasia covering a period of 40,000 y combined with genome-wide association summary statistics from the UK Biobank. We find a robust signal of directional selection in ancient West Eurasians on 170 skin pigmentation-associated variants ascertained in the UK Biobank. However, we also show that this signal is driven by a limited number of large-effect variants. Consistent with this observation, we find that a polygenic selection test in present-day populations fails to detect selection with the full set of variants. Our data allow us to disentangle the effects of admixture and selection. Most notably, a large-effect variant at SLC24A5 was introduced to Western Europe by migrations of Neolithic farming populations but continued to be under selection post-admixture. This study shows that the response to selection for light skin pigmentation in West Eurasia was driven by a relatively small proportion of the variants that are associated with present-day phenotypic variation.
Collapse
|
80
|
Pazokitoroudi A, Chiu AM, Burch KS, Pasaniuc B, Sankararaman S. Quantifying the contribution of dominance deviation effects to complex trait variation in biobank-scale data. Am J Hum Genet 2021; 108:799-808. [PMID: 33811807 PMCID: PMC8206203 DOI: 10.1016/j.ajhg.2021.03.018] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 03/18/2021] [Indexed: 11/25/2022] Open
Abstract
The proportion of variation in complex traits that can be attributed to non-additive genetic effects has been a topic of intense debate. The availability of biobank-scale datasets of genotype and trait data from unrelated individuals opens up the possibility of obtaining precise estimates of the contribution of non-additive genetic effects. We present an efficient method to estimate the variation in a complex trait that can be attributed to additive (additive heritability) and dominance deviation (dominance heritability) effects across all genotyped SNPs in a large collection of unrelated individuals. Over a wide range of genetic architectures, our method yields unbiased estimates of additive and dominance heritability. We applied our method, in turn, to array genotypes as well as imputed genotypes (at common SNPs with minor allele frequency [MAF] > 1%) and 50 quantitative traits measured in 291,273 unrelated white British individuals in the UK Biobank. Averaged across these 50 traits, we find that additive heritability on array SNPs is 21.86% while dominance heritability is 0.13% (about 0.48% of the additive heritability) with qualitatively similar results for imputed genotypes. We find no statistically significant evidence for dominance heritability (p<0.05/50 accounting for the number of traits tested) and estimate that dominance heritability is unlikely to exceed 1% for the traits analyzed. Our analyses indicate a limited contribution of dominance heritability to complex trait variation.
Collapse
|
81
|
O’Connor CH, Sikkink KL, Nelson TC, Fierst JL, Cresko WA, Phillips PC. Complex pleiotropic genetic architecture of evolved heat stress and oxidative stress resistance in the nematode Caenorhabditis remanei. G3 (BETHESDA, MD.) 2021; 11:jkab045. [PMID: 33605401 PMCID: PMC8049431 DOI: 10.1093/g3journal/jkab045] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Accepted: 02/01/2021] [Indexed: 12/04/2022]
Abstract
The adaptation of complex organisms to changing environments has been a central question in evolutionary quantitative genetics since its inception. The structure of the genotype-phenotype maps is critical because pleiotropic effects can generate widespread correlated responses to selection and potentially restrict the extent of evolutionary change. In this study, we use experimental evolution to dissect the genetic architecture of natural variation for acute heat stress and oxidative stress response in the nematode Caenorhabiditis remanei. Previous work in the classic model nematode Caenorhabiditis elegans has found that abiotic stress response is controlled by a handful of genes of major effect and that mutations in any one of these genes can have widespread pleiotropic effects on multiple stress response traits. Here, we find that acute heat stress response and acute oxidative response in C. remanei are polygenic, complex traits, with hundreds of genomic regions responding to selection. In contrast to expectation from mutation studies, we find that evolved acute heat stress and acute oxidative stress response for the most part display independent genetic bases. This lack of correlation is reflected at the levels of phenotype, gene expression, and in the genomic response to selection. Thus, while these findings support the general view that rapid adaptation can be generated by changes at hundreds to thousands of sites in the genome, the architecture of segregating variation is likely to be determined by the pleiotropic structure of the underlying genetic networks.
Collapse
|
82
|
Abstract
Dogs and humans have coexisted together for thousands of years, but it was not until the Victorian Era that humans practiced selective breeding to produce the modern standards we see today. Strong artificial selection during the breed formation period has simplified the genetic architecture of complex traits and caused an enrichment of identity-by-descent (IBD) segments in the dog genome. This study demonstrates the value of IBD segments and utilizes them to infer the recent demography of canids, predict case-control status for complex traits, locate regions of the genome potentially linked to inbreeding depression, and to identify understudied breeds where there is potential to discover new disease-associated variants. Domestic dogs have experienced population bottlenecks, recent inbreeding, and strong artificial selection. These processes have simplified the genetic architecture of complex traits, allowed deleterious variation to persist, and increased both identity-by-descent (IBD) segments and runs of homozygosity (ROH). As such, dogs provide an excellent model for examining how these evolutionary processes influence disease. We assembled a dataset containing 4,414 breed dogs, 327 village dogs, and 380 wolves genotyped at 117,288 markers and data for clinical and morphological phenotypes. Breed dogs have an enrichment of IBD and ROH, relative to both village dogs and wolves, and we use these patterns to show that breed dogs have experienced differing severities of bottlenecks in their recent past. We then found that ROH burden is associated with phenotypes in breed dogs, such as lymphoma. We next test the prediction that breeds with greater ROH have more disease alleles reported in the Online Mendelian Inheritance in Animals (OMIA). Surprisingly, the number of causal variants identified correlates with the popularity of that breed rather than the ROH or IBD burden, suggesting an ascertainment bias in OMIA. Lastly, we use the distribution of ROH across the genome to identify genes with depletions of ROH as potential hotspots for inbreeding depression and find multiple exons where ROH are never observed. Our results suggest that inbreeding has played a large role in shaping genetic and phenotypic variation in dogs and that future work on understudied breeds may reveal new disease-causing variation.
Collapse
|
83
|
The long-term genetic stability and individual specificity of the human gut microbiome. Cell 2021; 184:2302-2315.e12. [PMID: 33838112 DOI: 10.1016/j.cell.2021.03.024] [Citation(s) in RCA: 143] [Impact Index Per Article: 47.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Revised: 12/02/2020] [Accepted: 03/11/2021] [Indexed: 12/11/2022]
Abstract
By following up the gut microbiome, 51 human phenotypes and plasma levels of 1,183 metabolites in 338 individuals after 4 years, we characterize microbial stability and variation in relation to host physiology. Using these individual-specific and temporally stable microbial profiles, including bacterial SNPs and structural variations, we develop a microbial fingerprinting method that shows up to 85% accuracy in classifying metagenomic samples taken 4 years apart. Application of our fingerprinting method to the independent HMP cohort results in 95% accuracy for samples taken 1 year apart. We further observe temporal changes in the abundance of multiple bacterial species, metabolic pathways, and structural variation, as well as strain replacement. We report 190 longitudinal microbial associations with host phenotypes and 519 associations with plasma metabolites. These associations are enriched for cardiometabolic traits, vitamin B, and uremic toxins. Finally, mediation analysis suggests that the gut microbiome may influence cardiometabolic health through its metabolites.
Collapse
|
84
|
Abstract
Although oral venom systems are ecologically important characters, how they originated is still unclear. In this study, we show that oral venom systems likely originated from a gene regulatory network conserved across amniotes. This network, which we term the “metavenom network,” comprises over 3,000 housekeeping genes coexpressed with venom and play a role in protein folding and modification. Comparative transcriptomics revealed that the network is conserved between venom glands of snakes and salivary glands of mammals. This suggests that while these tissues have evolved different functions, they share a common regulatory core, that persisted since their common ancestor. We propose several evolutionary mechanisms that can utilize this common regulatory core to give rise to venomous animals from their nonvenomous ancestors. Oral venom systems evolved multiple times in numerous vertebrates enabling the exploitation of unique predatory niches. Yet how and when they evolved remains poorly understood. Up to now, most research on venom evolution has focused strictly on the toxins. However, using toxins present in modern day animals to trace the origin of the venom system is difficult, since they tend to evolve rapidly, show complex patterns of expression, and were incorporated into the venom arsenal relatively recently. Here we focus on gene regulatory networks associated with the production of toxins in snakes, rather than the toxins themselves. We found that overall venom gland gene expression was surprisingly well conserved when compared to salivary glands of other amniotes. We characterized the “metavenom network,” a network of ∼3,000 nonsecreted housekeeping genes that are strongly coexpressed with the toxins, and are primarily involved in protein folding and modification. Conserved across amniotes, this network was coopted for venom evolution by exaptation of existing members and the recruitment of new toxin genes. For instance, starting from this common molecular foundation, Heloderma lizards, shrews, and solenodon, evolved venoms in parallel by overexpression of kallikreins, which were common in ancestral saliva and induce vasodilation when injected, causing circulatory shock. Derived venoms, such as those of snakes, incorporated novel toxins, though still rely on hypotension for prey immobilization. These similarities suggest repeated cooption of shared molecular machinery for the evolution of oral venom in mammals and reptiles, blurring the line between truly venomous animals and their ancestors.
Collapse
|
85
|
Grinberg NF, Wallace C. Multi-tissue transcriptome-wide association studies. Genet Epidemiol 2021; 45:324-337. [PMID: 33369784 PMCID: PMC8048510 DOI: 10.1002/gepi.22374] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 11/04/2020] [Accepted: 11/18/2020] [Indexed: 12/20/2022]
Abstract
A transcriptome-wide association study (TWAS) attempts to identify disease associated genes by imputing gene expression into a genome-wide association study (GWAS) using an expression quantitative trait loci (eQTL) data set and then testing for associations with a trait of interest. Regulatory processes may be shared across related tissues and one natural extension of TWAS is harnessing cross-tissue correlation in gene expression to improve prediction accuracy. Here, we studied multi-tissue extensions of lasso regression and random forests (RF), joint lasso and RF-MTL (multi-task learning RF), respectively. We found that, on our chosen eQTL data set, multi-tissue methods were generally more accurate than their single-tissue counterparts, with RF-MTL performing the best. Simulations showed that these benefits generally translated into more associated genes identified, although highlighted that joint lasso had a tendency to erroneously identify genes in one tissue if there existed an eQTL signal for that gene in another. Applying the four methods to a type 1 diabetes GWAS, we found that multi-tissue methods found more unique associated genes for most of the tissues considered. We conclude that multi-tissue methods are competitive and, for some cell types, superior to single-tissue approaches and hold much promise for TWAS studies.
Collapse
|
86
|
Durvasula A, Lohmueller KE. Negative selection on complex traits limits phenotype prediction accuracy between populations. Am J Hum Genet 2021; 108:620-631. [PMID: 33691092 DOI: 10.1016/j.ajhg.2021.02.013] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 02/17/2021] [Indexed: 12/22/2022] Open
Abstract
Phenotype prediction is a key goal for medical genetics. Unfortunately, most genome-wide association studies are done in European populations, which reduces the accuracy of predictions via polygenic scores in non-European populations. Here, we use population genetic models to show that human demographic history and negative selection on complex traits can result in population-specific genetic architectures. For traits where alleles with the largest effect on the trait are under the strongest negative selection, approximately half of the heritability can be accounted for by variants in Europe that are absent from Africa, leading to poor performance in phenotype prediction across these populations. Further, under such a model, individuals in the tails of the genetic risk distribution may not be identified via polygenic scores generated in another population. We empirically test these predictions by building a model to stratify heritability between European-specific and shared variants and applied it to 37 traits and diseases in the UK Biobank. Across these phenotypes, ∼30% of the heritability comes from European-specific variants. We conclude that genetic association studies need to include more diverse populations to enable the utility of phenotype prediction in all populations.
Collapse
|
87
|
Evidence of Genetic Overlap Between Circadian Preference and Brain White Matter Microstructure. Twin Res Hum Genet 2021; 24:1-6. [PMID: 33663638 DOI: 10.1017/thg.2021.4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Several neuroimaging studies have reported associations between brain white matter microstructure and chronotype. However, it is unclear whether those phenotypic relationships are causal or underlined by genetic factors. In the present study, we use genetic data to examine the genetic overlap and infer causal relationships between chronotype and diffusion tensor imaging (DTI) measures. We identify 29 significant pairwise genetic correlations, of which 13 also show evidence for a causal association. Genetic correlations were identified between chronotype and brain-wide mean, axial and radial diffusivities. When exploring individual tracts, 10 genetic correlations were observed with mean diffusivity, 10 with axial diffusivity, 4 with radial diffusivity and 2 with mode of anisotropy. We found evidence for a possible causal association of eveningness with white matter microstructure measures in individual tracts including the posterior limb and the retrolenticular part of the internal capsule; the genu and splenium of the corpus callosum and the posterior, superior and anterior regions of the corona radiata. Our findings contribute to the understanding of how genes influence circadian preference and brain white matter and provide a new avenue for investigating the role of chronotype in health and disease.
Collapse
|
88
|
Sinnott-Armstrong N, Naqvi S, Rivas M, Pritchard JK. GWAS of three molecular traits highlights core genes and pathways alongside a highly polygenic background. eLife 2021; 10:e58615. [PMID: 33587031 PMCID: PMC7884075 DOI: 10.7554/elife.58615] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 01/18/2021] [Indexed: 12/30/2022] Open
Abstract
Genome-wide association studies (GWAS) have been used to study the genetic basis of a wide variety of complex diseases and other traits. We describe UK Biobank GWAS results for three molecular traits-urate, IGF-1, and testosterone-with better-understood biology than most other complex traits. We find that many of the most significant hits are readily interpretable. We observe huge enrichment of associations near genes involved in the relevant biosynthesis, transport, or signaling pathways. We show how GWAS data illuminate the biology of each trait, including differences in testosterone regulation between females and males. At the same time, even these molecular traits are highly polygenic, with many thousands of variants spread across the genome contributing to trait variance. In summary, for these three molecular traits we identify strong enrichment of signal in putative core gene sets, even while most of the SNP-based heritability is driven by a massively polygenic background.
Collapse
|
89
|
Umans BD, Battle A, Gilad Y. Where Are the Disease-Associated eQTLs? Trends Genet 2021; 37:109-124. [PMID: 32912663 PMCID: PMC8162831 DOI: 10.1016/j.tig.2020.08.009] [Citation(s) in RCA: 128] [Impact Index Per Article: 42.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 08/07/2020] [Accepted: 08/14/2020] [Indexed: 02/07/2023]
Abstract
Most disease-associated variants, although located in putatively regulatory regions, do not have detectable effects on gene expression. One explanation could be that we have not examined gene expression in the cell types or conditions that are most relevant for disease. Even large-scale efforts to study gene expression across tissues are limited to human samples obtained opportunistically or postmortem, mostly from adults. In this review we evaluate recent findings and suggest an alternative strategy, drawing on the dynamic and highly context-specific nature of gene regulation. We discuss new technologies that can extend the standard regulatory mapping framework to more diverse, disease-relevant cell types and states.
Collapse
|
90
|
Kopriva S, Weber APM. Genetic encoding of complex traits. JOURNAL OF EXPERIMENTAL BOTANY 2021; 72:1-3. [PMID: 33471904 PMCID: PMC7816844 DOI: 10.1093/jxb/eraa498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
|
91
|
Marderstein AR, Davenport ER, Kulm S, Van Hout CV, Elemento O, Clark AG. Leveraging phenotypic variability to identify genetic interactions in human phenotypes. Am J Hum Genet 2021; 108:49-67. [PMID: 33326753 DOI: 10.1016/j.ajhg.2020.11.016] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Accepted: 11/23/2020] [Indexed: 12/13/2022] Open
Abstract
Although thousands of loci have been associated with human phenotypes, the role of gene-environment (GxE) interactions in determining individual risk of human diseases remains unclear. This is partly because of the severe erosion of statistical power resulting from the massive number of statistical tests required to detect such interactions. Here, we focus on improving the power of GxE tests by developing a statistical framework for assessing quantitative trait loci (QTLs) associated with the trait means and/or trait variances. When applying this framework to body mass index (BMI), we find that GxE discovery and replication rates are significantly higher when prioritizing genetic variants associated with the variance of the phenotype (vQTLs) compared to when assessing all genetic variants. Moreover, we find that vQTLs are enriched for associations with other non-BMI phenotypes having strong environmental influences, such as diabetes or ulcerative colitis. We show that GxE effects first identified in quantitative traits such as BMI can be used for GxE discovery in disease phenotypes such as diabetes. A clear conclusion is that strong GxE interactions mediate the genetic contribution to body weight and diabetes risk.
Collapse
|
92
|
Spear ML, Diaz-Papkovich A, Ziv E, Yracheta JM, Gravel S, Torgerson DG, Hernandez RD. Recent shifts in the genomic ancestry of Mexican Americans may alter the genetic architecture of biomedical traits. eLife 2020; 9:e56029. [PMID: 33372659 PMCID: PMC7771964 DOI: 10.7554/elife.56029] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Accepted: 12/13/2020] [Indexed: 11/13/2022] Open
Abstract
People in the Americas represent a diverse continuum of populations with varying degrees of admixture among African, European, and Amerindigenous ancestries. In the United States, populations with non-European ancestry remain understudied, and thus little is known about the genetic architecture of phenotypic variation in these populations. Using genotype data from the Hispanic Community Health Study/Study of Latinos, we find that Amerindigenous ancestry increased by an average of ~20% spanning 1940s-1990s in Mexican Americans. These patterns result from complex interactions between several population and cultural factors which shaped patterns of genetic variation and influenced the genetic architecture of complex traits in Mexican Americans. We show for height how polygenic risk scores based on summary statistics from a European-based genome-wide association study perform poorly in Mexican Americans. Our findings reveal temporal changes in population structure within Hispanics/Latinos that may influence biomedical traits, demonstrating a need to improve our understanding of admixed populations.
Collapse
|
93
|
Torres JM, Abdalla M, Payne A, Fernandez-Tajes J, Thurner M, Nylander V, Gloyn AL, Mahajan A, McCarthy MI. A Multi-omic Integrative Scheme Characterizes Tissues of Action at Loci Associated with Type 2 Diabetes. Am J Hum Genet 2020; 107:1011-1028. [PMID: 33186544 PMCID: PMC7820628 DOI: 10.1016/j.ajhg.2020.10.009] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Accepted: 10/20/2020] [Indexed: 12/30/2022] Open
Abstract
Resolving the molecular processes that mediate genetic risk remains a challenge because most disease-associated variants are non-coding and functional characterization of these signals requires knowledge of the specific tissues and cell-types in which they operate. To address this challenge, we developed a framework for integrating tissue-specific gene expression and epigenomic maps to obtain "tissue-of-action" (TOA) scores for each association signal by systematically partitioning posterior probabilities from Bayesian fine-mapping. We applied this scheme to credible set variants for 380 association signals from a recent GWAS meta-analysis of type 2 diabetes (T2D) in Europeans. The resulting tissue profiles underscored a predominant role for pancreatic islets and, to a lesser extent, adipose and liver, particularly among signals with greater fine-mapping resolution. We incorporated resulting TOA scores into a rule-based classifier and validated the tissue assignments through comparison with data from cis-eQTL enrichment, functional fine-mapping, RNA co-expression, and patterns of physiological association. In addition to implicating signals with a single TOA, we found evidence for signals with shared effects in multiple tissues as well as distinct tissue profiles between independent signals within heterogeneous loci. Lastly, we demonstrated that TOA scores can be directly coupled with eQTL colocalization to further resolve effector transcripts at T2D signals. This framework guides mechanistic inference by directing functional validation studies to the most relevant tissues and can gain power as fine-mapping resolution and cell-specific annotations become richer. This method is generalizable to all complex traits with relevant annotation data and is made available as an R package.
Collapse
|
94
|
Brion C, Lutz SM, Albert FW. Simultaneous quantification of mRNA and protein in single cells reveals post-transcriptional effects of genetic variation. eLife 2020; 9:60645. [PMID: 33191917 PMCID: PMC7707838 DOI: 10.7554/elife.60645] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Accepted: 11/14/2020] [Indexed: 01/27/2023] Open
Abstract
Trans-acting DNA variants may specifically affect mRNA or protein levels of genes located throughout the genome. However, prior work compared trans-acting loci mapped in separate studies, many of which had limited statistical power. Here, we developed a CRISPR-based system for simultaneous quantification of mRNA and protein of a given gene via dual fluorescent reporters in single, live cells of the yeast Saccharomyces cerevisiae. In large populations of recombinant cells from a cross between two genetically divergent strains, we mapped 86 trans-acting loci affecting the expression of ten genes. Less than 20% of these loci had concordant effects on mRNA and protein of the same gene. Most loci influenced protein but not mRNA of a given gene. One locus harbored a premature stop variant in the YAK1 kinase gene that had specific effects on protein or mRNA of dozens of genes. These results demonstrate complex, post-transcriptional genetic effects on gene expression.
Collapse
|
95
|
Foulkes AS, Selvaggi C, Cao T, O'Reilly ME, Cynn E, Ma P, Lumish H, Xue C, Reilly MP. Nonconserved Long Intergenic Noncoding RNAs Associate With Complex Cardiometabolic Disease Traits. Arterioscler Thromb Vasc Biol 2020; 41:501-511. [PMID: 33176448 DOI: 10.1161/atvbaha.120.315045] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
OBJECTIVE Transcriptome profiling of human tissues has revealed thousands of long intergenic noncoding RNAs (lincRNAs) at loci identified through large-scale genome-wide studies for complex cardiometabolic traits. This raises the question of whether genetic variation at nonconserved lincRNAs has any systematic association with complex disease, and if so, how different this pattern is from conserved lincRNAs. We evaluated whether the associations between nonconserved lincRNAs and 8 complex cardiometabolic traits resemble or differ from the pattern of association for conserved lincRNAs. Approach and Results: Our investigation of over 7000 lincRNA annotations from GENCODE Release 33-GRCh38.p13 for complex trait genetic associations leveraged several large, established meta-analyses genome-wide association study summary data resources, including GIANT (Genetic Investigation of Anthropometric Traits), UK Biobank, GLGC (Global Lipids Genetics Consortium), Cardiogram (Coronary Artery Disease Genome Wide Replication and Meta-Analysis), and DIAGRAM (Diabetes Genetics Replication and Meta-Analysis)/DIAMANTE (Diabetes Meta-Analysis of Trans-Ethnic Association Studies). These analyses revealed that (1) nonconserved lincRNAs associate with a range of cardiometabolic traits at a rate that is generally consistent with conserved lincRNAs; (2) these findings persist across different definitions of conservation; and (3) overall across all cardiometabolic traits, approximately one-third of genome-wide association study-associated lincRNAs are nonconserved, and this increases to about two-thirds using a more stringent definition of conservation. CONCLUSIONS These findings suggest that the traditional notion of conservation driving prioritization for functional and translational follow-up of complex cardiometabolic genomic discoveries may need to be revised in the context of the abundance of nonconserved long noncoding RNAs in the human genome and their apparent predilection to associate with complex cardiometabolic traits.
Collapse
|
96
|
Miles AM, Huson HJ. Graduate Student Literature Review: Understanding the genetic mechanisms underlying mastitis. J Dairy Sci 2020; 104:1183-1191. [PMID: 33162090 DOI: 10.3168/jds.2020-18297] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 08/16/2020] [Indexed: 01/24/2023]
Abstract
Mastitis is the costliest disease facing dairy producers today; consequently, it has been the subject of substantial research focus. Efforts have evolved from an initial focus on understanding the etiology of intramammary infections to the application of preventative measures, including attempts to breed cows that are resistant to infection. However, breeding for resistance to infection has proven difficult, given the complexity of the disease and the high expense associated with assembling high-quality genotypes and phenotypes. This review provides a brief background on mastitis; illustrates current understanding of the genetics influencing mastitis and the application of this knowledge; and discusses challenges and limitations in understanding these mechanisms and applying these findings to genetic improvement strategies.
Collapse
|
97
|
Genomic Prediction Informed by Biological Processes Expands Our Understanding of the Genetic Architecture Underlying Free Amino Acid Traits in Dry Arabidopsis Seeds. G3-GENES GENOMES GENETICS 2020; 10:4227-4239. [PMID: 32978264 PMCID: PMC7642941 DOI: 10.1534/g3.120.401240] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Plant growth, development, and nutritional quality depends upon amino acid homeostasis, especially in seeds. However, our understanding of the underlying genetics influencing amino acid content and composition remains limited, with only a few candidate genes and quantitative trait loci identified to date. Improved knowledge of the genetics and biological processes that determine amino acid levels will enable researchers to use this information for plant breeding and biological discovery. Toward this goal, we used genomic prediction to identify biological processes that are associated with, and therefore potentially influence, free amino acid (FAA) composition in seeds of the model plant Arabidopsis thaliana. Markers were split into categories based on metabolic pathway annotations and fit using a genomic partitioning model to evaluate the influence of each pathway on heritability explained, model fit, and predictive ability. Selected pathways included processes known to influence FAA composition, albeit to an unknown degree, and spanned four categories: amino acid, core, specialized, and protein metabolism. Using this approach, we identified associations for pathways containing known variants for FAA traits, in addition to finding new trait-pathway associations. Markers related to amino acid metabolism, which are directly involved in FAA regulation, improved predictive ability for branched chain amino acids and histidine. The use of genomic partitioning also revealed patterns across biochemical families, in which serine-derived FAAs were associated with protein related annotations and aromatic FAAs were associated with specialized metabolic pathways. Taken together, these findings provide evidence that genomic partitioning is a viable strategy to uncover the relative contributions of biological processes to FAA traits in seeds, offering a promising framework to guide hypothesis testing and narrow the search space for candidate genes.
Collapse
|
98
|
Fritsche LG, Patil S, Beesley LJ, VandeHaar P, Salvatore M, Ma Y, Peng RB, Taliun D, Zhou X, Mukherjee B. Cancer PRSweb: An Online Repository with Polygenic Risk Scores for Major Cancer Traits and Their Evaluation in Two Independent Biobanks. Am J Hum Genet 2020; 107:815-836. [PMID: 32991828 PMCID: PMC7675001 DOI: 10.1016/j.ajhg.2020.08.025] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2020] [Accepted: 08/28/2020] [Indexed: 02/06/2023] Open
Abstract
To facilitate scientific collaboration on polygenic risk scores (PRSs) research, we created an extensive PRS online repository for 35 common cancer traits integrating freely available genome-wide association studies (GWASs) summary statistics from three sources: published GWASs, the NHGRI-EBI GWAS Catalog, and UK Biobank-based GWASs. Our framework condenses these summary statistics into PRSs using various approaches such as linkage disequilibrium pruning/p value thresholding (fixed or data-adaptively optimized thresholds) and penalized, genome-wide effect size weighting. We evaluated the PRSs in two biobanks: the Michigan Genomics Initiative (MGI), a longitudinal biorepository effort at Michigan Medicine, and the population-based UK Biobank (UKB). For each PRS construct, we provide measures on predictive performance and discrimination. Besides PRS evaluation, the Cancer-PRSweb platform features construct downloads and phenome-wide PRS association study results (PRS-PheWAS) for predictive PRSs. We expect this integrated platform to accelerate PRS-related cancer research.
Collapse
|
99
|
Hillis DA, Yadgary L, Weinstock GM, Pardo-Manuel de Villena F, Pomp D, Fowler AS, Xu S, Chan F, Garland T. Genetic Basis of Aerobically Supported Voluntary Exercise: Results from a Selection Experiment with House Mice. Genetics 2020; 216:781-804. [PMID: 32978270 PMCID: PMC7648575 DOI: 10.1534/genetics.120.303668] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Accepted: 09/18/2020] [Indexed: 12/14/2022] Open
Abstract
The biological basis of exercise behavior is increasingly relevant for maintaining healthy lifestyles. Various quantitative genetic studies and selection experiments have conclusively demonstrated substantial heritability for exercise behavior in both humans and laboratory rodents. In the "High Runner" selection experiment, four replicate lines of Mus domesticus were bred for high voluntary wheel running (HR), along with four nonselected control (C) lines. After 61 generations, the genomes of 79 mice (9-10 from each line) were fully sequenced and single nucleotide polymorphisms (SNPs) were identified. We used nested ANOVA with MIVQUE estimation and other approaches to compare allele frequencies between the HR and C lines for both SNPs and haplotypes. Approximately 61 genomic regions, across all somatic chromosomes, showed evidence of differentiation; 12 of these regions were differentiated by all methods of analysis. Gene function was inferred largely using Panther gene ontology terms and KO phenotypes associated with genes of interest. Some of the differentiated genes are known to be associated with behavior/motivational systems and/or athletic ability, including Sorl1, Dach1, and Cdh10 Sorl1 is a sorting protein associated with cholinergic neuron morphology, vascular wound healing, and metabolism. Dach1 is associated with limb bud development and neural differentiation. Cdh10 is a calcium ion binding protein associated with phrenic neurons. Overall, these results indicate that selective breeding for high voluntary exercise has resulted in changes in allele frequencies for multiple genes associated with both motivation and ability for endurance exercise, providing candidate genes that may explain phenotypic changes observed in previous studies.
Collapse
|
100
|
Jakobson CM, Jarosz DF. What Has a Century of Quantitative Genetics Taught Us About Nature's Genetic Tool Kit? Annu Rev Genet 2020; 54:439-464. [PMID: 32897739 DOI: 10.1146/annurev-genet-021920-102037] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The complexity of heredity has been appreciated for decades: Many traits are controlled not by a single genetic locus but instead by polymorphisms throughout the genome. The importance of complex traits in biology and medicine has motivated diverse approaches to understanding their detailed genetic bases. Here, we focus on recent systematic studies, many in budding yeast, which have revealed that large numbers of all kinds of molecular variation, from noncoding to synonymous variants, can make significant contributions to phenotype. Variants can affect different traits in opposing directions, and their contributions can be modified by both the environment and the epigenetic state of the cell. The integration of prospective (synthesizing and analyzing variants) and retrospective (examining standing variation) approaches promises to reveal how natural selection shapes quantitative traits. Only by comprehensively understanding nature's genetic tool kit can we predict how phenotypes arise from the complex ensembles of genetic variants in living organisms.
Collapse
|