51
|
Speed D, Kaphle A, Balding DJ. SNP-based heritability and selection analyses: Improved models and new results. Bioessays 2022; 44:e2100170. [PMID: 35279859 DOI: 10.1002/bies.202100170] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 03/02/2022] [Accepted: 03/03/2022] [Indexed: 01/15/2023]
Abstract
Complex-trait genetics has advanced dramatically through methods to estimate the heritability tagged by SNPs, both genome-wide and in genomic regions of interest such as those defined by functional annotations. The models underlying many of these analyses are inadequate, and consequently many SNP-heritability results published to date are inaccurate. Here, we review the modelling issues, both for analyses based on individual genotype data and association test statistics, highlighting the role of a low-dimensional model for the heritability of each SNP. We use state-of-art models to present updated results about how heritability is distributed with respect to functional annotations in the human genome, and how it varies with allele frequency, which can reflect purifying selection. Our results give finer detail to the picture that has emerged in recent years of complex trait heritability widely dispersed across the genome. Confounding due to population structure remains a problem that summary statistic analyses cannot reliably overcome. Also see the video abstract here: https://youtu.be/WC2u03V65MQ.
Collapse
|
52
|
Kuijpers Y, Domínguez-Andrés J, Bakker OB, Gupta MK, Grasshoff M, Xu CJ, Joosten LAB, Bertranpetit J, Netea MG, Li Y. Evolutionary Trajectories of Complex Traits in European Populations of Modern Humans. Front Genet 2022; 13:833190. [PMID: 35419030 PMCID: PMC8995853 DOI: 10.3389/fgene.2022.833190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 03/11/2022] [Indexed: 11/14/2022] Open
Abstract
Humans have a great diversity in phenotypes, influenced by genetic, environmental, nutritional, cultural, and social factors. Understanding the historical trends of physiological traits can shed light on human physiology, as well as elucidate the factors that influence human diseases. Here we built genome-wide polygenic scores for heritable traits, including height, body mass index, lipoprotein concentrations, cardiovascular disease, and intelligence, using summary statistics of genome-wide association studies in Europeans. Subsequently, we applied these scores to the genomes of ancient European populations. Our results revealed that after the Neolithic, European populations experienced an increase in height and intelligence scores, decreased their skin pigmentation, while the risk for coronary artery disease increased through a genetic trajectory favoring low HDL concentrations. These results are a reflection of the continuous evolutionary processes in humans and highlight the impact that the Neolithic revolution had on our lifestyle and health.
Collapse
|
53
|
Foster CSP, Van Dyke JU, Thompson MB, Smith NMA, Simpfendorfer CA, Murphy CR, Whittington CM. Different genes are recruited during convergent evolution of pregnancy and the placenta. Mol Biol Evol 2022; 39:6564414. [PMID: 35388432 PMCID: PMC9048886 DOI: 10.1093/molbev/msac077] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The repeated evolution of the same traits in distantly related groups (convergent evolution) raises a key question in evolutionary biology: do the same genes underpin convergent phenotypes? Here we explore one such trait, viviparity (live birth), which qualitative studies suggest may indeed have evolved via genetic convergence. There are >150 independent origins of live birth in vertebrates, providing a uniquely powerful system to test the mechanisms underpinning convergence in morphology, physiology, and/or gene recruitment during pregnancy. We compared transcriptomic data from eight vertebrates (lizards, mammals, sharks) that gestate embryos within the uterus. Since many previous studies detected qualitative similarities in gene use during independent origins of pregnancy, we expected to find significant overlap in gene use in viviparous taxa. However, we found no more overlap in uterine gene expression associated with viviparity than we would expect by chance alone. Each viviparous lineage exhibits the same core set of uterine physiological functions, yet, contrary to prevailing assumptions about this trait, we find that none of the same genes are differentially expressed in all viviparous lineages, or even in all viviparous amniote lineages. Therefore, across distantly related vertebrates, different genes have been recruited to support the morphological and physiological changes required for successful pregnancy. We conclude that redundancies in gene function have enabled the repeated evolution of viviparity through recruitment of different genes from genomic 'toolboxes', which are uniquely constrained by the ancestries of each lineage.
Collapse
|
54
|
Macias-Velasco JF, St Pierre CL, Wayhart JP, Yin L, Spears L, Miranda MA, Carson C, Funai K, Cheverud JM, Semenkovich CF, Lawson HA. Parent-of-origin effects propagate through networks to shape metabolic traits. eLife 2022; 11:e72989. [PMID: 35356864 PMCID: PMC9075957 DOI: 10.7554/elife.72989] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Accepted: 03/25/2022] [Indexed: 11/16/2022] Open
Abstract
Parent-of-origin effects are unexpectedly common in complex traits, including metabolic and neurological traits. Parent-of-origin effects can be modified by the environment, but the architecture of these gene-by-environmental effects on phenotypes remains to be unraveled. Previously, quantitative trait loci (QTL) showing context-specific parent-of-origin effects on metabolic traits were mapped in the F16 generation of an advanced intercross between LG/J and SM/J inbred mice. However, these QTL were not enriched for known imprinted genes, suggesting another mechanism is needed to explain these parent-of-origin effects phenomena. We propose that non-imprinted genes can generate complex parent-of-origin effects on metabolic traits through interactions with imprinted genes. Here, we employ data from mouse populations at different levels of intercrossing (F0, F1, F2, F16) of the LG/J and SM/J inbred mouse lines to test this hypothesis. Using multiple populations and incorporating genetic, genomic, and physiological data, we leverage orthogonal evidence to identify networks of genes through which parent-of-origin effects propagate. We identify a network comprised of three imprinted and six non-imprinted genes that show parent-of-origin effects. This epistatic network forms a nutritional responsive pathway and the genes comprising it jointly serve cellular functions associated with growth. We focus on two genes, Nnat and F2r, whose interaction associates with serum glucose levels across generations in high-fat-fed females. Single-cell RNAseq reveals that Nnat expression increases and F2r expression decreases in pre-adipocytes along an adipogenic trajectory, a result that is consistent with our observations in bulk white adipose tissue.
Collapse
|
55
|
Abstract
SignificanceTo adapt to arboreal lifestyles, treefrogs have evolved a suite of complex traits that support vertical movement and gliding, thus presenting a unique case for studying the genetic basis for traits causally linked to vertical niche expansion. Here, based on two de novo-assembled Asian treefrog genomes, we determined that genes involved in limb development and keratin cytoskeleton likely played a role in the evolution of their climbing systems. Behavioral and morphological evaluation and time-ordered gene coexpression network analysis revealed the developmental patterns and regulatory pathways of the webbed feet used for gliding in Rhacophorus kio.
Collapse
|
56
|
Macdonald SJ, Cloud-Richardson KM, Sims-West DJ, Long AD. Powerful, efficient QTL mapping in Drosophila melanogaster using bulked phenotyping and pooled sequencing. Genetics 2022; 220:iyab238. [PMID: 35100395 PMCID: PMC8893256 DOI: 10.1093/genetics/iyab238] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 12/19/2021] [Indexed: 01/22/2024] Open
Abstract
Despite the value of recombinant inbred lines for the dissection of complex traits, large panels can be difficult to maintain, distribute, and phenotype. An attractive alternative to recombinant inbred lines for many traits leverages selecting phenotypically extreme individuals from a segregating population, and subjecting pools of selected and control individuals to sequencing. Under a bulked or extreme segregant analysis paradigm, genomic regions contributing to trait variation are revealed as frequency differences between pools. Here, we describe such an extreme quantitative trait locus, or extreme quantitative trait loci, mapping strategy that builds on an existing multiparental population, the Drosophila Synthetic Population Resource, and involves phenotyping and genotyping a population derived by mixing hundreds of Drosophila Synthetic Population Resource recombinant inbred lines. Simulations demonstrate that challenging, yet experimentally tractable extreme quantitative trait loci designs (≥4 replicates, ≥5,000 individuals/replicate, and selecting the 5-10% most extreme animals) yield at least the same power as traditional recombinant inbred line-based quantitative trait loci mapping and can localize variants with sub-centimorgan resolution. We empirically demonstrate the effectiveness of the approach using a 4-fold replicated extreme quantitative trait loci experiment that identifies 7 quantitative trait loci for caffeine resistance. Two mapped extreme quantitative trait loci factors replicate loci previously identified in recombinant inbred lines, 6/7 are associated with excellent candidate genes, and RNAi knock-downs support the involvement of 4 genes in the genetic control of trait variation. For many traits of interest to drosophilists, a bulked phenotyping/genotyping extreme quantitative trait loci design has considerable advantages.
Collapse
|
57
|
Mi S, Shi Y, Dari G, Yu Y. Function of m6A and its regulation of domesticated animals' complex traits. J Anim Sci 2022; 100:6524534. [PMID: 35137116 PMCID: PMC8942107 DOI: 10.1093/jas/skac034] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Accepted: 02/06/2022] [Indexed: 11/14/2022] Open
Abstract
N6-methyladenosine (m6A) is the most functionally important epigenetic modification in RNA. The m6A modification widely exists in mRNA and noncoding RNA, influences the mRNA processing, and regulates the secondary structure and maturation of noncoding RNA. Studies showed the important regulatory roles of m6A modification in animal's complex traits, such as development, immunity, and reproduction-related traits. As an important intermediate stage from animal genome to phenotype, the function of m6A in the complex trait formation of domestic animals cannot be neglected. This review discusses recent research advances on m6A modification in well-studied organisms, such as human and model organisms, and introduces m6A detection technologies, small-molecule inhibitors of m6A-related enzymes, interaction between m6A and other biological progresses, and the regulation mechanisms of m6A in domesticated animals' complex traits.
Collapse
|
58
|
Wu Y, Burch KS, Ganna A, Pajukanta P, Pasaniuc B, Sankararaman S. Fast estimation of genetic correlation for biobank-scale data. Am J Hum Genet 2022; 109:24-32. [PMID: 34861179 PMCID: PMC8764132 DOI: 10.1016/j.ajhg.2021.11.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 11/09/2021] [Indexed: 11/24/2022] Open
Abstract
Genetic correlation is an important parameter in efforts to understand the relationships among complex traits. Current methods that analyze individual genotype data for estimating genetic correlation are challenging to scale to large datasets. Methods that analyze summary data, while being computationally efficient, tend to yield estimates of genetic correlation with reduced precision. We propose SCORE (scalable genetic correlation estimator), a randomized method of moments estimator of genetic correlation that is both scalable and accurate. SCORE obtains more precise estimates of genetic correlations relative to summary-statistic methods that can be applied at scale; it achieves a 44% reduction in standard error relative to LD-score regression (LDSC) and a 20% reduction relative to high-definition likelihood (HDL) (averaged over all simulations). The efficiency of SCORE enables computation of genetic correlations on the UK Biobank dataset, consisting of ≈300 K individuals and ≈500 K SNPs, in a few h (orders of magnitude faster than methods that analyze individual data, such as GCTA). Across 780 pairs of traits in 291,273 unrelated white British individuals in the UK Biobank, SCORE identifies significant genetic correlation between 200 additional pairs of traits over LDSC (beyond the 245 pairs identified by both).
Collapse
|
59
|
Hayward LK, Sella G. Polygenic adaptation after a sudden change in environment. eLife 2022; 11:66697. [PMID: 36155653 PMCID: PMC9683794 DOI: 10.7554/elife.66697] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 07/18/2022] [Indexed: 11/13/2022] Open
Abstract
Polygenic adaptation is thought to be ubiquitous, yet remains poorly understood. Here, we model this process analytically, in the plausible setting of a highly polygenic, quantitative trait that experiences a sudden shift in the fitness optimum. We show how the mean phenotype changes over time, depending on the effect sizes of loci that contribute to variance in the trait, and characterize the allele dynamics at these loci. Notably, we describe the two phases of the allele dynamics: The first is a rapid phase, in which directional selection introduces small frequency differences between alleles whose effects are aligned with or opposed to the shift, ultimately leading to small differences in their probability of fixation during a second, longer phase, governed by stabilizing selection. As we discuss, key results should hold in more general settings and have important implications for efforts to identify the genetic basis of adaptation in humans and other species.
Collapse
|
60
|
Sohail M, Izarraras-Gomez A, Ortega-Del Vecchyo D. Populations, Traits, and Their Spatial Structure in Humans. Genome Biol Evol 2021; 13:evab272. [PMID: 34894236 PMCID: PMC8715524 DOI: 10.1093/gbe/evab272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2021] [Indexed: 11/16/2022] Open
Abstract
The spatial distribution of genetic variants is jointly determined by geography, past demographic processes, natural selection, and its interplay with environmental variation. A fraction of these genetic variants are "causal alleles" that affect the manifestation of a complex trait. The effect exerted by these causal alleles on complex traits can be independent or dependent on the environment. Understanding the evolutionary processes that shape the spatial structure of causal alleles is key to comprehend the spatial distribution of complex traits. Natural selection, past population size changes, range expansions, consanguinity, assortative mating, archaic introgression, admixture, and the environment can alter the frequencies, effect sizes, and heterozygosities of causal alleles. This provides a genetic axis along which complex traits can vary. However, complex traits also vary along biogeographical and sociocultural axes which are often correlated with genetic axes in complex ways. The purpose of this review is to consider these genetic and environmental axes in concert and examine the ways they can help us decipher the variation in complex traits that is visible in humans today. This initiative necessarily implies a discussion of populations, traits, the ability to infer and interpret "genetic" components of complex traits, and how these have been impacted by adaptive events. In this review, we provide a history-aware discussion on these topics using both the recent and more distant past of our academic discipline and its relevant contexts.
Collapse
|
61
|
Wang B, Yang J, Qiu S, Bai Y, Qin ZS. Systematic Exploration in Tissue-Pathway Associations of Complex Traits Using Comprehensive eQTLs Catalog. Front Big Data 2021; 4:719737. [PMID: 34805976 PMCID: PMC8595594 DOI: 10.3389/fdata.2021.719737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 10/13/2021] [Indexed: 11/13/2022] Open
Abstract
The collection of expression quantitative trait loci (eQTLs) is an important resource to study complex traits through understanding where and how transcriptional regulations are controlled by genetic variations in the non-coding regions of the genome. Previous studies have focused on associating eQTLs with traits to identify the roles of trait-related eQTLs and their corresponding target genes involved in trait determination. Since most genes function as a part of pathways in a systematic manner, it is crucial to explore the pathways’ involvements in complex traits to test potentially novel hypotheses and to reveal underlying mechanisms of disease pathogenesis. In this study, we expanded and applied loci2path software to perform large-scale eQTLs enrichment [i.e., eQTLs’ target genes (eGenes) enrichment] analysis at pathway level to identify the tissue-specific enriched pathways within trait-related genomic intervals. By utilizing 13,791,909 eQTLs cataloged in the Genotype-Tissue Expression (GTEx) V8 data for 49 tissue types, 2,893 pathway sets reported from MSigDB, and query regions derived from the Phenotype-Genotype Integrator (PheGenI) catalog, we identified intriguing biological pathways that are likely to be involved in ten traits [Alzheimer’s disease (AD), body mass index, Parkinson’s disease (PD), schizophrenia, amyotrophic lateral sclerosis, non-small cell lung cancer (NSCLC), stroke, blood pressure, autism spectrum disorder, and myocardial infarction]. Furthermore, we extracted the most significant pathways for AD, such as BioCarta D4-GDI pathway and WikiPathways sulfation biotransformation reaction and viral acute myocarditis pathways, to study specific genes within pathways. Our data presented new hypotheses in AD pathogenesis supported by previous studies, like the increased level of caspase-3 in the amygdala that cleaves GDP dissociation inhibitor and binds to beta-amyloid, leading to increased apoptosis and neuronal loss. Our findings also revealed potential pathogenesis mechanisms for PD, schizophrenia, NSCLC, blood pressure, autism spectrum disorder, and myocardial infarction, which were consistent with past studies. Our results indicated that loci2path′s eQTLs enrichment test was valuable in unveiling novel biological mechanisms of complex traits. The discovered mechanisms of disease pathogenesis and traits require further in-depth analysis and experimental validation.
Collapse
|
62
|
Aponte JD, Katz DC, Roth DM, Vidal-García M, Liu W, Andrade F, Roseman CC, Murray SA, Cheverud J, Graf D, Marcucio RS, Hallgrímsson B. Relating multivariate shapes to genescapes using phenotype-biological process associations for craniofacial shape. eLife 2021; 10:68623. [PMID: 34779766 PMCID: PMC8631940 DOI: 10.7554/elife.68623] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Accepted: 11/12/2021] [Indexed: 12/20/2022] Open
Abstract
Realistic mappings of genes to morphology are inherently multivariate on both sides of the equation. The importance of coordinated gene effects on morphological phenotypes is clear from the intertwining of gene actions in signaling pathways, gene regulatory networks, and developmental processes underlying the development of shape and size. Yet, current approaches tend to focus on identifying and localizing the effects of individual genes and rarely leverage the information content of high-dimensional phenotypes. Here, we explicitly model the joint effects of biologically coherent collections of genes on a multivariate trait – craniofacial shape – in a sample of n = 1145 mice from the Diversity Outbred (DO) experimental line. We use biological process Gene Ontology (GO) annotations to select skeletal and facial development gene sets and solve for the axis of shape variation that maximally covaries with gene set marker variation. We use our process-centered, multivariate genotype-phenotype (process MGP) approach to determine the overall contributions to craniofacial variation of genes involved in relevant processes and how variation in different processes corresponds to multivariate axes of shape variation. Further, we compare the directions of effect in phenotype space of mutations to the primary axis of shape variation associated with broader pathways within which they are thought to function. Finally, we leverage the relationship between mutational and pathway-level effects to predict phenotypic effects beyond craniofacial shape in specific mutants. We also introduce an online application that provides users the means to customize their own process-centered craniofacial shape analyses in the DO. The process-centered approach is generally applicable to any continuously varying phenotype and thus has wide-reaching implications for complex trait genetics.
Collapse
|
63
|
Kondratyev NV, Alfimova MV, Golov AK, Golimbet VE. Bench Research Informed by GWAS Results. Cells 2021; 10:3184. [PMID: 34831407 PMCID: PMC8623533 DOI: 10.3390/cells10113184] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 11/11/2021] [Accepted: 11/11/2021] [Indexed: 12/15/2022] Open
Abstract
Scientifically interesting as well as practically important phenotypes often belong to the realm of complex traits. To the extent that these traits are hereditary, they are usually 'highly polygenic'. The study of such traits presents a challenge for researchers, as the complex genetic architecture of such traits makes it nearly impossible to utilise many of the usual methods of reverse genetics, which often focus on specific genes. In recent years, thousands of genome-wide association studies (GWAS) were undertaken to explore the relationships between complex traits and a large number of genetic factors, most of which are characterised by tiny effects. In this review, we aim to familiarise 'wet biologists' with approaches for the interpretation of GWAS results, to clarify some issues that may seem counterintuitive and to assess the possibility of using GWAS results in experiments on various complex traits.
Collapse
|
64
|
Genetic overlap and causality between blood metabolites and migraine. Am J Hum Genet 2021; 108:2086-2098. [PMID: 34644541 DOI: 10.1016/j.ajhg.2021.09.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 09/17/2021] [Indexed: 12/11/2022] Open
Abstract
The availability of genome-wide association studies (GWASs) for human blood metabolome provides an excellent opportunity for studying metabolism in a heritable disease such as migraine. Utilizing GWAS summary statistics, we conduct comprehensive pairwise genetic analyses to estimate polygenic genetic overlap and causality between 316 unique blood metabolite levels and migraine risk. We find significant genome-wide genetic overlap between migraine and 44 metabolites, mostly lipid and organic acid metabolic traits (FDR < 0.05). We also identify 36 metabolites, mostly related to lipoproteins, that have shared genetic influences with migraine at eight independent genomic loci (posterior probability > 0.9) across chromosomes 3, 5, 6, 9, and 16. The observed relationships between genetic factors influencing blood metabolite levels and genetic risk for migraine suggest an alteration of metabolite levels in individuals with migraine. Our analyses suggest higher levels of fatty acids, except docosahexaenoic acid (DHA), a very long-chain omega-3, in individuals with migraine. Consistently, we found a causally protective role for a longer length of fatty acids against migraine. We also identified a causal effect for a higher level of a lysophosphatidylethanolamine, LPE(20:4), on migraine, thus introducing LPE(20:4) as a potential therapeutic target for migraine.
Collapse
|
65
|
Domarkienė I, Ambrozaitytė L, Bukauskas L, Rančelis T, Sütterlin S, Knox BJ, Maennel K, Maennel O, Parish K, Lugo RG, Brilingaitė A. CyberGenomics: Application of Behavioral Genetics in Cybersecurity. Behav Sci (Basel) 2021; 11:bs11110152. [PMID: 34821613 PMCID: PMC8614761 DOI: 10.3390/bs11110152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 10/21/2021] [Accepted: 10/29/2021] [Indexed: 11/19/2022] Open
Abstract
Cybersecurity (CS) is a contemporary field for research and applied study of a range of aspects from across multiple disciplines. A cybersecurity expert has an in-depth knowledge of technology but is often also recognized for the ability to view technology in a non-standard way. This paper explores how CS specialists are both a combination of professional computing-based skills and genetically encoded traits. Almost every human behavioral trait is a result of many genome variants in action altogether with environmental factors. The review focuses on contextualizing the behavior genetics aspects in the application of cybersecurity. It reconsiders methods that help to identify aspects of human behavior from the genetic information. And stress is an illustrative factor to start the discussion within the community on what methodology should be used in an ethical way to approach those questions. CS positions are considered stressful due to the complexity of the domain and the social impact it can have in cases of failure. An individual risk profile could be created combining known genome variants linked to a trait of particular behavior using a special biostatistical approach such as a polygenic score. These revised advancements bring challenging possibilities in the applications of human behavior genetics and CS.
Collapse
|
66
|
Ma Y, Zhou X. Genetic prediction of complex traits with polygenic scores: a statistical review. Trends Genet 2021; 37:995-1011. [PMID: 34243982 PMCID: PMC8511058 DOI: 10.1016/j.tig.2021.06.004] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 05/31/2021] [Accepted: 06/03/2021] [Indexed: 01/03/2023]
Abstract
Accurate genetic prediction of complex traits can facilitate disease screening, improve early intervention, and aid in the development of personalized medicine. Genetic prediction of complex traits requires the development of statistical methods that can properly model polygenic architecture and construct a polygenic score (PGS). We present a comprehensive review of 46 methods for PGS construction. We connect the majority of these methods through a multiple linear regression framework which can be instrumental for understanding their prediction performance for traits with distinct genetic architectures. We discuss the practical considerations of PGS analysis as well as challenges and future directions of PGS method development. We hope our review serves as a useful reference both for statistical geneticists who develop PGS methods and for data analysts who perform PGS analysis.
Collapse
|
67
|
Zhang Y, Cheng Y, Jiang W, Ye Y, Lu Q, Zhao H. Comparison of methods for estimating genetic correlation between complex traits using GWAS summary statistics. Brief Bioinform 2021; 22:bbaa442. [PMID: 33497438 PMCID: PMC8425307 DOI: 10.1093/bib/bbaa442] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 12/12/2020] [Accepted: 12/30/2020] [Indexed: 01/03/2023] Open
Abstract
Genetic correlation is the correlation of phenotypic effects by genetic variants across the genome on two phenotypes. It is an informative metric to quantify the overall genetic similarity between complex traits, which provides insights into their polygenic genetic architecture. Several methods have been proposed to estimate genetic correlation based on data collected from genome-wide association studies (GWAS). Due to the easy access of GWAS summary statistics and computational efficiency, methods only requiring GWAS summary statistics as input have become more popular than methods utilizing individual-level genotype data. Here, we present a benchmark study for different summary-statistics-based genetic correlation estimation methods through simulation and real data applications. We focus on two major technical challenges in estimating genetic correlation: marker dependency caused by linkage disequilibrium (LD) and sample overlap between different studies. To assess the performance of different methods in the presence of these two challenges, we first conducted comprehensive simulations with diverse LD patterns and sample overlaps. Then we applied these methods to real GWAS summary statistics for a wide spectrum of complex traits. Based on these experiments, we conclude that methods relying on accurate LD estimation are less robust in real data applications due to the imprecision of LD obtained from reference panels. Our findings offer guidance on how to choose appropriate methods for genetic correlation estimation in post-GWAS analysis.
Collapse
|
68
|
Khan AH, Smith DJ. Cost-Effective Mapping of Genetic Interactions in Mammalian Cells. Front Genet 2021; 12:703738. [PMID: 34434222 PMCID: PMC8381747 DOI: 10.3389/fgene.2021.703738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 07/13/2021] [Indexed: 11/23/2022] Open
Abstract
Comprehensive maps of genetic interactions in mammalian cells are daunting to construct because of the large number of potential interactions, ~ 2 × 108 for protein coding genes. We previously used co-inheritance of distant genes from published radiation hybrid (RH) datasets to identify genetic interactions. However, it was necessary to combine six legacy datasets from four species to obtain adequate statistical power. Mapping resolution was also limited by the low density PCR genotyping. Here, we employ shallow sequencing of nascent human RH clones as an economical approach to constructing interaction maps. In this initial study, 15 clones were analyzed, enabling construction of a network with 225 genes and 2,359 interactions (FDR < 0.05). Despite its small size, the network showed significant overlap with the previous RH network and with a protein-protein interaction network. Consumables were ≲$50 per clone, showing that affordable, high quality genetic interaction maps are feasible in mammalian cells.
Collapse
|
69
|
David S. A current guide to candidate gene association studies. Trends Genet 2021; 37:1056-1059. [PMID: 34400010 DOI: 10.1016/j.tig.2021.07.009] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 07/15/2021] [Accepted: 07/16/2021] [Indexed: 11/30/2022]
Abstract
Important factors contribute to a gained momentum in candidate gene association studies (CGASs), including the generalized use of next-generation sequencing (NGS), growing opportunities for hospital-based research, and the availability of open-source databases and bioinformatics tools. This article summarizes the general principles and analytical methods as a guide to CGASs in today's favorable context.
Collapse
|
70
|
Wang L, Gao B, Fan Y, Xue F, Zhou X. Mendelian randomization under the omnigenic architecture. Brief Bioinform 2021; 22:6347949. [PMID: 34379090 DOI: 10.1093/bib/bbab322] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 07/22/2021] [Accepted: 07/24/2021] [Indexed: 11/15/2022] Open
Abstract
Mendelian randomization (MR) is a common analytic tool for exploring the causal relationship among complex traits. Existing MR methods require selecting a small set of single nucleotide polymorphisms (SNPs) to serve as instrument variables. However, selecting a small set of SNPs may not be ideal, as most complex traits have a polygenic or omnigenic architecture and are each influenced by thousands of SNPs. Here, motivated by the recent omnigenic hypothesis, we present an MR method that uses all genome-wide SNPs for causal inference. Our method uses summary statistics from genome-wide association studies as input, accommodates the commonly encountered horizontal pleiotropy effects and relies on a composite likelihood framework for scalable computation. We refer to our method as the omnigenic Mendelian randomization, or OMR. We examine the power and robustness of OMR through extensive simulations including those under various modeling misspecifications. We apply OMR to several real data applications, where we identify multiple complex traits that potentially causally influence coronary artery disease (CAD) and asthma. The identified new associations reveal important roles of blood lipids, blood pressure and immunity underlying CAD as well as important roles of immunity and obesity underlying asthma.
Collapse
|
71
|
Irving-Pease EK, Muktupavela R, Dannemann M, Racimo F. Quantitative Human Paleogenetics: What can Ancient DNA Tell us About Complex Trait Evolution? Front Genet 2021; 12:703541. [PMID: 34422004 PMCID: PMC8371751 DOI: 10.3389/fgene.2021.703541] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 07/08/2021] [Indexed: 12/13/2022] Open
Abstract
Genetic association data from national biobanks and large-scale association studies have provided new prospects for understanding the genetic evolution of complex traits and diseases in humans. In turn, genomes from ancient human archaeological remains are now easier than ever to obtain, and provide a direct window into changes in frequencies of trait-associated alleles in the past. This has generated a new wave of studies aiming to analyse the genetic component of traits in historic and prehistoric times using ancient DNA, and to determine whether any such traits were subject to natural selection. In humans, however, issues about the portability and robustness of complex trait inference across different populations are particularly concerning when predictions are extended to individuals that died thousands of years ago, and for which little, if any, phenotypic validation is possible. In this review, we discuss the advantages of incorporating ancient genomes into studies of trait-associated variants, the need for models that can better accommodate ancient genomes into quantitative genetic frameworks, and the existing limits to inferences about complex trait evolution, particularly with respect to past populations.
Collapse
|
72
|
Genome-Wide Association Study Using Whole-Genome Sequence Data for Fertility, Health Indicator, and Endoparasite Infection Traits in German Black Pied Cattle. Genes (Basel) 2021; 12:genes12081163. [PMID: 34440337 PMCID: PMC8391191 DOI: 10.3390/genes12081163] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 07/24/2021] [Accepted: 07/27/2021] [Indexed: 12/16/2022] Open
Abstract
This genome-wide association study (GWAS) aimed to identify sequence variants (SVs) and candidate genes associated with fertility and health in endangered German Black Pied cattle (DSN) based on whole-genome sequence (WGS) data. We used 304 sequenced DSN cattle for the imputation of 1797 genotyped DSN to WGS. The final dataset included 11,413,456 SVs of 1886 cows. Cow traits were calving-to-first service interval (CTFS), non-return after 56 days (NR56), somatic cell score (SCS), fat-to-protein ratio (FPR), and three pre-corrected endoparasite infection traits. We identified 40 SVs above the genome-wide significance and suggestive threshold associated with CTFS and NR56, and three important potential candidate genes (ARHGAP21, MARCH11, and ZNF462). For SCS, most associations were observed on BTA 25. The GWAS revealed 61 SVs, a cluster of 10 candidate genes on BTA 13, and 7 pathways for FPR, including key mediators involved in milk fat synthesis. The strongest associations for gastrointestinal nematode and Dictyocaulus viviparus infections were detected on BTA 8 and 24, respectively. For Fasciola hepatica infections, the strongest associated SVs were located on BTA 4 and 7. We detected 200 genes for endoparasite infection traits, related to 16 pathways involved in host immune response during infection.
Collapse
|
73
|
Leocadio-Miguel MA, Ruiz FS, Ahmed SS, Taporoski TP, Horimoto ARVR, Beijamini F, Pedrazzoli M, Knutson KL, Pereira AC, von Schantz M. Compared Heritability of Chronotype Instruments in a Single Population Sample. J Biol Rhythms 2021; 36:483-490. [PMID: 34313481 PMCID: PMC8442136 DOI: 10.1177/07487304211030420] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
It is well established that the oldest chronotype questionnaire, the
morningness-eveningness questionnaire (MEQ), has significant
heritability, and several associations have been reported between MEQ
score and polymorphisms in candidate clock genes, a number of them
reproducibly across populations. By contrast, there are no reports of
heritability and genetic associations for the Munich chronotype
questionnaire (MCTQ). Recent genome-wide association studies (GWAS)
from large cohorts have reported multiple associations with chronotype
as assessed by a single self-evaluation question. We have taken
advantage of the availability of data from all these instruments from
a single sample of 597 participants from the Brazilian Baependi Heart
Study. The family-based design of the cohort allowed us to calculate
the heritability (h2) for these measures. Heritability
values for the best-fitted models were 0.37 for MEQ, 0.32 for MCTQ,
and 0.28 for single-question chronotype (MEQ Question 19). We also
calculated the heritability for the two major factors recently derived
from MEQ, “Dissipation of sleep pressure” (0.32) and “Build-up of
sleep pressure” (0.28). This first heritability comparison of the
major chronotype instruments in current use provides the first
quantification of the genetic component of MCTQ score, supporting its
future use in genetic analysis. Our findings also suggest that the
single chronotype question that has been used for large GWAS analyses
captures a larger proportion of the dimensions of chronotype than
previously thought.
Collapse
|
74
|
Chawla A, Nagy C, Turecki G. Chromatin Profiling Techniques: Exploring the Chromatin Environment and Its Contributions to Complex Traits. Int J Mol Sci 2021; 22:7612. [PMID: 34299232 PMCID: PMC8305586 DOI: 10.3390/ijms22147612] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 07/09/2021] [Accepted: 07/13/2021] [Indexed: 01/04/2023] Open
Abstract
The genetic architecture of complex traits is multifactorial. Genome-wide association studies (GWASs) have identified risk loci for complex traits and diseases that are disproportionately located at the non-coding regions of the genome. On the other hand, we have just begun to understand the regulatory roles of the non-coding genome, making it challenging to precisely interpret the functions of non-coding variants associated with complex diseases. Additionally, the epigenome plays an active role in mediating cellular responses to fluctuations of sensory or environmental stimuli. However, it remains unclear how exactly non-coding elements associate with epigenetic modifications to regulate gene expression changes and mediate phenotypic outcomes. Therefore, finer interrogations of the human epigenomic landscape in associating with non-coding variants are warranted. Recently, chromatin-profiling techniques have vastly improved our understanding of the numerous functions mediated by the epigenome and DNA structure. Here, we review various chromatin-profiling techniques, such as assays of chromatin accessibility, nucleosome distribution, histone modifications, and chromatin topology, and discuss their applications in unraveling the brain epigenome and etiology of complex traits at tissue homogenate and single-cell resolution. These techniques have elucidated compositional and structural organizing principles of the chromatin environment. Taken together, we believe that high-resolution epigenomic and DNA structure profiling will be one of the best ways to elucidate how non-coding genetic variations impact complex diseases, ultimately allowing us to pinpoint cell-type targets with therapeutic potential.
Collapse
|
75
|
Zong SB, Li YL, Liu JX. Genomic Architecture of Rapid Parallel Adaptation to Fresh Water in a Wild Fish. Mol Biol Evol 2021; 38:1317-1329. [PMID: 33146383 PMCID: PMC8480189 DOI: 10.1093/molbev/msaa290] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Rapid adaptation to novel environments may drive changes in genomic regions through natural selection. However, the genetic architecture underlying these adaptive changes is still poorly understood. Using population genomic approaches, we investigated the genomic architecture that underlies rapid parallel adaptation of Coilia nasus to fresh water by comparing four freshwater-resident populations with their ancestral anadromous population. Linkage disequilibrium network analysis and population genetic analyses revealed two putative large chromosome inversions on LG6 and LG22, which were enriched for outlier loci and exhibited parallel association with freshwater adaptation. Drastic frequency shifts and elevated genetic differentiation were observed for the two chromosome inversions among populations, suggesting that both inversions would undergo divergent selection between anadromous and resident ecotypes. Enrichment analysis of genes within chromosome inversions showed significant enrichment of genes involved in metabolic process, immunoregulation, growth, maturation, osmoregulation, and so forth, which probably underlay differences in morphology, physiology and behavior between the anadromous and freshwater-resident forms. The availability of beneficial standing genetic variation, large optimum shift between marine and freshwater habitats, and high efficiency of selection with large population size could lead to the observed rapid parallel adaptive genomic change. We propose that chromosomal inversions might have played an important role during the evolution of rapid parallel ecological divergence in the face of environmental heterogeneity in C. nasus. Our study provides insights into the genomic basis of rapid adaptation of complex traits in novel habitats and highlights the importance of structural genomic variants in analyses of ecological adaptation.
Collapse
|