26
|
Li J, Bank C. Dominance and multi-locus interaction. Trends Genet 2024; 40:364-378. [PMID: 38453542 DOI: 10.1016/j.tig.2023.12.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 12/03/2023] [Accepted: 12/04/2023] [Indexed: 03/09/2024]
Abstract
Dominance is usually considered a constant value that describes the relative difference in fitness or phenotype between heterozygotes and the average of homozygotes at a focal polymorphic locus. However, the observed dominance can vary with the genetic background of the focal locus. Here, alleles at other loci modify the observed phenotype through position effects or dominance modifiers that are sometimes associated with pathogen resistance, lineage, sex, or mating type. Theoretical models have illustrated how variable dominance appears in the context of multi-locus interaction (epistasis). Here, we review empirical evidence for variable dominance and how the observed patterns may be captured by proposed epistatic models. We highlight how integrating epistasis and dominance is crucial for comprehensively understanding adaptation and speciation.
Collapse
|
27
|
Dibyachintan S, Dube AK, Bradley D, Lemieux P, Dionne U, Landry CR. Cryptic genetic variation shapes the fate of gene duplicates in a protein interaction network. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.23.581840. [PMID: 38464075 PMCID: PMC10925128 DOI: 10.1101/2024.02.23.581840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Paralogous genes are often redundant for long periods of time before they diverge in function. While their functions are preserved, paralogous proteins can accumulate mutations that, through epistasis, could impact their fate in the future. By quantifying the impact of all single-amino acid substitutions on the binding of two myosin proteins to their interaction partners, we find that the future evolution of these proteins is highly contingent on their regulatory divergence and the mutations that have silently accumulated in their protein binding domains. Differences in the promoter strength of the two paralogs amplify the impact of mutations on binding in the lowly expressed one. While some mutations would be sufficient to non-functionalize one paralog, they would have minimal impact on the other. Our results reveal how functionally equivalent protein domains could be destined to specific fates by regulatory and cryptic coding sequence changes that currently have little to no functional impact.
Collapse
|
28
|
Burch J, Chin M, Fontenot BE, Mandal S, McKnight TD, Demuth JP, Blackmon H. Wright was right: leveraging old data and new methods to illustrate the critical role of epistasis in genetics and evolution. Evolution 2024; 78:624-634. [PMID: 38241518 DOI: 10.1093/evolut/qpae003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 12/19/2023] [Accepted: 01/17/2024] [Indexed: 01/21/2024]
Abstract
Much of evolutionary theory is predicated on assumptions about the relative importance of simple additive versus complex epistatic genetic architectures. Previous work suggests traits strongly associated with fitness will lack additive genetic variation, whereas traits less strongly associated with fitness are expected to exhibit more additive genetic variation. We use a quantitative genetics method, line cross analysis, to infer genetic architectures that contribute to trait divergence. By parsing over 1,600 datasets by trait type, clade, and cross divergence, we estimated the relative importance of epistasis across the tree of life. In our comparison between life-history traits and morphological traits, we found greater epistatic contributions to life-history traits. Our comparison between plants and animals showed that animals have more epistatic contribution to trait divergence than plants. In our comparison of within-species versus between-species crosses, we found that only animals exhibit a greater epistatic contribution to trait divergence as divergence increases. While many scientists have argued that epistasis is ultimately of little importance, our results show that epistasis underlies much of trait divergence and must be accounted for in theory and practical applications like domestication, conservation breeding design, and understanding complex diseases.
Collapse
|
29
|
Rogozin IB, Saura A, Poliakov E, Bykova A, Roche-Lima A, Pavlov YI, Yurchenko V. Properties and Mechanisms of Deletions, Insertions, and Substitutions in the Evolutionary History of SARS-CoV-2. Int J Mol Sci 2024; 25:3696. [PMID: 38612505 PMCID: PMC11011937 DOI: 10.3390/ijms25073696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Revised: 03/22/2024] [Accepted: 03/23/2024] [Indexed: 04/14/2024] Open
Abstract
SARS-CoV-2 has accumulated many mutations since its emergence in late 2019. Nucleotide substitutions leading to amino acid replacements constitute the primary material for natural selection. Insertions, deletions, and substitutions appear to be critical for coronavirus's macro- and microevolution. Understanding the molecular mechanisms of mutations in the mutational hotspots (positions, loci with recurrent mutations, and nucleotide context) is important for disentangling roles of mutagenesis and selection. In the SARS-CoV-2 genome, deletions and insertions are frequently associated with repetitive sequences, whereas C>U substitutions are often surrounded by nucleotides resembling the APOBEC mutable motifs. We describe various approaches to mutation spectra analyses, including the context features of RNAs that are likely to be involved in the generation of recurrent mutations. We also discuss the interplay between mutations and natural selection as a complex evolutionary trend. The substantial variability and complexity of pipelines for the reconstruction of mutations and the huge number of genomic sequences are major problems for the analyses of mutations in the SARS-CoV-2 genome. As a solution, we advocate for the development of a centralized database of predicted mutations, which needs to be updated on a regular basis.
Collapse
|
30
|
Judge A, Sankaran B, Hu L, Palaniappan M, Birgy A, Prasad BVV, Palzkill T. Network of epistatic interactions in an enzyme active site revealed by large-scale deep mutational scanning. Proc Natl Acad Sci U S A 2024; 121:e2313513121. [PMID: 38483989 PMCID: PMC10962969 DOI: 10.1073/pnas.2313513121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 02/14/2024] [Indexed: 03/19/2024] Open
Abstract
Cooperative interactions between amino acids are critical for protein function. A genetic reflection of cooperativity is epistasis, which is when a change in the amino acid at one position changes the sequence requirements at another position. To assess epistasis within an enzyme active site, we utilized CTX-M β-lactamase as a model system. CTX-M hydrolyzes β-lactam antibiotics to provide antibiotic resistance, allowing a simple functional selection for rapid sorting of modified enzymes. We created all pairwise mutations across 17 active site positions in the β-lactamase enzyme and quantitated the function of variants against two β-lactam antibiotics using next-generation sequencing. Context-dependent sequence requirements were determined by comparing the antibiotic resistance function of double mutations across the CTX-M active site to their predicted function based on the constituent single mutations, revealing both positive epistasis (synergistic interactions) and negative epistasis (antagonistic interactions) between amino acid substitutions. The resulting trends demonstrate that positive epistasis is present throughout the active site, that epistasis between residues is mediated through substrate interactions, and that residues more tolerant to substitutions serve as generic compensators which are responsible for many cases of positive epistasis. Additionally, we show that a key catalytic residue (Glu166) is amenable to compensatory mutations, and we characterize one such double mutant (E166Y/N170G) that acts by an altered catalytic mechanism. These findings shed light on the unique biochemical factors that drive epistasis within an enzyme active site and will inform enzyme engineering efforts by bridging the gap between amino acid sequence and catalytic function.
Collapse
|
31
|
Mehra P, Hintze A. Reducing Epistasis and Pleiotropy Can Avoid the Survival of the Flattest Tragedy. BIOLOGY 2024; 13:193. [PMID: 38534462 DOI: 10.3390/biology13030193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 03/10/2024] [Accepted: 03/15/2024] [Indexed: 03/28/2024]
Abstract
This study investigates whether reducing epistasis and pleiotropy enhances mutational robustness in evolutionary adaptation, utilizing an indirect encoded model within the "survival of the flattest" (SoF) fitness landscape. By simulating genetic variations and their phenotypic consequences, we explore organisms' adaptive mechanisms to maintain positions on higher, narrower evolutionary peaks amidst environmental and genetic pressures. Our results reveal that organisms can indeed sustain their advantageous positions by minimizing the complexity of genetic interactions-specifically, by reducing the levels of epistasis and pleiotropy. This finding suggests a counterintuitive strategy for evolutionary stability: simpler genetic architectures, characterized by fewer gene interactions and multifunctional genes, confer a survival advantage by enhancing mutational robustness. This study contributes to our understanding of the genetic underpinnings of adaptability and robustness, challenging traditional views that equate complexity with fitness in dynamic environments.
Collapse
|
32
|
Lin WY. Searching for gene-gene interactions through variance quantitative trait loci of 29 continuous Taiwan Biobank phenotypes. Front Genet 2024; 15:1357238. [PMID: 38516378 PMCID: PMC10956579 DOI: 10.3389/fgene.2024.1357238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Accepted: 02/27/2024] [Indexed: 03/23/2024] Open
Abstract
Introduction: After the era of genome-wide association studies (GWAS), thousands of genetic variants have been identified to exhibit main effects on human phenotypes. The next critical issue would be to explore the interplay between genes, the so-called "gene-gene interactions" (GxG) or epistasis. An exhaustive search for all single-nucleotide polymorphism (SNP) pairs is not recommended because this will induce a harsh penalty of multiple testing. Limiting the search of epistasis on SNPs reported by previous GWAS may miss essential interactions between SNPs without significant marginal effects. Moreover, most methods are computationally intensive and can be challenging to implement genome-wide. Methods: I here searched for GxG through variance quantitative trait loci (vQTLs) of 29 continuous Taiwan Biobank (TWB) phenotypes. A discovery cohort of 86,536 and a replication cohort of 25,460 TWB individuals were analyzed, respectively. Results: A total of 18 nearly independent vQTLs with linkage disequilibrium measure r 2 < 0.01 were identified and replicated from nine phenotypes. 15 significant GxG were found with p-values <1.1E-5 (in the discovery cohort) and false discovery rates <2% (in the replication cohort). Among these 15 GxG, 11 were detected for blood traits including red blood cells, hemoglobin, and hematocrit; 2 for total bilirubin; 1 for fasting glucose; and 1 for total cholesterol (TCHO). All GxG were observed for gene pairs on the same chromosome, except for the APOA5 (chromosome 11)-TOMM40 (chromosome 19) interaction for TCHO. Discussion: This study provided a computationally feasible way to search for GxG genome-wide and applied this approach to 29 phenotypes.
Collapse
|
33
|
Dickel L, Arcese P, Keller LF, Nietlisbach P, Goedert D, Jensen H, Reid JM. Multigenerational Fitness Effects of Natural Immigration Indicate Strong Heterosis and Epistatic Breakdown in a Wild Bird Population. Am Nat 2024; 203:411-431. [PMID: 38358807 DOI: 10.1086/728669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2024]
Abstract
AbstractThe fitness of immigrants and their descendants produced within recipient populations fundamentally underpins the genetic and population dynamic consequences of immigration. Immigrants can in principle induce contrasting genetic effects on fitness across generations, reflecting multifaceted additive, dominance, and epistatic effects. Yet full multigenerational and sex-specific fitness effects of regular immigration have not been quantified within naturally structured systems, precluding inference on underlying genetic architectures and population outcomes. We used four decades of song sparrow (Melospiza melodia) life history and pedigree data to quantify fitness of natural immigrants, natives, and their F1, F2, and backcross descendants and test for evidence of nonadditive genetic effects. Values of key fitness components (including adult lifetime reproductive success and zygote survival) of F1 offspring of immigrant-native matings substantially exceeded their parent mean, indicating strong heterosis. Meanwhile, F2 offspring of F1-F1 matings had notably low values, indicating surprisingly strong epistatic breakdown. Furthermore, magnitudes of effects varied among fitness components and differed between female and male descendants. These results demonstrate that strong nonadditive genetic effects on fitness can arise within weakly structured and fragmented populations experiencing frequent natural immigration. Such effects will substantially affect the net degree of effective gene flow and resulting local genetic introgression and adaptation.
Collapse
|
34
|
Filipow N, Mallon S, Shewaramani S, Kassen R, Wong A. The impact of genetic background during laboratory evolution of Pseudomonas aeruginosa in a cystic fibrosis-like environment. Evolution 2024; 78:566-578. [PMID: 37862583 DOI: 10.1093/evolut/qpad189] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 10/04/2023] [Accepted: 10/18/2023] [Indexed: 10/22/2023]
Abstract
Genetic background has the potential to influence both the tempo and trajectory of adaptive change: Different genotypes of a given species may adopt varied solutions to the same environmental challenge, or they may approach the same solution at different rates. Laboratory selection has been widely used to experimentally examine the evolutionary consequences of variation in genetic background, although largely using genotypes differing by only a few mutations. Here, we leverage natural variation in the bacterium Pseudomonas aeruginosa to investigate whether different adaptive solutions are accessible from distant points of departure on an adaptive landscape. We evolved 17 diverse genotypes in a laboratory medium that partially mimics the lung sputum of cystic fibrosis patients, and we measured changes in 10 phenotypes as well as in fitness. Using phylogenetically informed analyses, we found that genetic background impacted the tempo, but not the trajectory, of phenotypic evolution: Different starting genotypes converged toward similar phenotypes, but at varying rates. Our findings add to a growing body of evidence supporting widespread diminishing return epistasis during adaptation. The importance of genetic background toward the trajectory of adaptation remains inconsistent across experimental systems and conditions.
Collapse
|
35
|
Prokkola JM, Chew KK, Anttila K, Maamela KS, Yildiz A, Åsheim ER, Primmer CR, Aykanat T. Tissue-specific metabolic enzyme levels covary with whole-animal metabolic rates and life-history loci via epistatic effects. Philos Trans R Soc Lond B Biol Sci 2024; 379:20220482. [PMID: 38186275 PMCID: PMC10772610 DOI: 10.1098/rstb.2022.0482] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 12/03/2023] [Indexed: 01/09/2024] Open
Abstract
Metabolic rates, including standard (SMR) and maximum (MMR) metabolic rate have often been linked with life-history strategies. Variation in context- and tissue-level metabolism underlying SMR and MMR may thus provide a physiological basis for life-history variation. This raises a hypothesis that tissue-specific metabolism covaries with whole-animal metabolic rates and is genetically linked to life history. In Atlantic salmon (Salmo salar), variation in two loci, vgll3 and six6, affects life history via age-at-maturity as well as MMR. Here, using individuals with known SMR and MMR with different vgll3 and six6 genotype combinations, we measured proxies of mitochondrial density and anaerobic metabolism, i.e. maximal activities of the mitochondrial citrate synthase (CS) and lactate dehydrogenase (LDH) enzymes, in four tissues (heart, intestine, liver, white muscle) across low- and high-food regimes. We found enzymatic activities were related to metabolic rates, mainly SMR, in the intestine and heart. Individual loci were not associated with the enzymatic activities, but we found epistatic effects and genotype-by-environment interactions in CS activity in the heart and epistasis in LDH activity in the intestine. These effects suggest that mitochondrial density and anaerobic capacity in the heart and intestine may partly mediate variation in metabolic rates and life history via age-at-maturity. This article is part of the theme issue 'The evolutionary significance of variation in metabolic rates'.
Collapse
|
36
|
Sasani TA, Quinlan AR, Harris K. Epistasis between mutator alleles contributes to germline mutation spectrum variability in laboratory mice. eLife 2024; 12:RP89096. [PMID: 38381482 PMCID: PMC10942616 DOI: 10.7554/elife.89096] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2024] Open
Abstract
Maintaining germline genome integrity is essential and enormously complex. Although many proteins are involved in DNA replication, proofreading, and repair, mutator alleles have largely eluded detection in mammals. DNA replication and repair proteins often recognize sequence motifs or excise lesions at specific nucleotides. Thus, we might expect that the spectrum of de novo mutations - the frequencies of C>T, A>G, etc. - will differ between genomes that harbor either a mutator or wild-type allele. Previously, we used quantitative trait locus mapping to discover candidate mutator alleles in the DNA repair gene Mutyh that increased the C>A germline mutation rate in a family of inbred mice known as the BXDs (Sasani et al., 2022, Ashbrook et al., 2021). In this study we developed a new method to detect alleles associated with mutation spectrum variation and applied it to mutation data from the BXDs. We discovered an additional C>A mutator locus on chromosome 6 that overlaps Ogg1, a DNA glycosylase involved in the same base-excision repair network as Mutyh (David et al., 2007). Its effect depends on the presence of a mutator allele near Mutyh, and BXDs with mutator alleles at both loci have greater numbers of C>A mutations than those with mutator alleles at either locus alone. Our new methods for analyzing mutation spectra reveal evidence of epistasis between germline mutator alleles and may be applicable to mutation data from humans and other model organisms.
Collapse
|
37
|
McCain JSP. Mapping combinatorial expression perturbations to growth in Escherichia coli. Cell Syst 2024; 15:106-108. [PMID: 38387440 DOI: 10.1016/j.cels.2024.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 01/21/2024] [Accepted: 01/22/2024] [Indexed: 02/24/2024]
Abstract
The connection between growth and gene expression has often been considered in a single gene. Repurposing a drug-drug interaction model, the multidimensional effects of several simultaneous gene expression perturbations on growth have been examined in the model bacteria Escherichia coli.
Collapse
|
38
|
Otto RM, Turska-Nowak A, Brown PM, Reynolds KA. A continuous epistasis model for predicting growth rate given combinatorial variation in gene expression and environment. Cell Syst 2024; 15:134-148.e7. [PMID: 38340730 PMCID: PMC10885703 DOI: 10.1016/j.cels.2024.01.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 10/13/2023] [Accepted: 01/18/2024] [Indexed: 02/12/2024]
Abstract
Quantifying and predicting growth rate phenotype given variation in gene expression and environment is complicated by epistatic interactions and the vast combinatorial space of possible perturbations. We developed an approach for mapping expression-growth rate landscapes that integrates sparsely sampled experimental measurements with an interpretable machine learning model. We used mismatch CRISPRi across pairs and triples of genes to create over 8,000 titrated changes in E. coli gene expression under varied environmental contexts, exploring epistasis in up to 22 distinct environments. Our results show that a pairwise model previously used to describe drug interactions well-described these data. The model yielded interpretable parameters related to pathway architecture and generalized to predict the combined effect of up to four perturbations when trained solely on pairwise perturbation data. We anticipate this approach will be broadly applicable in optimizing bacterial growth conditions, generating pharmacogenomic models, and understanding the fundamental constraints on bacterial gene expression. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
|
39
|
Zhang Q, Liu J, Liu H, Ao L, Xi Y, Chen D. Genome-wide epistasis analysis reveals gene-gene interaction network on an intermediate endophenotype P-tau/Aβ 42 ratio in ADNI cohort. Sci Rep 2024; 14:3984. [PMID: 38368488 PMCID: PMC10874417 DOI: 10.1038/s41598-024-54541-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Accepted: 02/14/2024] [Indexed: 02/19/2024] Open
Abstract
Alzheimer's disease (AD) is a progressive neurodegenerative disorder and the most common cause of dementia in the elderly worldwide. The exact etiology of AD, particularly its genetic mechanisms, remains incompletely understood. Traditional genome-wide association studies (GWAS), which primarily focus on single-nucleotide polymorphisms (SNPs) with main effects, provide limited explanations for the "missing heritability" of AD, while there is growing evidence supporting the important role of epistasis. In this study, we performed a genome-wide SNP-SNP interaction detection using a linear regression model and employed multiple GPUs for parallel computing, significantly enhancing the speed of whole-genome analysis. The cerebrospinal fluid (CSF) phosphorylated tau (P-tau)/amyloid-[Formula: see text] (A[Formula: see text]) ratio was used as a quantitative trait (QT) to enhance statistical power. Age, gender, and clinical diagnosis were included as covariates to control for potential non-genetic factors influencing AD. We identified 961 pairs of statistically significant SNP-SNP interactions, explaining a high-level variance of P-tau/A[Formula: see text] level, all of which exhibited marginal main effects. Additionally, we replicated 432 previously reported AD-related genes and found 11 gene-gene interaction pairs overlapping with the protein-protein interaction (PPI) network. Our findings may contribute to partially explain the "missing heritability" of AD. The identified subnetwork may be associated with synaptic dysfunction, Wnt signaling pathway, oligodendrocytes, inflammation, hippocampus, and neuronal cells.
Collapse
|
40
|
Thompson AJ, Wu NC, Canales A, Kikuchi C, Zhu X, de Toro BF, Cañada FJ, Worth C, Wang S, McBride R, Peng W, Nycholat CM, Jiménez-Barbero J, Wilson IA, Paulson JC. Evolution of human H3N2 influenza virus receptor specificity has substantially expanded the receptor-binding domain site. Cell Host Microbe 2024; 32:261-275.e4. [PMID: 38307019 PMCID: PMC11057904 DOI: 10.1016/j.chom.2024.01.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 11/14/2023] [Accepted: 01/09/2024] [Indexed: 02/04/2024]
Abstract
Hemagglutinins (HAs) from human influenza viruses descend from avian progenitors that bind α2-3-linked sialosides and must adapt to glycans with α2-6-linked sialic acids on human airway cells to transmit within the human population. Since their introduction during the 1968 pandemic, H3N2 viruses have evolved over the past five decades to preferentially recognize human α2-6-sialoside receptors that are elongated through addition of poly-LacNAc. We show that more recent H3N2 viruses now make increasingly complex interactions with elongated receptors while continuously selecting for strains maintaining this phenotype. This change in receptor engagement is accompanied by an extension of the traditional receptor-binding site to include residues in key antigenic sites on the surface of HA trimers. These results help explain the propensity for selection of antigenic variants, leading to vaccine mismatching, when H3N2 viruses are propagated in chicken eggs or cells that do not contain such receptors.
Collapse
|
41
|
Park Y, Metzger BP, Thornton JW. The simplicity of protein sequence-function relationships. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.02.556057. [PMID: 37732229 PMCID: PMC10508729 DOI: 10.1101/2023.09.02.556057] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
How complicated is the genetic architecture of proteins - the set of causal effects by which sequence determines function? High-order epistatic interactions among residues are thought to be pervasive, making a protein's function difficult to predict or understand from its sequence. Most studies, however, used methods that overestimate epistasis, because they analyze genetic architecture relative to a designated reference sequence - causing measurement noise and small local idiosyncrasies to propagate into pervasive high-order interactions - or have not effectively accounted for global nonlinearity in the sequence-function relationship. Here we present a new reference-free method that jointly estimates global nonlinearity and specific epistatic interactions across a protein's entire genotype-phenotype map. This method yields a maximally efficient explanation of a protein's genetic architecture and is more robust than existing methods to measurement noise, partial sampling, and model misspecification. We reanalyze 20 combinatorial mutagenesis experiments from a diverse set of proteins and find that additive and pairwise effects, along with a simple nonlinearity to account for limited dynamic range, explain a median of 96% of total variance in measured phenotypes (and >92% in every case). Only a tiny fraction of genotypes are strongly affected by third- or higher-order epistasis. Genetic architecture is also sparse: the number of terms required to explain the vast majority of variance is smaller than the number of genotypes by many orders of magnitude. The sequence-function relationship in most proteins is therefore far simpler than previously thought, opening the way for new and tractable approaches to characterize it.
Collapse
|
42
|
Alvarez S, Nartey CM, Mercado N, de la Paz JA, Huseinbegovic T, Morcos F. In vivo functional phenotypes from a computational epistatic model of evolution. Proc Natl Acad Sci U S A 2024; 121:e2308895121. [PMID: 38285950 PMCID: PMC10861889 DOI: 10.1073/pnas.2308895121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 12/19/2023] [Indexed: 01/31/2024] Open
Abstract
Computational models of evolution are valuable for understanding the dynamics of sequence variation, to infer phylogenetic relationships or potential evolutionary pathways and for biomedical and industrial applications. Despite these benefits, few have validated their propensities to generate outputs with in vivo functionality, which would enhance their value as accurate and interpretable evolutionary algorithms. We demonstrate the power of epistasis inferred from natural protein families to evolve sequence variants in an algorithm we developed called sequence evolution with epistatic contributions (SEEC). Utilizing the Hamiltonian of the joint probability of sequences in the family as fitness metric, we sampled and experimentally tested for in vivo [Formula: see text]-lactamase activity in Escherichia coli TEM-1 variants. These evolved proteins can have dozens of mutations dispersed across the structure while preserving sites essential for both catalysis and interactions. Remarkably, these variants retain family-like functionality while being more active than their wild-type predecessor. We found that depending on the inference method used to generate the epistatic constraints, different parameters simulate diverse selection strengths. Under weaker selection, local Hamiltonian fluctuations reliably predict relative changes to variant fitness, recapitulating neutral evolution. SEEC has the potential to explore the dynamics of neofunctionalization, characterize viral fitness landscapes, and facilitate vaccine development.
Collapse
|
43
|
Moran BM, Payne CY, Powell DL, Iverson ENK, Donny AE, Banerjee SM, Langdon QK, Gunn TR, Rodriguez-Soto RA, Madero A, Baczenas JJ, Kleczko KM, Liu F, Matney R, Singhal K, Leib RD, Hernandez-Perez O, Corbett-Detig R, Frydman J, Gifford C, Schartl M, Havird JC, Schumer M. A lethal mitonuclear incompatibility in complex I of natural hybrids. Nature 2024; 626:119-127. [PMID: 38200310 PMCID: PMC10830419 DOI: 10.1038/s41586-023-06895-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 11/23/2023] [Indexed: 01/12/2024]
Abstract
The evolution of reproductive barriers is the first step in the formation of new species and can help us understand the diversification of life on Earth. These reproductive barriers often take the form of hybrid incompatibilities, in which alleles derived from two different species no longer interact properly in hybrids1-3. Theory predicts that hybrid incompatibilities may be more likely to arise at rapidly evolving genes4-6 and that incompatibilities involving multiple genes should be common7,8, but there has been sparse empirical data to evaluate these predictions. Here we describe a mitonuclear incompatibility involving three genes whose protein products are in physical contact within respiratory complex I of naturally hybridizing swordtail fish species. Individuals homozygous for mismatched protein combinations do not complete embryonic development or die as juveniles, whereas those heterozygous for the incompatibility have reduced complex I function and unbalanced representation of parental alleles in the mitochondrial proteome. We find that the effects of different genetic interactions on survival are non-additive, highlighting subtle complexity in the genetic architecture of hybrid incompatibilities. Finally, we document the evolutionary history of the genes involved, showing signals of accelerated evolution and evidence that an incompatibility has been transferred between species via hybridization.
Collapse
|
44
|
Reguant R, O'Brien MJ, Bayat A, Hosking B, Jain Y, Twine NA, Bauer DC. PEPS: Polygenic Epistatic Phenotype Simulation. Stud Health Technol Inform 2024; 310:810-814. [PMID: 38269921 DOI: 10.3233/shti231077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
Genetic data is limited and generating new datasets is often an expensive, time-consuming process, involving countless moving parts to genotype and phenotype individuals. While sharing data is beneficial for quality control and software development, privacy and security are of utmost importance. Generating synthetic data is a practical solution to mitigate the cost, time and sensitivities that hamper developers and researchers in producing and validating novel biotechnological solutions to data intensive problems. Existing methods focus on mutation frequencies at specific loci while ignoring epistatic interactions. Alternatively, programs that do consider epistasis are limited to two-way interactions or apply genomic constraints that make synthetic data generation arduous or computationally intensive. To solve this, we developed Polygenic Epistatic Phenotype Simulator (PEPS). Our tool is a probabilistic model that can generate synthetic phenotypes with a controllable level of complexity.
Collapse
|
45
|
Tezuka T, Nagai S, Matsuo C, Okamori T, Iizuka T, Marubashi W. Genetic Cause of Hybrid Lethality Observed in Reciprocal Interspecific Crosses between Nicotiana simulans and N. tabacum. Int J Mol Sci 2024; 25:1226. [PMID: 38279225 PMCID: PMC10817076 DOI: 10.3390/ijms25021226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 01/17/2024] [Accepted: 01/17/2024] [Indexed: 01/28/2024] Open
Abstract
Hybrid lethality, a type of postzygotic reproductive isolation, is an obstacle to wide hybridization breeding. Here, we report the hybrid lethality that was observed in crosses between the cultivated tobacco, Nicotiana tabacum (section Nicotiana), and the wild tobacco species, Nicotiana simulans (section Suaveolentes). Reciprocal hybrid seedlings were inviable at 28 °C, and the lethality was characterized by browning of the hypocotyl and roots, suggesting that hybrid lethality is due to the interaction of nuclear genomes derived from each parental species, and not to a cytoplasmic effect. Hybrid lethality was temperature-sensitive and suppressed at 36 °C. However, when hybrid seedlings cultured at 36 °C were transferred to 28 °C, all of them showed hybrid lethality. After crossing between an N. tabacum monosomic line missing one copy of the Q chromosome and N. simulans, hybrid seedlings with or without the Q chromosome were inviable and viable, respectively. These results indicated that gene(s) on the Q chromosome are responsible for hybrid lethality and also suggested that N. simulans has the same allele at the Hybrid Lethality A1 (HLA1) locus responsible for hybrid lethality as other species in the section Suaveolentes. Haplotype analysis around the HLA1 locus suggested that there are at least six and two haplotypes containing Hla1-1 and hla1-2 alleles, respectively, in the section Suaveolentes.
Collapse
|
46
|
Han L, Shen B, Wu X, Zhang J, Wen YJ. Compressed variance component mixed model reveals epistasis associated with flowering in Arabidopsis. FRONTIERS IN PLANT SCIENCE 2024; 14:1283642. [PMID: 38259933 PMCID: PMC10800901 DOI: 10.3389/fpls.2023.1283642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Accepted: 12/15/2023] [Indexed: 01/24/2024]
Abstract
Introduction Epistasis is currently a topic of great interest in molecular and quantitative genetics. Arabidopsis thaliana, as a model organism, plays a crucial role in studying the fundamental biology of diverse plant species. However, there have been limited reports about identification of epistasis related to flowering in genome-wide association studies (GWAS). Therefore, it is of utmost importance to conduct epistasis in Arabidopsis. Method In this study, we employed Levene's test and compressed variance component mixed model in GWAS to detect quantitative trait nucleotides (QTNs) and QTN-by-QTN interactions (QQIs) for 11 flowering-related traits of 199 Arabidopsis accessions with 216,130 markers. Results Our analysis detected 89 QTNs and 130 pairs of QQIs. Around these loci, 34 known genes previously reported in Arabidopsis were confirmed to be associated with flowering-related traits, such as SPA4, which is involved in regulating photoperiodic flowering, and interacts with PAP1 and PAP2, affecting growth of Arabidopsis under light conditions. Then, we observed significant and differential expression of 35 genes in response to variations in temperature, photoperiod, and vernalization treatments out of unreported genes. Functional enrichment analysis revealed that 26 of these genes were associated with various biological processes. Finally, the haplotype and phenotypic difference analysis revealed 20 candidate genes exhibiting significant phenotypic variations across gene haplotypes, of which the candidate genes AT1G12990 and AT1G09950 around QQIs might have interaction effect to flowering time regulation in Arabidopsis. Discussion These findings may offer valuable insights for the identification and exploration of genes and gene-by-gene interactions associated with flowering-related traits in Arabidopsis, that may even provide valuable reference and guidance for the research of epistasis in other species.
Collapse
|
47
|
Glagoleva AY, Kukoeva TV, Khlestkina EK, Shoeva OY. Polyphenol oxidase genes in barley ( Hordeum vulgare L.): functional activity with respect to black grain pigmentation. FRONTIERS IN PLANT SCIENCE 2024; 14:1320770. [PMID: 38259950 PMCID: PMC10800887 DOI: 10.3389/fpls.2023.1320770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 12/18/2023] [Indexed: 01/24/2024]
Abstract
Polyphenol oxidase (PPO) is an oxidoreductase. In damaged plant tissues, it catalyzes enzymatic browning by oxidizing o-diphenols to highly reactive o-quinones, which polymerize producing heterogeneous dark polymer melanin. In intact tissues, functions of PPO are not well understood. The aim of the study was to investigate the barley PPO gene family and to reveal the possible involvement of Ppo genes in melanization of barley grain, which is controlled by the Blp1 gene. Based on known barley Ppo genes on chromosome 2H (Ppo1 and Ppo2), two additional genes-Ppo3 and Ppo4-were found on chromosomes 3H and 4H, respectively. These genes have one and two exons, respectively, contain a conserved tyrosinase domain and are thought to be functional. Comparative transcriptional analyzes of the genes in samples of developing grains (combined hulls and pericarp tissues) were conducted in two barley lines differing by melanin pigmentation. The genes were found to be transcribed with increasing intensity (while grains mature) independently from the grain color, except for Ppo2, which is transcribed only in black-grained line i:BwBlp1 accumulating melanin in grains. Analysis of this gene's expression in detached hulls and pericarps showed its elevated transcription in both tissues in comparison with yellow ones, while it was significantly higher in hulls than in pericarp. Segregation analysis in two F2 populations obtained based on barley genotypes carrying dominant Blp1 and recessive ppo1 (I) and dominant Blp1 and recessive ppo1 and ppo2 (II) was carried out. In population I, only two phenotypic classes corresponding to parental black and white ones were observed; the segregation ratio was 3 black to 1 white, corresponding to monogenic. In population II, aside from descendants with black and white grains, hybrids with a gray phenotype - light hulls and dark pericarp - were observed; the segregation ratio was 9 black to 3 gray to 4 white, corresponding to the epistatic interaction of two genes. Most hybrids with the gray phenotype carry dominant Blp1 and a homozygous recessive allele of Ppo2. Based on transcription and segregation assays one may conclude involvement of Ppo2 but not Ppo1 in melanin formation in barley hulls.
Collapse
|
48
|
Periyasamy S, Youssef P, John S, Thara R, Mowry BJ. Genetic interactions of schizophrenia using gene-based statistical epistasis exclusively identify nervous system-related pathways and key hub genes. Front Genet 2024; 14:1301150. [PMID: 38259618 PMCID: PMC10800577 DOI: 10.3389/fgene.2023.1301150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Accepted: 12/12/2023] [Indexed: 01/24/2024] Open
Abstract
Background: The relationship between genotype and phenotype is governed by numerous genetic interactions (GIs), and the mapping of GI networks is of interest for two main reasons: 1) By modelling biological robustness, GIs provide a powerful opportunity to infer compensatory biological mechanisms via the identification of functional relationships between genes, which is of interest for biological discovery and translational research. Biological systems have evolved to compensate for genetic (i.e., variations and mutations) and environmental (i.e., drug efficacy) perturbations by exploiting compensatory relationships between genes, pathways and biological processes; 2) GI facilitates the identification of the direction (alleviating or aggravating interactions) and magnitude of epistatic interactions that influence the phenotypic outcome. The generation of GIs for human diseases is impossible using experimental biology approaches such as systematic deletion analysis. Moreover, the generation of disease-specific GIs has never been undertaken in humans. Methods: We used our Indian schizophrenia case-control (case-816, controls-900) genetic dataset to implement the workflow. Standard GWAS sample quality control procedure was followed. We used the imputed genetic data to increase the SNP coverage to analyse epistatic interactions across the genome comprehensively. Using the odds ratio (OR), we identified the GIs that increase or decrease the risk of a disease phenotype. The SNP-based epistatic results were transformed into gene-based epistatic results. Results: We have developed a novel approach by conducting gene-based statistical epistatic analysis using an Indian schizophrenia case-control genetic dataset and transforming these results to infer GIs that increase the risk of schizophrenia. There were ∼9.5 million GIs with a p-value ≤ 1 × 10-5. Approximately 4.8 million GIs showed an increased risk (OR > 1.0), while ∼4.75 million GIs had a decreased risk (OR <1.0) for schizophrenia. Conclusion: Unlike model organisms, this approach is specifically viable in humans due to the availability of abundant disease-specific genome-wide genotype datasets. The study exclusively identified brain/nervous system-related processes, affirming the findings. This computational approach fills a critical gap by generating practically non-existent heritable disease-specific human GIs from human genetic data. These novel datasets can train innovative deep-learning models, potentially surpassing the limitations of conventional GWAS.
Collapse
|
49
|
Prakapenka D, Liang Z, Zaabza HB, VanRaden PM, Van Tassell CP, Da Y. A Million-Cow Validation of a Chromosome 14 Region Interacting with All Chromosomes for Fat Percentage in U.S. Holstein Cows. Int J Mol Sci 2024; 25:674. [PMID: 38203848 PMCID: PMC10779465 DOI: 10.3390/ijms25010674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 12/29/2023] [Accepted: 01/03/2024] [Indexed: 01/12/2024] Open
Abstract
A genome-wide association study (GWAS) of fat percentage (FPC) using 1,231,898 first lactation cows and 75,198 SNPs confirmed a previous result that a Chr14 region about 9.38 Mb in size (0.14-9.52 Mb) had significant inter-chromosome additive × additive (A×A) effects with all chromosomes and revealed many new such effects. This study divides this 9.38 Mb region into two sub-regions, Chr14a at 0.14-0.88 Mb (0.74 Mb in size) with 78% and Chr14b at 2.21-9.52 Mb (7.31 Mb in size) with 22% of the 2761 significant A×A effects. These two sub-regions were separated by a 1.3 Mb gap at 0.9-2.2 Mb without significant inter-chromosome A×A effects. The PPP1R16A-FOXH1-CYHR1-TONSL (PFCT) region of Chr14a (29 Kb in size) with four SNPs had the largest number of inter-chromosome A×A effects (1141 pairs) with all chromosomes, including the most significant inter-chromosome A×A effects. The SLC4A4-GC-NPFFR2 (SGN) region of Chr06, known to have highly significant additive effects for some production, fertility and health traits, specifically interacted with the PFCT region and a Chr14a region with CPSF1, ADCK5, SLC52A2, DGAT1, SMPD5 and PARP10 (CASDSP) known to have highly significant additive effects for milk production traits. The most significant effects were between an SNP in SGN and four SNPs in PFCT. The CASDSP region mostly interacted with the SGN region. In the Chr14b region, the 2.28-2.42 Mb region (138.46 Kb in size) lacking coding genes had the largest cluster of A×A effects, interacting with seventeen chromosomes. The results from this study provide high-confidence evidence towards the understanding of the genetic mechanism of FPC in Holstein cows.
Collapse
|
50
|
Schwab B, Yin J. Computational multigene interactions in virus growth and infection spread. Virus Evol 2023; 10:vead082. [PMID: 38361828 PMCID: PMC10868543 DOI: 10.1093/ve/vead082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/29/2023] [Accepted: 12/19/2023] [Indexed: 02/17/2024] Open
Abstract
Viruses persist in nature owing to their extreme genetic heterogeneity and large population sizes, which enable them to evade host immune defenses, escape antiviral drugs, and adapt to new hosts. The persistence of viruses is challenging to study because mutations affect multiple virus genes, interactions among genes in their impacts on virus growth are seldom known, and measures of viral fitness are yet to be standardized. To address these challenges, we employed a data-driven computational model of cell infection by a virus. The infection model accounted for the kinetics of viral gene expression, functional gene-gene interactions, genome replication, and allocation of host cellular resources to produce progeny of vesicular stomatitis virus, a prototype RNA virus. We used this model to computationally probe how interactions among genes carrying up to eleven deleterious mutations affect different measures of virus fitness: single-cycle growth yields and multicycle rates of infection spread. Individual mutations were implemented by perturbing biophysical parameters associated with individual gene functions of the wild-type model. Our analysis revealed synergistic epistasis among deleterious mutations in their effects on virus yield; so adverse effects of single deleterious mutations were amplified by interaction. For the same mutations, multicycle infection spread indicated weak or negligible epistasis, where single mutations act alone in their effects on infection spread. These results were robust to simulation in high- and low-host resource environments. Our work highlights how different types and magnitudes of epistasis can arise for genetically identical virus variants, depending on the fitness measure. More broadly, gene-gene interactions can differently affect how viruses grow and spread.
Collapse
|