1
|
Dwivedi SL, Heslop-Harrison P, Amas J, Ortiz R, Edwards D. Epistasis and pleiotropy-induced variation for plant breeding. PLANT BIOTECHNOLOGY JOURNAL 2024. [PMID: 38875130 DOI: 10.1111/pbi.14405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 05/07/2024] [Accepted: 05/24/2024] [Indexed: 06/16/2024]
Abstract
Epistasis refers to nonallelic interaction between genes that cause bias in estimates of genetic parameters for a phenotype with interactions of two or more genes affecting the same trait. Partitioning of epistatic effects allows true estimation of the genetic parameters affecting phenotypes. Multigenic variation plays a central role in the evolution of complex characteristics, among which pleiotropy, where a single gene affects several phenotypic characters, has a large influence. While pleiotropic interactions provide functional specificity, they increase the challenge of gene discovery and functional analysis. Overcoming pleiotropy-based phenotypic trade-offs offers potential for assisting breeding for complex traits. Modelling higher order nonallelic epistatic interaction, pleiotropy and non-pleiotropy-induced variation, and genotype × environment interaction in genomic selection may provide new paths to increase the productivity and stress tolerance for next generation of crop cultivars. Advances in statistical models, software and algorithm developments, and genomic research have facilitated dissecting the nature and extent of pleiotropy and epistasis. We overview emerging approaches to exploit positive (and avoid negative) epistatic and pleiotropic interactions in a plant breeding context, including developing avenues of artificial intelligence, novel exploitation of large-scale genomics and phenomics data, and involvement of genes with minor effects to analyse epistatic interactions and pleiotropic quantitative trait loci, including missing heritability.
Collapse
Affiliation(s)
| | - Pat Heslop-Harrison
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
- Department of Genetics and Genome Biology, Institute for Environmental Futures, University of Leicester, Leicester, UK
| | - Junrey Amas
- Centre for Applied Bioinformatics, School of Biological Sciences, University of Western Australia, Perth, WA, Australia
| | - Rodomiro Ortiz
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | - David Edwards
- Centre for Applied Bioinformatics, School of Biological Sciences, University of Western Australia, Perth, WA, Australia
| |
Collapse
|
2
|
Diaz-Colunga J, Skwara A, Vila JCC, Bajic D, Sanchez A. Global epistasis and the emergence of function in microbial consortia. Cell 2024; 187:3108-3119.e30. [PMID: 38776921 DOI: 10.1016/j.cell.2024.04.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 12/06/2023] [Accepted: 04/16/2024] [Indexed: 05/25/2024]
Abstract
The many functions of microbial communities emerge from a complex web of interactions between organisms and their environment. This poses a significant obstacle to engineering microbial consortia, hindering our ability to harness the potential of microorganisms for biotechnological applications. In this study, we demonstrate that the collective effect of ecological interactions between microbes in a community can be captured by simple statistical models that predict how adding a new species to a community will affect its function. These predictive models mirror the patterns of global epistasis reported in genetics, and they can be quantitatively interpreted in terms of pairwise interactions between community members. Our results illuminate an unexplored path to quantitatively predicting the function of microbial consortia from their composition, paving the way to optimizing desirable community properties and bringing the tasks of predicting biological function at the genetic, organismal, and ecological scales under the same quantitative formalism.
Collapse
Affiliation(s)
- Juan Diaz-Colunga
- Department of Ecology & Evolutionary Biology, Yale University, New Haven, CT 06511, USA; Microbial Sciences Institute, Yale University, New Haven, CT 06511, USA; Department of Microbial Biotechnology, National Center for Biotechnology CNB-CSIC, 28049 Madrid, Spain; Institute of Functional Biology and Genomics IBFG-CSIC, University of Salamanca, 37007 Salamanca, Spain.
| | - Abigail Skwara
- Department of Ecology & Evolutionary Biology, Yale University, New Haven, CT 06511, USA; Microbial Sciences Institute, Yale University, New Haven, CT 06511, USA
| | - Jean C C Vila
- Department of Ecology & Evolutionary Biology, Yale University, New Haven, CT 06511, USA; Microbial Sciences Institute, Yale University, New Haven, CT 06511, USA; Department of Biology, Stanford University, Stanford, CA 94305, USA
| | - Djordje Bajic
- Department of Ecology & Evolutionary Biology, Yale University, New Haven, CT 06511, USA; Microbial Sciences Institute, Yale University, New Haven, CT 06511, USA; Department of Biotechnology, Delft University of Technology, Delft 2628 CD, the Netherlands.
| | - Alvaro Sanchez
- Department of Ecology & Evolutionary Biology, Yale University, New Haven, CT 06511, USA; Microbial Sciences Institute, Yale University, New Haven, CT 06511, USA; Department of Microbial Biotechnology, National Center for Biotechnology CNB-CSIC, 28049 Madrid, Spain; Institute of Functional Biology and Genomics IBFG-CSIC, University of Salamanca, 37007 Salamanca, Spain.
| |
Collapse
|
3
|
Hale JJ, Matsui T, Goldstein I, Mullis MN, Roy KR, Ville CN, Miller D, Wang C, Reynolds T, Steinmetz LM, Levy SF, Ehrenreich IM. Genome-scale analysis of interactions between genetic perturbations and natural variation. Nat Commun 2024; 15:4234. [PMID: 38762544 PMCID: PMC11102447 DOI: 10.1038/s41467-024-48626-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 04/30/2024] [Indexed: 05/20/2024] Open
Abstract
Interactions between genetic perturbations and segregating loci can cause perturbations to show different phenotypic effects across genetically distinct individuals. To study these interactions on a genome scale in many individuals, we used combinatorial DNA barcode sequencing to measure the fitness effects of 8046 CRISPRi perturbations targeting 1721 distinct genes in 169 yeast cross progeny (or segregants). We identified 460 genes whose perturbation has different effects across segregants. Several factors caused perturbations to show variable effects, including baseline segregant fitness, the mean effect of a perturbation across segregants, and interacting loci. We mapped 234 interacting loci and found four hub loci that interact with many different perturbations. Perturbations that interact with a given hub exhibit similar epistatic relationships with the hub and show enrichment for cellular processes that may mediate these interactions. These results suggest that an individual's response to perturbations is shaped by a network of perturbation-locus interactions that cannot be measured by approaches that examine perturbations or natural variation alone.
Collapse
Affiliation(s)
- Joseph J Hale
- Department of Biological Sciences, Molecular and Computational Biology Section, University of Southern California, Los Angeles, CA, 90089, USA
| | - Takeshi Matsui
- SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA
| | - Ilan Goldstein
- Department of Biological Sciences, Molecular and Computational Biology Section, University of Southern California, Los Angeles, CA, 90089, USA
| | - Martin N Mullis
- Department of Biological Sciences, Molecular and Computational Biology Section, University of Southern California, Los Angeles, CA, 90089, USA
| | - Kevin R Roy
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Christopher Ne Ville
- Department of Biological Sciences, Molecular and Computational Biology Section, University of Southern California, Los Angeles, CA, 90089, USA
| | - Darach Miller
- SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA
| | - Charley Wang
- Department of Biological Sciences, Molecular and Computational Biology Section, University of Southern California, Los Angeles, CA, 90089, USA
| | - Trevor Reynolds
- Department of Biological Sciences, Molecular and Computational Biology Section, University of Southern California, Los Angeles, CA, 90089, USA
| | - Lars M Steinmetz
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Sasha F Levy
- SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA.
- BacStitch DNA, Los Altos, CA, USA.
| | - Ian M Ehrenreich
- Department of Biological Sciences, Molecular and Computational Biology Section, University of Southern California, Los Angeles, CA, 90089, USA.
| |
Collapse
|
4
|
Schraiber JG, Edge MD, Pennell M. Unifying approaches from statistical genetics and phylogenetics for mapping phenotypes in structured populations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.10.579721. [PMID: 38496530 PMCID: PMC10942266 DOI: 10.1101/2024.02.10.579721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
In both statistical genetics and phylogenetics, a major goal is to identify correlations between genetic loci or other aspects of the phenotype or environment and a focal trait. In these two fields, there are sophisticated but disparate statistical traditions aimed at these tasks. The disconnect between their respective approaches is becoming untenable as questions in medicine, conservation biology, and evolutionary biology increasingly rely on integrating data from within and among species, and once-clear conceptual divisions are becoming increasingly blurred. To help bridge this divide, we derive a general model describing the covariance between the genetic contributions to the quantitative phenotypes of different individuals. Taking this approach shows that standard models in both statistical genetics (e.g., Genome-Wide Association Studies; GWAS) and phylogenetic comparative biology (e.g., phylogenetic regression) can be interpreted as special cases of this more general quantitative-genetic model. The fact that these models share the same core architecture means that we can build a unified understanding of the strengths and limitations of different methods for controlling for genetic structure when testing for associations. We develop intuition for why and when spurious correlations may occur using analytical theory and conduct population-genetic and phylogenetic simulations of quantitative traits. The structural similarity of problems in statistical genetics and phylogenetics enables us to take methodological advances from one field and apply them in the other. We demonstrate this by showing how a standard GWAS technique-including both the genetic relatedness matrix (GRM) as well as its leading eigenvectors, corresponding to the principal components of the genotype matrix, in a regression model-can mitigate spurious correlations in phylogenetic analyses. As a case study of this, we re-examine an analysis testing for co-evolution of expression levels between genes across a fungal phylogeny, and show that including covariance matrix eigenvectors as covariates decreases the false positive rate while simultaneously increasing the true positive rate. More generally, this work provides a foundation for more integrative approaches for understanding the genetic architecture of phenotypes and how evolutionary processes shape it.
Collapse
|
5
|
Hale JJ, Matsui T, Goldstein I, Mullis MN, Roy KR, Ville CN, Miller D, Wang C, Reynolds T, Steinmetz LM, Levy SF, Ehrenreich IM. Genome-scale analysis of interactions between genetic perturbations and natural variation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.05.06.539663. [PMID: 38293072 PMCID: PMC10827069 DOI: 10.1101/2023.05.06.539663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Interactions between genetic perturbations and segregating loci can cause perturbations to show different phenotypic effects across genetically distinct individuals. To study these interactions on a genome scale in many individuals, we used combinatorial DNA barcode sequencing to measure the fitness effects of 7,700 CRISPRi perturbations targeting 1,712 distinct genes in 169 yeast cross progeny (or segregants). We identified 460 genes whose perturbation has different effects across segregants. Several factors caused perturbations to show variable effects, including baseline segregant fitness, the mean effect of a perturbation across segregants, and interacting loci. We mapped 234 interacting loci and found four hub loci that interact with many different perturbations. Perturbations that interact with a given hub exhibit similar epistatic relationships with the hub and show enrichment for cellular processes that may mediate these interactions. These results suggest that an individual's response to perturbations is shaped by a network of perturbation-locus interactions that cannot be measured by approaches that examine perturbations or natural variation alone.
Collapse
Affiliation(s)
- Joseph J. Hale
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Takeshi Matsui
- SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA
| | - Ilan Goldstein
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Martin N. Mullis
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Kevin R. Roy
- Stanford Genome Technology Center, Stanford University, Palo Alto, California, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
| | - Chris Ne Ville
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Darach Miller
- SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA
| | - Charley Wang
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Trevor Reynolds
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Lars M. Steinmetz
- Stanford Genome Technology Center, Stanford University, Palo Alto, California, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Sasha F. Levy
- SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA
- Present address: BacStitch DNA, Los Altos, California, USA
| | - Ian M. Ehrenreich
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
6
|
Coradini ALV, Ville CN, Krieger ZA, Roemer J, Hull C, Yang S, Lusk DT, Ehrenreich IM. Building synthetic chromosomes from natural DNA. Nat Commun 2023; 14:8337. [PMID: 38123566 PMCID: PMC10733283 DOI: 10.1038/s41467-023-44112-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 11/30/2023] [Indexed: 12/23/2023] Open
Abstract
De novo chromosome synthesis is costly and time-consuming, limiting its use in research and biotechnology. Building synthetic chromosomes from natural components is an unexplored alternative with many potential applications. In this paper, we report CReATiNG (Cloning, Reprogramming, and Assembling Tiled Natural Genomic DNA), a method for constructing synthetic chromosomes from natural components in yeast. CReATiNG entails cloning segments of natural chromosomes and then programmably assembling them into synthetic chromosomes that can replace the native chromosomes in cells. We use CReATiNG to synthetically recombine chromosomes between strains and species, to modify chromosome structure, and to delete many linked, non-adjacent regions totaling 39% of a chromosome. The multiplex deletion experiment reveals that CReATiNG also enables recovery from flaws in synthetic chromosome design via recombination between a synthetic chromosome and its native counterpart. CReATiNG facilitates the application of chromosome synthesis to diverse biological problems.
Collapse
Affiliation(s)
- Alessandro L V Coradini
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA.
| | - Christopher Ne Ville
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA
| | - Zachary A Krieger
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA
| | - Joshua Roemer
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA
| | - Cara Hull
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA
| | - Shawn Yang
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA
| | - Daniel T Lusk
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA
| | - Ian M Ehrenreich
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA.
| |
Collapse
|
7
|
Cyplik A, Piaskowska D, Czembor P, Bocianowski J. The use of weighted multiple linear regression to estimate QTL × QTL × QTL interaction effects of winter wheat (Triticum aestivum L.) doubled-haploid lines. J Appl Genet 2023; 64:679-693. [PMID: 37878169 PMCID: PMC10632291 DOI: 10.1007/s13353-023-00795-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 09/25/2023] [Accepted: 10/04/2023] [Indexed: 10/26/2023]
Abstract
Knowledge of the magnitude of gene effects and their interactions, their nature, and contribution to determining quantitative traits is very important in conducting an effective breeding program. In traditional breeding, information on the parameter related to additive gene effect and additive-additive interaction (epistasis) and higher-order additive interactions would be useful. Although commonly overlooked in studies, higher-order interactions have a significant impact on phenotypic traits. Failure to account for the effect of triplet interactions in quantitative genetics can significantly underestimate additive QTL effects. Understanding the genetic architecture of quantitative traits is a major challenge in the post-genomic era, especially for quantitative trait locus (QTL) effects, QTL-QTL interactions, and QTL-QTL-QTL interactions. This paper proposes using weighted multiple linear regression to estimate the effects of triple interaction (additive-additive-additive) quantitative trait loci (QTL-QTL-QTL). The material for the study consisted of 126 doubled haploid lines of winter wheat (Mandub × Begra cross). The lines were analyzed for 18 traits, including percentage of necrosis leaf area, percentage of leaf area covered by pycnidia, heading data, and height. The number of genes (the number of effective factors) was lower than the number of QTLs for nine traits, higher for four traits and equal for five traits. The number of triples for unweighted regression ranged from 0 to 9, while for weighted regression, it ranged from 0 to 13. The total aaagu effect ranged from - 14.74 to 15.61, while aaagw ranged from - 23.39 to 21.65. The number of detected threes using weighted regression was higher for two traits and lower for four traits. Forty-nine statistically significant threes of the additive-by-additive-by-additive interaction effects were observed. The QTL most frequently occurring in threes was 4407404 (9 times). The use of weighted regression improved (in absolute value) the assessment of QTL-QTL-QTL interaction effects compared to the assessment based on unweighted regression. The coefficients of determination for the weighted regression model were higher, ranging from 0.8 to 15.5%, than for the unweighted regression. Based on the results, it can be concluded that the QTL-QTL-QTL triple interaction had a significant effect on the expression of quantitative traits. The use of weighted multiple linear regression proved to be a useful statistical tool for estimating additive-additive-additive (aaa) interaction effects. The weighted regression also provided results closer to phenotypic evaluations than estimator values obtained using unweighted regression, which is closer to the true values.
Collapse
Affiliation(s)
- Adrian Cyplik
- Department of Mathematical and Statistical Methods, Poznań University of Life Sciences, Wojska Polskiego 28, 60-637, Poznań, Poland
| | - Dominika Piaskowska
- Plant Breeding and Acclimatization Institute - National Research Institute, Department of Applied Biology, Radzików, 05-870, Błonie, Poland
| | - Paweł Czembor
- Plant Breeding and Acclimatization Institute - National Research Institute, Department of Applied Biology, Radzików, 05-870, Błonie, Poland
| | - Jan Bocianowski
- Department of Mathematical and Statistical Methods, Poznań University of Life Sciences, Wojska Polskiego 28, 60-637, Poznań, Poland.
| |
Collapse
|
8
|
Nadi R, Juan-Vicente L, Mateo-Bonmatí E, Micol JL. The unequal functional redundancy of Arabidopsis INCURVATA11 and CUPULIFORMIS2 is not dependent on genetic background. FRONTIERS IN PLANT SCIENCE 2023; 14:1239093. [PMID: 38034561 PMCID: PMC10684699 DOI: 10.3389/fpls.2023.1239093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 10/26/2023] [Indexed: 12/02/2023]
Abstract
The paralogous genes INCURVATA11 (ICU11) and CUPULIFORMIS2 (CP2) encode components of the epigenetic machinery in Arabidopsis and belong to the 2-oxoglutarate and Fe (II)-dependent dioxygenase superfamily. We previously inferred unequal functional redundancy between ICU11 and CP2 from a study of the synergistic phenotypes of the double mutant and sesquimutant combinations of icu11 and cp2 mutations, although they represented mixed genetic backgrounds. To avoid potential confounding effects arising from different genetic backgrounds, we generated the icu11-5 and icu11-6 mutants via CRISPR/Cas genome editing in the Col-0 background and crossed them to cp2 mutants in Col-0. The resulting mutants exhibited a postembryonic-lethal phenotype reminiscent of strong embryonic flower (emf) mutants. Double mutants involving icu11-5 and mutations affecting epigenetic machinery components displayed synergistic phenotypes, whereas cp2-3 did not besides icu11-5. Our results confirmed the unequal functional redundancy between ICU11 and CP2 and demonstrated that it is not allele or genetic background specific. An increase in sucrose content in the culture medium partially rescued the post-germinative lethality of icu11 cp2 double mutants and sesquimutants, facilitating the study of their morphological phenotypes throughout their life cycle, which include floral organ homeotic transformations. We thus established that the ICU11-CP2 module is required for proper flower organ identity.
Collapse
Affiliation(s)
| | | | | | - José Luis Micol
- Instituto de Bioingeniería, Universidad Miguel Hernández, Elche, Spain
| |
Collapse
|
9
|
Fausett SR, Sandjak A, Billard B, Braendle C. Higher-order epistasis shapes natural variation in germ stem cell niche activity. Nat Commun 2023; 14:2824. [PMID: 37198172 DOI: 10.1038/s41467-023-38527-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 05/05/2023] [Indexed: 05/19/2023] Open
Abstract
To study how natural allelic variation explains quantitative developmental system variation, we characterized natural differences in germ stem cell niche activity, measured as progenitor zone (PZ) size, between two Caenorhabditis elegans isolates. Linkage mapping yielded candidate loci on chromosomes II and V, and we found that the isolate with a smaller PZ size harbours a 148 bp promoter deletion in the Notch ligand, lag-2/Delta, a central signal promoting germ stem cell fate. As predicted, introducing this deletion into the isolate with a large PZ resulted in a smaller PZ size. Unexpectedly, restoring the deleted ancestral sequence in the isolate with a smaller PZ did not increase-but instead further reduced-PZ size. These seemingly contradictory phenotypic effects are explained by epistatic interactions between the lag-2/Delta promoter, the chromosome II locus, and additional background loci. These results provide first insights into the quantitative genetic architecture regulating an animal stem cell system.
Collapse
Affiliation(s)
- Sarah R Fausett
- Université Côte d'Azur, CNRS, Inserm, IBV, Nice, France.
- Department of Biology and Marine Biology, University of North Carolina Wilmington, Wilmington, NC, USA.
| | - Asma Sandjak
- Université Côte d'Azur, CNRS, Inserm, IBV, Nice, France
| | | | | |
Collapse
|
10
|
Ang RML, Chen SAA, Kern AF, Xie Y, Fraser HB. Widespread epistasis among beneficial genetic variants revealed by high-throughput genome editing. CELL GENOMICS 2023; 3:100260. [PMID: 37082144 PMCID: PMC10112194 DOI: 10.1016/j.xgen.2023.100260] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 09/27/2022] [Accepted: 01/06/2023] [Indexed: 04/22/2023]
Abstract
The phenotypic effect of any genetic variant can be altered by variation at other genomic loci. Known as epistasis, these genetic interactions shape the genotype-phenotype map of every species, yet their origins remain poorly understood. To investigate this, we employed high-throughput genome editing to measure the fitness effects of 1,826 naturally polymorphic variants in four strains of Saccharomyces cerevisiae. About 31% of variants affect fitness, of which 24% have strain-specific fitness effects indicative of epistasis. We found that beneficial variants are more likely to exhibit genetic interactions and that these interactions can be mediated by specific traits such as flocculation ability. This work suggests that adaptive evolution will often involve trade-offs where a variant is only beneficial in some genetic backgrounds, potentially explaining why many beneficial variants remain polymorphic. In sum, we provide a framework to understand the factors influencing epistasis with single-nucleotide resolution, revealing widespread epistasis among beneficial variants.
Collapse
Affiliation(s)
- Roy Moh Lik Ang
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Shi-An A. Chen
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| | - Alexander F. Kern
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Yihua Xie
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| | - Hunter B. Fraser
- Department of Biology, Stanford University, Stanford, CA 94305, USA
- Corresponding author
| |
Collapse
|
11
|
Evidence for Epistatic Interaction between HLA-G and LILRB1 in the Pathogenesis of Nonsegmental Vitiligo. Cells 2023; 12:cells12040630. [PMID: 36831297 PMCID: PMC9954564 DOI: 10.3390/cells12040630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 12/31/2022] [Accepted: 01/29/2023] [Indexed: 02/18/2023] Open
Abstract
Vitiligo is the most frequent cause of depigmentation worldwide. Genetic association studies have discovered about 50 loci associated with disease, many with immunological functions. Among them is HLA-G, which modulates immunity by interacting with specific inhibitory receptors, mainly LILRB1 and LILRB2. Here we investigated the LILRB1 and LILRB2 association with vitiligo risk and evaluated the possible role of interactions between HLA-G and its receptors in this pathogenesis. We tested the association of the polymorphisms of HLA-G, LILRB1, and LILRB2 with vitiligo using logistic regression along with adjustment by ancestry. Further, methods based on the multifactor dimensionality reduction (MDR) approach (MDR v.3.0.2, GMDR v.0.9, and MB-MDR) were used to detect potential epistatic interactions between polymorphisms from the three genes. An interaction involving rs9380142 and rs2114511 polymorphisms was identified by all methods used. The polymorphism rs9380142 is an HLA-G 3'UTR variant (+3187) with a well-established role in mRNA stability. The polymorphism rs2114511 is located in the exonic region of LILRB1. Although no association involving this SNP has been reported, ChIP-Seq experiments have identified this position as an EBF1 binding site. These results highlight the role of an epistatic interaction between HLA-G and LILRB1 in vitiligo pathogenesis.
Collapse
|
12
|
Learning high-order interactions for polygenic risk prediction. PLoS One 2023; 18:e0281618. [PMID: 36763605 PMCID: PMC9916647 DOI: 10.1371/journal.pone.0281618] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 01/27/2023] [Indexed: 02/11/2023] Open
Abstract
Within the framework of precision medicine, the stratification of individual genetic susceptibility based on inherited DNA variation has paramount relevance. However, one of the most relevant pitfalls of traditional Polygenic Risk Scores (PRS) approaches is their inability to model complex high-order non-linear SNP-SNP interactions and their effect on the phenotype (e.g. epistasis). Indeed, they incur in a computational challenge as the number of possible interactions grows exponentially with the number of SNPs considered, affecting the statistical reliability of the model parameters as well. In this work, we address this issue by proposing a novel PRS approach, called High-order Interactions-aware Polygenic Risk Score (hiPRS), that incorporates high-order interactions in modeling polygenic risk. The latter combines an interaction search routine based on frequent itemsets mining and a novel interaction selection algorithm based on Mutual Information, to construct a simple and interpretable weighted model of user-specified dimensionality that can predict a given binary phenotype. Compared to traditional PRSs methods, hiPRS does not rely on GWAS summary statistics nor any external information. Moreover, hiPRS differs from Machine Learning-based approaches that can include complex interactions in that it provides a readable and interpretable model and it is able to control overfitting, even on small samples. In the present work we demonstrate through a comprehensive simulation study the superior performance of hiPRS w.r.t. state of the art methods, both in terms of scoring performance and interpretability of the resulting model. We also test hiPRS against small sample size, class imbalance and the presence of noise, showcasing its robustness to extreme experimental settings. Finally, we apply hiPRS to a case study on real data from DACHS cohort, defining an interaction-aware scoring model to predict mortality of stage II-III Colon-Rectal Cancer patients treated with oxaliplatin.
Collapse
|
13
|
Morin MA, Morrison AJ, Harms MJ, Dutton RJ. Higher-order interactions shape microbial interactions as microbial community complexity increases. Sci Rep 2022; 12:22640. [PMID: 36587027 PMCID: PMC9805437 DOI: 10.1038/s41598-022-25303-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Accepted: 11/28/2022] [Indexed: 01/01/2023] Open
Abstract
Non-pairwise interactions, or higher-order interactions (HOIs), in microbial communities have been described as significant drivers of emergent features in microbiomes. Yet, the re-organization of microbial interactions between pairwise cultures and larger communities remains largely unexplored from a molecular perspective but is central to our understanding and further manipulation of microbial communities. Here, we used a bottom-up approach to investigate microbial interaction mechanisms from pairwise cultures up to 4-species communities from a simple microbiome (Hafnia alvei, Geotrichum candidum, Pencillium camemberti and Escherichia coli). Specifically, we characterized the interaction landscape for each species combination involving E. coli by identifying E. coli's interaction-associated mutants using an RB-TnSeq-based interaction assay. We observed a deep reorganization of the interaction-associated mutants, with very few 2-species interactions conserved all the way up to a 4-species community and the emergence of multiple HOIs. We further used a quantitative genetics strategy to decipher how 2-species interactions were quantitatively conserved in higher community compositions. Epistasis-based analysis revealed that, of the interactions that are conserved at all levels of complexity, 82% follow an additive pattern. Altogether, we demonstrate the complex architecture of microbial interactions even within a simple microbiome, and provide a mechanistic and molecular explanation of HOIs.
Collapse
Affiliation(s)
- Manon A. Morin
- grid.266100.30000 0001 2107 4242School of Biological Science, University of California San Diego, San Diego, 92093 USA
| | - Anneliese J. Morrison
- grid.170202.60000 0004 1936 8008Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR USA ,grid.170202.60000 0004 1936 8008Institute of Molecular Biology, University of Oregon, Eugene, OR USA
| | - Michael J. Harms
- grid.170202.60000 0004 1936 8008Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR USA ,grid.170202.60000 0004 1936 8008Institute of Molecular Biology, University of Oregon, Eugene, OR USA
| | - Rachel J. Dutton
- grid.266100.30000 0001 2107 4242School of Biological Science, University of California San Diego, San Diego, 92093 USA
| |
Collapse
|
14
|
Swamy KBS, Lee HY, Ladra C, Liu CFJ, Chao JC, Chen YY, Leu JY. Proteotoxicity caused by perturbed protein complexes underlies hybrid incompatibility in yeast. Nat Commun 2022; 13:4394. [PMID: 35906261 PMCID: PMC9338014 DOI: 10.1038/s41467-022-32107-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Accepted: 07/18/2022] [Indexed: 02/08/2023] Open
Abstract
Dobzhansky–Muller incompatibilities represent a major driver of reproductive isolation between species. They are caused when interacting components encoded by alleles from different species cannot function properly when mixed. At incipient stages of speciation, complex incompatibilities involving multiple genetic loci with weak effects are frequently observed, but the underlying mechanisms remain elusive. Here we show perturbed proteostasis leading to compromised mitosis and meiosis in Saccharomyces cerevisiae hybrid lines carrying one or two chromosomes from Saccharomyces bayanus var. uvarum. Levels of proteotoxicity are correlated with the number of protein complexes on replaced chromosomes. Proteomic approaches reveal that multi-protein complexes with subunits encoded by replaced chromosomes tend to be unstable. Furthermore, hybrid defects can be alleviated or aggravated, respectively, by up- or down-regulating the ubiquitin-proteasomal degradation machinery, suggesting that destabilized complex subunits overburden the proteostasis machinery and compromise hybrid fitness. Our findings reveal the general role of impaired protein complex assembly in complex incompatibilities. Hybrid incompatibility can be an important element of reproductive isolation and speciation. Using chromosome replacement lines of yeast, the authors show that perturbed proteostasis caused by destabilized hybrid protein complexes may represent a general mechanism of hybrid incompatibility.
Collapse
Affiliation(s)
- Krishna B S Swamy
- Institute of Molecular Biology, Academia Sinica, Taipei, 11529, Taiwan.,Division of Biological and Life Sciences, School of Arts and Sciences, Ahmedabad University, Ahmedabad, 380009, India
| | - Hsin-Yi Lee
- Institute of Molecular Biology, Academia Sinica, Taipei, 11529, Taiwan
| | - Carmina Ladra
- Institute of Molecular Biology, Academia Sinica, Taipei, 11529, Taiwan
| | - Chien-Fu Jeff Liu
- Institute of Molecular Biology, Academia Sinica, Taipei, 11529, Taiwan
| | - Jung-Chi Chao
- Institute of Molecular Biology, Academia Sinica, Taipei, 11529, Taiwan
| | - Yi-Yun Chen
- Institute of Biological Chemistry, Academia Sinica, Taipei, 11529, Taiwan
| | - Jun-Yi Leu
- Institute of Molecular Biology, Academia Sinica, Taipei, 11529, Taiwan.
| |
Collapse
|
15
|
Genetic Parameters for Selected Traits of Inbred Lines of Maize (Zea mays L.). APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12146961] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This paper presents an estimation of the parameters connected with the additive (a) effect, additive by additive (aa) epistatic effect, and additive by additive by additive (aaa) interaction gene effect for nine quantitative traits of maize (Zea mays L.) inbred lines. To our knowledge, this is the first report about aaa interaction of maize inbred lines. An analysis was performed on 252 lines derived from Plant Breeding Smolice Ltd. (Smolice, Poland)—Plant Breeding and Acclimatization Institute-National Research Institute Group (151 lines) and Małopolska Plant Breeding Ltd. (Kobierzyce, Poland) (101 lines). The total additive effects were significant for all studied cases. Two-way and three-way significant interactions were found in most analyzed cases with a considerable impact on phenotype. Omitting the inclusion of higher-order interactions effect in quantitative genetics may result in a substantial underestimation of additive QTL effects. Expanding models with that information may also be helpful in future homozygous line crossing projects.
Collapse
|
16
|
Mathew B, Hauptmann A, Léon J, Sillanpää MJ. NeuralLasso: Neural Networks Meet Lasso in Genomic Prediction. FRONTIERS IN PLANT SCIENCE 2022; 13:800161. [PMID: 35574107 PMCID: PMC9100816 DOI: 10.3389/fpls.2022.800161] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 03/18/2022] [Indexed: 06/15/2023]
Abstract
Prediction of complex traits based on genome-wide marker information is of central importance for both animal and plant breeding. Numerous models have been proposed for the prediction of complex traits and still considerable effort has been given to improve the prediction accuracy of these models, because various genetics factors like additive, dominance and epistasis effects can influence of the prediction accuracy of such models. Recently machine learning (ML) methods have been widely applied for prediction in both animal and plant breeding programs. In this study, we propose a new algorithm for genomic prediction which is based on neural networks, but incorporates classical elements of LASSO. Our new method is able to account for the local epistasis (higher order interaction between the neighboring markers) in the prediction. We compare the prediction accuracy of our new method with the most commonly used prediction methods, such as BayesA, BayesB, Bayesian Lasso (BL), genomic BLUP and Elastic Net (EN) using the heterogenous stock mouse and rice field data sets.
Collapse
Affiliation(s)
- Boby Mathew
- Bayer CropScience, Monheim am Rhein, Germany
- Institute of Crop Science and Resource Conservation, University of Bonn, Bonn, Germany
| | - Andreas Hauptmann
- Research Unit of Mathematical Sciences, University of Oulu, Oulu, Finland
- Department of Computer Science, University College London, London, United Kingdom
| | - Jens Léon
- Institute of Crop Science and Resource Conservation, University of Bonn, Bonn, Germany
| | - Mikko J. Sillanpää
- Research Unit of Mathematical Sciences, University of Oulu, Oulu, Finland
| |
Collapse
|
17
|
González-Seoane B, Ponte-Fernández C, González-Domínguez J, Martín MJ. PyToxo: a Python tool for calculating penetrance tables of high-order epistasis models. BMC Bioinformatics 2022; 23:117. [PMID: 35366804 PMCID: PMC8977015 DOI: 10.1186/s12859-022-04645-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 03/22/2022] [Indexed: 11/16/2022] Open
Abstract
Background Epistasis is the interaction between different genes when expressing a certain phenotype. If epistasis involves more than two loci it is called high-order epistasis. High-order epistasis is an area under active research because it could be the cause of many complex traits. The most common way to specify an epistasis interaction is through a penetrance table. Results This paper presents PyToxo, a Python tool for generating penetrance tables from any-order epistasis models. Unlike other tools available in the bibliography, PyToxo is able to work with high-order models and realistic penetrance and heritability values, achieving high-precision results in a short time. In addition, PyToxo is distributed as open-source software and includes several interfaces to ease its use. Conclusions PyToxo provides the scientific community with a useful tool to evaluate algorithms and methods that can detect high-order epistasis to continue advancing in the discovery of the causes behind complex diseases. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04645-7.
Collapse
|
18
|
Matsui T, Mullis MN, Roy KR, Hale JJ, Schell R, Levy SF, Ehrenreich IM. The interplay of additivity, dominance, and epistasis on fitness in a diploid yeast cross. Nat Commun 2022; 13:1463. [PMID: 35304450 PMCID: PMC8933436 DOI: 10.1038/s41467-022-29111-z] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Accepted: 02/22/2022] [Indexed: 12/27/2022] Open
Abstract
In diploid species, genetic loci can show additive, dominance, and epistatic effects. To characterize the contributions of these different types of genetic effects to heritable traits, we use a double barcoding system to generate and phenotype a panel of ~200,000 diploid yeast strains that can be partitioned into hundreds of interrelated families. This experiment enables the detection of thousands of epistatic loci, many whose effects vary across families. Here, we show traits are largely specified by a small number of hub loci with major additive and dominance effects, and pervasive epistasis. Genetic background commonly influences both the additive and dominance effects of loci, with multiple modifiers typically involved. The most prominent dominance modifier in our data is the mating locus, which has no effect on its own. Our findings show that the interplay between additivity, dominance, and epistasis underlies a complex genotype-to-phenotype map in diploids.
Collapse
Affiliation(s)
- Takeshi Matsui
- Joint Initiative for Metrology in Biology, Stanford, CA, 94305, USA
- SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Martin N Mullis
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA
- Twist Bioscience, 681 Gateway Blvd, South San Francisco, CA, 94080, USA
| | - Kevin R Roy
- Joint Initiative for Metrology in Biology, Stanford, CA, 94305, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, 94305, USA
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA, 94304, USA
| | - Joseph J Hale
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA
| | - Rachel Schell
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA
| | - Sasha F Levy
- Joint Initiative for Metrology in Biology, Stanford, CA, 94305, USA.
- SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA.
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, 94305, USA.
| | - Ian M Ehrenreich
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA.
| |
Collapse
|
19
|
Wu PY, Stich B, Weisweiler M, Shrestha A, Erban A, Westhoff P, Inghelandt DV. Improvement of prediction ability by integrating multi-omic datasets in barley. BMC Genomics 2022; 23:200. [PMID: 35279073 PMCID: PMC8917753 DOI: 10.1186/s12864-022-08337-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 01/20/2022] [Indexed: 11/10/2022] Open
Abstract
Background Genomic prediction (GP) based on single nucleotide polymorphisms (SNP) has become a broadly used tool to increase the gain of selection in plant breeding. However, using predictors that are biologically closer to the phenotypes such as transcriptome and metabolome may increase the prediction ability in GP. The objectives of this study were to (i) assess the prediction ability for three yield-related phenotypic traits using different omic datasets as single predictors compared to a SNP array, where these omic datasets included different types of sequence variants (full-SV, deleterious-dSV, and tolerant-tSV), different types of transcriptome (expression presence/absence variation-ePAV, gene expression-GE, and transcript expression-TE) sampled from two tissues, leaf and seedling, and metabolites (M); (ii) investigate the improvement in prediction ability when combining multiple omic datasets information to predict phenotypic variation in barley breeding programs; (iii) explore the predictive performance when using SV, GE, and ePAV from simulated 3’end mRNA sequencing of different lengths as predictors. Results The prediction ability from genomic best linear unbiased prediction (GBLUP) for the three traits using dSV information was higher than when using tSV, all SV information, or the SNP array. Any predictors from the transcriptome (GE, TE, as well as ePAV) and metabolome provided higher prediction abilities compared to the SNP array and SV on average across the three traits. In addition, some (di)-similarity existed between different omic datasets, and therefore provided complementary biological perspectives to phenotypic variation. Optimal combining the information of dSV, TE, ePAV, as well as metabolites into GP models could improve the prediction ability over that of the single predictors alone. Conclusions The use of integrated omic datasets in GP model is highly recommended. Furthermore, we evaluated a cost-effective approach generating 3’end mRNA sequencing with transcriptome data extracted from seedling without losing prediction ability in comparison to the full-length mRNA sequencing, paving the path for the use of such prediction methods in commercial breeding programs. Supplementary Information The online version contains supplementary material available at (10.1186/s12864-022-08337-7).
Collapse
|
20
|
Ponte-Fernandez C, Gonzalez-Dominguez J, Carvajal-Rodriguez A, Martin MJ. Evaluation of Existing Methods for High-Order Epistasis Detection. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:912-926. [PMID: 33055017 DOI: 10.1109/tcbb.2020.3030312] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Finding epistatic interactions among loci when expressing a phenotype is a widely employed strategy to understand the genetic architecture of complex traits in GWAS. The abundance of methods dedicated to the same purpose, however, makes it increasingly difficult for scientists to decide which method is more suitable for their studies. This work compares the different epistasis detection methods published during the last decade in terms of runtime, detection power and type I error rate, with a special emphasis on high-order interactions. Results show that in terms of detection power, the only methods that perform well across all experiments are the exhaustive methods, although their computational cost may be prohibitive in large-scale studies. Regarding non-exhaustive methods, not one could consistently find epistasis interactions when marginal effects are absent. If marginal effects are present, there are methods that perform well for high-order interactions, such as BADTrees, FDHE-IW, SingleMI or SNPHarvester. As for false-positive control, only SNPHarvester, FDHE-IW and DCHE show good results. The study concludes that there is no single epistasis detection method to recommend in all scenarios. Authors should prioritize exhaustive methods when sufficient computational resources are available considering the data set size, and resort to non-exhaustive methods when the analysis time is prohibitive.
Collapse
|
21
|
Maximal Information Coefficient-Based Testing to Identify Epistasis in Case-Control Association Studies. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:7843990. [PMID: 35211187 PMCID: PMC8863443 DOI: 10.1155/2022/7843990] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Revised: 01/12/2022] [Accepted: 01/27/2022] [Indexed: 12/18/2022]
Abstract
Interactions between genetic variants (epistasis) are ubiquitous in the model system and can significantly affect evolutionary adaptation, genetic mapping, and precision medical efforts. In this paper, we proposed a method for epistasis detection, called EpiMIC (epistasis detection through a maximal information coefficient (MIC)). MIC is a promising bivariate dependence measure explicitly designed for rapidly exploring various function types equally and for interpreting and comparing them on the same scale. Most epistasis detection approaches make assumptions about the form of the association between genetic variants, resulting in limited statistical performance. Based on the notion that if two SNPs do not interact, their joint distribution in all samples and in only cases should not be substantially different. We developed a statistic that utilizes the difference of MIC as a signal of epistasis and combined it with a permutation resampling strategy to estimate the empirical distribution of our statistic. Results of simulation and real-world data set showed that EpiMIC outperformed previous approaches for identifying epistasis at varying degrees of heredity.
Collapse
|
22
|
Ji X, Lin L, Fan J, Li Y, Wei Y, Shen S, Su L, Shafer A, Bjaanæs MM, Karlsson A, Planck M, Staaf J, Helland Å, Esteller M, Zhang R, Chen F, Christiani DC. Epigenome-wide three-way interaction study identifies a complex pattern between TRIM27, KIAA0226, and smoking associated with overall survival of early-stage NSCLC. Mol Oncol 2022; 16:717-731. [PMID: 34932879 PMCID: PMC8807353 DOI: 10.1002/1878-0261.13167] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Revised: 11/23/2021] [Accepted: 12/20/2021] [Indexed: 01/12/2023] Open
Abstract
The interaction between DNA methylation of tripartite motif containing 27 (cg05293407TRIM27 ) and smoking has previously been identified to reveal histologically heterogeneous effects of TRIM27 DNA methylation on early-stage non-small-cell lung cancer (NSCLC) survival. However, to understand the complex mechanisms underlying NSCLC progression, we searched three-way interactions. A two-phase study was adopted to identify three-way interactions in the form of pack-year of smoking (number of cigarettes smoked per day × number of years smoked) × cg05293407TRIM27 × epigenome-wide DNA methylation CpG probe. Two CpG probes were identified with FDR-q ≤ 0.05 in the discovery phase and P ≤ 0.05 in the validation phase: cg00060500KIAA0226 and cg17479956EXT2 . Compared to a prediction model with only clinical information, the model added 42 significant three-way interactions using a looser criterion (discovery: FDR-q ≤ 0.10, validation: P ≤ 0.05) had substantially improved the area under the receiver operating characteristic curve (AUC) of the prognostic prediction model for both 3-year and 5-year survival. Our research identified the complex interaction effects among multiple environment and epigenetic factors, and provided therapeutic target for NSCLC patients.
Collapse
Affiliation(s)
- Xinyu Ji
- Department of BiostatisticsCenter for Global HealthSchool of Public HealthNanjing Medical UniversityNanjingChina
| | - Lijuan Lin
- Department of BiostatisticsCenter for Global HealthSchool of Public HealthNanjing Medical UniversityNanjingChina
| | - Juanjuan Fan
- Department of BiostatisticsCenter for Global HealthSchool of Public HealthNanjing Medical UniversityNanjingChina
| | - Yi Li
- Department of BiostatisticsUniversity of MichiganAnn ArborMIUSA
| | - Yongyue Wei
- Department of BiostatisticsCenter for Global HealthSchool of Public HealthNanjing Medical UniversityNanjingChina,Department of Environmental HealthHarvard T.H. Chan School of Public HealthBostonMAUSA,China International Cooperation Center for Environment and Human HealthNanjing Medical UniversityNanjingChina
| | - Sipeng Shen
- Department of BiostatisticsCenter for Global HealthSchool of Public HealthNanjing Medical UniversityNanjingChina
| | - Li Su
- Department of Environmental HealthHarvard T.H. Chan School of Public HealthBostonMAUSA
| | - Andrea Shafer
- Pulmonary and Critical Care DivisionDepartment of MedicineMassachusetts General Hospital and Harvard Medical SchoolBostonMAUSA
| | - Maria Moksnes Bjaanæs
- Department of Cancer GeneticsInstitute for Cancer ResearchOslo University HospitalOsloNorway
| | - Anna Karlsson
- Division of OncologyDepartment of Clinical Sciences Lund and CREATE Health Strategic Center for Translational Cancer ResearchLund UniversityLundSweden
| | - Maria Planck
- Division of OncologyDepartment of Clinical Sciences Lund and CREATE Health Strategic Center for Translational Cancer ResearchLund UniversityLundSweden
| | - Johan Staaf
- Division of OncologyDepartment of Clinical Sciences Lund and CREATE Health Strategic Center for Translational Cancer ResearchLund UniversityLundSweden
| | - Åslaug Helland
- Department of Cancer GeneticsInstitute for Cancer ResearchOslo University HospitalOsloNorway,Institute of Clinical MedicineUniversity of OsloOsloNorway
| | - Manel Esteller
- Josep Carreras Leukaemia Research InstituteBarcelonaSpain,Centro de Investigacion Biomedica en Red CancerMadridSpain,Institucio Catalana de Recerca i Estudis AvançatsBarcelonaSpain,Physiological Sciences DepartmentSchool of Medicine and Health SciencesUniversity of BarcelonaBarcelonaSpain
| | - Ruyang Zhang
- Department of BiostatisticsCenter for Global HealthSchool of Public HealthNanjing Medical UniversityNanjingChina,Department of Environmental HealthHarvard T.H. Chan School of Public HealthBostonMAUSA,China International Cooperation Center for Environment and Human HealthNanjing Medical UniversityNanjingChina
| | - Feng Chen
- Department of BiostatisticsCenter for Global HealthSchool of Public HealthNanjing Medical UniversityNanjingChina,China International Cooperation Center for Environment and Human HealthNanjing Medical UniversityNanjingChina,State Key Laboratory of Reproductive MedicineNanjing Medical UniversityNanjingChina,Jiangsu Key Lab of Cancer Biomarkers, Prevention and TreatmentCancer CenterCollaborative Innovation Center for Cancer Personalized MedicineNanjing Medical UniversityNanjingChina
| | - David C. Christiani
- Department of Environmental HealthHarvard T.H. Chan School of Public HealthBostonMAUSA,Pulmonary and Critical Care DivisionDepartment of MedicineMassachusetts General Hospital and Harvard Medical SchoolBostonMAUSA
| |
Collapse
|
23
|
Schell R, Hale JJ, Mullis MN, Matsui T, Foree R, Ehrenreich IM. Genetic basis of a spontaneous mutation’s expressivity. Genetics 2022; 220:6515283. [PMID: 35078232 PMCID: PMC8893249 DOI: 10.1093/genetics/iyac013] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 01/19/2022] [Indexed: 11/12/2022] Open
Abstract
Abstract
Genetic background often influences the phenotypic consequences of mutations, resulting in variable expressivity. How standing genetic variants collectively cause this phenomenon is not fully understood. Here, we comprehensively identify loci in a budding yeast cross that impact the growth of individuals carrying a spontaneous missense mutation in the nuclear-encoded mitochondrial ribosomal gene MRP20. Initial results suggested that a single large effect locus influences the mutation’s expressivity, with one allele causing inviability in mutants. However, further experiments revealed this simplicity was an illusion. In fact, many additional loci shape the mutation’s expressivity, collectively leading to a wide spectrum of mutational responses. These results exemplify how complex combinations of alleles can produce a diversity of qualitative and quantitative responses to the same mutation.
Collapse
Affiliation(s)
- Rachel Schell
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Joseph J Hale
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Martin N Mullis
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Takeshi Matsui
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Ryan Foree
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Ian M Ehrenreich
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
24
|
Analytical and numerical comparisons of two methods of estimation of additive × additive × additive interaction of QTL effects. J Appl Genet 2021; 63:213-221. [PMID: 34940940 PMCID: PMC8979904 DOI: 10.1007/s13353-021-00676-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 11/07/2021] [Accepted: 12/13/2021] [Indexed: 12/27/2022]
Abstract
This paper presents the analytical and numerical comparison of two methods of estimation of additive × additive × additive (aaa) interaction of QTL effects. The first method takes into account only the plant phenotype, while in the second we also included genotypic information from molecular marker observation. Analysis was made on 150 doubled haploid (DH) lines of barley derived from cross Steptoe × Morex and 145 DH lines from Harrington × TR306 cross. In total, 153 sets of observation was analyzed. In most cases, aaa interactions were found with an exert effect on QTL. Results also show that with molecular marker observations, obtained estimators had smaller absolute values than phenotypic estimators.
Collapse
|
25
|
Abstract
AbstractTrade-offs and constraints are inherent to life, and studies of these phenomena play a central role in both organismal and evolutionary biology. Trade-offs can be defined, categorized, and studied in at least six, not mutually exclusive, ways. (1) Allocation constraints are caused by a limited resource (e.g., energy, time, space, essential nutrients), such that increasing allocation to one component necessarily requires a decrease in another (if only two components are involved, this is referred to as the Y-model, e.g., energy devoted to size versus number of offspring). (2) Functional conflicts occur when features that enhance performance of one task decrease performance of another (e.g., relative lengths of in-levers and out-levers, force-velocity trade-offs related to muscle fiber type composition). (3) Shared biochemical pathways, often involving integrator molecules (e.g., hormones, neurotransmitters, transcription factors), can simultaneously affect multiple traits, with some effects being beneficial for one or more components of Darwinian fitness (e.g., survival, age at first reproduction, fecundity) and others detrimental. (4) Antagonistic pleiotropy describes genetic variants that increase one component of fitness (or a lower-level trait) while simultaneously decreasing another. (5) Ecological circumstances (or selective regime) may impose trade-offs, such as when foraging behavior increases energy availability yet also decreases survival. (6) Sexual selection may lead to the elaboration of (usually male) secondary sexual characters that improve mating success but handicap survival and/or impose energetic costs that reduce other fitness components. Empirical studies of trade-offs often search for negative correlations between two traits that are the expected outcomes of the trade-offs, but this will generally be inadequate if more than two traits are involved and especially for complex physiological networks of interacting traits. Moreover, trade-offs often occur only in populations that are experiencing harsh environmental conditions or energetic challenges at the extremes of phenotypic distributions, such as among individuals or species that have exceptional athletic abilities. Trade-offs may be (partially) circumvented through various compensatory mechanisms, depending on the timescale involved, ranging from acute to evolutionary. Going forward, a pluralistic view of trade-offs and constraints, combined with integrative analyses that cross levels of biological organization and traditional boundaries among disciplines, will enhance the study of evolutionary organismal biology.
Collapse
|
26
|
Park S, Supek F, Lehner B. Higher order genetic interactions switch cancer genes from two-hit to one-hit drivers. Nat Commun 2021; 12:7051. [PMID: 34862370 PMCID: PMC8642467 DOI: 10.1038/s41467-021-27242-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 11/09/2021] [Indexed: 11/10/2022] Open
Abstract
The classic two-hit model posits that both alleles of a tumor suppressor gene (TSG) must be inactivated to cause cancer. In contrast, for some oncogenes and haploinsufficient TSGs, a single genetic alteration can suffice to increase tumor fitness. Here, by quantifying the interactions between mutations and copy number alterations (CNAs) across 10,000 tumors, we show that many cancer genes actually switch between acting as one-hit or two-hit drivers. Third order genetic interactions identify the causes of some of these switches in dominance and dosage sensitivity as mutations in other genes in the same biological pathway. The correct genetic model for a gene thus depends on the other mutations in a genome, with a second hit in the same gene or an alteration in a different gene in the same pathway sometimes representing alternative evolutionary paths to cancer.
Collapse
Affiliation(s)
- Solip Park
- Centro Nacional de Investigaciones Oncológicas (CNIO), Madrid, Spain.
| | - Fran Supek
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain.
| | - Ben Lehner
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain.
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
| |
Collapse
|
27
|
Expression level is a major modifier of the fitness landscape of a protein coding gene. Nat Ecol Evol 2021; 6:103-115. [PMID: 34795386 DOI: 10.1038/s41559-021-01578-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 10/01/2021] [Indexed: 11/09/2022]
Abstract
The phenotypic consequence of a genetic mutation depends on many factors including the expression level of a gene. However, a comprehensive quantification of this expression effect is still lacking, as is a further general mechanistic understanding of the effect. Here, we measured the fitness effect of almost all (>97.5%) single-nucleotide mutations in GFP, an exogenous gene with no physiological function, and URA3, a conditionally essential gene. Both genes were driven by two promoters whose expression levels differed by around tenfold. The resulting fitness landscapes revealed that the fitness effects of at least 42% of all single-nucleotide mutations within the genes were expression dependent. Although only a small fraction of variation in fitness effects among different mutations can be explained by biophysical properties of the protein and messenger RNA of the gene, our analyses revealed that the avoidance of stochastic molecular errors generally underlies the expression dependency of mutational effects and suggested protein misfolding as the most important type of molecular error among those examined. Our results therefore directly explained the slower evolution of highly expressed genes and highlighted cytotoxicity due to stochastic molecular errors as a non-negligible component for understanding the phenotypic consequence of mutations.
Collapse
|
28
|
Calvert MB, Doellman MM, Feder JL, Hood GR, Meyers P, Egan SP, Powell THQ, Glover MM, Tait C, Schuler H, Berlocher SH, Smith JJ, Nosil P, Hahn DA, Ragland GJ. Genomically correlated trait combinations and antagonistic selection contributing to counterintuitive genetic patterns of adaptive diapause divergence in Rhagoletis flies. J Evol Biol 2021; 35:146-163. [PMID: 34670006 DOI: 10.1111/jeb.13952] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Revised: 10/08/2021] [Accepted: 10/15/2021] [Indexed: 11/27/2022]
Abstract
Adaptation to novel environments can result in unanticipated genomic responses to selection. Here, we illustrate how multifarious, correlational selection helps explain a counterintuitive pattern of genetic divergence between the recently derived apple- and ancestral hawthorn-infesting host races of Rhagoletis pomonella (Diptera: Tephritidae). The apple host race terminates diapause and emerges as adults earlier in the season than the hawthorn host race, to coincide with the earlier fruiting phenology of their apple hosts. However, alleles at many loci associated with later emergence paradoxically occur at higher frequencies in sympatric populations of the apple compared to the hawthorn race. We present genomic evidence that historical selection over geographically varying environmental gradients across North America generated genetic correlations between two life history traits, diapause intensity and diapause termination, in the hawthorn host race. Moreover, the loci associated with these life history traits are concentrated in genomic regions in high linkage disequilibrium (LD). These genetic correlations are antagonistic to contemporary selection on local apple host race populations that favours increased initial diapause depth and earlier, not later, diapause termination. Thus, the paradox of apple flies appears due, in part, to pleiotropy or linkage of alleles associated with later adult emergence and increased initial diapause intensity, the latter trait strongly selected for by the earlier phenology of apples. Our results demonstrate how understanding of multivariate trait combinations and the correlative nature of selective forces acting on them can improve predictions concerning adaptive evolution and help explain seemingly counterintuitive patterns of genetic diversity in nature.
Collapse
Affiliation(s)
- McCall B Calvert
- Department of Integrative Biology, University of Colorado Denver, Denver, Colorado, USA
| | - Meredith M Doellman
- Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, USA
| | - Jeffrey L Feder
- Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, USA.,Environmental Change Initiative, University of Notre Dame, Notre Dame, Indiana, USA.,Advanced Diagnostics and Therapeutics Initiative, University of Notre Dame, Notre Dame, Indiana, USA
| | - Glen R Hood
- Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, USA.,Department of Biosciences, Rice University, Houston, Texas, USA.,Department of Biological Sciences, Wayne State University, Detroit, Michigan, USA
| | - Peter Meyers
- Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, USA
| | - Scott P Egan
- Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, USA.,Environmental Change Initiative, University of Notre Dame, Notre Dame, Indiana, USA.,Advanced Diagnostics and Therapeutics Initiative, University of Notre Dame, Notre Dame, Indiana, USA.,Department of Biosciences, Rice University, Houston, Texas, USA
| | - Thomas H Q Powell
- Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, USA.,Department of Biological Sciences, Binghamton University (State University of New York), Binghamton, New York, USA
| | - Mary M Glover
- Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, USA
| | - Cheyenne Tait
- Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, USA
| | - Hannes Schuler
- Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, USA.,Faculty of Science and Technology, Free University of Bozen-Bolzano, Bozen, Italy
| | - Stewart H Berlocher
- Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - James J Smith
- Department of Entomology, Lyman Briggs College, Michigan State University, East Lansing, Michigan, USA
| | - Patrik Nosil
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK.,CEFE, CNRS, EPHE, IRD, Univ Montpellier, Univ Paul Valéry Montpellier 3, Montpellier, France
| | - Daniel A Hahn
- Department of Entomology and Nematology, University of Florida, Gainesville, Florida, USA
| | - Gregory J Ragland
- Department of Integrative Biology, University of Colorado Denver, Denver, Colorado, USA.,Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, USA.,Advanced Diagnostics and Therapeutics Initiative, University of Notre Dame, Notre Dame, Indiana, USA
| |
Collapse
|
29
|
Lahens NF, Brooks TG, Sarantopoulou D, Nayak S, Lawrence C, Mrčela A, Srinivasan A, Schug J, Hogenesch JB, Barash Y, Grant GR. CAMPAREE: a robust and configurable RNA expression simulator. BMC Genomics 2021; 22:692. [PMID: 34563123 PMCID: PMC8467241 DOI: 10.1186/s12864-021-07934-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 08/17/2021] [Indexed: 11/10/2022] Open
Abstract
Background The accurate interpretation of RNA-Seq data presents a moving target as scientists continue to introduce new experimental techniques and analysis algorithms. Simulated datasets are an invaluable tool to accurately assess the performance of RNA-Seq analysis methods. However, existing RNA-Seq simulators focus on modeling the technical biases and artifacts of sequencing, rather than on simulating the original RNA samples. A first step in simulating RNA-Seq is to simulate RNA. Results To fill this need, we developed the Configurable And Modular Program Allowing RNA Expression Emulation (CAMPAREE), a simulator using empirical data to simulate diploid RNA samples at the level of individual molecules. We demonstrated CAMPAREE’s use for generating idealized coverage plots from real data, and for adding the ability to generate allele-specific data to existing RNA-Seq simulators that do not natively support this feature. Conclusions Separating input sample modeling from library preparation/sequencing offers added flexibility for both users and developers to mix-and-match different sample and sequencing simulators to suit their specific needs. Furthermore, the ability to maintain sample and sequencing simulators independently provides greater agility to incorporate new biological findings about transcriptomics and new developments in sequencing technologies. Additionally, by simulating at the level of individual molecules, CAMPAREE has the potential to model molecules transcribed from the same genes as a heterogeneous population of transcripts with different states of degradation and processing (splicing, editing, etc.). CAMPAREE was developed in Python, is open source, and freely available at https://github.com/itmat/CAMPAREE. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07934-2.
Collapse
Affiliation(s)
- Nicholas F Lahens
- The Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Thomas G Brooks
- The Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Dimitra Sarantopoulou
- The Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.,Present address: National Institute on Aging, National Institutes of Health, Baltimore, Maryland, USA
| | - Soumyashant Nayak
- Statistics and Mathematics Unit, Indian Statistical Institute, Bengaluru, Karnataka, India
| | - Cris Lawrence
- The Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Antonijo Mrčela
- The Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Anand Srinivasan
- Perelman School of Medicine, Enterprise Research Applications and High Performance Computing, Penn Medicine Academic Computing Services, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Jonathan Schug
- The Institute for Diabetes, Obesity and Metabolism, The Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - John B Hogenesch
- Division of Human Genetics, Department of Pediatrics, Center for Chronobiology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
| | - Yoseph Barash
- The Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Gregory R Grant
- The Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA. .,The Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
| |
Collapse
|
30
|
Yen PTW, Xia K, Cheong SA. Understanding Changes in the Topology and Geometry of Financial Market Correlations during a Market Crash. ENTROPY (BASEL, SWITZERLAND) 2021; 23:1211. [PMID: 34573837 PMCID: PMC8467365 DOI: 10.3390/e23091211] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 09/05/2021] [Accepted: 09/06/2021] [Indexed: 12/24/2022]
Abstract
In econophysics, the achievements of information filtering methods over the past 20 years, such as the minimal spanning tree (MST) by Mantegna and the planar maximally filtered graph (PMFG) by Tumminello et al., should be celebrated. Here, we show how one can systematically improve upon this paradigm along two separate directions. First, we used topological data analysis (TDA) to extend the notions of nodes and links in networks to faces, tetrahedrons, or k-simplices in simplicial complexes. Second, we used the Ollivier-Ricci curvature (ORC) to acquire geometric information that cannot be provided by simple information filtering. In this sense, MSTs and PMFGs are but first steps to revealing the topological backbones of financial networks. This is something that TDA can elucidate more fully, following which the ORC can help us flesh out the geometry of financial networks. We applied these two approaches to a recent stock market crash in Taiwan and found that, beyond fusions and fissions, other non-fusion/fission processes such as cavitation, annihilation, rupture, healing, and puncture might also be important. We also successfully identified neck regions that emerged during the crash, based on their negative ORCs, and performed a case study on one such neck region.
Collapse
Affiliation(s)
- Peter Tsung-Wen Yen
- Center for Crystal Researches, National Sun Yet-Sen University, No. 70, Lien-hai Rd., Kaohsiung 80424, Taiwan;
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, 21 Nanyang Link, Singapore 637371, Singapore;
| | - Siew Ann Cheong
- Division of Physics and Applied Physics, School of Physical and Mathematical Sciences, Nanyang Technological University, 21 Nanyang Link, Singapore 637371, Singapore
| |
Collapse
|
31
|
Bayat A, Hosking B, Jain Y, Hosking C, Kodikara M, Reti D, Twine NA, Bauer DC. Fast and accurate exhaustive higher-order epistasis search with BitEpi. Sci Rep 2021; 11:15923. [PMID: 34354094 PMCID: PMC8342486 DOI: 10.1038/s41598-021-94959-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Accepted: 07/20/2021] [Indexed: 01/03/2023] Open
Abstract
Complex genetic diseases may be modulated by a large number of epistatic interactions affecting a polygenic phenotype. Identifying these interactions is difficult due to computational complexity, especially in the case of higher-order interactions where more than two genomic variants are involved. In this paper, we present BitEpi, a fast and accurate method to test all possible combinations of up to four bi-allelic variants (i.e. Single Nucleotide Variant or SNV for short). BitEpi introduces a novel bitwise algorithm that is 1.7 and 56 times faster for 3-SNV and 4-SNV search, than established software. The novel entropy statistic used in BitEpi is 44% more accurate to identify interactive SNVs, incorporating a p-value-based significance testing. We demonstrate BitEpi on real world data of 4900 samples and 87,000 SNPs. We also present EpiExplorer to visualize the potentially large number of individual and interacting SNVs in an interactive Cytoscape graph. EpiExplorer uses various visual elements to facilitate the discovery of true biological events in a complex polygenic environment.
Collapse
Affiliation(s)
- Arash Bayat
- Transformations Bioinformatics, Health and Biosecurity, Commonwealth Scientific and Industrial Research Organisation (CSIRO), North Ryde, NSW, 2113, Australia.,The Kinghorn Cancer Centre, Darlinghurst, NSW, 2010, Australia
| | - Brendan Hosking
- Transformations Bioinformatics, Health and Biosecurity, Commonwealth Scientific and Industrial Research Organisation (CSIRO), North Ryde, NSW, 2113, Australia
| | - Yatish Jain
- Transformations Bioinformatics, Health and Biosecurity, Commonwealth Scientific and Industrial Research Organisation (CSIRO), North Ryde, NSW, 2113, Australia.,Department of Biomedical Sciences, Macquarie University, Macquarie Park, NSW, 2113, Australia
| | - Cameron Hosking
- Transformations Bioinformatics, Health and Biosecurity, Commonwealth Scientific and Industrial Research Organisation (CSIRO), North Ryde, NSW, 2113, Australia
| | - Milindi Kodikara
- Transformations Bioinformatics, Health and Biosecurity, Commonwealth Scientific and Industrial Research Organisation (CSIRO), North Ryde, NSW, 2113, Australia
| | - Daniel Reti
- Transformations Bioinformatics, Health and Biosecurity, Commonwealth Scientific and Industrial Research Organisation (CSIRO), North Ryde, NSW, 2113, Australia.,Applied BioSciences, Faculty of Science and Engineering, Macquarie University, Macquarie Park, NSW, 2113, Australia
| | - Natalie A Twine
- Transformations Bioinformatics, Health and Biosecurity, Commonwealth Scientific and Industrial Research Organisation (CSIRO), North Ryde, NSW, 2113, Australia.,Applied BioSciences, Faculty of Science and Engineering, Macquarie University, Macquarie Park, NSW, 2113, Australia
| | - Denis C Bauer
- Transformations Bioinformatics, Health and Biosecurity, Commonwealth Scientific and Industrial Research Organisation (CSIRO), North Ryde, NSW, 2113, Australia. .,Department of Biomedical Sciences, Macquarie University, Macquarie Park, NSW, 2113, Australia. .,Applied BioSciences, Faculty of Science and Engineering, Macquarie University, Macquarie Park, NSW, 2113, Australia.
| |
Collapse
|
32
|
Sakai T, Abe A, Shimizu M, Terauchi R. RIL-StEp: epistasis analysis of rice recombinant inbred lines reveals candidate interacting genes that control seed hull color and leaf chlorophyll content. G3 (BETHESDA, MD.) 2021; 11:jkab130. [PMID: 33871605 PMCID: PMC8496299 DOI: 10.1093/g3journal/jkab130] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 04/10/2021] [Indexed: 11/19/2022]
Abstract
Characterizing epistatic gene interactions is fundamental for understanding the genetic architecture of complex traits. However, due to the large number of potential gene combinations, detecting epistatic gene interactions is computationally demanding. A simple, easy-to-perform method for sensitive detection of epistasis is required. Due to their homozygous nature, use of recombinant inbred lines excludes the dominance effect of alleles and interactions involving heterozygous genotypes, thereby allowing detection of epistasis in a simple and interpretable model. Here, we present an approach called RIL-StEp (recombinant inbred lines stepwise epistasis detection) to detect epistasis using single-nucleotide polymorphisms in the genome. We applied the method to reveal epistasis affecting rice (Oryza sativa) seed hull color and leaf chlorophyll content and successfully identified pairs of genomic regions that presumably control these phenotypes. This method has the potential to improve our understanding of the genetic architecture of various traits of crops and other organisms.
Collapse
Affiliation(s)
- Toshiyuki Sakai
- Laboratory of Crop Evolution, Graduate School of Agriculture, Kyoto University, Mozume, Muko, Kyoto 617-0001, Japan
- The Sainsbury Laboratory, University of East Anglia, Norwich Research Park, Norwich NR4 7UH, UK
| | - Akira Abe
- Iwate Biotechnology Research Center, Kitakami, Iwate 024-0003, Japan
| | - Motoki Shimizu
- Iwate Biotechnology Research Center, Kitakami, Iwate 024-0003, Japan
| | - Ryohei Terauchi
- Laboratory of Crop Evolution, Graduate School of Agriculture, Kyoto University, Mozume, Muko, Kyoto 617-0001, Japan
- Iwate Biotechnology Research Center, Kitakami, Iwate 024-0003, Japan
| |
Collapse
|
33
|
Diehl V, Wegner M, Grumati P, Husnjak K, Schaubeck S, Gubas A, Shah V, Polat I, Langschied F, Prieto-Garcia C, Müller K, Kalousi A, Ebersberger I, Brandts C, Dikic I, Kaulich M. Minimized combinatorial CRISPR screens identify genetic interactions in autophagy. Nucleic Acids Res 2021; 49:5684-5704. [PMID: 33956155 PMCID: PMC8191801 DOI: 10.1093/nar/gkab309] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 04/01/2021] [Accepted: 04/14/2021] [Indexed: 12/13/2022] Open
Abstract
Combinatorial CRISPR-Cas screens have advanced the mapping of genetic interactions, but their experimental scale limits the number of targetable gene combinations. Here, we describe 3Cs multiplexing, a rapid and scalable method to generate highly diverse and uniformly distributed combinatorial CRISPR libraries. We demonstrate that the library distribution skew is the critical determinant of its required screening coverage. By circumventing iterative cloning of PCR-amplified oligonucleotides, 3Cs multiplexing facilitates the generation of combinatorial CRISPR libraries with low distribution skews. We show that combinatorial 3Cs libraries can be screened with minimal coverages, reducing associated efforts and costs at least 10-fold. We apply a 3Cs multiplexing library targeting 12,736 autophagy gene combinations with 247,032 paired gRNAs in viability and reporter-based enrichment screens. In the viability screen, we identify, among others, the synthetic lethal WDR45B-PIK3R4 and the proliferation-enhancing ATG7-KEAP1 genetic interactions. In the reporter-based screen, we identify over 1,570 essential genetic interactions for autophagy flux, including interactions among paralogous genes, namely ATG2A-ATG2B, GABARAP-MAP1LC3B and GABARAP-GABARAPL2. However, we only observe few genetic interactions within paralogous gene families of more than two members, indicating functional compensation between them. This work establishes 3Cs multiplexing as a platform for genetic interaction screens at scale.
Collapse
Affiliation(s)
- Valentina Diehl
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Martin Wegner
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Paolo Grumati
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Koraljka Husnjak
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Simone Schaubeck
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Andrea Gubas
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Varun Jayeshkumar Shah
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Ibrahim H Polat
- Department of Medicine, Hematology/Oncology, University Hospital, Goethe University, 60590 Frankfurt am Main, Germany
| | - Felix Langschied
- Applied Bioinformatics Group, Institute of Cell Biology and Neuroscience, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| | - Cristian Prieto-Garcia
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Konstantin Müller
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Alkmini Kalousi
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Ingo Ebersberger
- Applied Bioinformatics Group, Institute of Cell Biology and Neuroscience, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
- Senckenberg Biodiversity and Climate Research Centre (S-BIK-F), Frankfurt am Main, Germany
- LOEWE Centre for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
| | - Christian H Brandts
- Department of Medicine, Hematology/Oncology, University Hospital, Goethe University, 60590 Frankfurt am Main, Germany
- Frankfurt Cancer Institute, 60596 Frankfurt am Main, Germany
- University Cancer Center Frankfurt (UCT), University Hospital, Goethe University, Frankfurt, Germany
| | - Ivan Dikic
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
- Frankfurt Cancer Institute, 60596 Frankfurt am Main, Germany
- Cardio-Pulmonary Institute, 60590 Frankfurt am Main, Germany
- Buchmann Institute for Molecular Life Sciences, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| | - Manuel Kaulich
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
- Frankfurt Cancer Institute, 60596 Frankfurt am Main, Germany
- Cardio-Pulmonary Institute, 60590 Frankfurt am Main, Germany
| |
Collapse
|
34
|
Parts L, Batté A, Lopes M, Yuen MW, Laver M, San Luis B, Yue J, Pons C, Eray E, Aloy P, Liti G, van Leeuwen J. Natural variants suppress mutations in hundreds of essential genes. Mol Syst Biol 2021; 17:e10138. [PMID: 34042294 PMCID: PMC8156963 DOI: 10.15252/msb.202010138] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 04/22/2021] [Accepted: 04/23/2021] [Indexed: 01/04/2023] Open
Abstract
The consequence of a mutation can be influenced by the context in which it operates. For example, loss of gene function may be tolerated in one genetic background, and lethal in another. The extent to which mutant phenotypes are malleable, the architecture of modifiers and the identities of causal genes remain largely unknown. Here, we measure the fitness effects of ~ 1,100 temperature-sensitive alleles of yeast essential genes in the context of variation from ten different natural genetic backgrounds and map the modifiers for 19 combinations. Altogether, fitness defects for 149 of the 580 tested genes (26%) could be suppressed by genetic variation in at least one yeast strain. Suppression was generally driven by gain-of-function of a single, strong modifier gene, and involved both genes encoding complex or pathway partners suppressing specific temperature-sensitive alleles, as well as general modifiers altering the effect of many alleles. The emerging frequency of suppression and range of possible mechanisms suggest that a substantial fraction of monogenic diseases could be managed by modulating other gene products.
Collapse
Affiliation(s)
- Leopold Parts
- Donnelly Centre for Cellular and Biomolecular ResearchUniversity of TorontoTorontoONCanada
- Wellcome Sanger InstituteWellcome Genome CampusHinxtonUK
- Department of Computer ScienceUniversity of TartuTartuEstonia
| | - Amandine Batté
- Center for Integrative GenomicsUniversity of LausanneLausanneSwitzerland
| | - Maykel Lopes
- Center for Integrative GenomicsUniversity of LausanneLausanneSwitzerland
| | - Michael W Yuen
- Donnelly Centre for Cellular and Biomolecular ResearchUniversity of TorontoTorontoONCanada
| | - Meredith Laver
- Donnelly Centre for Cellular and Biomolecular ResearchUniversity of TorontoTorontoONCanada
| | - Bryan‐Joseph San Luis
- Donnelly Centre for Cellular and Biomolecular ResearchUniversity of TorontoTorontoONCanada
| | - Jia‐Xing Yue
- University of Côte d’AzurCNRSINSERMIRCANNiceFrance
| | - Carles Pons
- Institute for Research in Biomedicine (IRB Barcelona)The Barcelona Institute for Science and TechnologyBarcelonaSpain
| | - Elise Eray
- Center for Integrative GenomicsUniversity of LausanneLausanneSwitzerland
| | - Patrick Aloy
- Institute for Research in Biomedicine (IRB Barcelona)The Barcelona Institute for Science and TechnologyBarcelonaSpain
- Institució Catalana de Recerca i Estudis Avançats (ICREA)BarcelonaSpain
| | - Gianni Liti
- University of Côte d’AzurCNRSINSERMIRCANNiceFrance
| | | |
Collapse
|
35
|
Kryazhimskiy S. Emergence and propagation of epistasis in metabolic networks. eLife 2021; 10:e60200. [PMID: 33527897 PMCID: PMC7924954 DOI: 10.7554/elife.60200] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 02/01/2021] [Indexed: 12/11/2022] Open
Abstract
Epistasis is often used to probe functional relationships between genes, and it plays an important role in evolution. However, we lack theory to understand how functional relationships at the molecular level translate into epistasis at the level of whole-organism phenotypes, such as fitness. Here, I derive two rules for how epistasis between mutations with small effects propagates from lower- to higher-level phenotypes in a hierarchical metabolic network with first-order kinetics and how such epistasis depends on topology. Most importantly, weak epistasis at a lower level may be distorted as it propagates to higher levels. Computational analyses show that epistasis in more realistic models likely follows similar, albeit more complex, patterns. These results suggest that pairwise inter-gene epistasis should be common, and it should generically depend on the genetic background and environment. Furthermore, the epistasis coefficients measured for high-level phenotypes may not be sufficient to fully infer the underlying functional relationships.
Collapse
Affiliation(s)
- Sergey Kryazhimskiy
- Division of Biological Sciences, University of California, San DiegoLa JollaUnited States
| |
Collapse
|
36
|
Amatya S, Ye M, Yang L, Gandhi CK, Wu R, Nagourney B, Floros J. Single Nucleotide Polymorphisms Interactions of the Surfactant Protein Genes Associated With Respiratory Distress Syndrome Susceptibility in Preterm Infants. Front Pediatr 2021; 9:682160. [PMID: 34671583 PMCID: PMC8521105 DOI: 10.3389/fped.2021.682160] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 09/06/2021] [Indexed: 11/14/2022] Open
Abstract
Background: Neonatal respiratory distress syndrome (RDS), due to surfactant deficiency in preterm infants, is the most common cause of respiratory morbidity. The surfactant proteins (SFTP) genetic variants have been well-studied in association with RDS; however, the impact of SNP-SNP (single nucleotide polymorphism) interactions on RDS has not been addressed. Therefore, this study utilizes a newer statistical model to determine the association of SFTP single SNP model and SNP-SNP interactions in a two and a three SNP interaction model with RDS susceptibility. Methods: This study used available genotype and clinical data in the Floros biobank at Penn State University. The patients consisted of 848 preterm infants, born <36 weeks of gestation, with 477 infants with RDS and 458 infants without RDS. Seventeen well-studied SFTPA1, SFTPA2, SFTPB, SFTPC, and SFTPD SNPs were investigated. Wang's statistical model was employed to test and identify significant associations in a case-control study. Results: Only the rs17886395 (C allele) of the SFTPA2 was associated with protection for RDS in a single-SNP model (Odd's Ratio 0.16, 95% CI 0.06-0.43, adjusted p = 0.03). The highest number of interactions (n = 27) in the three SNP interactions were among SFTPA1 and SFTPA2. The three SNP models showed intergenic and intragenic interactions among all SFTP SNPs except SFTPC. Conclusion: The single SNP model and SNP interactions using the two and three SNP interactions models identified SFTP-SNP associations with RDS. However, the large number of significant associations containing SFTPA1 and/or SFTPA2 SNPs point to the importance of SFTPA1 and SFTPA2 in RDS susceptibility.
Collapse
Affiliation(s)
- Shaili Amatya
- Department of Pediatrics, Center for Host Defense, Inflammation, and Lung Disease (CHILD) Research, Pennsylvania State University College of Medicine, Hershey, PA, United States
| | - Meixia Ye
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, China
| | - Lili Yang
- School of First Clinical Medicine, Nanjing University of Chinese Medicine, Nanjing, China
| | - Chintan K Gandhi
- Department of Pediatrics, Center for Host Defense, Inflammation, and Lung Disease (CHILD) Research, Pennsylvania State University College of Medicine, Hershey, PA, United States
| | - Rongling Wu
- Public Health Science, Pennsylvania State University College of Medicine, Hershey, PA, United States
| | - Beth Nagourney
- Albert Einstein College of Medicine, New York, NY, United States
| | - Joanna Floros
- Department of Pediatrics, Center for Host Defense, Inflammation, and Lung Disease (CHILD) Research, Pennsylvania State University College of Medicine, Hershey, PA, United States.,Obstetrics and Gynecology, Pennsylvania State University College of Medicine, Hershey, PA, United States
| |
Collapse
|
37
|
Goldstein I, Ehrenreich IM. The complex role of genetic background in shaping the effects of spontaneous and induced mutations. Yeast 2020; 38:187-196. [PMID: 33125810 PMCID: PMC7984271 DOI: 10.1002/yea.3530] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 10/09/2020] [Accepted: 10/24/2020] [Indexed: 12/27/2022] Open
Abstract
Spontaneous and induced mutations frequently show different phenotypic effects across genetically distinct individuals. It is generally appreciated that these background effects mainly result from genetic interactions between the mutations and segregating loci. However, the architectures and molecular bases of these genetic interactions are not well understood. Recent work in a number of model organisms has tried to advance knowledge of background effects both by using large‐scale screens to find mutations that exhibit this phenomenon and by identifying the specific loci that are involved. Here, we review this body of research, emphasizing in particular the insights it provides into both the prevalence of background effects across different mutations and the mechanisms that cause these background effects. A large fraction of mutations show different effects in distinct individuals. These background effects are mainly caused by epistasis with segregating loci. Mapping studies show a diversity of genetic architectures can be involved. Genetically complex changes in gene expression are often, but not always, causative.
Collapse
Affiliation(s)
- Ilan Goldstein
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, California, 90089-2910, USA
| | - Ian M Ehrenreich
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, California, 90089-2910, USA
| |
Collapse
|
38
|
Kumar M, Srivastav AK, Parmar D. Genetic analysis and epistatic interaction association of lipid traits in a C57xBalb/c F2 mice. GENE REPORTS 2020. [DOI: 10.1016/j.genrep.2020.100729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
39
|
Tekin E, Diamant ES, Cruz‐Loya M, Enriquez V, Singh N, Savage VM, Yeh PJ. Using a newly introduced framework to measure ecological stressor interactions. Ecol Lett 2020; 23:1391-1403. [DOI: 10.1111/ele.13533] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Revised: 02/13/2020] [Accepted: 04/16/2020] [Indexed: 12/30/2022]
Affiliation(s)
- Elif Tekin
- Department of Ecology and Evolutionary Biology University of California Los Angeles CA90095USA
- Department of Computational Medicine the David Geffen School of Medicine University of California Los Angeles CA USA
| | - Eleanor S. Diamant
- Department of Ecology and Evolutionary Biology University of California Los Angeles CA90095USA
| | - Mauricio Cruz‐Loya
- Department of Computational Medicine the David Geffen School of Medicine University of California Los Angeles CA USA
| | - Vivien Enriquez
- Department of Ecology and Evolutionary Biology University of California Los Angeles CA90095USA
| | - Nina Singh
- Department of Ecology and Evolutionary Biology University of California Los Angeles CA90095USA
| | - Van M. Savage
- Department of Ecology and Evolutionary Biology University of California Los Angeles CA90095USA
- Department of Computational Medicine the David Geffen School of Medicine University of California Los Angeles CA USA
- Santa Fe Institute Santa Fe NM87501USA
| | - Pamela J. Yeh
- Department of Ecology and Evolutionary Biology University of California Los Angeles CA90095USA
- Santa Fe Institute Santa Fe NM87501USA
| |
Collapse
|
40
|
Affiliation(s)
- Ian M Ehrenreich
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA.
| |
Collapse
|
41
|
Chanda P, Costa E, Hu J, Sukumar S, Van Hemert J, Walia R. Information Theory in Computational Biology: Where We Stand Today. ENTROPY (BASEL, SWITZERLAND) 2020; 22:E627. [PMID: 33286399 PMCID: PMC7517167 DOI: 10.3390/e22060627] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 05/31/2020] [Accepted: 06/03/2020] [Indexed: 12/30/2022]
Abstract
"A Mathematical Theory of Communication" was published in 1948 by Claude Shannon to address the problems in the field of data compression and communication over (noisy) communication channels. Since then, the concepts and ideas developed in Shannon's work have formed the basis of information theory, a cornerstone of statistical learning and inference, and has been playing a key role in disciplines such as physics and thermodynamics, probability and statistics, computational sciences and biological sciences. In this article we review the basic information theory based concepts and describe their key applications in multiple major areas of research in computational biology-gene expression and transcriptomics, alignment-free sequence comparison, sequencing and error correction, genome-wide disease-gene association mapping, metabolic networks and metabolomics, and protein sequence, structure and interaction analysis.
Collapse
Affiliation(s)
- Pritam Chanda
- Corteva Agriscience™, Indianapolis, IN 46268, USA
- Computer and Information Science, Indiana University-Purdue University, Indianapolis, IN 46202, USA
| | - Eduardo Costa
- Corteva Agriscience™, Mogi Mirim, Sao Paulo 13801-540, Brazil
| | - Jie Hu
- Corteva Agriscience™, Indianapolis, IN 46268, USA
| | | | | | - Rasna Walia
- Corteva Agriscience™, Johnston, IA 50131, USA
| |
Collapse
|
42
|
Toxo: a library for calculating penetrance tables of high-order epistasis models. BMC Bioinformatics 2020; 21:138. [PMID: 32272874 PMCID: PMC7147067 DOI: 10.1186/s12859-020-3456-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Accepted: 03/18/2020] [Indexed: 12/12/2022] Open
Abstract
Background Epistasis is defined as the interaction between different genes when expressing a specific phenotype. The most common way to characterize an epistatic relationship is using a penetrance table, which contains the probability of expressing the phenotype under study given a particular allele combination. Available simulators can only create penetrance tables for well-known epistasis models involving a small number of genes and under a large number of limitations. Results Toxo is a MATLAB library designed to calculate penetrance tables of epistasis models of any interaction order which resemble real data more closely. The user specifies the desired heritability (or prevalence) and the program maximizes the table’s prevalence (or heritability) according to the input epistatic model boundaries. Conclusions Toxo extends the capabilities of existing simulators that define epistasis using penetrance tables. These tables can be directly used as input for software simulators such as GAMETES so that they are able to generate data samples with larger interactions and more realistic prevalences/heritabilities.
Collapse
|
43
|
Hekselman I, Yeger-Lotem E. Mechanisms of tissue and cell-type specificity in heritable traits and diseases. Nat Rev Genet 2020; 21:137-150. [DOI: 10.1038/s41576-019-0200-9] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/12/2019] [Indexed: 02/07/2023]
|
44
|
Testing the Significance of Interactions in Genetic Studies Using Interaction Information and Resampling Technique. LECTURE NOTES IN COMPUTER SCIENCE 2020. [PMCID: PMC7304020 DOI: 10.1007/978-3-030-50420-5_38] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Abstract
Interaction information is a model-free, non-parametric measure used for detection of interaction among variables. It frequently finds interactions which remain undetected by standard model-based methods. However in the previous studies application of interaction information was limited by lack of appropriate statistical tests. We study a challenging problem of testing the positiveness of interaction information which allows to confirm the statistical significance of the investigated interactions. It turns out that commonly used chi-squared test detects too many spurious interactions when the dependence between the variables (e.g. between two genetic markers) is strong. To overcome this problem we consider permutation test and also propose a novel HYBRID method that combines permutation and chi-squared tests and takes into account dependence between studied variables. We show in numerical experiments that, in contrast to chi-squared based test, the proposed method controls well the actual significance level and in many situations detects interactions which are undetected by standard methods. Moreover HYBRID method outperforms permutation test with respect to power and computational efficiency. The method is applied to find interactions among Single Nucleotide Polymorphisms as well as among gene expression levels of human immune cells.
Collapse
|
45
|
Sanchez-Gorostiaga A, Bajić D, Osborne ML, Poyatos JF, Sanchez A. High-order interactions distort the functional landscape of microbial consortia. PLoS Biol 2019; 17:e3000550. [PMID: 31830028 PMCID: PMC6932822 DOI: 10.1371/journal.pbio.3000550] [Citation(s) in RCA: 76] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 12/26/2019] [Accepted: 11/15/2019] [Indexed: 12/11/2022] Open
Abstract
Understanding the link between community composition and function is a major challenge in microbial population biology, with implications for the management of natural microbiomes and the design of synthetic consortia. Specifically, it is poorly understood whether community functions can be quantitatively predicted from traits of species in monoculture. Inspired by the study of complex genetic interactions, we have examined how the amylolytic rate of combinatorial assemblages of six starch-degrading soil bacteria depend on the separate functional contributions from each species and their interactions. Filtering our results through the theory of biochemical kinetics, we show that this simple function is additive in the absence of interactions among community members. For about half of the combinatorially assembled consortia, the amylolytic function is dominated by pairwise and higher-order interactions. For the other half, the function is additive despite the presence of strong competitive interactions. We explain the mechanistic basis of these findings and propose a quantitative framework that allows us to separate the effect of behavioral and population dynamics interactions. Our results suggest that the functional robustness of a consortium to pairwise and higher-order interactions critically affects our ability to predict and bottom-up engineer ecosystem function in complex communities.
Collapse
Affiliation(s)
- Alicia Sanchez-Gorostiaga
- Department of Ecology & Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
- Microbial Sciences Institute, Yale University, West Haven, Connecticut, United States of America
| | - Djordje Bajić
- Department of Ecology & Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
- Microbial Sciences Institute, Yale University, West Haven, Connecticut, United States of America
| | - Melisa L. Osborne
- The Rowland Institute at Harvard, Harvard University, Cambridge, Massachusetts, United States of America
- Biological Design Center, Boston University, Boston, Massachusetts, United States of America
| | - Juan F. Poyatos
- The Rowland Institute at Harvard, Harvard University, Cambridge, Massachusetts, United States of America
- Logic of Genomic Systems Laboratory, Spanish National Biotechnology Centre (CNB-CSIC), Madrid, Spain
| | - Alvaro Sanchez
- Department of Ecology & Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
- Microbial Sciences Institute, Yale University, West Haven, Connecticut, United States of America
- The Rowland Institute at Harvard, Harvard University, Cambridge, Massachusetts, United States of America
| |
Collapse
|
46
|
Sanchez-Gorostiaga A, Bajić D, Osborne ML, Poyatos JF, Sanchez A. High-order interactions distort the functional landscape of microbial consortia. PLoS Biol 2019; 17:e3000550. [PMID: 31830028 DOI: 10.1101/333534] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 12/26/2019] [Accepted: 11/15/2019] [Indexed: 05/23/2023] Open
Abstract
Understanding the link between community composition and function is a major challenge in microbial population biology, with implications for the management of natural microbiomes and the design of synthetic consortia. Specifically, it is poorly understood whether community functions can be quantitatively predicted from traits of species in monoculture. Inspired by the study of complex genetic interactions, we have examined how the amylolytic rate of combinatorial assemblages of six starch-degrading soil bacteria depend on the separate functional contributions from each species and their interactions. Filtering our results through the theory of biochemical kinetics, we show that this simple function is additive in the absence of interactions among community members. For about half of the combinatorially assembled consortia, the amylolytic function is dominated by pairwise and higher-order interactions. For the other half, the function is additive despite the presence of strong competitive interactions. We explain the mechanistic basis of these findings and propose a quantitative framework that allows us to separate the effect of behavioral and population dynamics interactions. Our results suggest that the functional robustness of a consortium to pairwise and higher-order interactions critically affects our ability to predict and bottom-up engineer ecosystem function in complex communities.
Collapse
Affiliation(s)
- Alicia Sanchez-Gorostiaga
- Department of Ecology & Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
- Microbial Sciences Institute, Yale University, West Haven, Connecticut, United States of America
| | - Djordje Bajić
- Department of Ecology & Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
- Microbial Sciences Institute, Yale University, West Haven, Connecticut, United States of America
| | - Melisa L Osborne
- The Rowland Institute at Harvard, Harvard University, Cambridge, Massachusetts, United States of America
- Biological Design Center, Boston University, Boston, Massachusetts, United States of America
| | - Juan F Poyatos
- The Rowland Institute at Harvard, Harvard University, Cambridge, Massachusetts, United States of America
- Logic of Genomic Systems Laboratory, Spanish National Biotechnology Centre (CNB-CSIC), Madrid, Spain
| | - Alvaro Sanchez
- Department of Ecology & Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
- Microbial Sciences Institute, Yale University, West Haven, Connecticut, United States of America
- The Rowland Institute at Harvard, Harvard University, Cambridge, Massachusetts, United States of America
| |
Collapse
|
47
|
Kontio JAJ, Sillanpää MJ. Scalable Nonparametric Prescreening Method for Searching Higher-Order Genetic Interactions Underlying Quantitative Traits. Genetics 2019; 213:1209-1224. [PMID: 31585953 PMCID: PMC6893368 DOI: 10.1534/genetics.119.302658] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 09/27/2019] [Indexed: 02/07/2023] Open
Abstract
Gaussian process (GP)-based automatic relevance determination (ARD) is known to be an efficient technique for identifying determinants of gene-by-gene interactions important to trait variation. However, the estimation of GP models is feasible only for low-dimensional datasets (∼200 variables), which severely limits application of the GP-based ARD method for high-throughput sequencing data. In this paper, we provide a nonparametric prescreening method that preserves virtually all the major benefits of the GP-based ARD method and extends its scalability to the typical high-dimensional datasets used in practice. In several simulated test scenarios, the proposed method compared favorably with existing nonparametric dimension reduction/prescreening methods suitable for higher-order interaction searches. As a real-data example, the proposed method was applied to a high-throughput dataset downloaded from the cancer genome atlas (TCGA) with measured expression levels of 16,976 genes (after preprocessing) from patients diagnosed with acute myeloid leukemia.
Collapse
Affiliation(s)
- Juho A J Kontio
- Research Unit of Mathematical Sciences, Biocenter Oulu, University of Oulu, 90014, Finland and
| | - Mikko J Sillanpää
- Research Unit of Mathematical Sciences, Biocenter Oulu, University of Oulu, 90014, Finland and
- Infotech Oulu, University of Oulu, 90014, Finland
| |
Collapse
|
48
|
Rapp JP, Joe B. Dissecting Epistatic QTL for Blood Pressure in Rats: Congenic Strains versus Heterogeneous Stocks, a Reality Check. Compr Physiol 2019; 9:1305-1337. [PMID: 31688958 DOI: 10.1002/cphy.c180038] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Advances in molecular genetics have provided well-defined physical genetic maps and large numbers of genetic markers for both model organisms and humans. It is now possible to gain a fundamental understanding of the genetic architecture underlying quantitative traits, of which blood pressure (BP) is an important example. This review emphasizes analytical techniques and results obtained using the Dahl salt-sensitive (S) rat as a model of hypertension by presenting results in detail for three specific chromosomal regions harboring genetic elements of increasing complexity controlling BP. These results highlight the critical importance of genetic interactions (epistasis) on BP at all levels of structure, intragenic, intergenic, intrachromosomal, interchromosomal, and across whole genomes. In two of the three examples presented, specific DNA structural variations leading to biochemical, physiological, and pathological mechanisms are well defined. This proves the usefulness of the techniques involving interval mapping followed by substitution mapping using congenic strains. These classic techniques are compared to newer approaches using sophisticated statistical analysis on various segregating or outbred model-organism populations, which in some cases are uniquely useful in demonstrating the existence of higher-order interactions. It is speculated that hypertension as an outlier quantitative phenotype is dependent on higher-order genetic interactions. The obstacle to the identification of genetic elements and the biochemical/physiological mechanisms involved in higher-order interactions is not theoretical or technical but the lack of future resources to finish the job of identifying the individual genetic elements underlying the quantitative trait loci for BP and ascertaining their molecular functions. © 2019 American Physiological Society. Compr Physiol 9:1305-1337, 2019.
Collapse
Affiliation(s)
- John P Rapp
- Physiological Genomics Laboratory, Department of Physiology and Pharmacology, Center for Hypertension and Precision Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, OH, USA
| | - Bina Joe
- Physiological Genomics Laboratory, Department of Physiology and Pharmacology, Center for Hypertension and Precision Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, OH, USA
| |
Collapse
|
49
|
Khatri BS, Goldstein RA. Biophysics and population size constrains speciation in an evolutionary model of developmental system drift. PLoS Comput Biol 2019; 15:e1007177. [PMID: 31335870 PMCID: PMC6677325 DOI: 10.1371/journal.pcbi.1007177] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 08/02/2019] [Accepted: 06/13/2019] [Indexed: 02/06/2023] Open
Abstract
Developmental system drift is a likely mechanism for the origin of hybrid incompatibilities between closely related species. We examine here the detailed mechanistic basis of hybrid incompatibilities between two allopatric lineages, for a genotype-phenotype map of developmental system drift under stabilising selection, where an organismal phenotype is conserved, but the underlying molecular phenotypes and genotype can drift. This leads to number of emergent phenomenon not obtainable by modelling genotype or phenotype alone. Our results show that: 1) speciation is more rapid at smaller population sizes with a characteristic, Orr-like, power law, but at large population sizes slow, characterised by a sub-diffusive growth law; 2) the molecular phenotypes under weakest selection contribute to the earliest incompatibilities; and 3) pair-wise incompatibilities dominate over higher order, contrary to previous predictions that the latter should dominate. The population size effect we find is consistent with previous results on allopatric divergence of transcription factor-DNA binding, where smaller populations have common ancestors with a larger drift load because genetic drift favours phenotypes which have a larger number of genotypes (higher sequence entropy) over more fit phenotypes which have far fewer genotypes; this means less substitutions are required in either lineage before incompatibilities arise. Overall, our results indicate that biophysics and population size provide a much stronger constraint to speciation than suggested by previous models, and point to a general mechanistic principle of how incompatibilities arise the under stabilising selection for an organismal phenotype. The process of speciation is of fundamental importance to the field of evolution as it is intimately connected to understanding the immense bio-diversity of life. There is still relatively little understanding of the underlying genetic mechanisms that give rise to hybrid incompatibilities with results suggesting that divergence in transcription factor DNA binding and gene expression play an important role. A key finding from the field of evo-devo is that organismal phenotypes show developmental system drift, where species maintain the same phenotype, but diverge in developmental pathways; this is an important potential source of hybrid incompatibilities. Here, we explore a theoretical framework to understand how incompatibilities arise due to developmental system drift, using a tractable biophysically inspired genotype-phenotype for spatial gene expression. Modelling the evolution of phenotypes in this way has the key advantage that it mirrors how selection works in nature, i.e. that selection acts on phenotypes, but variation (mutation) arise at the level of genotypes. This results, as we demonstrate, in a number of non-trivial and testable predictions concerning speciation due to developmental system drift, which would not be obtainable by modelling evolution of genotypes or phenotypes alone.
Collapse
Affiliation(s)
| | - Richard A. Goldstein
- Division of Infection & Immunity, University College London, London, United Kingdom
| |
Collapse
|
50
|
Polte C, Wedemeyer D, Oliver KE, Wagner J, Bijvelds MJC, Mahoney J, de Jonge HR, Sorscher EJ, Ignatova Z. Assessing cell-specific effects of genetic variations using tRNA microarrays. BMC Genomics 2019; 20:549. [PMID: 31307398 PMCID: PMC6632033 DOI: 10.1186/s12864-019-5864-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Background By definition, effect of synonymous single-nucleotide variants (SNVs) on protein folding and function are neutral, as they alter the codon and not the encoded amino acid. Recent examples indicate tissue-specific and transfer RNA (tRNA)-dependent effects of some genetic variations arguing against neutrality of synonymous SNVs for protein biogenesis. Results We performed systematic analysis of tRNA abunandance across in various models used in cystic fibrosis (CF) research and drug development, including Fischer rat thyroid (FRT) cells, patient-derived primary human bronchial epithelia (HBE) from lung biopsies, primary human nasal epithelia (HNE) from nasal curettage, intestinal organoids, and airway progenitor-directed differentiation of human induced pluripotent stem cells (iPSCs). These were compared to an immortalized CF bronchial cell model (CFBE41o−) and two widely used laboratory cell lines, HeLa and HEK293. We discovered that specific synonymous SNVs exhibited differential effects which correlated with variable concentrations of cognate tRNAs. Conclusions Our results highlight ways in which the presence of synonymous SNVs may alter local kinetics of mRNA translation; and thus, impact protein biogenesis and function. This effect is likely to influence results from mechansistic analysis and/or drug screeining efforts, and establishes importance of cereful model system selection based on genetic variation profile. Electronic supplementary material The online version of this article (10.1186/s12864-019-5864-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Christine Polte
- Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, 20146, Hamburg, Germany
| | - Daniel Wedemeyer
- Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, 20146, Hamburg, Germany
| | - Kathryn E Oliver
- Emory University School of Medicine, Atlanta, GA, 30322, USA.,Children's Healthcare of Atlanta, Atlanta, GA, 30322, USA
| | - Johannes Wagner
- Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, 20146, Hamburg, Germany
| | - Marcel J C Bijvelds
- Gastroenterology and Hepatology Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - John Mahoney
- Cystic Fibrosis Foundation CFFT Lab, Lexington, MA, 02421, USA
| | - Hugo R de Jonge
- Gastroenterology and Hepatology Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Eric J Sorscher
- Emory University School of Medicine, Atlanta, GA, 30322, USA.,Children's Healthcare of Atlanta, Atlanta, GA, 30322, USA
| | - Zoya Ignatova
- Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, 20146, Hamburg, Germany.
| |
Collapse
|