1
|
Jacobus AP, Cavassana SD, de Oliveira II, Barreto JA, Rohwedder E, Frazzon J, Basso TP, Basso LC, Gross J. Optimal trade-off between boosted tolerance and growth fitness during adaptive evolution of yeast to ethanol shocks. BIOTECHNOLOGY FOR BIOFUELS AND BIOPRODUCTS 2024; 17:63. [PMID: 38730312 PMCID: PMC11088041 DOI: 10.1186/s13068-024-02503-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 04/05/2024] [Indexed: 05/12/2024]
Abstract
BACKGROUND The selection of Saccharomyces cerevisiae strains with higher alcohol tolerance can potentially increase the industrial production of ethanol fuel. However, the design of selection protocols to obtain bioethanol yeasts with higher alcohol tolerance poses the challenge of improving industrial strains that are already robust to high ethanol levels. Furthermore, yeasts subjected to mutagenesis and selection, or laboratory evolution, often present adaptation trade-offs wherein higher stress tolerance is attained at the expense of growth and fermentation performance. Although these undesirable side effects are often associated with acute selection regimes, the utility of using harsh ethanol treatments to obtain robust ethanologenic yeasts still has not been fully investigated. RESULTS We conducted an adaptive laboratory evolution by challenging four populations (P1-P4) of the Brazilian bioethanol yeast, Saccharomyces cerevisiae PE-2_H4, through 68-82 cycles of 2-h ethanol shocks (19-30% v/v) and outgrowths. Colonies isolated from the final evolved populations (P1c-P4c) were subjected to whole-genome sequencing, revealing mutations in genes enriched for the cAMP/PKA and trehalose degradation pathways. Fitness analyses of the isolated clones P1c-P3c and reverse-engineered strains demonstrated that mutations were primarily selected for cell viability under ethanol stress, at the cost of decreased growth rates in cultures with or without ethanol. Under this selection regime for stress survival, the population P4 evolved a protective snowflake phenotype resulting from BUD3 disruption. Despite marked adaptation trade-offs, the combination of reverse-engineered mutations cyr1A1474T/usv1Δ conferred 5.46% higher fitness than the parental PE-2_H4 for propagation in 8% (v/v) ethanol, with only a 1.07% fitness cost in a culture medium without alcohol. The cyr1A1474T/usv1Δ strain and evolved P1c displayed robust fermentations of sugarcane molasses using cell recycling and sulfuric acid treatments, mimicking Brazilian bioethanol production. CONCLUSIONS Our study combined genomic, mutational, and fitness analyses to understand the genetic underpinnings of yeast evolution to ethanol shocks. Although fitness analyses revealed that most evolved mutations impose a cost for cell propagation, combination of key mutations cyr1A1474T/usv1Δ endowed yeasts with higher tolerance for growth in the presence of ethanol. Moreover, alleles selected for acute stress survival comprising the P1c genotype conferred stress tolerance and optimal performance under conditions simulating the Brazilian industrial ethanol production.
Collapse
Affiliation(s)
- Ana Paula Jacobus
- Bioenergy Research Institute, São Paulo State University, Rio Claro, Brazil
- SENAI Innovation Institute for Biotechnology, São Paulo, Brazil
| | | | | | | | - Ewerton Rohwedder
- Biological Science Department, "Luiz de Queiroz" College of Agriculture, University of Sao Paulo, Piracicaba, Brazil
| | - Jeverson Frazzon
- Institute of Food Science and Technology, Federal University of Rio Grande do Sul, Porto Alegre, Brazil
| | - Thalita Peixoto Basso
- Department of Agri-Food Industry, Food and Nutrition, "Luiz de Queiroz" College of Agriculture, University of Sao Paulo, Piracicaba, Brazil
| | - Luiz Carlos Basso
- Biological Science Department, "Luiz de Queiroz" College of Agriculture, University of Sao Paulo, Piracicaba, Brazil
| | - Jeferson Gross
- Bioenergy Research Institute, São Paulo State University, Rio Claro, Brazil.
| |
Collapse
|
2
|
Lawir DF, Soza-Ried C, Iwanami N, Siamishi I, Bylund GO, O Meara C, Sikora K, Kanzler B, Johansson E, Schorpp M, Cauchy P, Boehm T. Antagonistic interactions safeguard mitotic propagation of genetic and epigenetic information in zebrafish. Commun Biol 2024; 7:31. [PMID: 38182651 PMCID: PMC10770094 DOI: 10.1038/s42003-023-05692-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 12/11/2023] [Indexed: 01/07/2024] Open
Abstract
The stability of cellular phenotypes in developing organisms depends on error-free transmission of epigenetic and genetic information during mitosis. Methylation of cytosine residues in genomic DNA is a key epigenetic mark that modulates gene expression and prevents genome instability. Here, we report on a genetic test of the relationship between DNA replication and methylation in the context of the developing vertebrate organism instead of cell lines. Our analysis is based on the identification of hypomorphic alleles of dnmt1, encoding the DNA maintenance methylase Dnmt1, and pole1, encoding the catalytic subunit of leading-strand DNA polymerase epsilon holoenzyme (Pole). Homozygous dnmt1 mutants exhibit genome-wide DNA hypomethylation, whereas the pole1 mutation is associated with increased DNA methylation levels. In dnmt1/pole1 double-mutant zebrafish larvae, DNA methylation levels are restored to near normal values, associated with partial rescue of mutant-associated transcriptional changes and phenotypes. Hence, a balancing antagonism between DNA replication and maintenance methylation buffers against replicative errors contributing to the robustness of vertebrate development.
Collapse
Affiliation(s)
- Divine-Fondzenyuy Lawir
- Department of Developmental Immunology, Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany
| | - Cristian Soza-Ried
- Department of Developmental Immunology, Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany
| | - Norimasa Iwanami
- Department of Developmental Immunology, Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany
| | - Iliana Siamishi
- Department of Developmental Immunology, Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany
| | - Göran O Bylund
- Department of Medical Biochemistry and Biophysics, Umeå University, Umeå, Sweden
| | - Connor O Meara
- Department of Developmental Immunology, Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany
| | - Katarzyna Sikora
- Department of Developmental Immunology, Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany
- Bioinformatic Unit, Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany
| | - Benoît Kanzler
- Transgenic Mouse Core Facility, Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany
| | - Erik Johansson
- Department of Medical Biochemistry and Biophysics, Umeå University, Umeå, Sweden
| | - Michael Schorpp
- Department of Developmental Immunology, Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany
| | - Pierre Cauchy
- Department of Developmental Immunology, Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany
| | - Thomas Boehm
- Department of Developmental Immunology, Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany.
- Institute for Immunodeficiency, Center for Chronic Immunodeficiency (CCI), University Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany.
| |
Collapse
|
3
|
Lütge A, Lu J, Hüllein J, Walther T, Sellner L, Wu B, Rosenquist R, Oakes CC, Dietrich S, Huber W, Zenz T. Subgroup-specific gene expression profiles and mixed epistasis in chronic lymphocytic leukemia. Haematologica 2023; 108:2664-2676. [PMID: 37226709 PMCID: PMC10614035 DOI: 10.3324/haematol.2022.281869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 05/18/2023] [Indexed: 05/26/2023] Open
Abstract
Understanding the molecular and phenotypic heterogeneity of cancer is a prerequisite for effective treatment. For chronic lymphocytic leukemia (CLL), recurrent genetic driver events have been extensively cataloged, but this does not suffice to explain the disease's diverse course. Here, we performed RNA sequencing on 184 CLL patient samples. Unsupervised analysis revealed two major, orthogonal axes of gene expression variation: the first one represented the mutational status of the immunoglobulin heavy variable (IGHV) genes, and concomitantly, the three-group stratification of CLL by global DNA methylation. The second axis aligned with trisomy 12 status and affected chemokine, MAPK and mTOR signaling. We discovered non-additive effects (epistasis) of IGHV mutation status and trisomy 12 on multiple phenotypes, including the expression of 893 genes. Multiple types of epistasis were observed, including synergy, buffering, suppression and inversion, suggesting that molecular understanding of disease heterogeneity requires studying such genetic events not only individually but in combination. We detected strong differentially expressed gene signatures associated with major gene mutations and copy number aberrations including SF3B1, BRAF and TP53, as well as del(17)(p13), del(13)(q14) and del(11)(q22.3) beyond dosage effect. Our study reveals previously underappreciated gene expression signatures for the major molecular subtypes in CLL and the presence of epistasis between them.
Collapse
Affiliation(s)
- Almut Lütge
- Genome Biology Unit, EMBL, Heidelberg, Germany; Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland; SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich
| | - Junyan Lu
- Genome Biology Unit, EMBL, Heidelberg, Germany; Medical Faculty Heidelberg, Heidelberg University, Heidelberg
| | | | - Tatjana Walther
- Molecular Therapy in Hematology and Oncology and Department of Translational Oncology, NCT and DKFZ, Heidelberg
| | - Leopold Sellner
- Molecular Therapy in Hematology and Oncology and Department of Translational Oncology, NCT and DKFZ, Heidelberg, Germany; Department of Medicine V, Heidelberg University Hospital, Heidelberg
| | - Bian Wu
- Molecular Therapy in Hematology and Oncology and Department of Translational Oncology, NCT and DKFZ, Heidelberg, Germany; Cancer Center, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan
| | - Richard Rosenquist
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden; Clinical Genetics, Karolinska University Hospital, Solna
| | - Christopher C Oakes
- Department of Internal Medicine, Division of Hematology, The Ohio State University, Columbus
| | - Sascha Dietrich
- Department of Medicine V, Heidelberg University Hospital, Heidelberg
| | | | - Thorsten Zenz
- Molecular Therapy in Hematology and Oncology and Department of Translational Oncology, NCT and DKFZ, Heidelberg, Germany; Department of Medical Oncology and Hematology, University Hospital Zurich, Zurich.
| |
Collapse
|
4
|
Phenomics approaches to understand genetic networks and gene function in yeast. Biochem Soc Trans 2022; 50:713-721. [PMID: 35285506 PMCID: PMC9162466 DOI: 10.1042/bst20210285] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 02/14/2022] [Accepted: 02/18/2022] [Indexed: 01/03/2023]
Abstract
Over the past decade, major efforts have been made to systematically survey the characteristics or phenotypes associated with genetic variation in a variety of model systems. These so-called phenomics projects involve the measurement of 'phenomes', or the set of phenotypic information that describes an organism or cell, in various genetic contexts or states, and in response to external factors, such as environmental signals. Our understanding of the phenome of an organism depends on the availability of reagents that enable systematic evaluation of the spectrum of possible phenotypic variation and the types of measurements that can be taken. Here, we highlight phenomics studies that use the budding yeast, a pioneer model organism for functional genomics research. We focus on genetic perturbation screens designed to explore genetic interactions, using a variety of phenotypic read-outs, from cell growth to subcellular morphology.
Collapse
|
5
|
Van Dyke K, Lutz S, Mekonnen G, Myers CL, Albert FW. Trans-acting genetic variation affects the expression of adjacent genes. Genetics 2021; 217:6126816. [PMID: 33789351 DOI: 10.1093/genetics/iyaa051] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 12/16/2020] [Indexed: 11/13/2022] Open
Abstract
Gene expression differences among individuals are shaped by trans-acting expression quantitative trait loci (eQTLs). Most trans-eQTLs map to hotspot locations that influence many genes. The molecular mechanisms perturbed by hotspots are often assumed to involve "vertical" cascades of effects in pathways that can ultimately affect the expression of thousands of genes. Here, we report that trans-eQTLs can affect the expression of adjacent genes via "horizontal" mechanisms that extend along a chromosome. Genes affected by trans-eQTL hotspots in the yeast Saccharomyces cerevisiae were more likely to be located next to each other than expected by chance. These paired hotspot effects tended to occur at adjacent genes that also show coexpression in response to genetic and environmental perturbations, suggesting shared mechanisms. Physical proximity and shared chromatin state, in addition to regulation of adjacent genes by similar transcription factors, were independently associated with paired hotspot effects among adjacent genes. Paired effects of trans-eQTLs can occur at neighboring genes even when these genes do not share a common function. This phenomenon could result in unexpected connections between regulatory genetic variation and phenotypes.
Collapse
Affiliation(s)
- Krisna Van Dyke
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN 55455, USA
| | - Sheila Lutz
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN 55455, USA
| | - Gemechu Mekonnen
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN 55455, USA
| | - Chad L Myers
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN 55455, USA
| | - Frank W Albert
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
6
|
Ma CZ, Brent MR. Inferring TF activities and activity regulators from gene expression data with constraints from TF perturbation data. Bioinformatics 2021; 37:1234-1245. [PMID: 33135076 PMCID: PMC8189679 DOI: 10.1093/bioinformatics/btaa947] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 09/26/2020] [Accepted: 10/27/2020] [Indexed: 12/20/2022] Open
Abstract
Motivation The activity of a transcription factor (TF) in a sample of cells is the extent to which it is exerting its regulatory potential. Many methods of inferring TF activity from gene expression data have been described, but due to the lack of appropriate large-scale datasets, systematic and objective validation has not been possible until now. Results We systematically evaluate and optimize the approach to TF activity inference in which a gene expression matrix is factored into a condition-independent matrix of control strengths and a condition-dependent matrix of TF activity levels. We find that expression data in which the activities of individual TFs have been perturbed are both necessary and sufficient for obtaining good performance. To a considerable extent, control strengths inferred using expression data from one growth condition carry over to other conditions, so the control strength matrices derived here can be used by others. Finally, we apply these methods to gain insight into the upstream factors that regulate the activities of yeast TFs Gcr2, Gln3, Gcn4 and Msn2. Availability and implementation Evaluation code and data are available at https://doi.org/10.5281/zenodo.4050573. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cynthia Z Ma
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63110, USA.,Department of Computer Science and Engineering, Washington University, St. Louis, MO 63130, USA
| | - Michael R Brent
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63110, USA.,Department of Computer Science and Engineering, Washington University, St. Louis, MO 63130, USA.,Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| |
Collapse
|
7
|
A mechanism-aware and multiomic machine-learning pipeline characterizes yeast cell growth. Proc Natl Acad Sci U S A 2020; 117:18869-18879. [PMID: 32675233 DOI: 10.1073/pnas.2002959117] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Metabolic modeling and machine learning are key components in the emerging next generation of systems and synthetic biology tools, targeting the genotype-phenotype-environment relationship. Rather than being used in isolation, it is becoming clear that their value is maximized when they are combined. However, the potential of integrating these two frameworks for omic data augmentation and integration is largely unexplored. We propose, rigorously assess, and compare machine-learning-based data integration techniques, combining gene expression profiles with computationally generated metabolic flux data to predict yeast cell growth. To this end, we create strain-specific metabolic models for 1,143 Saccharomyces cerevisiae mutants and we test 27 machine-learning methods, incorporating state-of-the-art feature selection and multiview learning approaches. We propose a multiview neural network using fluxomic and transcriptomic data, showing that the former increases the predictive accuracy of the latter and reveals functional patterns that are not directly deducible from gene expression alone. We test the proposed neural network on a further 86 strains generated in a different experiment, therefore verifying its robustness to an additional independent dataset. Finally, we show that introducing mechanistic flux features improves the predictions also for knockout strains whose genes were not modeled in the metabolic reconstruction. Our results thus demonstrate that fusing experimental cues with in silico models, based on known biochemistry, can contribute with disjoint information toward biologically informed and interpretable machine learning. Overall, this study provides tools for understanding and manipulating complex phenotypes, increasing both the prediction accuracy and the extent of discernible mechanistic biological insights.
Collapse
|
8
|
Jacobsen A, Ivanova O, Amini S, Heringa J, Kemmeren P, Feenstra KA. A framework for exhaustive modelling of genetic interaction patterns using Petri nets. Bioinformatics 2020; 36:2142-2149. [PMID: 31845959 DOI: 10.1093/bioinformatics/btz917] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2018] [Revised: 07/09/2019] [Accepted: 12/13/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Genetic interaction (GI) patterns are characterized by the phenotypes of interacting single and double mutated gene pairs. Uncovering the regulatory mechanisms of GIs would provide a better understanding of their role in biological processes, diseases and drug response. Computational analyses can provide insights into the underpinning mechanisms of GIs. RESULTS In this study, we present a framework for exhaustive modelling of GI patterns using Petri nets (PN). Four-node models were defined and generated on three levels with restrictions, to enable an exhaustive approach. Simulations suggest ∼5 million models of GIs. Generalizing these we propose putative mechanisms for the GI patterns, inversion and suppression. We demonstrate that exhaustive PN modelling enables reasoning about mechanisms of GIs when only the phenotypes of gene pairs are known. The framework can be applied to other GI or genetic regulatory datasets. AVAILABILITY AND IMPLEMENTATION The framework is available at http://www.ibi.vu.nl/programs/ExhMod. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Annika Jacobsen
- Department of Computer Science, Centre for Integrative Bioinformatics (IBIVU), Vrije Universiteit Amsterdam, 1081 HV Amsterdam, Netherlands
| | - Olga Ivanova
- Department of Computer Science, Centre for Integrative Bioinformatics (IBIVU), Vrije Universiteit Amsterdam, 1081 HV Amsterdam, Netherlands
| | - Saman Amini
- Princess Máxima Center for Pediatric Oncology, 3584 CS Utrecht, Netherlands.,Divison of Biomedical Genetics, Center for Molecular Medicine, University Medical Centre Utrecht, 3584 CX Utrecht, Netherlands
| | - Jaap Heringa
- Department of Computer Science, Centre for Integrative Bioinformatics (IBIVU), Vrije Universiteit Amsterdam, 1081 HV Amsterdam, Netherlands
| | - Patrick Kemmeren
- Princess Máxima Center for Pediatric Oncology, 3584 CS Utrecht, Netherlands.,Divison of Biomedical Genetics, Center for Molecular Medicine, University Medical Centre Utrecht, 3584 CX Utrecht, Netherlands
| | - K Anton Feenstra
- Department of Computer Science, Centre for Integrative Bioinformatics (IBIVU), Vrije Universiteit Amsterdam, 1081 HV Amsterdam, Netherlands
| |
Collapse
|
9
|
Hekselman I, Yeger-Lotem E. Mechanisms of tissue and cell-type specificity in heritable traits and diseases. Nat Rev Genet 2020; 21:137-150. [DOI: 10.1038/s41576-019-0200-9] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/12/2019] [Indexed: 02/07/2023]
|
10
|
Kemble H, Nghe P, Tenaillon O. Recent insights into the genotype-phenotype relationship from massively parallel genetic assays. Evol Appl 2019; 12:1721-1742. [PMID: 31548853 PMCID: PMC6752143 DOI: 10.1111/eva.12846] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 06/21/2019] [Accepted: 07/02/2019] [Indexed: 12/20/2022] Open
Abstract
With the molecular revolution in Biology, a mechanistic understanding of the genotype-phenotype relationship became possible. Recently, advances in DNA synthesis and sequencing have enabled the development of deep mutational scanning assays, capable of scoring comprehensive libraries of genotypes for fitness and a variety of phenotypes in massively parallel fashion. The resulting empirical genotype-fitness maps pave the way to predictive models, potentially accelerating our ability to anticipate the behaviour of pathogen and cancerous cell populations from sequencing data. Besides from cellular fitness, phenotypes of direct application in industry (e.g. enzyme activity) and medicine (e.g. antibody binding) can be quantified and even selected directly by these assays. This review discusses the technological basis of and recent developments in massively parallel genetics, along with the trends it is uncovering in the genotype-phenotype relationship (distribution of mutation effects, epistasis), their possible mechanistic bases and future directions for advancing towards the goal of predictive genetics.
Collapse
Affiliation(s)
- Harry Kemble
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Philippe Nghe
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Olivier Tenaillon
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
| |
Collapse
|
11
|
Maj C, Azevedo T, Giansanti V, Borisov O, Dimitri GM, Spasov S, Lió P, Merelli I. Integration of Machine Learning Methods to Dissect Genetically Imputed Transcriptomic Profiles in Alzheimer's Disease. Front Genet 2019; 10:726. [PMID: 31552082 PMCID: PMC6735530 DOI: 10.3389/fgene.2019.00726] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Accepted: 07/10/2019] [Indexed: 12/12/2022] Open
Abstract
The genetic component of many common traits is associated with the gene expression and several variants act as expression quantitative loci, regulating the gene expression in a tissue specific manner. In this work, we applied tissue-specific cis-eQTL gene expression prediction models on the genotype of 808 samples including controls, subjects with mild cognitive impairment, and patients with Alzheimer's Disease. We then dissected the imputed transcriptomic profiles by means of different unsupervised and supervised machine learning approaches to identify potential biological associations. Our analysis suggests that unsupervised and supervised methods can provide complementary information, which can be integrated for a better characterization of the underlying biological system. In particular, a variational autoencoder representation of the transcriptomic profiles, followed by a support vector machine classification, has been used for tissue-specific gene prioritizations. Interestingly, the achieved gene prioritizations can be efficiently integrated as a feature selection step for improving the accuracy of deep learning classifier networks. The identified gene-tissue information suggests a potential role for inflammatory and regulatory processes in gut-brain axis related tissues. In line with the expected low heritability that can be apportioned to eQTL variants, we were able to achieve only relatively low prediction capability with deep learning classification models. However, our analysis revealed that the classification power strongly depends on the network structure, with recurrent neural networks being the best performing network class. Interestingly, cross-tissue analysis suggests a potentially greater role of models trained in brain tissues also by considering dementia-related endophenotypes. Overall, the present analysis suggests that the combination of supervised and unsupervised machine learning techniques can be used for the evaluation of high dimensional omics data.
Collapse
Affiliation(s)
- Carlo Maj
- Institute for Genomic Statistics and Bioinformatics, University Hospital Bonn, Bonn, Germany
| | - Tiago Azevedo
- Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom
| | - Valentina Giansanti
- National Research Council, Institute for Biomedical Technologies, Milan, Italy
| | - Oleg Borisov
- Institute for Genomic Statistics and Bioinformatics, University Hospital Bonn, Bonn, Germany
| | - Giovanna Maria Dimitri
- Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom
| | - Simeon Spasov
- Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom
| | | | - Pietro Lió
- Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom
| | - Ivan Merelli
- National Research Council, Institute for Biomedical Technologies, Milan, Italy
| |
Collapse
|
12
|
Amini S, Jacobsen A, Ivanova O, Lijnzaad P, Heringa J, Holstege FCP, Feenstra KA, Kemmeren P. The ability of transcription factors to differentially regulate gene expression is a crucial component of the mechanism underlying inversion, a frequently observed genetic interaction pattern. PLoS Comput Biol 2019; 15:e1007061. [PMID: 31083661 PMCID: PMC6532943 DOI: 10.1371/journal.pcbi.1007061] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Revised: 05/23/2019] [Accepted: 04/30/2019] [Indexed: 12/21/2022] Open
Abstract
Genetic interactions, a phenomenon whereby combinations of mutations lead to unexpected effects, reflect how cellular processes are wired and play an important role in complex genetic diseases. Understanding the molecular basis of genetic interactions is crucial for deciphering pathway organization as well as understanding the relationship between genetic variation and disease. Several hypothetical molecular mechanisms have been linked to different genetic interaction types. However, differences in genetic interaction patterns and their underlying mechanisms have not yet been compared systematically between different functional gene classes. Here, differences in the occurrence and types of genetic interactions are compared for two classes, gene-specific transcription factors (GSTFs) and signaling genes (kinases and phosphatases). Genome-wide gene expression data for 63 single and double deletion mutants in baker's yeast reveals that the two most common genetic interaction patterns are buffering and inversion. Buffering is typically associated with redundancy and is well understood. In inversion, genes show opposite behavior in the double mutant compared to the corresponding single mutants. The underlying mechanism is poorly understood. Although both classes show buffering and inversion patterns, the prevalence of inversion is much stronger in GSTFs. To decipher potential mechanisms, a Petri Net modeling approach was employed, where genes are represented as nodes and relationships between genes as edges. This allowed over 9 million possible three and four node models to be exhaustively enumerated. The models show that a quantitative difference in interaction strength is a strict requirement for obtaining inversion. In addition, this difference is frequently accompanied with a second gene that shows buffering. Taken together, these results provide a mechanistic explanation for inversion. Furthermore, the ability of transcription factors to differentially regulate expression of their targets provides a likely explanation why inversion is more prevalent for GSTFs compared to kinases and phosphatases.
Collapse
Affiliation(s)
- Saman Amini
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
- Center for Molecular Medicine, University Medical Centre Utrecht, Utrecht, The Netherlands
| | - Annika Jacobsen
- Centre for Integrative Bioinformatics (IBIVU), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Olga Ivanova
- Centre for Integrative Bioinformatics (IBIVU), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Philip Lijnzaad
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
| | - Jaap Heringa
- Centre for Integrative Bioinformatics (IBIVU), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | | | - K. Anton Feenstra
- Centre for Integrative Bioinformatics (IBIVU), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Patrick Kemmeren
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
- Center for Molecular Medicine, University Medical Centre Utrecht, Utrecht, The Netherlands
- * E-mail:
| |
Collapse
|
13
|
Pirkl M, Diekmann M, van der Wees M, Beerenwinkel N, Fröhlich H, Markowetz F. Inferring modulators of genetic interactions with epistatic nested effects models. PLoS Comput Biol 2017; 13:e1005496. [PMID: 28406896 PMCID: PMC5407847 DOI: 10.1371/journal.pcbi.1005496] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Revised: 04/27/2017] [Accepted: 04/03/2017] [Indexed: 12/27/2022] Open
Abstract
Maps of genetic interactions can dissect functional redundancies in cellular networks. Gene expression profiles as high-dimensional molecular readouts of combinatorial perturbations provide a detailed view of genetic interactions, but can be hard to interpret if different gene sets respond in different ways (called mixed epistasis). Here we test the hypothesis that mixed epistasis between a gene pair can be explained by the action of a third gene that modulates the interaction. We have extended the framework of Nested Effects Models (NEMs), a type of graphical model specifically tailored to analyze high-dimensional gene perturbation data, to incorporate logical functions that describe interactions between regulators on downstream genes and proteins. We benchmark our approach in the controlled setting of a simulation study and show high accuracy in inferring the correct model. In an application to data from deletion mutants of kinases and phosphatases in S. cerevisiae we show that epistatic NEMs can point to modulators of genetic interactions. Our approach is implemented in the R-package 'epiNEM' available from https://github.com/cbg-ethz/epiNEM and https://bioconductor.org/packages/epiNEM/.
Collapse
Affiliation(s)
- Martin Pirkl
- ETH Zurich, Department of Biosystems Science and Engineering, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Madeline Diekmann
- ETH Zurich, Department of Biosystems Science and Engineering, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Niko Beerenwinkel
- ETH Zurich, Department of Biosystems Science and Engineering, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Holger Fröhlich
- Bonn-Aachen International Center for IT (B-IT), University of Bonn, Bonn, Germany
- UCB Biosciences GmbH, Monheim, Germany
| | - Florian Markowetz
- University of Cambridge, Cancer Research UK Cambridge Institute, Cambridge, United Kingdom
| |
Collapse
|
14
|
Amini S, Holstege FCP, Kemmeren P. Growth condition dependency is the major cause of non-responsiveness upon genetic perturbation. PLoS One 2017; 12:e0173432. [PMID: 28257504 PMCID: PMC5336285 DOI: 10.1371/journal.pone.0173432] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Accepted: 02/10/2017] [Indexed: 11/29/2022] Open
Abstract
Investigating the role and interplay between individual proteins in biological processes is often performed by assessing the functional consequences of gene inactivation or removal. Depending on the sensitivity of the assay used for determining phenotype, between 66% (growth) and 53% (gene expression) of Saccharomyces cerevisiae gene deletion strains show no defect when analyzed under a single condition. Although it is well known that this non-responsive behavior is caused by different types of redundancy mechanisms or by growth condition/cell type dependency, it is not known what the relative contribution of these different causes is. Understanding the underlying causes of and their relative contribution to non-responsive behavior upon genetic perturbation is extremely important for designing efficient strategies aimed at elucidating gene function and unraveling complex cellular systems. Here, we provide a systematic classification of the underlying causes of and their relative contribution to non-responsive behavior upon gene deletion. The overall contribution of redundancy to non-responsive behavior is estimated at 29%, of which approximately 17% is due to homology-based redundancy and 12% is due to pathway-based redundancy. The major determinant of non-responsiveness is condition dependency (71%). For approximately 14% of protein complexes, just-in-time assembly can be put forward as a potential mechanistic explanation for how proteins can be regulated in a condition dependent manner. Taken together, the results underscore the large contribution of growth condition requirement to non-responsive behavior, which needs to be taken into account for strategies aimed at determining gene function. The classification provided here, can also be further harnessed in systematic analyses of complex cellular systems.
Collapse
Affiliation(s)
- Saman Amini
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
- Center for Molecular Medicine, University Medical Centre Utrecht, Utrecht, The Netherlands
| | | | - Patrick Kemmeren
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
- Center for Molecular Medicine, University Medical Centre Utrecht, Utrecht, The Netherlands
- * E-mail:
| |
Collapse
|
15
|
Wang F, Li Y, Wu X, Yang M, Cong W, Fan Z, Wang J, Zhang C, Du J, Wang S. Transcriptome analysis of coding and long non-coding RNAs highlights the regulatory network of cascade initiation of permanent molars in miniature pigs. BMC Genomics 2017; 18:148. [PMID: 28187707 PMCID: PMC5303240 DOI: 10.1186/s12864-017-3546-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2016] [Accepted: 02/02/2017] [Indexed: 12/12/2022] Open
Abstract
Background In diphyodont mammals, the additional molars (permanent molars) bud off from the posterior-free end of the primary dental lamina compared with successional teeth (replacement teeth) budding off from the secondary dental lamina. The diphyodont miniature pig has proved to be a valuable model for studying human molar morphogenesis. The additional molars show a sequential initiation pattern related to the specific tooth development stage of additional molars in miniature pigs during the morphogenesis of additional molars. However, the molecular mechanisms of the regulatory network of mRNAs and long non-coding RNAs during sequential formation of additional molars remain poorly characterized in diphyodont mammals. Here, we performed RNA-seq and microarray on miniature pigs at three key molar developmental stages to examine their differential gene expression profiles and potential regulatory networks during additional molar morphogenesis. Results We have profiled the differential transcript expression and functional networks during morphogenesis of additional molars in miniature pigs. We also have identified the coding and long non-coding transcripts using Coding-Non-Coding Index (CNCI) and annotated transcripts through mapping to the porcine, Wuzhishan miniature pig, mice, cow and human genomes. Many new unannotated genes plus 450 putative long intergenic non-coding RNAs (lincRNAs) were identified. Detailed regulatory network analyses reveal that WNT and TGF-β pathways are critical in regulating sequential morphogenesis of additional molars. Conclusions This is the first study to comprehensively analyze the spatiotemporal dynamics of coding and long non-coding transcripts during morphogenesis of additional molars in diphyodont mammals. The miniature pig serves as a large model animal to elucidate the relationship between morphogenesis and transcript level during the cascade initiation of additional molars. Our data provide fundamental knowledge and a basis for understanding the molecular mechanisms governing cascade initiation of additional molars, but also provide an important resource for developmental biology research. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3546-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Fu Wang
- Molecular Laboratory for Gene Therapy & Tooth Regeneration, Beijing Key Laboratory of Tooth Regeneration and Function Reconstruction, School of Stomatology, Capital Medical University, Beijing, 100050, China.,Department of Oral Basic Science, School of Stomatology, Dalian Medical University, Liaoning, 116044, China
| | - Yang Li
- Molecular Laboratory for Gene Therapy & Tooth Regeneration, Beijing Key Laboratory of Tooth Regeneration and Function Reconstruction, School of Stomatology, Capital Medical University, Beijing, 100050, China
| | - Xiaoshan Wu
- Molecular Laboratory for Gene Therapy & Tooth Regeneration, Beijing Key Laboratory of Tooth Regeneration and Function Reconstruction, School of Stomatology, Capital Medical University, Beijing, 100050, China
| | - Min Yang
- Department of Oral Basic Science, School of Stomatology, Dalian Medical University, Liaoning, 116044, China
| | - Wei Cong
- Department of Oral Basic Science, School of Stomatology, Dalian Medical University, Liaoning, 116044, China
| | - Zhipeng Fan
- Laboratory of Molecular Signaling and Stem Cells Therapy, Beijing Key Laboratory of Tooth Regeneration and Function Reconstruction, School of Stomatology, Capital Medical University, Beijing, 100050, China
| | - Jinsong Wang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Capital Medical University, Beijing, 100069, China
| | - Chunmei Zhang
- Molecular Laboratory for Gene Therapy & Tooth Regeneration, Beijing Key Laboratory of Tooth Regeneration and Function Reconstruction, School of Stomatology, Capital Medical University, Beijing, 100050, China
| | - Jie Du
- Department of Physiology and Pathophysiology, Beijing AnZhen Hospital the Key Laboratory of Remodeling-Related Cardiovascular Diseases, School of Basic Medical Sciences, Capital Medical University, No.10 Xitoutiao, You An Men, Beijing, 100069, China
| | - Songlin Wang
- Molecular Laboratory for Gene Therapy & Tooth Regeneration, Beijing Key Laboratory of Tooth Regeneration and Function Reconstruction, School of Stomatology, Capital Medical University, Beijing, 100050, China. .,Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Capital Medical University, Beijing, 100069, China.
| |
Collapse
|