1
|
Zhang X, Fang B, Huang YF. Transcription factor binding sites are frequently under accelerated evolution in primates. Nat Commun 2023; 14:783. [PMID: 36774380 PMCID: PMC9922303 DOI: 10.1038/s41467-023-36421-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 01/31/2023] [Indexed: 02/13/2023] Open
Abstract
Recent comparative genomic studies have identified many human accelerated elements (HARs) with elevated substitution rates in the human lineage. However, it remains unknown to what extent transcription factor binding sites (TFBSs) are under accelerated evolution in humans and other primates. Here, we introduce two pooling-based phylogenetic methods with dramatically enhanced sensitivity to examine accelerated evolution in TFBSs. Using these new methods, we show that more than 6000 TFBSs annotated in the human genome have experienced accelerated evolution in Hominini, apes, and Old World monkeys. Although these TFBSs individually show relatively weak signals of accelerated evolution, they collectively are more abundant than HARs. Also, we show that accelerated evolution in Pol III binding sites may be driven by lineage-specific positive selection, whereas accelerated evolution in other TFBSs might be driven by nonadaptive evolutionary forces. Finally, the accelerated TFBSs are enriched around developmental genes, suggesting that accelerated evolution in TFBSs may drive the divergence of developmental processes between primates.
Collapse
Affiliation(s)
- Xinru Zhang
- Department of Biology, Pennsylvania State University, University Park, PA, 16802, USA. .,Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA. .,Bioinformatics and Genomics Graduate Program, Pennsylvania State University, University Park, PA, 16802, USA.
| | - Bohao Fang
- Department of Organismic and Evolutionary Biology and the Museum of Comparative Zoology, Harvard University, Boston, MA, 02135, USA
| | - Yi-Fei Huang
- Department of Biology, Pennsylvania State University, University Park, PA, 16802, USA. .,Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
2
|
Macquet J, Mounichetty S, Raffaele S. Genetic co-option into plant-filamentous pathogen interactions. TRENDS IN PLANT SCIENCE 2022; 27:1144-1158. [PMID: 35909010 DOI: 10.1016/j.tplants.2022.06.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 06/16/2022] [Accepted: 06/30/2022] [Indexed: 06/15/2023]
Abstract
Plants are engaged in a coevolutionary arms race with their pathogens that drives rapid diversification and specialization of genes involved in resistance and virulence. However, some major innovations in plant-pathogen interactions, such as molecular decoys, trans-kingdom RNA interference, two-speed genomes, and receptor networks, evolved through the expansion of the functional landscape of genes. This is a typical outcome of genetic co-option, the evolutionary process by which available genes are recruited into new biological functions. Co-option into plant-pathogen interactions emerges generally from (i) cis-regulatory variation, (ii) horizontal gene transfer (HGT), (iii) mutations altering molecular promiscuity, and (iv) rewiring of gene networks and protein complexes. Understanding these molecular mechanisms is key for the functional and predictive biology of plant-pathogen interactions.
Collapse
Affiliation(s)
- Joris Macquet
- Laboratoire des Interactions Plante-Microbe-Environnement (LIPME), Université de Toulouse, Institut National de Recherche pour l'Agriculture, l'Alimentation, et l'Environnement (INRAE), Centre National de la Recherche Scientifique (CNRS), Castanet Tolosan, France
| | - Shantala Mounichetty
- Laboratoire des Interactions Plante-Microbe-Environnement (LIPME), Université de Toulouse, Institut National de Recherche pour l'Agriculture, l'Alimentation, et l'Environnement (INRAE), Centre National de la Recherche Scientifique (CNRS), Castanet Tolosan, France
| | - Sylvain Raffaele
- Laboratoire des Interactions Plante-Microbe-Environnement (LIPME), Université de Toulouse, Institut National de Recherche pour l'Agriculture, l'Alimentation, et l'Environnement (INRAE), Centre National de la Recherche Scientifique (CNRS), Castanet Tolosan, France.
| |
Collapse
|
3
|
Shih CH, Fay J. Cis-regulatory variants affect gene expression dynamics in yeast. eLife 2021; 10:e68469. [PMID: 34369376 PMCID: PMC8367379 DOI: 10.7554/elife.68469] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 08/06/2021] [Indexed: 12/14/2022] Open
Abstract
Evolution of cis-regulatory sequences depends on how they affect gene expression and motivates both the identification and prediction of cis-regulatory variants responsible for expression differences within and between species. While much progress has been made in relating cis-regulatory variants to expression levels, the timing of gene activation and repression may also be important to the evolution of cis-regulatory sequences. We investigated allele-specific expression (ASE) dynamics within and between Saccharomyces species during the diauxic shift and found appreciable cis-acting variation in gene expression dynamics. Within-species ASE is associated with intergenic variants, and ASE dynamics are more strongly associated with insertions and deletions than ASE levels. To refine these associations, we used a high-throughput reporter assay to test promoter regions and individual variants. Within the subset of regions that recapitulated endogenous expression, we identified and characterized cis-regulatory variants that affect expression dynamics. Between species, chimeric promoter regions generate novel patterns and indicate constraints on the evolution of gene expression dynamics. We conclude that changes in cis-regulatory sequences can tune gene expression dynamics and that the interplay between expression dynamics and other aspects of expression is relevant to the evolution of cis-regulatory sequences.
Collapse
Affiliation(s)
- Ching-Hua Shih
- Department of Biology, University of RochesterRochesterUnited States
| | - Justin Fay
- Department of Biology, University of RochesterRochesterUnited States
| |
Collapse
|
4
|
Fagny M, Austerlitz F. Polygenic Adaptation: Integrating Population Genetics and Gene Regulatory Networks. Trends Genet 2021; 37:631-638. [PMID: 33892958 DOI: 10.1016/j.tig.2021.03.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 03/15/2021] [Accepted: 03/16/2021] [Indexed: 12/13/2022]
Abstract
The adaptation of populations to local environments often relies on the selection of optimal values for polygenic traits. Here, we first summarize the results obtained from different quantitative genetics and population genetics models, about the genetic architecture of polygenic traits and their response to directional selection. We then highlight the contribution of systems biology to the understanding of the molecular bases of polygenic traits and the evolution of gene regulatory networks involved in these traits. Finally, we discuss the need for a unifying framework merging the fields of population genetics, quantitative genetics and systems biology to better understand the molecular bases of polygenic traits adaptation.
Collapse
Affiliation(s)
- Maud Fagny
- UMR7206 Eco-Anthropologie, Muséum National d'Histoire Naturelle, Centre National de la Recherche Scientifique, Université de Paris, Paris, France.
| | - Frédéric Austerlitz
- UMR7206 Eco-Anthropologie, Muséum National d'Histoire Naturelle, Centre National de la Recherche Scientifique, Université de Paris, Paris, France
| |
Collapse
|
5
|
Cridland JM, Majane AC, Sheehy HK, Begun DJ. Polymorphism and Divergence of Novel Gene Expression Patterns in Drosophila melanogaster. Genetics 2020; 216:79-93. [PMID: 32737121 PMCID: PMC7463294 DOI: 10.1534/genetics.120.303515] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Accepted: 07/27/2020] [Indexed: 12/14/2022] Open
Abstract
Transcriptomes may evolve by multiple mechanisms, including the evolution of novel genes, the evolution of transcript abundance, and the evolution of cell, tissue, or organ expression patterns. Here, we focus on the last of these mechanisms in an investigation of tissue and organ shifts in gene expression in Drosophila melanogaster. In contrast to most investigations of expression evolution, we seek to provide a framework for understanding the mechanisms of novel expression patterns on a short population genetic timescale. To do so, we generated population samples of D. melanogaster transcriptomes from five tissues: accessory gland, testis, larval salivary gland, female head, and first-instar larva. We combined these data with comparable data from two outgroups to characterize gains and losses of expression, both polymorphic and fixed, in D. melanogaster We observed a large number of gain- or loss-of-expression phenotypes, most of which were polymorphic within D. melanogaster Several polymorphic, novel expression phenotypes were strongly influenced by segregating cis-acting variants. In support of previous literature on the evolution of novelties functioning in male reproduction, we observed many more novel expression phenotypes in the testis and accessory gland than in other tissues. Additionally, genes showing novel expression phenotypes tend to exhibit greater tissue-specific expression. Finally, in addition to qualitatively novel expression phenotypes, we identified genes exhibiting major quantitative expression divergence in the D. melanogaster lineage.
Collapse
Affiliation(s)
- Julie M Cridland
- Department of Evolution and Ecology, University of California, Davis, California 95616
| | - Alex C Majane
- Department of Evolution and Ecology, University of California, Davis, California 95616
| | - Hayley K Sheehy
- Department of Evolution and Ecology, University of California, Davis, California 95616
| | - David J Begun
- Department of Evolution and Ecology, University of California, Davis, California 95616
| |
Collapse
|
6
|
Dukler N, Huang YF, Siepel A. Phylogenetic Modeling of Regulatory Element Turnover Based on Epigenomic Data. Mol Biol Evol 2020; 37:2137-2152. [PMID: 32176292 PMCID: PMC7306682 DOI: 10.1093/molbev/msaa073] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Evolutionary changes in gene expression are often driven by gains and losses of cis-regulatory elements (CREs). The dynamics of CRE evolution can be examined using multispecies epigenomic data, but so far such analyses have generally been descriptive and model-free. Here, we introduce a probabilistic modeling framework for the evolution of CREs that operates directly on raw chromatin immunoprecipitation and sequencing (ChIP-seq) data and fully considers the phylogenetic relationships among species. Our framework includes a phylogenetic hidden Markov model, called epiPhyloHMM, for identifying the locations of multiply aligned CREs, and a combined phylogenetic and generalized linear model, called phyloGLM, for accounting for the influence of a rich set of genomic features in describing their evolutionary dynamics. We apply these methods to previously published ChIP-seq data for the H3K4me3 and H3K27ac histone modifications in liver tissue from nine mammals. We find that enhancers are gained and lost during mammalian evolution at about twice the rate of promoters, and that turnover rates are negatively correlated with DNA sequence conservation, expression level, and tissue breadth, and positively correlated with distance from the transcription start site, consistent with previous findings. In addition, we find that the predicted dosage sensitivity of target genes positively correlates with DNA sequence constraint in CREs but not with turnover rates, perhaps owing to differences in the effect sizes of the relevant mutations. Altogether, our probabilistic modeling framework enables a variety of powerful new analyses.
Collapse
Affiliation(s)
- Noah Dukler
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
- Physiology, Biophysics, and Systems Biology, Weill Cornell Medical College, New York, NY
| | - Yi-Fei Huang
- Department of Biology and Huck Institute of Life Sciences, Pennsylvania State University, University Park, PA
| | - Adam Siepel
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
| |
Collapse
|
7
|
Li L, Zhang S, Li LM. Dual Eigen-modules of Cis-Element Regulation Profiles and Selection of Cognition-Language Eigen-direction along Evolution in Hominidae. Mol Biol Evol 2020; 37:1679-1693. [PMID: 32068872 PMCID: PMC10615152 DOI: 10.1093/molbev/msaa036] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
To understand the genomic basis accounting for the phenotypic differences between human and apes, we compare the matrices consisting of the cis-element frequencies in the proximal regulatory regions of their genomes. One such frequency matrix is represented by a robust singular value decomposition. For each singular value, the negative and positive ends of the sorted motif eigenvector correspond to the dual ends of the sorted gene eigenvector, respectively, comprising a dual eigen-module defined by cis-regulatory element frequencies (CREF). The CREF eigen-modules at levels 1, 2, 3, and 6 are highly conserved across humans, chimpanzees, and orangutans. The key biological processes embedded in the top three CREF eigen-modules are reproduction versus embryogenesis, fetal maturation versus immune system, and stress responses versus mitosis. Although the divergence at the nucleotide level between the chimpanzee and human genome was small, their cis-element frequency matrices crossed a singularity point, at which the fourth and fifth singular values were identical. The CREF eigen-modules corresponding to the fourth and fifth singular values were reorganized along the evolution from apes to human. Interestingly, the fourth sorted gene eigenvector encodes the phenotypes unique to human such as long-term memory, language development, and social behavior. The number of motifs present on Alu elements increases substantially at the fourth level. The motif analysis together with the cases of human-specific Alu insertions suggests that mutations related to Alu elements play a critical role in the evolution of the human-phenotypic gene eigenvector.
Collapse
Affiliation(s)
- Liang Li
- National Center of Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences
| | - Sheng Zhang
- National Center of Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences
| | - Lei M Li
- National Center of Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| |
Collapse
|
8
|
Peng PC, Khoueiry P, Girardot C, Reddington JP, Garfield DA, Furlong EEM, Sinha S. The Role of Chromatin Accessibility in cis-Regulatory Evolution. Genome Biol Evol 2020; 11:1813-1828. [PMID: 31114856 PMCID: PMC6601868 DOI: 10.1093/gbe/evz103] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/13/2019] [Indexed: 02/07/2023] Open
Abstract
Transcription factor (TF) binding is determined by sequence as well as chromatin accessibility. Although the role of accessibility in shaping TF-binding landscapes is well recorded, its role in evolutionary divergence of TF binding, which in turn can alter cis-regulatory activities, is not well understood. In this work, we studied the evolution of genome-wide binding landscapes of five major TFs in the core network of mesoderm specification, between Drosophila melanogaster and Drosophila virilis, and examined its relationship to accessibility and sequence-level changes. We generated chromatin accessibility data from three important stages of embryogenesis in both Drosophila melanogaster and Drosophila virilis and recorded conservation and divergence patterns. We then used multivariable models to correlate accessibility and sequence changes to TF-binding divergence. We found that accessibility changes can in some cases, for example, for the master regulator Twist and for earlier developmental stages, more accurately predict binding change than is possible using TF-binding motif changes between orthologous enhancers. Accessibility changes also explain a significant portion of the codivergence of TF pairs. We noted that accessibility and motif changes offer complementary views of the evolution of TF binding and developed a combined model that captures the evolutionary data much more accurately than either view alone. Finally, we trained machine learning models to predict enhancer activity from TF binding and used these functional models to argue that motif and accessibility-based predictors of TF-binding change can substitute for experimentally measured binding change, for the purpose of predicting evolutionary changes in enhancer activity.
Collapse
Affiliation(s)
- Pei-Chen Peng
- Department of Computer Science, University of Illinois at Urbana-Champaign.,Center for Bioinformatics and Functional Genomics, Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA
| | - Pierre Khoueiry
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.,American University of Beirut (AUB), Department of Biochemistry and Molecular Genetics, Beirut, Lebanon
| | - Charles Girardot
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - James P Reddington
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - David A Garfield
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.,IRI-Life Sciences, Humboldt Universität zu Berlin, Berlin, Germany
| | - Eileen E M Furlong
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign
| |
Collapse
|
9
|
Bogan SN, Place SP. Accelerated evolution at chaperone promoters among Antarctic notothenioid fishes. BMC Evol Biol 2019; 19:205. [PMID: 31694524 PMCID: PMC6836667 DOI: 10.1186/s12862-019-1524-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Accepted: 10/01/2019] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Antarctic fishes of the Notothenioidei suborder constitutively upregulate multiple inducible chaperones, a highly derived adaptation that preserves proteostasis in extreme cold, and represent a system for studying the evolution of gene frontloading. We screened for Hsf1-binding sites, as Hsf1 is a master transcription factor of the heat shock response, and highly-conserved non-coding elements within proximal promoters of chaperone genes across 10 Antarctic notothens, 2 subpolar notothens, and 17 perciform fishes. We employed phylogenetic models of molecular evolution to determine whether (i) changes in motifs associated with Hsf1-binding and/or (ii) relaxed purifying selection or exaptation at ancestral cis-regulatory elements coincided with the evolution of chaperone frontloading in Antarctic notothens. RESULTS Antarctic notothens exhibited significantly fewer Hsf1-binding sites per bp at chaperone promoters than subpolar notothens and Serranoidei, the most closely-related suborder to Notothenioidei included in this study. 90% of chaperone promoters exhibited accelerated substitution rates among Antarctic notothens relative to other perciformes. The proportion of bases undergoing accelerated evolution (i) was significantly greater in Antarctic notothens than in subpolar notothens and Perciformes in 70% of chaperone genes and (ii) increased among bases that were more conserved among perciformes. Lastly, we detected evidence of relaxed purifying selection and exaptation acting on ancestrally conserved cis-regulatory elements in the Antarctic notothen lineage and its major branches. CONCLUSION A large degree of turnover has occurred in Notothenioidei at chaperone promoter regions that are conserved among perciform fishes following adaptation to the cooling of the Southern Ocean. Additionally, derived reductions in Hsf1-binding site frequency suggest cis-regulatory modifications to the classical heat shock response. Of note, turnover events within chaperone promoters were less frequent in the ancestral node of Antarctic notothens relative to younger Antarctic lineages. This suggests that cis-regulatory divergence at chaperone promoters may be greater between Antarctic notothen lineages than between subpolar and Antarctic clades. These findings demonstrate that strong selective forces have acted upon cis-regulatory elements of chaperone genes among Antarctic notothens.
Collapse
Affiliation(s)
- Samuel N Bogan
- Department of Biology, Sonoma State University, Rohnert Park, CA, 94928, USA.
- Department of Ecology, Evolution and Marine Biology, University of California, Santa Barbara, CA, 93106, USA.
| | - Sean P Place
- Department of Biology, Sonoma State University, Rohnert Park, CA, 94928, USA
| |
Collapse
|
10
|
Boltz TA, Khuri S, Wuchty S. Promoter conservation in HDACs points to functional implications. BMC Genomics 2019; 20:613. [PMID: 31351464 PMCID: PMC6660948 DOI: 10.1186/s12864-019-5973-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Accepted: 07/12/2019] [Indexed: 01/05/2023] Open
Abstract
Background Histone deacetylases (HDACs) are the proteins responsible for removing the acetyl group from lysine residues of core histones in chromosomes, a crucial component of gene regulation. Eleven known HDACs exist in humans and most other vertebrates. While the basic function of HDACs has been well characterized and new discoveries are still being made, the transcriptional regulation of their corresponding genes is still poorly understood. Results Here, we conducted a computational analysis of the eleven HDAC promoter sequences in 25 vertebrate species to determine whether transcription factor binding sites (TFBSs) are conserved in HDAC evolution, and if so, whether they provide useful information about HDAC expression and function. Furthermore, we used tissue-specific information of transcription factors to investigate the potential expression patterns of HDACs in different human tissues based on their transcription factor binding sites. We found that the TFBS profiles of most of the HDACs were well conserved in closely related species for all HDAC promoters except HDAC7 and HDAC10. HDAC5 had particularly strong conservation across over half of the species studied, with nearly identical profiles in the primate species. Our comparisons of TFBSs with the tissue specific gene expression profiles of their corresponding TFs showed that most HDACs had the ability to be ubiquitously expressed. A few HDAC promoters exhibited the potential for preferential expression in certain tissues, most notably HDAC11 in gall bladder, while HDAC9 seemed to have less propensity for expression in the nervous system. Conclusions In general, we found evolutionary conservation in HDAC promoters that seems to be more prominent for the ubiquitously expressed HDACs. In turn, when conservation did not follow usual phylogeny, human TFBS patterns indicated possible functional relevance. While we found that HDACs appear to uniformly expressed, we confirm that the functional differences in HDACs may be less a matter of location of activity than a question of which proteins and which acetyl groups they may be acting on. Electronic supplementary material The online version of this article (10.1186/s12864-019-5973-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Toni A Boltz
- Department of Computer Science, University of Miami, Coral Gables, FL, USA.,Present address: University of California, Los Angeles, Los Angeles, CA, USA
| | - Sawsan Khuri
- University of Exeter College of Medicine and Health, Exeter, UK
| | - Stefan Wuchty
- Department of Computer Science, University of Miami, Coral Gables, FL, USA. .,Department of Biology, University of Miami, Coral Gables, FL, USA. .,Center of Computational Science, University of Miami, Coral Gables, FL, USA. .,Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL, USA.
| |
Collapse
|
11
|
Nguyen DX, Sakaguchi T, Nakazawa T, Sakamoto M, Honda Y. A 14-bp stretch plays a critical role in regulating gene expression from β1-tubulin promoters of basidiomycetes. Curr Genet 2019; 66:217-228. [DOI: 10.1007/s00294-019-01014-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Revised: 06/27/2019] [Accepted: 07/03/2019] [Indexed: 11/25/2022]
|
12
|
Kinney JB, McCandlish DM. Massively Parallel Assays and Quantitative Sequence-Function Relationships. Annu Rev Genomics Hum Genet 2019; 20:99-127. [PMID: 31091417 DOI: 10.1146/annurev-genom-083118-014845] [Citation(s) in RCA: 76] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Over the last decade, a rich variety of massively parallel assays have revolutionized our understanding of how biological sequences encode quantitative molecular phenotypes. These assays include deep mutational scanning, high-throughput SELEX, and massively parallel reporter assays. Here, we review these experimental methods and how the data they produce can be used to quantitatively model sequence-function relationships. In doing so, we touch on a diverse range of topics, including the identification of clinically relevant genomic variants, the modeling of transcription factor binding to DNA, the functional and evolutionary landscapes of proteins, and cis-regulatory mechanisms in both transcription and mRNA splicing. We further describe a unified conceptual framework and a core set of mathematical modeling strategies that studies in these diverse areas can make use of. Finally, we highlight key aspects of experimental design and mathematical modeling that are important for the results of such studies to be interpretable and reproducible.
Collapse
Affiliation(s)
- Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| |
Collapse
|
13
|
del Olmo Toledo V, Puccinelli R, Fordyce PM, Pérez JC. Diversification of DNA binding specificities enabled SREBP transcription regulators to expand the repertoire of cellular functions that they govern in fungi. PLoS Genet 2018; 14:e1007884. [PMID: 30596634 PMCID: PMC6329520 DOI: 10.1371/journal.pgen.1007884] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2018] [Revised: 01/11/2019] [Accepted: 12/08/2018] [Indexed: 01/08/2023] Open
Abstract
The Sterol Regulatory Element Binding Proteins (SREBPs) are basic-helix-loop-helix transcription regulators that control the expression of sterol biosynthesis genes in higher eukaryotes and some fungi. Surprisingly, SREBPs do not regulate sterol biosynthesis in the ascomycete yeasts (Saccharomycotina) as this role was handed off to an unrelated transcription regulator in this clade. The SREBPs, nonetheless, expanded in fungi such as the ascomycete yeasts Candida spp., raising questions about their role and evolution in these organisms. Here we report that the fungal SREBPs diversified their DNA binding preferences concomitantly with an expansion in function. We establish that several branches of fungal SREBPs preferentially bind non-palindromic DNA sequences, in contrast to the palindromic DNA motifs recognized by most basic-helix-loop-helix proteins (including SREBPs) in higher eukaryotes. Reconstruction and biochemical characterization of the likely ancestor protein suggest that an intrinsic DNA binding promiscuity in the family was resolved by alternative mechanisms in different branches of fungal SREBPs. Furthermore, we show that two SREBPs in the human commensal yeast Candida albicans drive a transcriptional cascade that inhibits a morphological switch under anaerobic conditions. Preventing this morphological transition enhances C. albicans colonization of the mammalian intestine, the fungus' natural niche. Thus, our results illustrate how diversification in DNA binding preferences enabled the functional expansion of a family of eukaryotic transcription regulators.
Collapse
Affiliation(s)
- Valentina del Olmo Toledo
- Interdisciplinary Center for Clinical Research, University Hospital Würzburg, Würzburg, Germany
- Institute for Molecular Infection Biology, University Würzburg, Würzburg, Germany
| | - Robert Puccinelli
- Department of Genetics, Stanford University, Stanford, California, United States of America
- Chan Zuckerberg Biohub, San Francisco, California, United States of America
| | - Polly M. Fordyce
- Department of Genetics, Stanford University, Stanford, California, United States of America
- Chan Zuckerberg Biohub, San Francisco, California, United States of America
- Department of Bioengineering, Stanford University, Stanford, California, United States of America
- Stanford CheM-H Institute, Stanford University, Stanford, California, United States of America
| | - J. Christian Pérez
- Interdisciplinary Center for Clinical Research, University Hospital Würzburg, Würzburg, Germany
- Institute for Molecular Infection Biology, University Würzburg, Würzburg, Germany
- * E-mail:
| |
Collapse
|
14
|
Nettling M, Treutler H, Cerquides J, Grosse I. Unrealistic phylogenetic trees may improve phylogenetic footprinting. Bioinformatics 2018; 33:1639-1646. [PMID: 28130227 PMCID: PMC5447242 DOI: 10.1093/bioinformatics/btx033] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Accepted: 01/19/2017] [Indexed: 01/10/2023] Open
Abstract
Motivation The computational investigation of DNA binding motifs from binding sites is one of the classic tasks in bioinformatics and a prerequisite for understanding gene regulation as a whole. Due to the development of sequencing technologies and the increasing number of available genomes, approaches based on phylogenetic footprinting become increasingly attractive. Phylogenetic footprinting requires phylogenetic trees with attached substitution probabilities for quantifying the evolution of binding sites, but these trees and substitution probabilities are typically not known and cannot be estimated easily. Results Here, we investigate the influence of phylogenetic trees with different substitution probabilities on the classification performance of phylogenetic footprinting using synthetic and real data. For synthetic data we find that the classification performance is highest when the substitution probability used for phylogenetic footprinting is similar to that used for data generation. For real data, however, we typically find that the classification performance of phylogenetic footprinting surprisingly increases with increasing substitution probabilities and is often highest for unrealistically high substitution probabilities close to one. This finding suggests that choosing realistic model assumptions might not always yield optimal predictions in general and that choosing unrealistically high substitution probabilities close to one might actually improve the classification performance of phylogenetic footprinting. Availability and Implementation The proposed PF is implemented in JAVA and can be downloaded from https://github.com/mgledi/PhyFoo Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Martin Nettling
- Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle, Germany
| | - Hendrik Treutler
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Halle, Germany
| | - Jesus Cerquides
- Institut d'Investigació en Intel ligència Artificial, IIIA-CSIC, Campus UAB, Cerdanyola, Spain
| | - Ivo Grosse
- Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle, Germany.,German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
| |
Collapse
|
15
|
Dynamic evolution of regulatory element ensembles in primate CD4 + T cells. Nat Ecol Evol 2018; 2:537-548. [PMID: 29379187 DOI: 10.1038/s41559-017-0447-5] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Accepted: 12/08/2017] [Indexed: 12/12/2022]
Abstract
How evolutionary changes at enhancers affect the transcription of target genes remains an important open question. Previous comparative studies of gene expression have largely measured the abundance of messenger RNA, which is affected by post-transcriptional regulatory processes, hence limiting inferences about the mechanisms underlying expression differences. Here, we directly measured nascent transcription in primate species, allowing us to separate transcription from post-transcriptional regulation. We used precision run-on and sequencing to map RNA polymerases in resting and activated CD4+ T cells in multiple human, chimpanzee and rhesus macaque individuals, with rodents as outgroups. We observed general conservation in coding and non-coding transcription, punctuated by numerous differences between species, particularly at distal enhancers and non-coding RNAs. Genes regulated by larger numbers of enhancers are more frequently transcribed at evolutionarily stable levels, despite reduced conservation at individual enhancers. Adaptive nucleotide substitutions are associated with lineage-specific transcription and at one locus, SGPP2, we predict and experimentally validate that multiple substitutions contribute to human-specific transcription. Collectively, our findings suggest a pervasive role for evolutionary compensation across ensembles of enhancers that jointly regulate target genes.
Collapse
|
16
|
Yi JK, Xu R, Jeong E, Mileva I, Truman JP, Lin CL, Wang K, Snider J, Wen S, Obeid LM, Hannun YA, Mao C. Aging-related elevation of sphingoid bases shortens yeast chronological life span by compromising mitochondrial function. Oncotarget 2018; 7:21124-44. [PMID: 27008706 PMCID: PMC5008273 DOI: 10.18632/oncotarget.8195] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2016] [Accepted: 03/04/2016] [Indexed: 01/08/2023] Open
Abstract
Sphingoid bases (SBs) as bioactive sphingolipids, have been implicated in aging in yeast. However, we know neither how SBs are regulated during yeast aging nor how they, in turn, regulate it. Herein, we demonstrate that the yeast alkaline ceramidases (YPC1 and YDC1) and SB kinases (LCB4 and LCB5) cooperate in regulating SBs during the aging process and that SBs shortens chronological life span (CLS) by compromising mitochondrial functions. With a lipidomics approach, we found that SBs were increased in a time-dependent manner during yeast aging. We also demonstrated that among the enzymes known for being responsible for the metabolism of SBs, YPC1 was upregulated whereas LCB4/5 were downregulated in the course of aging. This inverse regulation of YPC1 and LCB4/5 led to the aging-related upregulation of SBs in yeast and a reduction in CLS. With the proteomics-based approach (SILAC), we revealed that increased SBs altered the levels of proteins related to mitochondria. Further mechanistic studies demonstrated that increased SBs inhibited mitochondrial fusion and caused fragmentation, resulting in decreases in mtDNA copy numbers, ATP levels, mitochondrial membrane potentials, and oxygen consumption. Taken together, these results suggest that increased SBs mediate the aging process by impairing mitochondrial structural integrity and functions.
Collapse
Affiliation(s)
- Jae Kyo Yi
- Graduate Program in Molecular and Cellular Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Medicine, Stony Brook University, Stony Brook, NY, USA.,Stony Brook Cancer Center, Stony Brook, NY, USA
| | - Ruijuan Xu
- Department of Medicine, Stony Brook University, Stony Brook, NY, USA.,Stony Brook Cancer Center, Stony Brook, NY, USA
| | - Eunmi Jeong
- Department of Medicine, Stony Brook University, Stony Brook, NY, USA.,Stony Brook Cancer Center, Stony Brook, NY, USA
| | - Izolda Mileva
- Department of Medicine, Stony Brook University, Stony Brook, NY, USA.,Stony Brook Cancer Center, Stony Brook, NY, USA
| | | | - Chih-Li Lin
- Graduate Program in Molecular and Cellular Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Medicine, Stony Brook University, Stony Brook, NY, USA.,Stony Brook Cancer Center, Stony Brook, NY, USA
| | - Kai Wang
- Department of Medicine, Stony Brook University, Stony Brook, NY, USA.,Stony Brook Cancer Center, Stony Brook, NY, USA
| | - Justin Snider
- Graduate Program in Molecular and Cellular Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Medicine, Stony Brook University, Stony Brook, NY, USA
| | - Sally Wen
- Department of Medicine, Stony Brook University, Stony Brook, NY, USA.,Stony Brook Cancer Center, Stony Brook, NY, USA
| | - Lina M Obeid
- Department of Medicine, Stony Brook University, Stony Brook, NY, USA.,Stony Brook Cancer Center, Stony Brook, NY, USA.,Northport Veterans Affairs Medical Center, Northport, NY, USA
| | - Yusuf A Hannun
- Department of Medicine, Stony Brook University, Stony Brook, NY, USA.,Stony Brook Cancer Center, Stony Brook, NY, USA
| | - Cungui Mao
- Department of Medicine, Stony Brook University, Stony Brook, NY, USA.,Stony Brook Cancer Center, Stony Brook, NY, USA
| |
Collapse
|
17
|
Characterization of dFOXO binding sites upstream of the Insulin Receptor P2 promoter across the Drosophila phylogeny. PLoS One 2017; 12:e0188357. [PMID: 29200426 PMCID: PMC5714339 DOI: 10.1371/journal.pone.0188357] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Accepted: 11/06/2017] [Indexed: 01/01/2023] Open
Abstract
The insulin/TOR signal transduction pathway plays a critical role in determining such important traits as body and organ size, metabolic homeostasis and life span. Although this pathway is highly conserved across the animal kingdom, the affected traits can exhibit important differences even between closely related species. Evolutionary studies of regulatory regions require the reliable identification of transcription factor binding sites. Here we have focused on the Insulin Receptor (InR) expression from its P2 promoter in the Drosophila genus, which in D. melanogaster is up-regulated by hypophosphorylated Drosophila FOXO (dFOXO). We have finely characterized this transcription factor binding sites in vitro along the 1.3 kb region upstream of the InR P2 promoter in five Drosophila species. Moreover, we have tested the effect of mutations in the characterized dFOXO sites of D. melanogaster in transgenic flies. The number of experimentally established binding sites varies across the 1.3 kb region of any particular species, and their distribution also differs among species. In D. melanogaster, InR expression from P2 is differentially affected by dFOXO binding sites at the proximal and distal halves of the species 1.3 kb fragment. The observed uneven distribution of binding sites across this fragment might underlie their differential contribution to regulate InR transcription.
Collapse
|
18
|
Wang Y, Ung MH, Xia T, Cheng W, Cheng C. Cancer cell line specific co-factors modulate the FOXM1 cistrome. Oncotarget 2017; 8:76498-76515. [PMID: 29100329 PMCID: PMC5652723 DOI: 10.18632/oncotarget.20405] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Accepted: 08/14/2017] [Indexed: 12/11/2022] Open
Abstract
ChIP-seq has been commonly applied to identify genomic occupation of transcription factors (TFs) in a context-specific manner. It is generally assumed that a TF should have similar binding patterns in cells from the same or closely related tissues. Surprisingly, this assumption has not been carefully examined. To this end, we systematically compared the genomic binding of the cell cycle regulator FOXM1 in eight cell lines from seven different human tissues at binding signal, peaks and target genes levels. We found that FOXM1 binding in ER-positive breast cancer cell line MCF-7 are distinct comparing to those in not only other non-breast cell lines, but also MDA-MB-231, ER-negative breast cancer cell line. However, binding sites in MDA-MB-231 and non-breast cell lines were highly consistent. The recruitment of estrogen receptor alpha (ERα) caused the unique FOXM1 binding patterns in MCF-7. Moreover, the activity of FOXM1 in MCF-7 reflects the regulatory functions of ERα, while in MDA-MB-231 and non-breast cell lines, FOXM1 activities regulate cell proliferation. Our results suggest that tissue similarity, in some specific contexts, does not hold precedence over TF-cofactors interactions in determining transcriptional states and that the genomic binding of a TF can be dramatically affected by a particular co-factor under certain conditions.
Collapse
Affiliation(s)
- Yue Wang
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China.,Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH 03755, USA
| | - Matthew H Ung
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH 03755, USA
| | - Tian Xia
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Wenqing Cheng
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Chao Cheng
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH 03755, USA.,Norris Cotton Cancer Center, Geisel School of Medicine at Dartmouth, Lebanon, NH 03766, USA.,Department of Biomedical Data Sciences, Geisel School of Medicine at Dartmouth, Lebanon, NH 03766, USA
| |
Collapse
|
19
|
Cronjé HT, Nienaber-Rousseau C, Zandberg L, Chikowore T, de Lange Z, van Zyl T, Pieters M. Candidate gene analysis of the fibrinogen phenotype reveals the importance of polygenic co-regulation. Matrix Biol 2017; 60-61:16-26. [DOI: 10.1016/j.matbio.2016.10.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Revised: 09/20/2016] [Accepted: 10/13/2016] [Indexed: 12/19/2022]
|
20
|
Almeida P, Barbosa R, Bensasson D, Gonçalves P, Sampaio JP. Adaptive divergence in wine yeasts and their wild relatives suggests a prominent role for introgressions and rapid evolution at noncoding sites. Mol Ecol 2017; 26:2167-2182. [PMID: 28231394 DOI: 10.1111/mec.14071] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Revised: 02/13/2017] [Accepted: 02/14/2017] [Indexed: 12/17/2022]
Abstract
In Saccharomyces cerevisiae, the main yeast in wine fermentation, the opportunity to examine divergence at the molecular level between a domesticated lineage and its wild counterpart arose recently due to the identification of the closest relatives of wine strains, a wild population associated with Mediterranean oaks. As genomic data are available for a considerable number of representatives belonging to both groups, we used population genomics to estimate the degree and distribution of nucleotide variation between wine yeasts and their closest wild relatives. We found widespread genomewide divergence, particularly at noncoding sites, which, together with above average divergence in trans-acting DNA binding proteins, may suggest an important role for divergence at the level of transcriptional regulation. Nine outlier regions putatively under strong divergent selection were highlighted by a genomewide scan under stringent conditions. Several cases of introgressions, originating in the sibling species Saccharomyces paradoxus, were also identified in the Mediterranean oak population. FZF1 and SSU1, mostly known for conferring sulphite resistance in wine yeasts, were among the introgressed genes, although not fixed. Because the introgressions detected in our study are not found in wine strains, we hypothesize that ongoing divergent ecological selection segregates the two forms between the different niches. Together, our results provide a first insight into the extent and kind of divergence between wine yeasts and their closest wild relatives.
Collapse
Affiliation(s)
- Pedro Almeida
- Departamento de Ciências da Vida, UCIBIO-REQUIMTE, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516, Caparica, Portugal
| | - Raquel Barbosa
- Departamento de Ciências da Vida, UCIBIO-REQUIMTE, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516, Caparica, Portugal
| | - Douda Bensasson
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA.,Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA
| | - Paula Gonçalves
- Departamento de Ciências da Vida, UCIBIO-REQUIMTE, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516, Caparica, Portugal
| | - José Paulo Sampaio
- Departamento de Ciências da Vida, UCIBIO-REQUIMTE, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516, Caparica, Portugal
| |
Collapse
|
21
|
Abstract
Alterations in regulatory networks contribute to evolutionary change. Transcriptional networks are reconfigured by changes in the binding specificity of transcription factors and their cognate sites. The evolution of RNA-protein regulatory networks is far less understood. The PUF (Pumilio and FBF) family of RNA regulatory proteins controls the translation, stability, and movements of hundreds of mRNAs in a single species. We probe the evolution of PUF-RNA networks by direct identification of the mRNAs bound to PUF proteins in budding and filamentous fungi and by computational analyses of orthologous RNAs from 62 fungal species. Our findings reveal that PUF proteins gain and lose mRNAs with related and emergent biological functions during evolution. We demonstrate at least two independent rewiring events for PUF3 orthologs, independent but convergent evolution of PUF4/5 binding specificity and the rewiring of the PUF4/5 regulons in different fungal lineages. These findings demonstrate plasticity in RNA regulatory networks and suggest ways in which their rewiring occurs.
Collapse
|
22
|
Lim JH, Latysheva NS, Iggo RD, Barker D. Cluster Analysis of p53 Binding Site Sequences Reveals Subsets with Different Functions. Cancer Inform 2016; 15:199-209. [PMID: 27812278 PMCID: PMC5081245 DOI: 10.4137/cin.s39968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2016] [Revised: 08/31/2016] [Accepted: 09/09/2016] [Indexed: 11/05/2022] Open
Abstract
p53 is an important regulator of cell cycle arrest, senescence, apoptosis and metabolism, and is frequently mutated in tumors. It functions as a tetramer, where each component dimer binds to a decameric DNA region known as a response element. We identify p53 binding site subtypes and examine the functional and evolutionary properties of these subtypes. We start with over 1700 known binding sites and, with no prior labeling, identify two sets of response elements by unsupervised clustering. When combined, they give rise to three types of p53 binding sites. We find that probabilistic and alignment-based assessments of cross-species conservation show no strong evidence of differential conservation between types of binding sites. In contrast, functional analysis of the genes most proximal to the binding sites provides strong bioinformatic evidence of functional differentiation between the three types of binding sites. Our results are consistent with recent structural data identifying two conformations of the L1 loop in the DNA binding domain, suggesting that they reflect biologically meaningful groups imposed by the p53 protein structure.
Collapse
Affiliation(s)
- Ji-Hyun Lim
- School of Biology, University of St Andrews, St Andrews, UK
- School of Medicine, University of St Andrews, St Andrews, UK
- Current address: Alacris Theranostics GmbH, Berlin, Germany
| | - Natasha S. Latysheva
- School of Biology, University of St Andrews, St Andrews, UK
- Current address: MRC Laboratory of Molecular Biology, Cambridge, UK
| | - Richard D. Iggo
- School of Medicine, University of St Andrews, St Andrews, UK
- INSERM Unit U1218, University of Bordeaux, Institut Bergonie, Bordeaux, France
| | - Daniel Barker
- School of Biology, University of St Andrews, St Andrews, UK
- Current address: Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
23
|
Functional identification and evolutionary conservation of the yme1L1 mitochondrial integrity gene promoter. GENE REPORTS 2016. [DOI: 10.1016/j.genrep.2016.07.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
24
|
Osada N. Genetic diversity in humans and non-human primates and its evolutionary consequences. Genes Genet Syst 2016; 90:133-45. [PMID: 26510568 DOI: 10.1266/ggs.90.133] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Genetic diversity is a key parameter in population genetics and is important for understanding the process of evolution and for the development of appropriate conservation strategies. Recent advances in sequencing technology have enabled the measurement of genetic diversity of various organisms at the nucleotide level and on a genome-wide scale, yielding more precise estimates than were previously achievable. In this review, I have compiled and summarized the estimates of genetic diversity in humans and non-human primates based on recent genome-wide studies. Although studies on population genetics demonstrated fluctuations in population sizes over time, general patterns have emerged. As shown previously, genetic diversity in humans is one of the lowest among primates; however, certain other primate species exhibit genetic diversity that is comparable to or even lower than that in humans. There exists greater than 10-fold variation in genetic diversity among primate species, and I found weak correlation with species fecundity but not with body or propagule size. I further discuss the potential evolutionary consequences of population size decline on the evolution of primate species. The level of genetic diversity negatively correlates with the ratio of non-synonymous to synonymous polymorphisms in a population, suggesting that proportionally greater numbers of slightly deleterious mutations segregate in small rather than large populations. Although population size decline is likely to promote the fixation of slightly deleterious mutations, there are molecular mechanisms, such as compensatory mutations at various molecular levels, which may prevent fitness decline at the population level. The effects of slightly deleterious mutations from theoretical and empirical studies and their relevance to conservation biology are also discussed in this review.
Collapse
Affiliation(s)
- Naoki Osada
- Department of Population Genetics, National Institute of Genetics
| |
Collapse
|
25
|
Vincent BJ, Estrada J, DePace AH. The appeasement of Doug: a synthetic approach to enhancer biology. Integr Biol (Camb) 2016; 8:475-84. [DOI: 10.1039/c5ib00321k] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Ben J. Vincent
- Department of Systems Biology, Harvard Medical School, 200 Longwood Avenue, Boston, MA 02115, USA
| | - Javier Estrada
- Department of Systems Biology, Harvard Medical School, 200 Longwood Avenue, Boston, MA 02115, USA
| | - Angela H. DePace
- Department of Systems Biology, Harvard Medical School, 200 Longwood Avenue, Boston, MA 02115, USA
| |
Collapse
|
26
|
Gasch AP, Payseur BA, Pool JE. The Power of Natural Variation for Model Organism Biology. Trends Genet 2016; 32:147-154. [PMID: 26777596 PMCID: PMC4769656 DOI: 10.1016/j.tig.2015.12.003] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Revised: 12/09/2015] [Accepted: 12/14/2015] [Indexed: 11/24/2022]
Abstract
Genetic background effects have long been recognized and, in some cases studied, but they are often viewed as a nuisance by molecular biologists. We suggest that genetic variation currently represents a critical frontier for molecular studies. Human genetics has seen a surge of interest in genetic variation and its contributions to disease, but insights into disease mechanisms are difficult since information about gene function is lacking. By contrast, model organism genetics has excelled at revealing molecular mechanisms of cellular processes, but often de-emphasizes genetic variation and its functional consequences. We argue that model organism biology would benefit from incorporating natural variation, both to capture how well laboratory lines exemplify the species they represent and to inform on molecular processes and their variability. Such a synthesis would also greatly expand the relevance of model systems for studies of complex trait variation, including disease.
Collapse
Affiliation(s)
- Audrey P Gasch
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA.
| | - Bret A Payseur
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA.
| | - John E Pool
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA.
| |
Collapse
|
27
|
Bergen AC, Olsen GM, Fay JC. Divergent MLS1 Promoters Lie on a Fitness Plateau for Gene Expression. Mol Biol Evol 2016; 33:1270-9. [PMID: 26782997 PMCID: PMC4839218 DOI: 10.1093/molbev/msw010] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Qualitative patterns of gene activation and repression are often conserved despite an abundance of quantitative variation in expression levels within and between species. A major challenge to interpreting patterns of expression divergence is knowing which changes in gene expression affect fitness. To characterize the fitness effects of gene expression divergence, we placed orthologous promoters from eight yeast species upstream of malate synthase (MLS1) in Saccharomyces cerevisiae. As expected, we found these promoters varied in their expression level under activated and repressed conditions as well as in their dynamic response following loss of glucose repression. Despite these differences, only a single promoter driving near basal levels of expression caused a detectable loss of fitness. We conclude that the MLS1 promoter lies on a fitness plateau whereby even large changes in gene expression can be tolerated without a substantial loss of fitness.
Collapse
Affiliation(s)
- Andrew C Bergen
- Molecular Genetics and Genomics Program, Washington University, St. Louis
| | | | - Justin C Fay
- Department of Genetics, Washington University, St. Louis Center for Genome Sciences and Systems Biology, Washington University, St. Louis
| |
Collapse
|
28
|
Abstract
Transcriptional control of gene expression requires interactions between the cis-regulatory elements (CREs) controlling gene promoters. We developed a sensitive computational method to identify CRE combinations with conserved spacing that does not require genome alignments. When applied to seven sensu stricto and sensu lato Saccharomyces species, 80% of the predicted interactions displayed some evidence of combinatorial transcriptional behavior in several existing datasets including: (1) chromatin immunoprecipitation data for colocalization of transcription factors, (2) gene expression data for coexpression of predicted regulatory targets, and (3) gene ontology databases for common pathway membership of predicted regulatory targets. We tested several predicted CRE interactions with chromatin immunoprecipitation experiments in a wild-type strain and strains in which a predicted cofactor was deleted. Our experiments confirmed that transcription factor (TF) occupancy at the promoters of the CRE combination target genes depends on the predicted cofactor while occupancy of other promoters is independent of the predicted cofactor. Our method has the additional advantage of identifying regulatory differences between species. By analyzing the S. cerevisiae and S. bayanus genomes, we identified differences in combinatorial cis-regulation between the species and showed that the predicted changes in gene regulation explain several of the species-specific differences seen in gene expression datasets. In some instances, the same CRE combinations appear to regulate genes involved in distinct biological processes in the two different species. The results of this research demonstrate that (1) combinatorial cis-regulation can be inferred by multi-genome analysis and (2) combinatorial cis-regulation can explain differences in gene expression between species.
Collapse
|
29
|
Tuğrul M, Paixão T, Barton NH, Tkačik G. Dynamics of Transcription Factor Binding Site Evolution. PLoS Genet 2015; 11:e1005639. [PMID: 26545200 PMCID: PMC4636380 DOI: 10.1371/journal.pgen.1005639] [Citation(s) in RCA: 68] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2015] [Accepted: 10/09/2015] [Indexed: 11/19/2022] Open
Abstract
Evolution of gene regulation is crucial for our understanding of the phenotypic differences between species, populations and individuals. Sequence-specific binding of transcription factors to the regulatory regions on the DNA is a key regulatory mechanism that determines gene expression and hence heritable phenotypic variation. We use a biophysical model for directional selection on gene expression to estimate the rates of gain and loss of transcription factor binding sites (TFBS) in finite populations under both point and insertion/deletion mutations. Our results show that these rates are typically slow for a single TFBS in an isolated DNA region, unless the selection is extremely strong. These rates decrease drastically with increasing TFBS length or increasingly specific protein-DNA interactions, making the evolution of sites longer than ∼ 10 bp unlikely on typical eukaryotic speciation timescales. Similarly, evolution converges to the stationary distribution of binding sequences very slowly, making the equilibrium assumption questionable. The availability of longer regulatory sequences in which multiple binding sites can evolve simultaneously, the presence of “pre-sites” or partially decayed old sites in the initial sequence, and biophysical cooperativity between transcription factors, can all facilitate gain of TFBS and reconcile theoretical calculations with timescales inferred from comparative genomics. Evolution has produced a remarkable diversity of living forms that manifests in qualitative differences as well as quantitative traits. An essential factor that underlies this variability is transcription factor binding sites, short pieces of DNA that control gene expression levels. Nevertheless, we lack a thorough theoretical understanding of the evolutionary times required for the appearance and disappearance of these sites. By combining a biophysically realistic model for how cells read out information in transcription factor binding sites with model for DNA sequence evolution, we explore these timescales and ask what factors crucially affect them. We find that the emergence of binding sites from a random sequence is generically slow under point and insertion/deletion mutational mechanisms. Strong selection, sufficient genomic sequence in which the sites can evolve, the existence of partially decayed old binding sites in the sequence, as well as certain biophysical mechanisms such as cooperativity, can accelerate the binding site gain times and make them consistent with the timescales suggested by comparative analyses of genomic data.
Collapse
Affiliation(s)
- Murat Tuğrul
- Institute of Science and Technology Austria, Klosterneuburg, Austria
- * E-mail:
| | - Tiago Paixão
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | | | - Gašper Tkačik
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| |
Collapse
|
30
|
Maximal Expression of the Evolutionarily Conserved Slit2 Gene Promoter Requires Sp1. Cell Mol Neurobiol 2015; 36:955-964. [PMID: 26456684 DOI: 10.1007/s10571-015-0281-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2015] [Accepted: 10/01/2015] [Indexed: 10/22/2022]
Abstract
Slit2 is a neural axon guidance and chemorepellent protein that stimulates motility in a variety of cell types. The role of Slit2 in neural development and neoplastic growth and migration has been well established, while the genetic mechanisms underlying regulation of the Slit2 gene have not. We identified the core and proximal promoter of Slit2 by mapping multiple transcriptional start sites, analyzing transcriptional activity, and confirming sequence homology for the Slit2 proximal promoter among a number of species. Deletion series and transient transfection identified the Slit2 proximal promoter as within 399 base pairs upstream of the start of transcription. A crucial region for full expression of the Slit2 proximal promoter lies between 399 base pairs and 296 base pairs upstream of the start of transcription. Computer modeling identified three transcription factor-binding consensus sites within this region, of which only site-directed mutagenesis of one of the two identified Sp1 consensus sites inhibited transcriptional activity of the Slit2 proximal promoter (-399 to +253). Bioinformatics analysis of the Slit2 proximal promoter -399 base pair to -296 base pair region shows high sequence conservation over twenty-two species, and that this region follows an expected pattern of sequence divergence through evolution.
Collapse
|
31
|
Schaefke B, Wang TY, Wang CY, Li WH. Gains and Losses of Transcription Factor Binding Sites in Saccharomyces cerevisiae and Saccharomyces paradoxus. Genome Biol Evol 2015. [PMID: 26220934 PMCID: PMC4558856 DOI: 10.1093/gbe/evv138] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Gene expression evolution occurs through changes in cis- or trans-regulatory elements or both. Interactions between transcription factors (TFs) and their binding sites (TFBSs) constitute one of the most important points where these two regulatory components intersect. In this study, we investigated the evolution of TFBSs in the promoter regions of different Saccharomyces strains and species. We divided the promoter of a gene into the proximal region and the distal region, which are defined, respectively, as the 200-bp region upstream of the transcription starting site and as the 200-bp region upstream of the proximal region. We found that the predicted TFBSs in the proximal promoter regions tend to be evolutionarily more conserved than those in the distal promoter regions. Additionally, Saccharomyces cerevisiae strains used in the fermentation of alcoholic drinks have experienced more TFBS losses than gains compared with strains from other environments (wild strains, laboratory strains, and clinical strains). We also showed that differences in TFBSs correlate with the cis component of gene expression evolution between species (comparing S. cerevisiae and its sister species Saccharomyces paradoxus) and within species (comparing two closely related S. cerevisiae strains).
Collapse
Affiliation(s)
- Bernhard Schaefke
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan National Yang-Ming University, Taipei, Taiwan Bioinformatics Program, Institute of Information Science, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan
| | | | | | - Wen-Hsiung Li
- National Yang-Ming University, Taipei, Taiwan China Medical University Hospital, Taichung, Taiwan Department of Ecology and Evolution, University of Chicago
| |
Collapse
|
32
|
Naval-Sánchez M, Potier D, Hulselmans G, Christiaens V, Aerts S. Identification of Lineage-Specific Cis-Regulatory Modules Associated with Variation in Transcription Factor Binding and Chromatin Activity Using Ornstein-Uhlenbeck Models. Mol Biol Evol 2015; 32:2441-55. [PMID: 25944915 PMCID: PMC4540964 DOI: 10.1093/molbev/msv107] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Scoring the impact of noncoding variation on the function of cis-regulatory regions, on their chromatin state, and on the qualitative and quantitative expression levels of target genes is a fundamental problem in evolutionary genomics. A particular challenge is how to model the divergence of quantitative traits and to identify relationships between the changes across the different levels of the genome, the chromatin activity landscape, and the transcriptome. Here, we examine the use of the Ornstein-Uhlenbeck (OU) model to infer selection at the level of predicted cis-regulatory modules (CRMs), and link these with changes in transcription factor binding and chromatin activity. Using publicly available cross-species ChIP-Seq and STARR-Seq data we show how OU can be applied genome-wide to identify candidate transcription factors for which binding site and CRM turnover is correlated with changes in regulatory activity. Next, we profile open chromatin in the developing eye across three Drosophila species. We identify the recognition motifs of the chromatin remodelers, Trithorax-like and Grainyhead as mostly correlating with species-specific changes in open chromatin. In conclusion, we show in this study that CRM scores can be used as quantitative traits and that motif discovery approaches can be extended towards more complex models of divergence.
Collapse
Affiliation(s)
- Marina Naval-Sánchez
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Delphine Potier
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Gert Hulselmans
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Valerie Christiaens
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Stein Aerts
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| |
Collapse
|
33
|
Nadimpalli S, Persikov AV, Singh M. Pervasive variation of transcription factor orthologs contributes to regulatory network evolution. PLoS Genet 2015; 11:e1005011. [PMID: 25748510 PMCID: PMC4351887 DOI: 10.1371/journal.pgen.1005011] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2014] [Accepted: 01/18/2015] [Indexed: 01/17/2023] Open
Abstract
Differences in transcriptional regulatory networks underlie much of the phenotypic variation observed across organisms. Changes to cis-regulatory elements are widely believed to be the predominant means by which regulatory networks evolve, yet examples of regulatory network divergence due to transcription factor (TF) variation have also been observed. To systematically ascertain the extent to which TFs contribute to regulatory divergence, we analyzed the evolution of the largest class of metazoan TFs, Cys2-His2 zinc finger (C2H2-ZF) TFs, across 12 Drosophila species spanning ~45 million years of evolution. Remarkably, we uncovered that a significant fraction of all C2H2-ZF 1-to-1 orthologs in flies exhibit variations that can affect their DNA-binding specificities. In addition to loss and recruitment of C2H2-ZF domains, we found diverging DNA-contacting residues in ~44% of domains shared between D. melanogaster and the other fly species. These diverging DNA-contacting residues, found in ~70% of the D. melanogaster C2H2-ZF genes in our analysis and corresponding to ~26% of all annotated D. melanogaster TFs, show evidence of functional constraint: they tend to be conserved across phylogenetic clades and evolve slower than other diverging residues. These same variations were rarely found as polymorphisms within a population of D. melanogaster flies, indicating their rapid fixation. The predicted specificities of these dynamic domains gradually change across phylogenetic distances, suggesting stepwise evolutionary trajectories for TF divergence. Further, whereas proteins with conserved C2H2-ZF domains are enriched in developmental functions, those with varying domains exhibit no functional enrichments. Our work suggests that a subset of highly dynamic and largely unstudied TFs are a likely source of regulatory variation in Drosophila and other metazoans.
Collapse
Affiliation(s)
- Shilpa Nadimpalli
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - Anton V. Persikov
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - Mona Singh
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| |
Collapse
|
34
|
Bioinformatics tools for discovery and functional analysis of single nucleotide polymorphisms. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2015; 827:287-310. [PMID: 25387971 DOI: 10.1007/978-94-017-9245-5_17] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
With the high speed DNA sequencing of genome, databases of genome data continue to grow, and the understanding of genetic variation between individuals grows as well. Single nucleotide polymorphisms (SNPs), a main type of genetic variation, are increasingly important resource for understanding the structure and function of the human genome and become a valuable resource for investigating the genetic basis of disease. During the past years, in addition to experimental approaches to characterize specific variants, intense bioinformatics techniques were applied to understand effects of these genetic changes. In the genetics studies, one intends to understand the molecular basis of disease, and computational methods are becoming increasingly important for SNPs selection, prediction and understanding the downstream effects of genetic variation. The review provides systematic information on the available resources and methods for SNPs discovery and analysis. We also report some new results on DNA sequence-based prediction of SNPs in human cytochrome P450, which serves as an example of computational methods to predict and discovery SNPs. Additionally, annotation and prediction of functional SNPs, as well as a comprehensive list of existing tools and online recourses, are reviewed and described.
Collapse
|
35
|
Crocker J, Abe N, Rinaldi L, McGregor AP, Frankel N, Wang S, Alsawadi A, Valenti P, Plaza S, Payre F, Mann RS, Stern DL. Low affinity binding site clusters confer hox specificity and regulatory robustness. Cell 2014; 160:191-203. [PMID: 25557079 DOI: 10.1016/j.cell.2014.11.041] [Citation(s) in RCA: 245] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2014] [Revised: 09/11/2014] [Accepted: 11/13/2014] [Indexed: 11/26/2022]
Abstract
In animals, Hox transcription factors define regional identity in distinct anatomical domains. How Hox genes encode this specificity is a paradox, because different Hox proteins bind with high affinity in vitro to similar DNA sequences. Here, we demonstrate that the Hox protein Ultrabithorax (Ubx) in complex with its cofactor Extradenticle (Exd) bound specifically to clusters of very low affinity sites in enhancers of the shavenbaby gene of Drosophila. These low affinity sites conferred specificity for Ubx binding in vivo, but multiple clustered sites were required for robust expression when embryos developed in variable environments. Although most individual Ubx binding sites are not evolutionarily conserved, the overall enhancer architecture-clusters of low affinity binding sites-is maintained and required for enhancer function. Natural selection therefore works at the level of the enhancer, requiring a particular density of low affinity Ubx sites to confer both specific and robust expression.
Collapse
Affiliation(s)
- Justin Crocker
- Janelia Research Campus, Howard Hughes Medical Institute, 19700 Helix Drive, Ashburn, VA 20147, USA
| | - Namiko Abe
- Columbia University Medical Center, 701 West 168(th) Street, HHSC 1104, New York, NY 10032, USA
| | - Lucrezia Rinaldi
- Columbia University Medical Center, 701 West 168(th) Street, HHSC 1104, New York, NY 10032, USA
| | - Alistair P McGregor
- Department of Biological and Medical Sciences, Oxford Brookes University, Gipsy Lane, Oxford OX3 0BP, UK
| | - Nicolás Frankel
- Departamento de Ecología, Genética y Evolución, IEGEBA-CONICET, Facultad, de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad, Universitaria, Pabellón 2, 1428 Buenos Aires, Argentina
| | - Shu Wang
- New Jersey Neuroscience Institute, 65 James Street, Edison, NJ 08820, USA
| | - Ahmad Alsawadi
- Centre de Biologie du Développement, Université de Toulouse, UPS, 31062 Cedex 9, France; CNRS, UMR5547, Centre de Biologie du Développement, Toulouse, 31062 Cedex 9, France
| | - Philippe Valenti
- Centre de Biologie du Développement, Université de Toulouse, UPS, 31062 Cedex 9, France; CNRS, UMR5547, Centre de Biologie du Développement, Toulouse, 31062 Cedex 9, France
| | - Serge Plaza
- Centre de Biologie du Développement, Université de Toulouse, UPS, 31062 Cedex 9, France; CNRS, UMR5547, Centre de Biologie du Développement, Toulouse, 31062 Cedex 9, France
| | - François Payre
- Centre de Biologie du Développement, Université de Toulouse, UPS, 31062 Cedex 9, France; CNRS, UMR5547, Centre de Biologie du Développement, Toulouse, 31062 Cedex 9, France
| | - Richard S Mann
- Columbia University Medical Center, 701 West 168(th) Street, HHSC 1104, New York, NY 10032, USA.
| | - David L Stern
- Janelia Research Campus, Howard Hughes Medical Institute, 19700 Helix Drive, Ashburn, VA 20147, USA.
| |
Collapse
|
36
|
Sabouri N, Capra JA, Zakian VA. The essential Schizosaccharomyces pombe Pfh1 DNA helicase promotes fork movement past G-quadruplex motifs to prevent DNA damage. BMC Biol 2014; 12:101. [PMID: 25471935 PMCID: PMC4275981 DOI: 10.1186/s12915-014-0101-5] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2014] [Accepted: 11/20/2014] [Indexed: 12/04/2022] Open
Abstract
BACKGROUND G-quadruplexes (G4s) are stable non-canonical DNA secondary structures consisting of stacked arrays of four guanines, each held together by Hoogsteen hydrogen bonds. Sequences with the ability to form these structures in vitro, G4 motifs, are found throughout bacterial and eukaryotic genomes. The budding yeast Pif1 DNA helicase, as well as several bacterial Pif1 family helicases, unwind G4 structures robustly in vitro and suppress G4-induced DNA damage in S. cerevisiae in vivo. RESULTS We determined the genomic distribution and evolutionary conservation of G4 motifs in four fission yeast species and investigated the relationship between G4 motifs and Pfh1, the sole S. pombe Pif1 family helicase. Using chromatin immunoprecipitation combined with deep sequencing, we found that many G4 motifs in the S. pombe genome were associated with Pfh1. Cells depleted of Pfh1 had increased fork pausing and DNA damage near G4 motifs, as indicated by high DNA polymerase occupancy and phosphorylated histone H2A, respectively. In general, G4 motifs were underrepresented in genes. However, Pfh1-associated G4 motifs were located on the transcribed strand of highly transcribed genes significantly more often than expected, suggesting that Pfh1 has a function in replication or transcription at these sites. CONCLUSIONS In the absence of functional Pfh1, unresolved G4 structures cause fork pausing and DNA damage of the sort associated with human tumors.
Collapse
Affiliation(s)
- Nasim Sabouri
- Department of Medical Biochemistry and Biophysics, Umeå University, Umeå, 901 87, Sweden.
| | - John A Capra
- Department of Biological Sciences and Biomedical Informatics and Center for Human Genetics Research, Vanderbilt University, Nashville, TN, 37235, USA.
| | - Virginia A Zakian
- Department of Molecular Biology, Princeton University, Princeton, NJ, 08544, USA.
| |
Collapse
|
37
|
Siepel A, Arbiza L. Cis-regulatory elements and human evolution. Curr Opin Genet Dev 2014; 29:81-9. [PMID: 25218861 PMCID: PMC4258466 DOI: 10.1016/j.gde.2014.08.011] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Revised: 08/17/2014] [Accepted: 08/23/2014] [Indexed: 11/20/2022]
Abstract
Modification of gene regulation has long been considered an important force in human evolution, particularly through changes to cis-regulatory elements (CREs) that function in transcriptional regulation. For decades, however, the study of cis-regulatory evolution was severely limited by the available data. New data sets describing the locations of CREs and genetic variation within and between species have now made it possible to study CRE evolution much more directly on a genome-wide scale. Here, we review recent research on the evolution of CREs in humans based on large-scale genomic data sets. We consider inferences based on primate divergence, human polymorphism, and combinations of divergence and polymorphism. We then consider 'new frontiers' in this field stemming from recent research on transcriptional regulation.
Collapse
Affiliation(s)
- Adam Siepel
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.
| | - Leonardo Arbiza
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
38
|
Obermayer B, Levine E. Exploring the miRNA regulatory network using evolutionary correlations. PLoS Comput Biol 2014; 10:e1003860. [PMID: 25299225 PMCID: PMC4191876 DOI: 10.1371/journal.pcbi.1003860] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2014] [Accepted: 08/18/2014] [Indexed: 01/01/2023] Open
Abstract
Post-transcriptional regulation by miRNAs is a widespread and highly conserved phenomenon in metazoans, with several hundreds to thousands of conserved binding sites for each miRNA, and up to two thirds of all genes under miRNA regulation. At the same time, the effect of miRNA regulation on mRNA and protein levels is usually quite modest and associated phenotypes are often weak or subtle. This has given rise to the notion that the highly interconnected miRNA regulatory network exerts its function less through any individual link and more via collective effects that lead to a functional interdependence of network links. We present a Bayesian framework to quantify conservation of miRNA target sites using vertebrate whole-genome alignments. The increased statistical power of our phylogenetic model allows detection of evolutionary correlation in the conservation patterns of site pairs. Such correlations could result from collective functions in the regulatory network. For instance, co-conservation of target site pairs supports a selective benefit of combinatorial regulation by multiple miRNAs. We find that some miRNA families are under pronounced co-targeting constraints, indicating a high connectivity in the regulatory network, while others appear to function in a more isolated way. By analyzing coordinated targeting of different curated gene sets, we observe distinct evolutionary signatures for protein complexes and signaling pathways that could reflect differences in control strategies. Our method is easily scalable to analyze upcoming larger data sets, and readily adaptable to detect high-level selective constraints between other genomic loci. We thus provide a proof-of-principle method to understand regulatory networks from an evolutionary perspective.
Collapse
Affiliation(s)
- Benedikt Obermayer
- Systems Biology of Gene Regulatory Elements, Max-Delbrück Center for Molecular Medicine, Berlin, Germany
- Department of Physics and Center for Systems Biology, Harvard University, Cambridge, United Kingdom
- * E-mail: (BO); (EL)
| | - Erel Levine
- Systems Biology of Gene Regulatory Elements, Max-Delbrück Center for Molecular Medicine, Berlin, Germany
- Department of Physics and Center for Systems Biology, Harvard University, Cambridge, United Kingdom
- * E-mail: (BO); (EL)
| |
Collapse
|
39
|
McCandlish DM, Stoltzfus A. Modeling evolution using the probability of fixation: history and implications. QUARTERLY REVIEW OF BIOLOGY 2014; 89:225-52. [PMID: 25195318 DOI: 10.1086/677571] [Citation(s) in RCA: 123] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Many models of evolution calculate the rate of evolution by multiplying the rate at which new mutations originate within a population by a probability of fixation. Here we review the historical origins, contemporary applications, and evolutionary implications of these "origin-fixation" models, which are widely used in evolutionary genetics, molecular evolution, and phylogenetics. Origin-fixation models were first introduced in 1969, in association with an emerging view of "molecular" evolution. Early origin-fixation models were used to calculate an instantaneous rate of evolution across a large number of independently evolving loci; in the 1980s and 1990s, a second wave of origin-fixation models emerged to address a sequence of fixation events at a single locus. Although origin fixation models have been applied to a broad array of problems in contemporary evolutionary research, their rise in popularity has not been accompanied by an increased appreciation of their restrictive assumptions or their distinctive implications. We argue that origin-fixation models constitute a coherent theory of mutation-limited evolution that contrasts sharply with theories of evolution that rely on the presence of standing genetic variation. A major unsolved question in evolutionary biology is the degree to which these models provide an accurate approximation of evolution in natural populations.
Collapse
|
40
|
8.2% of the Human genome is constrained: variation in rates of turnover across functional element classes in the human lineage. PLoS Genet 2014; 10:e1004525. [PMID: 25057982 PMCID: PMC4109858 DOI: 10.1371/journal.pgen.1004525] [Citation(s) in RCA: 133] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Accepted: 06/05/2014] [Indexed: 01/27/2023] Open
Abstract
Ten years on from the finishing of the human reference genome sequence, it remains unclear what fraction of the human genome confers function, where this sequence resides, and how much is shared with other mammalian species. When addressing these questions, functional sequence has often been equated with pan-mammalian conserved sequence. However, functional elements that are short-lived, including those contributing to species-specific biology, will not leave a footprint of long-lasting negative selection. Here, we address these issues by identifying and characterising sequence that has been constrained with respect to insertions and deletions for pairs of eutherian genomes over a range of divergences. Within noncoding sequence, we find increasing amounts of mutually constrained sequence as species pairs become more closely related, indicating that noncoding constrained sequence turns over rapidly. We estimate that half of present-day noncoding constrained sequence has been gained or lost in approximately the last 130 million years (half-life in units of divergence time, d1/2 = 0.25–0.31). While enriched with ENCODE biochemical annotations, much of the short-lived constrained sequences we identify are not detected by models optimized for wider pan-mammalian conservation. Constrained DNase 1 hypersensitivity sites, promoters and untranslated regions have been more evolutionarily stable than long noncoding RNA loci which have turned over especially rapidly. By contrast, protein coding sequence has been highly stable, with an estimated half-life of over a billion years (d1/2 = 2.1–5.0). From extrapolations we estimate that 8.2% (7.1–9.2%) of the human genome is presently subject to negative selection and thus is likely to be functional, while only 2.2% has maintained constraint in both human and mouse since these species diverged. These results reveal that the evolutionary history of the human genome has been highly dynamic, particularly for its noncoding yet biologically functional fraction. Nearly 99% of the human genome does not encode proteins, and while there recently has been extensive biochemical annotation of the remaining noncoding fraction, it remains unclear whether or not the bulk of these DNA sequences have important functional roles. By comparing the genome sequences of different species we identify genomic regions that have evolved unexpectedly slowly, a signature of natural selection upon functional sequence. Using a high resolution evolutionary approach to find sequence showing evolutionary signatures of functionality we estimate that a total of 8.2% (7.1–9.2%) of the human genome is presently functional, more than three times as much than is functional and shared between human and mouse. This implies that there is an abundance of sequences with short lived lineage-specific functionality. As expected, most of the sequence involved in this functional “turnover” is noncoding, while protein coding sequence is stably preserved over longer evolutionary timescales. More generally, we find that the rate of functional turnover varies significantly across categories of functional noncoding elements. Our results provide a pan-mammalian and whole genome perspective on how rapidly different classes of sequence have gained and lost functionality down the human lineage.
Collapse
|
41
|
Haldane A, Manhart M, Morozov AV. Biophysical fitness landscapes for transcription factor binding sites. PLoS Comput Biol 2014; 10:e1003683. [PMID: 25010228 PMCID: PMC4091707 DOI: 10.1371/journal.pcbi.1003683] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2013] [Accepted: 05/11/2014] [Indexed: 11/18/2022] Open
Abstract
Phenotypic states and evolutionary trajectories available to cell populations are ultimately dictated by complex interactions among DNA, RNA, proteins, and other molecular species. Here we study how evolution of gene regulation in a single-cell eukaryote S. cerevisiae is affected by interactions between transcription factors (TFs) and their cognate DNA sites. Our study is informed by a comprehensive collection of genomic binding sites and high-throughput in vitro measurements of TF-DNA binding interactions. Using an evolutionary model for monomorphic populations evolving on a fitness landscape, we infer fitness as a function of TF-DNA binding to show that the shape of the inferred fitness functions is in broad agreement with a simple functional form inspired by a thermodynamic model of two-state TF-DNA binding. However, the effective parameters of the model are not always consistent with physical values, indicating selection pressures beyond the biophysical constraints imposed by TF-DNA interactions. We find little statistical support for the fitness landscape in which each position in the binding site evolves independently, indicating that epistasis is common in the evolution of gene regulation. Finally, by correlating TF-DNA binding energies with biological properties of the sites or the genes they regulate, we are able to rule out several scenarios of site-specific selection, under which binding sites of the same TF would experience different selection pressures depending on their position in the genome. These findings support the existence of universal fitness landscapes which shape evolution of all sites for a given TF, and whose properties are determined in part by the physics of protein-DNA interactions. Specialized proteins called transcription factors turn genes on and off by binding to short stretches of DNA in their regulatory regions. Precise gene regulation is essential for cellular survival and proliferation, and its evolution and maintenance under mutational pressure are central issues in biology. Here we discuss how evolution of gene regulation is shaped by the need to maintain favorable binding energies between transcription factors and their genomic binding sites. We show that, surprisingly, transcription factor binding is not affected by many biological properties, such as the essentiality of the gene it regulates. Rather, all sites for a given factor appear to evolve under a universal set of constraints, which can be rationalized in terms of a simple model inspired by transcription factor – DNA binding thermodynamics.
Collapse
Affiliation(s)
- Allan Haldane
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
| | - Michael Manhart
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
| | - Alexandre V. Morozov
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
- BioMaPS Institute for Quantitative Biology, Rutgers University, Piscataway, New Jersey, United States of America
- * E-mail:
| |
Collapse
|
42
|
Abstract
Transcription factor binding sites (TFBSs) on the DNA are generally accepted as the key nodes of gene control. However, the multitudes of TFBSs identified in genome-wide studies, some of them seemingly unconstrained in evolution, have prompted the view that in many cases TF binding may serve no biological function. Yet, insights from transcriptional biochemistry, population genetics and functional genomics suggest that rather than segregating into 'functional' or 'non-functional', TFBS inputs to their target genes may be generally cumulative, with varying degrees of potency and redundancy. As TFBS redundancy can be diminished by mutations and environmental stress, some of the apparently 'spurious' sites may turn out to be important for maintaining adequate transcriptional regulation under these conditions. This has significant implications for interpreting the phenotypic effects of TFBS mutations, particularly in the context of genome-wide association studies for complex traits.
Collapse
|
43
|
Heger P, Wiehe T. New tools in the box: An evolutionary synopsis of chromatin insulators. Trends Genet 2014; 30:161-71. [DOI: 10.1016/j.tig.2014.03.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2014] [Revised: 03/24/2014] [Accepted: 03/25/2014] [Indexed: 01/19/2023]
|
44
|
Relative specificity: all substrates are not created equal. GENOMICS PROTEOMICS & BIOINFORMATICS 2014; 12:1-7. [PMID: 24491634 PMCID: PMC4411342 DOI: 10.1016/j.gpb.2014.01.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/26/2013] [Revised: 12/21/2013] [Accepted: 01/07/2014] [Indexed: 11/24/2022]
Abstract
A biological molecule, e.g., an enzyme, tends to interact with its many cognate substrates, targets, or partners differentially. Such a property is termed relative specificity and has been proposed to regulate important physiological functions, even though it has not been examined explicitly in most complex biochemical systems. This essay reviews several recent large-scale studies that investigate protein folding, signal transduction, RNA binding, translation and transcription in the context of relative specificity. These results and others support a pervasive role of relative specificity in diverse biological processes. It is becoming clear that relative specificity contributes fundamentally to the diversity and complexity of biological systems, which has significant implications in disease processes as well.
Collapse
|
45
|
Martinez C, Rest JS, Kim AR, Ludwig M, Kreitman M, White K, Reinitz J. Ancestral resurrection of the Drosophila S2E enhancer reveals accessible evolutionary paths through compensatory change. Mol Biol Evol 2014; 31:903-16. [PMID: 24408913 DOI: 10.1093/molbev/msu042] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Upstream regulatory sequences that control gene expression evolve rapidly, yet the expression patterns and functions of most genes are typically conserved. To address this paradox, we have reconstructed computationally and resurrected in vivo the cis-regulatory regions of the ancestral Drosophila eve stripe 2 element and evaluated its evolution using a mathematical model of promoter function. Our feed-forward transcriptional model predicts gene expression patterns directly from enhancer sequence. We used this functional model along with phylogenetics to generate a set of possible ancestral eve stripe 2 sequences for the common ancestors of 1) D. simulans and D. sechellia; 2) D. melanogaster, D. simulans, and D. sechellia; and 3) D. erecta and D. yakuba. These ancestral sequences were synthesized and resurrected in vivo. Using a combination of quantitative and computational analysis, we find clear support for functional compensation between the binding sites for Bicoid, Giant, and Krüppel over the course of 40-60 My of Drosophila evolution. We show that this compensation is driven by a coupling interaction between Bicoid activation and repression at the anterior and posterior border necessary for proper placement of the anterior stripe 2 border. A multiplicity of mechanisms for binding site turnover exemplified by Bicoid, Giant, and Krüppel sites, explains how rapid sequence change may occur while maintaining the function of the cis-regulatory element.
Collapse
Affiliation(s)
- Carlos Martinez
- Institute for Genomics and Systems Biology, University of Chicago
| | | | | | | | | | | | | |
Collapse
|
46
|
Fay JC. The molecular basis of phenotypic variation in yeast. Curr Opin Genet Dev 2013; 23:672-7. [PMID: 24269094 DOI: 10.1016/j.gde.2013.10.005] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2013] [Revised: 10/19/2013] [Accepted: 10/24/2013] [Indexed: 11/19/2022]
Abstract
The power of yeast genetics has now been extensively applied to phenotypic variation among strains of Saccharomyces cerevisiae. As a result, over 100 genes and numerous sequence variants have been identified, providing us with a general characterization of mutations underlying quantitative trait variation. Most quantitative trait alleles exert considerable phenotypic effects and alter conserved amino acid positions within protein coding sequences. When examined, quantitative trait alleles influence the expression of numerous genes, most of which are unrelated to an allele's phenotypic effect. The profile of quantitative trait alleles has proven useful to reverse quantitative genetics approaches and supports the use of systems genetics approaches to synthesize the molecular basis of trait variation across multiple strains.
Collapse
Affiliation(s)
- Justin C Fay
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University, St. Louis, MO, United States.
| |
Collapse
|
47
|
Duque T, Samee MAH, Kazemian M, Pham HN, Brodsky MH, Sinha S. Simulations of enhancer evolution provide mechanistic insights into gene regulation. Mol Biol Evol 2013; 31:184-200. [PMID: 24097306 PMCID: PMC3879441 DOI: 10.1093/molbev/mst170] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
There is growing interest in models of regulatory sequence evolution. However, existing models specifically designed for regulatory sequences consider the independent evolution of individual transcription factor (TF)-binding sites, ignoring that the function and evolution of a binding site depends on its context, typically the cis-regulatory module (CRM) in which the site is located. Moreover, existing models do not account for the gene-specific roles of TF-binding sites, primarily because their roles often are not well understood. We introduce two models of regulatory sequence evolution that address some of the shortcomings of existing models and implement simulation frameworks based on them. One model simulates the evolution of an individual binding site in the context of a CRM, while the other evolves an entire CRM. Both models use a state-of-the art sequence-to-expression model to predict the effects of mutations on the regulatory output of the CRM and determine the strength of selection. We use the new framework to simulate the evolution of TF-binding sites in 37 well-studied CRMs belonging to the anterior-posterior patterning system in Drosophila embryos. We show that these simulations provide accurate fits to evolutionary data from 12 Drosophila genomes, which includes statistics of binding site conservation on relatively short evolutionary scales and site loss across larger divergence times. The new framework allows us, for the first time, to test hypotheses regarding the underlying cis-regulatory code by directly comparing the evolutionary implications of the hypothesis with the observed evolutionary dynamics of binding sites. Using this capability, we find that explicitly modeling self-cooperative DNA binding by the TF Caudal (CAD) provides significantly better fits than an otherwise identical evolutionary simulation that lacks this mechanistic aspect. This hypothesis is further supported by a statistical analysis of the distribution of intersite spacing between adjacent CAD sites. Experimental tests confirm direct homodimeric interaction between CAD molecules as well as self-cooperative DNA binding by CAD. We note that computational modeling of the D. melanogaster CRMs alone did not yield significant evidence to support CAD self-cooperativity. We thus demonstrate how specific mechanistic details encoded in CRMs can be revealed by modeling their evolution and fitting such models to multispecies data.
Collapse
Affiliation(s)
- Thyago Duque
- Department of Computer Science, University of Illinois at Urbana-Champaign
| | | | | | | | | | | |
Collapse
|
48
|
Bykova NA, Favorov AV, Mironov AA. Hidden Markov models for evolution and comparative genomics analysis. PLoS One 2013; 8:e65012. [PMID: 23762278 PMCID: PMC3676395 DOI: 10.1371/journal.pone.0065012] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2012] [Accepted: 04/23/2013] [Indexed: 12/21/2022] Open
Abstract
The problem of reconstruction of ancestral states given a phylogeny and data from extant species arises in a wide range of biological studies. The continuous-time Markov model for the discrete states evolution is generally used for the reconstruction of ancestral states. We modify this model to account for a case when the states of the extant species are uncertain. This situation appears, for example, if the states for extant species are predicted by some program and thus are known only with some level of reliability; it is common for bioinformatics field. The main idea is formulation of the problem as a hidden Markov model on a tree (tree HMM, tHMM), where the basic continuous-time Markov model is expanded with the introduction of emission probabilities of observed data (e.g. prediction scores) for each underlying discrete state. Our tHMM decoding algorithm allows us to predict states at the ancestral nodes as well as to refine states at the leaves on the basis of quantitative comparative genomics. The test on the simulated data shows that the tHMM approach applied to the continuous variable reflecting the probabilities of the states (i.e. prediction score) appears to be more accurate then the reconstruction from the discrete states assignment defined by the best score threshold. We provide examples of applying our model to the evolutionary analysis of N-terminal signal peptides and transcription factor binding sites in bacteria. The program is freely available at http://bioinf.fbb.msu.ru/~nadya/tHMM and via web-service at http://bioinf.fbb.msu.ru/treehmmweb.
Collapse
Affiliation(s)
- Nadezda A Bykova
- A.A. Kharkevich Institute for Information Transmission Problems RAS, Moscow, Russia.
| | | | | |
Collapse
|
49
|
Kenigsberg E, Tanay A. Drosophila functional elements are embedded in structurally constrained sequences. PLoS Genet 2013; 9:e1003512. [PMID: 23750124 PMCID: PMC3671938 DOI: 10.1371/journal.pgen.1003512] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2012] [Accepted: 03/04/2013] [Indexed: 12/22/2022] Open
Abstract
Modern functional genomics uncovered numerous functional elements in metazoan genomes. Nevertheless, only a small fraction of the typical non-exonic genome contains elements that code for function directly. On the other hand, a much larger fraction of the genome is associated with significant evolutionary constraints, suggesting that much of the non-exonic genome is weakly functional. Here we show that in flies, local (30–70 bp) conserved sequence elements that are associated with multiple regulatory functions serve as focal points to a pattern of punctuated regional increase in G/C nucleotide frequencies. We show that this pattern, which covers a region tenfold larger than the conserved elements themselves, is an evolutionary consequence of a shift in the balance between gain and loss of G/C nucleotides and that it is correlated with nucleosome occupancy across multiple classes of epigenetic state. Evidence for compensatory evolution and analysis of SNP allele frequencies show that the evolutionary regime underlying this balance shift is likely to be non-neutral. These data suggest that current gaps in our understanding of genome function and evolutionary dynamics are explicable by a model of sparse sequence elements directly encoding for function, embedded into structural sequences that help to define the local and global epigenomic context of such functional elements. A key challenge in functional genomics is to predict evolutionary dynamics from functional annotation of the genome and vice versa. Modern epigenomic studies helped assign function to numerous new sequence elements, but left most of the genome essentially uncharacterized. Evolutionary genomics, on the other hand, consistently suggests that a much larger fraction of the un-annotated genome evolves under selective pressure. We hypothesize that this function-selection gap can be attributed to sequences that facilitate the physical organization of functional elements, such as transcription factor binding sites, within chromosomes. We exemplify this by studying in detail the sequences embedding small conserved elements (CEs) in Drosophila. We show that, while CEs have typically high AT content, high GC content levels around them are maintained by a non-neutral evolutionary balance between gain and loss of GC nucleotides. This non-uniform pattern is highly correlated with nucleosome organization around CEs, potentially imposing an evolutionary constraint on as much as one quarter of the genome. We suggest this can at least partly explain the above function-selection gap. Weak evolutionary constraints on “structural” sequences (at scales ranging from one nucleosome to recently described multi-megabase topological domains) may affect genome evolution just like structural motifs shape protein evolution.
Collapse
Affiliation(s)
- Ephraim Kenigsberg
- Department of Computer Science and Applied Mathematics and Department of Biological Regulation, Weizmann Institute, Rehovot, Israel
| | - Amos Tanay
- Department of Computer Science and Applied Mathematics and Department of Biological Regulation, Weizmann Institute, Rehovot, Israel
- * E-mail:
| |
Collapse
|
50
|
Connelly CF, Skelly DA, Dunham MJ, Akey JM. Population genomics and transcriptional consequences of regulatory motif variation in globally diverse Saccharomyces cerevisiae strains. Mol Biol Evol 2013; 30:1605-13. [PMID: 23619145 DOI: 10.1093/molbev/mst073] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Noncoding genetic variation is known to significantly influence gene expression levels in a growing number of specific cases; however, the patterns of genome-wide noncoding variation present within populations, the evolutionary forces acting on noncoding variants, and the relative effects of regulatory polymorphisms on transcript abundance are not well characterized. Here, we address these questions by analyzing patterns of regulatory variation in motifs for 177 DNA binding proteins in 37 strains of Saccharomyces cerevisiae. Between S. cerevisiae strains, we found considerable polymorphism in regulatory motifs across strains (mean π = 0.005) as well as diversity in regulatory motifs (mean 0.91 motifs differences per regulatory region). Population genetics analyses reveal that motifs are under purifying selection, and there is considerable heterogeneity in the magnitude of selection across different motifs. Finally, we obtained RNA-Seq data in 22 strains and identified 49 polymorphic DNA sequence motifs in 30 distinct genes that are significantly associated with transcriptional differences between strains. In 22 of these genes, there was a single polymorphic motif associated with expression in the upstream region. Our results provide comprehensive insights into the evolutionary trajectory of regulatory variation in yeast and the characteristics of a compendium of regulatory alleles.
Collapse
|