1
|
Dickson ZW, Golding GB. Evolution of Transcript Abundance is Influenced by Indels in Protein Low Complexity Regions. J Mol Evol 2024; 92:153-168. [PMID: 38485789 DOI: 10.1007/s00239-024-10158-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 01/24/2024] [Indexed: 04/02/2024]
Abstract
Protein Protein low complexity regions (LCRs) are compositionally biased amino acid sequences, many of which have significant evolutionary impacts on the proteins which contain them. They are mutationally unstable experiencing higher rates of indels and substitutions than higher complexity regions. LCRs also impact the expression of their proteins, likely through multiple effects along the path from gene transcription, through translation, and eventual protein degradation. It has been observed that proteins which contain LCRs are associated with elevated transcript abundance (TAb), despite having lower protein abundance. We have gathered and integrated human data to investigate the co-evolution of TAb and LCRs through ancestral reconstructions and model inference using an approximate Bayesian calculation based method. We observe that on short evolutionary timescales TAb evolution is significantly impacted by changes in LCR length, with insertions driving TAb down. But in contrast, the observed data is best explained by indel rates in LCRs which are unaffected by shifts in TAb. Our work demonstrates a coupling between LCR and TAb evolution, and the utility of incorporating multiple responses into evolutionary analyses.
Collapse
Affiliation(s)
| | - G Brian Golding
- Department of Biology, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
2
|
de Boer CG, Taipale J. Hold out the genome: a roadmap to solving the cis-regulatory code. Nature 2024; 625:41-50. [PMID: 38093018 DOI: 10.1038/s41586-023-06661-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 09/20/2023] [Indexed: 01/05/2024]
Abstract
Gene expression is regulated by transcription factors that work together to read cis-regulatory DNA sequences. The 'cis-regulatory code' - how cells interpret DNA sequences to determine when, where and how much genes should be expressed - has proven to be exceedingly complex. Recently, advances in the scale and resolution of functional genomics assays and machine learning have enabled substantial progress towards deciphering this code. However, the cis-regulatory code will probably never be solved if models are trained only on genomic sequences; regions of homology can easily lead to overestimation of predictive performance, and our genome is too short and has insufficient sequence diversity to learn all relevant parameters. Fortunately, randomly synthesized DNA sequences enable testing a far larger sequence space than exists in our genomes, and designed DNA sequences enable targeted queries to maximally improve the models. As the same biochemical principles are used to interpret DNA regardless of its source, models trained on these synthetic data can predict genomic activity, often better than genome-trained models. Here we provide an outlook on the field, and propose a roadmap towards solving the cis-regulatory code by a combination of machine learning and massively parallel assays using synthetic DNA.
Collapse
Affiliation(s)
- Carl G de Boer
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada.
| | - Jussi Taipale
- Applied Tumor Genomics Research Program, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.
- Department of Biochemistry, University of Cambridge, Cambridge, UK.
| |
Collapse
|
3
|
Choudhary MNK, Quaid K, Xing X, Schmidt H, Wang T. Widespread contribution of transposable elements to the rewiring of mammalian 3D genomes. Nat Commun 2023; 14:634. [PMID: 36746940 PMCID: PMC9902604 DOI: 10.1038/s41467-023-36364-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 01/26/2023] [Indexed: 02/08/2023] Open
Abstract
Transposable elements (TEs) are major contributors of genetic material in mammalian genomes. These often include binding sites for architectural proteins, including the multifarious master protein, CTCF, which shapes the 3D genome by creating loops, domains, compartment borders, and RNA-DNA interactions. These play a role in the compact packaging of DNA and have the potential to facilitate regulatory function. In this study, we explore the widespread contribution of TEs to mammalian 3D genomes by quantifying the extent to which they give rise to loops and domain border differences across various cell types and species using several 3D genome mapping technologies. We show that specific families and subfamilies of TEs have contributed to lineage-specific 3D chromatin structures across mammalian species. In many cases, these loops may facilitate sustained interaction between distant cis-regulatory elements and target genes, and domains may segregate chromatin state to impact gene expression in a lineage-specific manner. An experimental validation of our analytical findings using CRISPR-Cas9 to delete a candidate TE resulted in disruption of species-specific 3D chromatin structure. Taken together, we comprehensively quantify and selectively validate our finding that TEs contribute to shaping 3D genome organization and may, in some cases, impact gene regulation during the course of mammalian evolution.
Collapse
Affiliation(s)
- Mayank N K Choudhary
- Center for Genome Sciences & Systems Biology, Washington University in St. Louis, St. Louis, MO, 63110, USA
- Department of Genetics, Washington University in St. Louis, St. Louis, MO, 63110, USA
| | - Kara Quaid
- Center for Genome Sciences & Systems Biology, Washington University in St. Louis, St. Louis, MO, 63110, USA
- Department of Genetics, Washington University in St. Louis, St. Louis, MO, 63110, USA
| | - Xiaoyun Xing
- Center for Genome Sciences & Systems Biology, Washington University in St. Louis, St. Louis, MO, 63110, USA
- Department of Genetics, Washington University in St. Louis, St. Louis, MO, 63110, USA
| | - Heather Schmidt
- Center for Genome Sciences & Systems Biology, Washington University in St. Louis, St. Louis, MO, 63110, USA
- Department of Genetics, Washington University in St. Louis, St. Louis, MO, 63110, USA
| | - Ting Wang
- Center for Genome Sciences & Systems Biology, Washington University in St. Louis, St. Louis, MO, 63110, USA.
- Department of Genetics, Washington University in St. Louis, St. Louis, MO, 63110, USA.
| |
Collapse
|
4
|
Hilsabeck TAU, Liu-Bryan R, Guo T, Wilson KA, Bose N, Raftery D, Beck JN, Lang S, Jin K, Nelson CS, Oron T, Stoller M, Promislow D, Brem RB, Terkeltaub R, Kapahi P. A fly GWAS for purine metabolites identifies human FAM214 homolog medusa, which acts in a conserved manner to enhance hyperuricemia-driven pathologies by modulating purine metabolism and the inflammatory response. GeroScience 2022; 44:2195-2211. [PMID: 35381951 PMCID: PMC9616999 DOI: 10.1007/s11357-022-00557-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 03/25/2022] [Indexed: 01/14/2023] Open
Abstract
Elevated serum urate (hyperuricemia) promotes crystalline monosodium urate tissue deposits and gout, with associated inflammation and increased mortality. To identify modifiers of uric acid pathologies, we performed a fly Genome-Wide Association Study (GWAS) on purine metabolites using the Drosophila Genetic Reference Panel strains. We tested the candidate genes using the Drosophila melanogaster model of hyperuricemia and uric acid crystallization ("concretion formation") in the kidney-like Malpighian tubule. Medusa (mda) activity increased urate levels and inflammatory response programming. Conversely, whole-body mda knockdown decreased purine synthesis precursor phosphoribosyl pyrophosphate, uric acid, and guanosine levels; limited formation of aggregated uric acid concretions; and was sufficient to rescue lifespan reduction in the fly hyperuricemia and gout model. Levels of mda homolog FAM214A were elevated in inflammatory M1- and reduced in anti-inflammatory M2-differentiated mouse bone marrow macrophages, and influenced intracellular uric acid levels in human HepG2 transformed hepatocytes. In conclusion, mda/FAM214A acts in a conserved manner to regulate purine metabolism, promotes disease driven by hyperuricemia and associated tissue inflammation, and provides a potential novel target for uric acid-driven pathologies.
Collapse
Affiliation(s)
- Tyler A U Hilsabeck
- Buck Institute for Research On Aging, 8001 Redwood Blvd., Novato, CA, 94945, USA
- Davis School of Gerontology, University of Southern California, University Park, Los Angeles, CA, 90007, USA
| | - Ru Liu-Bryan
- VA San Diego Healthcare System, 111K, 3350 La Jolla Village Drive, San Diego, CA, 92161, USA
- Department of Medicine, Division of Rheumatology, Allergy and Immunology, University of California San Diego, San Diego, CA, 92093, USA
| | - Tracy Guo
- VA San Diego Healthcare System, 111K, 3350 La Jolla Village Drive, San Diego, CA, 92161, USA
- Department of Medicine, Division of Rheumatology, Allergy and Immunology, University of California San Diego, San Diego, CA, 92093, USA
| | - Kenneth A Wilson
- Buck Institute for Research On Aging, 8001 Redwood Blvd., Novato, CA, 94945, USA
| | - Neelanjan Bose
- Buck Institute for Research On Aging, 8001 Redwood Blvd., Novato, CA, 94945, USA
| | - Daniel Raftery
- Northwest Metabolomics Research Center, Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, WA, USA
| | - Jennifer N Beck
- Buck Institute for Research On Aging, 8001 Redwood Blvd., Novato, CA, 94945, USA
- Department of Urology, University of California, San Francisco, 400 Parnassus Avenue, Room A-632, San Francisco, CA, 94143, USA
| | - Sven Lang
- Department of Medical Biochemistry and Molecular Biology, Saarland University, Homburg, Germany
| | - Kelly Jin
- Allen Institute for Brain Science, Seattle, WA, 98109, USA
- Department of Pathology, University of Washington, Seattle, WA, 98195, USA
| | - Christopher S Nelson
- Buck Institute for Research On Aging, 8001 Redwood Blvd., Novato, CA, 94945, USA
| | - Tal Oron
- Buck Institute for Research On Aging, 8001 Redwood Blvd., Novato, CA, 94945, USA
| | - Marshall Stoller
- Department of Urology, University of California, San Francisco, 400 Parnassus Avenue, Room A-632, San Francisco, CA, 94143, USA
| | - Daniel Promislow
- Department of Pathology, University of Washington, Seattle, WA, 98195, USA
- Department of Biology, University of Washington, Seattle, WA, 98195, USA
| | - Rachel B Brem
- Buck Institute for Research On Aging, 8001 Redwood Blvd., Novato, CA, 94945, USA
- Davis School of Gerontology, University of Southern California, University Park, Los Angeles, CA, 90007, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, 111 Koshland Hall, Berkeley, CA, 94720, USA
| | - Robert Terkeltaub
- VA San Diego Healthcare System, 111K, 3350 La Jolla Village Drive, San Diego, CA, 92161, USA
- Department of Medicine, Division of Rheumatology, Allergy and Immunology, University of California San Diego, San Diego, CA, 92093, USA
| | - Pankaj Kapahi
- Buck Institute for Research On Aging, 8001 Redwood Blvd., Novato, CA, 94945, USA.
- Davis School of Gerontology, University of Southern California, University Park, Los Angeles, CA, 90007, USA.
- Department of Urology, University of California, San Francisco, 400 Parnassus Avenue, Room A-632, San Francisco, CA, 94143, USA.
| |
Collapse
|
5
|
Ahsan F, Yan Z, Precup D, Blanchette M. PhyloPGM: boosting regulatory function prediction accuracy using evolutionary information. Bioinformatics 2022; 38:i299-i306. [PMID: 35758792 PMCID: PMC9235490 DOI: 10.1093/bioinformatics/btac259] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Motivation The computational prediction of regulatory function associated with a genomic sequence is of utter importance in -omics study, which facilitates our understanding of the underlying mechanisms underpinning the vast gene regulatory network. Prominent examples in this area include the binding prediction of transcription factors in DNA regulatory regions, and predicting RNA–protein interaction in the context of post-transcriptional gene expression. However, existing computational methods have suffered from high false-positive rates and have seldom used any evolutionary information, despite the vast amount of available orthologous data across multitudes of extant and ancestral genomes, which readily present an opportunity to improve the accuracy of existing computational methods. Results In this study, we present a novel probabilistic approach called PhyloPGM that leverages previously trained TFBS or RNA–RBP binding predictors by aggregating their predictions from various orthologous regions, in order to boost the overall prediction accuracy on human sequences. Throughout our experiments, PhyloPGM has shown significant improvement over baselines such as the sequence-based RNA–RBP binding predictor RNATracker and the sequence-based TFBS predictor that is known as FactorNet. PhyloPGM is simple in principle, easy to implement and yet, yields impressive results. Availability and implementation The PhyloPGM package is available at https://github.com/BlanchetteLab/PhyloPGM Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Faizy Ahsan
- School of Computer Science, McGill University, Montreal H3A 0G4, Canada
| | - Zichao Yan
- School of Computer Science, McGill University, Montreal H3A 0G4, Canada
| | - Doina Precup
- School of Computer Science, McGill University, Montreal H3A 0G4, Canada
| | | |
Collapse
|
6
|
Krinsky BH, Arthur RK, Xia S, Sosa D, Arsala D, White KP, Long M. Rapid Cis-Trans Coevolution Driven by a Novel Gene Retroposed from a Eukaryotic Conserved CCR4-NOT Component in Drosophila. Genes (Basel) 2021; 13:57. [PMID: 35052398 PMCID: PMC8774992 DOI: 10.3390/genes13010057] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Revised: 12/10/2021] [Accepted: 12/23/2021] [Indexed: 12/11/2022] Open
Abstract
Young, or newly evolved, genes arise ubiquitously across the tree of life, and they can rapidly acquire novel functions that influence a diverse array of biological processes. Previous work identified a young regulatory duplicate gene in Drosophila, Zeus that unexpectedly diverged rapidly from its parent, Caf40, an extremely conserved component in the CCR4-NOT machinery in post-transcriptional and post-translational regulation of eukaryotic cells, and took on roles in the male reproductive system. This neofunctionalization was accompanied by differential binding of the Zeus protein to loci throughout the Drosophila melanogaster genome. However, the way in which new DNA-binding proteins acquire and coevolve with their targets in the genome is not understood. Here, by comparing Zeus ChIP-Seq data from D. melanogaster and D. simulans to the ancestral Caf40 binding events from D. yakuba, a species that diverged before the duplication event, we found a dynamic pattern in which Zeus binding rapidly coevolved with a previously unknown DNA motif, which we term Caf40 and Zeus-Associated Motif (CAZAM), under the influence of positive selection. Interestingly, while both copies of Zeus acquired targets at male-biased and testis-specific genes, D. melanogaster and D. simulans proteins have specialized binding on different chromosomes, a pattern echoed in the evolution of the associated motif. Using CRISPR-Cas9-mediated gene knockout of Zeus and RNA-Seq, we found that Zeus regulated the expression of 661 differentially expressed genes (DEGs). Our results suggest that the evolution of young regulatory genes can be coupled to substantial rewiring of the transcriptional networks into which they integrate, even over short evolutionary timescales. Our results thus uncover dynamic genome-wide evolutionary processes associated with new genes.
Collapse
Affiliation(s)
- Benjamin H. Krinsky
- Committee on Evolutionary Biology, University of Chicago, Chicago, IL 60637, USA;
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Robert K. Arthur
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
- Institute for Genomics and Systems Biology, Department of Human Genetics, University of Chicago and Argonne National Laboratory, Chicago, IL 60637, USA
| | - Shengqian Xia
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Dylan Sosa
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Deanna Arsala
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Kevin P. White
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
- Institute for Genomics and Systems Biology, Department of Human Genetics, University of Chicago and Argonne National Laboratory, Chicago, IL 60637, USA
| | - Manyuan Long
- Committee on Evolutionary Biology, University of Chicago, Chicago, IL 60637, USA;
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| |
Collapse
|
7
|
Landis GN, Hilsabeck TAU, Bell HS, Ronnen-Oron T, Wang L, Doherty DV, Tejawinata FI, Erickson K, Vu W, Promislow DEL, Kapahi P, Tower J. Mifepristone Increases Life Span of Virgin Female Drosophila on Regular and High-fat Diet Without Reducing Food Intake. Front Genet 2021; 12:751647. [PMID: 34659367 PMCID: PMC8511958 DOI: 10.3389/fgene.2021.751647] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Accepted: 09/13/2021] [Indexed: 12/14/2022] Open
Abstract
Background: The synthetic steroid mifepristone is reported to have anti-obesity and anti-diabetic effects in mammals on normal and high-fat diets (HFD). We previously reported that mifepristone blocks the negative effect on life span caused by mating in female Drosophila melanogaster. Methods: Here we asked if mifepristone could protect virgin females from the life span-shortening effect of HFD. Mifepristone was assayed for effects on life span in virgin females, in repeated assays, on regular media and on media supplemented with coconut oil (HFD). The excrement quantification (EX-Q) assay was used to measure food intake of the flies after 12 days mifepristone treatment. In addition, experiments were conducted to compare the effects of mifepristone in virgin and mated females, and to identify candidate mifepristone targets and mechanisms. Results: Mifepristone increased life span of virgin females on regular media, as well as on media supplemented with either 2.5 or 5% coconut oil. Food intake was not reduced in any assay, and was significantly increased by mifepristone in half of the assays. To ask if mifepristone might rescue virgin females from all life span-shortening stresses, the oxidative stressor paraquat was tested, and mifepristone produced little to no rescue. Analysis of extant metabolomics and transcriptomics data suggested similarities between effects of mifepristone in virgin and mated females, including reduced tryptophan breakdown and similarities to dietary restriction. Bioinformatics analysis identified candidate mifepristone targets, including transcription factors Paired and Extra-extra. In addition to shortening life span, mating also causes midgut hypertrophy and activation of the lipid metabolism regulatory factor SREBP. Mifepristone blocked the increase in midgut size caused by mating, but did not detectably affect midgut size in virgins. Finally, mating increased activity of a SREBP reporter in abdominal tissues, as expected, but reporter activity was not detectably reduced by mifepristone in either mated or virgin females. Conclusion: Mifepristone increases life span of virgin females on regular and HFD without reducing food intake. Metabolomics and transcriptomics analyses suggest some similar effects of mifepristone between virgin and mated females, however reduced midgut size was observed only in mated females. The results are discussed regarding possible mifepristone mechanisms and targets.
Collapse
Affiliation(s)
- Gary N. Landis
- Molecular and Computational Biology Section, Department of Biological Sciences, Dornsife College of Letters, Arts, and Sciences, University of Southern California, Los Angeles, CA, United States
| | - Tyler A. U. Hilsabeck
- Buck Institute for Research on Aging, Novato, CA, United States
- Davis School of Gerontology, University of Southern California, University Park, Los Angeles, CA, United States
| | - Hans S. Bell
- Molecular and Computational Biology Section, Department of Biological Sciences, Dornsife College of Letters, Arts, and Sciences, University of Southern California, Los Angeles, CA, United States
| | - Tal Ronnen-Oron
- Buck Institute for Research on Aging, Novato, CA, United States
| | - Lu Wang
- Department of Environmental and Occupational Health Sciences, University of Washington, Seattle, WA, United States
| | - Devon V. Doherty
- Molecular and Computational Biology Section, Department of Biological Sciences, Dornsife College of Letters, Arts, and Sciences, University of Southern California, Los Angeles, CA, United States
| | - Felicia I. Tejawinata
- Molecular and Computational Biology Section, Department of Biological Sciences, Dornsife College of Letters, Arts, and Sciences, University of Southern California, Los Angeles, CA, United States
| | - Katherine Erickson
- Molecular and Computational Biology Section, Department of Biological Sciences, Dornsife College of Letters, Arts, and Sciences, University of Southern California, Los Angeles, CA, United States
| | - William Vu
- Molecular and Computational Biology Section, Department of Biological Sciences, Dornsife College of Letters, Arts, and Sciences, University of Southern California, Los Angeles, CA, United States
| | - Daniel E. L. Promislow
- Department of Biology, University of Washington, Seattle, WA, United States
- Department of Laboratory Medicine and Pathology, University of Washington School of Medicine, Seattle, WA, United States
| | - Pankaj Kapahi
- Buck Institute for Research on Aging, Novato, CA, United States
| | - John Tower
- Molecular and Computational Biology Section, Department of Biological Sciences, Dornsife College of Letters, Arts, and Sciences, University of Southern California, Los Angeles, CA, United States
| |
Collapse
|
8
|
Kyrchanova O, Klimenko N, Postika N, Bonchuk A, Zolotarev N, Maksimenko O, Georgiev P. Drosophila architectural protein CTCF is not essential for fly survival and is able to function independently of CP190. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2021; 1864:194733. [PMID: 34311130 DOI: 10.1016/j.bbagrm.2021.194733] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 07/15/2021] [Accepted: 07/15/2021] [Indexed: 12/20/2022]
Abstract
CTCF is the most likely ancestor of proteins that contain large clusters of C2H2 zinc finger domains (C2H2) and is conserved among most bilateral organisms. In mammals, CTCF functions as the main architectural protein involved in the organization of topology-associated domains (TADs). In vertebrates and Drosophila, CTCF is involved in the regulation of homeotic genes. Previously, it was found that null mutations in the dCTCF gene died as pharate adults, which failed to eclose from their pupal case, or shortly after hatching of adults. Here, we obtained several new null dCTCF mutations and found that the complete inactivation of dCTCF appears is limited mainly to phenotypic manifestations of the Abd-B gene and fertility of adult flies. Many modifiers that are not associated with an independent phenotypic manifestation can significantly enhance the expressivity of the null dCTCF mutations, indicating that other architectural proteins are able to functionally compensate for dCTCF inactivation in Drosophila. We also mapped the 715-735 aa region of dCTCF as being essential for the interaction with the BTB (Broad-Complex, Tramtrack, and Bric a brac) and microtubule-targeting (M) domains of the CP190 protein, which binds to many architectural proteins. However, the mutational analysis showed that the interaction with CP190 was not important for the functional activity of dCTCF in vivo.
Collapse
Affiliation(s)
- Olga Kyrchanova
- Department of the Control of Genetic Processes, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., Moscow 119334, Russia; Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., Moscow 119334, Russia
| | - Natalia Klimenko
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., Moscow 119334, Russia
| | - Nikolay Postika
- Department of the Control of Genetic Processes, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., Moscow 119334, Russia
| | - Artem Bonchuk
- Department of the Control of Genetic Processes, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., Moscow 119334, Russia; Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., Moscow 119334, Russia
| | - Nikolay Zolotarev
- Department of the Control of Genetic Processes, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., Moscow 119334, Russia
| | - Oksana Maksimenko
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., Moscow 119334, Russia
| | - Pavel Georgiev
- Department of the Control of Genetic Processes, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., Moscow 119334, Russia.
| |
Collapse
|
9
|
Biswas A, Narlikar L. A universal framework for detecting cis-regulatory diversity in DNA regulatory regions. Genome Res 2021; 31:1646-1662. [PMID: 34285090 PMCID: PMC8415372 DOI: 10.1101/gr.274563.120] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2020] [Accepted: 07/09/2021] [Indexed: 12/02/2022]
Abstract
High-throughput sequencing-based assays measure different biochemical activities pertaining to gene regulation, genome-wide. These activities include transcription factor (TF)–DNA binding, enhancer activity, open chromatin, and more. A major goal is to understand underlying sequence components, or motifs, that can explain the measured activity. It is usually not one motif but a combination of motifs bound by cooperatively acting proteins that confers activity to such regions. Furthermore, regions can be diverse, governed by different combinations of TFs/motifs. Current approaches do not take into account this issue of combinatorial diversity. We present a new statistical framework, cisDIVERSITY, which models regions as diverse modules characterized by combinations of motifs while simultaneously learning the motifs themselves. Because cisDIVERSITY does not rely on knowledge of motifs, modules, cell type, or organism, it is general enough to be applied to regions reported by most high-throughput assays. For example, in enhancer predictions resulting from different assays—GRO-cap, STARR-seq, and those measuring chromatin structure—cisDIVERSITY discovers distinct modules and combinations of TF binding sites, some specific to the assay. From protein–DNA binding data, cisDIVERSITY identifies potential cofactors of the profiled TF, whereas from ATAC-seq data, it identifies tissue-specific regulatory modules. Finally, analysis of single-cell ATAC-seq data suggests that regions open in one cell-state encode information about future states, with certain modules staying open and others closing down in the next time point.
Collapse
Affiliation(s)
- Anushua Biswas
- CSIR-National Chemical Laboratory, Academy of Scientific and Innovative Research
| | - Leelavati Narlikar
- CSIR-National Chemical Laboratory, Academy of Scientific and Innovative Research
| |
Collapse
|
10
|
Liao Y, Zhang X, Chakraborty M, Emerson JJ. Topologically associating domains and their role in the evolution of genome structure and function in Drosophila. Genome Res 2021; 31:397-410. [PMID: 33563719 PMCID: PMC7919452 DOI: 10.1101/gr.266130.120] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2020] [Accepted: 12/24/2020] [Indexed: 12/18/2022]
Abstract
Topologically associating domains (TADs) were recently identified as fundamental units of three-dimensional eukaryotic genomic organization, although our knowledge of the influence of TADs on genome evolution remains preliminary. To study the molecular evolution of TADs in Drosophila species, we constructed a new reference-grade genome assembly and accompanying high-resolution TAD map for D. pseudoobscura Comparison of D. pseudoobscura and D. melanogaster, which are separated by ∼49 million years of divergence, showed that ∼30%-40% of their genomes retain conserved TADs. Comparative genomic analysis of 17 Drosophila species revealed that chromosomal rearrangement breakpoints are enriched at TAD boundaries but depleted within TADs. Additionally, genes within conserved TADs show lower expression divergence than those located in nonconserved TADs. Furthermore, we found that a substantial proportion of long genes (>50 kbp) in D. melanogaster (42%) and D. pseudoobscura (26%) constitute their own TADs, implying transcript structure may be one of the deterministic factors for TAD formation. By using structural variants (SVs) identified from 14 D. melanogaster strains, its three closest sibling species from the D. simulans species complex, and two obscura clade species, we uncovered evidence of selection acting on SVs at TAD boundaries, but with the nature of selection differing between SV types. Deletions are depleted at TAD boundaries in both divergent and polymorphic SVs, suggesting purifying selection, whereas divergent tandem duplications are enriched at TAD boundaries relative to polymorphism, suggesting they are adaptive. Our findings highlight how important TADs are in shaping the acquisition and retention of structural mutations that fundamentally alter genome organization.
Collapse
Affiliation(s)
- Yi Liao
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697, USA
| | - Xinwen Zhang
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697, USA
| | - Mahul Chakraborty
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697, USA
| | - J J Emerson
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697, USA.,Center for Complex Biological Systems, University of California, Irvine, California 92697, USA
| |
Collapse
|
11
|
Liu J, Robinson-Rechavi M. Robust inference of positive selection on regulatory sequences in the human brain. SCIENCE ADVANCES 2020; 6:6/48/eabc9863. [PMID: 33246961 PMCID: PMC7695467 DOI: 10.1126/sciadv.abc9863] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 10/16/2020] [Indexed: 05/07/2023]
Abstract
A longstanding hypothesis is that divergence between humans and chimpanzees might have been driven more by regulatory level adaptations than by protein sequence adaptations. This has especially been suggested for regulatory adaptations in the evolution of the human brain. We present a new method to detect positive selection on transcription factor binding sites on the basis of measuring predicted affinity change with a machine learning model of binding. Unlike other methods, this approach requires neither defining a priori neutral sites nor detecting accelerated evolution, thus removing major sources of bias. We scanned the signals of positive selection for CTCF binding sites in 29 human and 11 mouse tissues or cell types. We found that human brain-related cell types have the highest proportion of positive selection. This result is consistent with the view that adaptive evolution to gene regulation has played an important role in evolution of the human brain.
Collapse
Affiliation(s)
- Jialin Liu
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
12
|
Mourad R. Studying 3D genome evolution using genomic sequence. Bioinformatics 2020; 36:1367-1373. [PMID: 31605131 DOI: 10.1093/bioinformatics/btz775] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Revised: 10/03/2019] [Accepted: 10/08/2019] [Indexed: 01/09/2023] Open
Abstract
MOTIVATION The three dimensions (3D) genome is essential to numerous key processes such as the regulation of gene expression and the replication-timing program. In vertebrates, chromatin looping is often mediated by CTCF, and marked by CTCF motif pairs in convergent orientation. Comparative high-throughput sequencing technique (Hi-C) recently revealed that chromatin looping evolves across species. However, Hi-C experiments are complex and costly, which currently limits their use for evolutionary studies over a large number of species. RESULTS Here, we propose a novel approach to study the 3D genome evolution in vertebrates using the genomic sequence only, e.g. without the need for Hi-C data. The approach is simple and relies on comparing the distances between convergent and divergent CTCF motifs by computing a ratio we named the 3D ratio or '3DR'. We show that 3DR is a powerful statistic to detect CTCF looping encoded in the human genome sequence, thus reflecting strong evolutionary constraints encoded in DNA and associated with the 3D genome. When comparing vertebrate genomes, our results reveal that 3DR which underlies CTCF looping and topologically associating domain organization evolves over time and suggest that ancestral character reconstruction can be used to infer 3DR in ancestral genomes. AVAILABILITY AND IMPLEMENTATION The R code is available at https://github.com/morphos30/PhyloCTCFLooping. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Raphaël Mourad
- LBCMCP, Centre de Biologie Intégrative (CBI), Université de Toulouse, CNRS, UPS, 31062 Toulouse, France
| |
Collapse
|
13
|
Wu Q, Liu P, Wang L. Many facades of CTCF unified by its coding for three-dimensional genome architecture. J Genet Genomics 2020; 47:407-424. [PMID: 33187878 DOI: 10.1016/j.jgg.2020.06.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Revised: 04/15/2020] [Accepted: 06/01/2020] [Indexed: 02/06/2023]
Abstract
CCCTC-binding factor (CTCF) is a multifunctional zinc finger protein that is conserved in metazoan species. CTCF is consistently found to play an important role in many diverse biological processes. CTCF/cohesin-mediated active chromatin 'loop extrusion' architects three-dimensional (3D) genome folding. The 3D architectural role of CTCF underlies its multifarious functions, including developmental regulation of gene expression, protocadherin (Pcdh) promoter choice in the nervous system, immunoglobulin (Ig) and T-cell receptor (Tcr) V(D)J recombination in the immune system, homeobox (Hox) gene control during limb development, as well as many other aspects of biology. Here, we review the pleiotropic functions of CTCF from the perspective of its essential role in 3D genome architecture and topological promoter/enhancer selection. We envision the 3D genome as an enormous complex architecture, with tens of thousands of CTCF sites as connecting nodes and CTCF proteins as mysterious bonds that glue together genomic building parts with distinct articulation joints. In particular, we focus on the internal mechanisms by which CTCF controls higher order chromatin structures that manifest its many façades of physiological and pathological functions. We also discuss the dichotomic role of CTCF sites as intriguing 3D genome nodes for seemingly contradictory 'looping bridges' and 'topological insulators' to frame a beautiful magnificent house for a cell's nuclear home.
Collapse
Affiliation(s)
- Qiang Wu
- MOE Key Lab of Systems Biomedicine, State Key Laboratory of Oncogenes and Related Genes, Center for Comparative Biomedicine, Institute of Systems Biomedicine, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University (SJTU), Shanghai, 200240, China.
| | - Peifeng Liu
- MOE Key Lab of Systems Biomedicine, State Key Laboratory of Oncogenes and Related Genes, Center for Comparative Biomedicine, Institute of Systems Biomedicine, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University (SJTU), Shanghai, 200240, China
| | - Leyang Wang
- MOE Key Lab of Systems Biomedicine, State Key Laboratory of Oncogenes and Related Genes, Center for Comparative Biomedicine, Institute of Systems Biomedicine, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University (SJTU), Shanghai, 200240, China
| |
Collapse
|
14
|
Gain of transcription factor binding sites is associated to changes in the expression signature of human brain and testis and is correlated to genes with higher expression breadth. SCIENCE CHINA-LIFE SCIENCES 2019; 62:526-534. [PMID: 30919278 DOI: 10.1007/s11427-018-9454-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2018] [Accepted: 10/15/2018] [Indexed: 11/26/2022]
Abstract
The gain of transcription factor binding sites (TFBS) is believed to represent one of the major causes of biological innovation. Here we used strategies based on comparative genomics to identify 21,822 TFBS specific to the human lineage (TFBS-HS), when compared to chimpanzee and gorilla genomes. More than 40% (9,206) of these TFBS-HS are in the vicinity of 1,283 genes. A comparison of the expression pattern of these genes and the corresponding orthologs in chimpanzee and gorilla identified genes differentially expressed in human tissues. These genes show a more divergent expression pattern in the human testis and brain, suggesting a role for positive selection in the fixation of TFBS gains. Genes associated with TFBS-HS were enriched in gene ontology categories related to transcriptional regulation, signaling, differentiation/development and nervous system. Furthermore, genes associated with TFBS-HS present a higher expression breadth when compared to genes in general. This biased distribution is due to a preferential gain of TFBS in genes with higher expression breadth rather than a shift in the expression pattern after the gain of TFBS.
Collapse
|
15
|
Werner MS, Sieriebriennikov B, Prabh N, Loschko T, Lanz C, Sommer RJ. Young genes have distinct gene structure, epigenetic profiles, and transcriptional regulation. Genome Res 2018; 28:1675-1687. [PMID: 30232198 PMCID: PMC6211652 DOI: 10.1101/gr.234872.118] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 09/05/2018] [Indexed: 12/22/2022]
Abstract
Species-specific, new, or "orphan" genes account for 10%-30% of eukaryotic genomes. Although initially considered to have limited function, an increasing number of orphan genes have been shown to provide important phenotypic innovation. How new genes acquire regulatory sequences for proper temporal and spatial expression is unknown. Orphan gene regulation may rely in part on origination in open chromatin adjacent to preexisting promoters, although this has not yet been assessed by genome-wide analysis of chromatin states. Here, we combine taxon-rich nematode phylogenies with Iso-Seq, RNA-seq, ChIP-seq, and ATAC-seq to identify the gene structure and epigenetic signature of orphan genes in the satellite model nematode Pristionchus pacificus Consistent with previous findings, we find young genes are shorter, contain fewer exons, and are on average less strongly expressed than older genes. However, the subset of orphan genes that are expressed exhibit distinct chromatin states from similarly expressed conserved genes. Orphan gene transcription is determined by a lack of repressive histone modifications, confirming long-held hypotheses that open chromatin is important for new gene formation. Yet orphan gene start sites more closely resemble enhancers defined by H3K4me1, H3K27ac, and ATAC-seq peaks, in contrast to conserved genes that exhibit traditional promoters defined by H3K4me3 and H3K27ac. Although the majority of orphan genes are located on chromosome arms that contain high recombination rates and repressive histone marks, strongly expressed orphan genes are more randomly distributed. Our results support a model of new gene origination by rare integration into open chromatin near enhancers.
Collapse
Affiliation(s)
- Michael S Werner
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Bogdan Sieriebriennikov
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Neel Prabh
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Tobias Loschko
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Christa Lanz
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Ralf J Sommer
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| |
Collapse
|
16
|
Mitra S, Biswas A, Narlikar L. DIVERSITY in binding, regulation, and evolution revealed from high-throughput ChIP. PLoS Comput Biol 2018; 14:e1006090. [PMID: 29684008 PMCID: PMC5933800 DOI: 10.1371/journal.pcbi.1006090] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Revised: 05/03/2018] [Accepted: 03/14/2018] [Indexed: 12/27/2022] Open
Abstract
Genome-wide in vivo protein-DNA interactions are routinely mapped using high-throughput chromatin immunoprecipitation (ChIP). ChIP-reported regions are typically investigated for enriched sequence-motifs, which are likely to model the DNA-binding specificity of the profiled protein and/or of co-occurring proteins. However, simple enrichment analyses can miss insights into the binding-activity of the protein. Note that ChIP reports regions making direct contact with the protein as well as those binding through intermediaries. For example, consider a ChIP experiment targeting protein X, which binds DNA at its cognate sites, but simultaneously interacts with four other proteins. Each of these proteins also binds to its own specific cognate sites along distant parts of the genome, a scenario consistent with the current view of transcriptional hubs and chromatin loops. Since ChIP will pull down all X-associated regions, the final reported data will be a union of five distinct sets of regions, each containing binding sites of one of the five proteins, respectively. Characterizing all five different motifs and the corresponding sets is important to interpret the ChIP experiment and ultimately, the role of X in regulation. We present diversity which attempts exactly this: it partitions the data so that each partition can be characterized with its own de novo motif. Diversity uses a Bayesian approach to identify the optimal number of motifs and the associated partitions, which together explain the entire dataset. This is in contrast to standard motif finders, which report motifs individually enriched in the data, but do not necessarily explain all reported regions. We show that the different motifs and associated regions identified by diversity give insights into the various complexes that may be forming along the chromatin, something that has so far not been attempted from ChIP data. Webserver at http://diversity.ncl.res.in/; standalone (Mac OS X/Linux) from https://github.com/NarlikarLab/DIVERSITY/releases/tag/v1.0.0. A high-throughput chromatin immunoprecipitation (ChIP) experiment identifies genomic regions bound by a protein in vivo. Current motif-discovery approaches seek an enriched motif signature in the reported regions, which they can attribute to the protein’s binding preferences. However, Diversity models the fact that since a ChIP experiment pulls down regions participating in all complexes involving the profiled protein, the reported regions are in all likelihood, a collection of different types of protein-DNA contacts. Diversity asks a different question: what sequence component caused a specific region to be reported in a ChIP experiment? The answer, in combination with additional data such as sequence conservation, SNPs, chromatin structure, downstream gene-expression, etc. can yield insights into the diverse regulatory mechanisms at play. The added benefits of a webserver and a standalone parallel version make diversity a practical tool for discovering new biology from ChIP experiments.
Collapse
Affiliation(s)
- Sneha Mitra
- Department of Chemical Engineering, CSIR-National Chemical Laboratory, Pune, India
| | - Anushua Biswas
- Department of Chemical Engineering, CSIR-National Chemical Laboratory, Pune, India
| | - Leelavati Narlikar
- Department of Chemical Engineering, CSIR-National Chemical Laboratory, Pune, India
- * E-mail:
| |
Collapse
|
17
|
Marinov GK, Kundaje A. ChIP-ping the branches of the tree: functional genomics and the evolution of eukaryotic gene regulation. Brief Funct Genomics 2018; 17:116-137. [PMID: 29529131 PMCID: PMC5889016 DOI: 10.1093/bfgp/ely004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Advances in the methods for detecting protein-DNA interactions have played a key role in determining the directions of research into the mechanisms of transcriptional regulation. The most recent major technological transformation happened a decade ago, with the move from using tiling arrays [chromatin immunoprecipitation (ChIP)-on-Chip] to high-throughput sequencing (ChIP-seq) as a readout for ChIP assays. In addition to the numerous other ways in which it is superior to arrays, by eliminating the need to design and manufacture them, sequencing also opened the door to carrying out comparative analyses of genome-wide transcription factor occupancy across species and studying chromatin biology in previously less accessible model and nonmodel organisms, thus allowing us to understand the evolution and diversity of regulatory mechanisms in unprecedented detail. Here, we review the biological insights obtained from such studies in recent years and discuss anticipated future developments in the field.
Collapse
Affiliation(s)
- Georgi K Marinov
- Corresponding author: Georgi K. Marinov, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA. E-mail:
| | | |
Collapse
|
18
|
Khoueiry P, Girardot C, Ciglar L, Peng PC, Gustafson EH, Sinha S, Furlong EE. Uncoupling evolutionary changes in DNA sequence, transcription factor occupancy and enhancer activity. eLife 2017; 6. [PMID: 28792889 PMCID: PMC5550276 DOI: 10.7554/elife.28440] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Accepted: 07/21/2017] [Indexed: 12/15/2022] Open
Abstract
Sequence variation within enhancers plays a major role in both evolution and disease, yet its functional impact on transcription factor (TF) occupancy and enhancer activity remains poorly understood. Here, we assayed the binding of five essential TFs over multiple stages of embryogenesis in two distant Drosophila species (with 1.4 substitutions per neutral site), identifying thousands of orthologous enhancers with conserved or diverged combinatorial occupancy. We used these binding signatures to dissect two properties of developmental enhancers: (1) potential TF cooperativity, using signatures of co-associations and co-divergence in TF occupancy. This revealed conserved combinatorial binding despite sequence divergence, suggesting protein-protein interactions sustain conserved collective occupancy. (2) Enhancer in-vivo activity, revealing orthologous enhancers with conserved activity despite divergence in TF occupancy. Taken together, we identify enhancers with diverged motifs yet conserved occupancy and others with diverged occupancy yet conserved activity, emphasising the need to functionally measure the effect of divergence on enhancer activity.
Collapse
Affiliation(s)
- Pierre Khoueiry
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Charles Girardot
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Lucia Ciglar
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Pei-Chen Peng
- Carl R. Woese Institute of Genomic Biology, University of Illinois, Champaign, United States
| | - E Hilary Gustafson
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Saurabh Sinha
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.,Carl R. Woese Institute of Genomic Biology, University of Illinois, Champaign, United States
| | - Eileen Em Furlong
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| |
Collapse
|
19
|
Gerland TA, Sun B, Smialowski P, Lukacs A, Thomae AW, Imhof A. The Drosophila speciation factor HMR localizes to genomic insulator sites. PLoS One 2017; 12:e0171798. [PMID: 28207793 PMCID: PMC5312933 DOI: 10.1371/journal.pone.0171798] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2016] [Accepted: 01/26/2017] [Indexed: 12/22/2022] Open
Abstract
Hybrid incompatibility between Drosophila melanogaster and D. simulans is caused by a lethal interaction of the proteins encoded by the Hmr and Lhr genes. In D. melanogaster the loss of HMR results in mitotic defects, an increase in transcription of transposable elements and a deregulation of heterochromatic genes. To better understand the molecular mechanisms that mediate HMR’s function, we measured genome-wide localization of HMR in D. melanogaster tissue culture cells by chromatin immunoprecipitation. Interestingly, we find HMR localizing to genomic insulator sites that can be classified into two groups. One group belongs to gypsy insulators and another one borders HP1a bound regions at active genes. The transcription of the latter group genes is strongly affected in larvae and ovaries of Hmr mutant flies. Our data suggest a novel link between HMR and insulator proteins, a finding that implicates a potential role for genome organization in the formation of species.
Collapse
Affiliation(s)
- Thomas Andreas Gerland
- Biomedical Center, Histone Modifications Group, Department of Molecular Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany
- Center for Integrated Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Munich, Germany
| | - Bo Sun
- Biomedical Center, Histone Modifications Group, Department of Molecular Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany
| | - Pawel Smialowski
- Biomedical Center, Histone Modifications Group, Department of Molecular Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany
- Biomedical Center, Core Facility Computational Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany
| | - Andrea Lukacs
- Biomedical Center, Histone Modifications Group, Department of Molecular Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany
| | - Andreas Walter Thomae
- Biomedical Center, Histone Modifications Group, Department of Molecular Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany
- Biomedical Center, Core Facility Bioimaging, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany
| | - Axel Imhof
- Biomedical Center, Histone Modifications Group, Department of Molecular Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany
- Center for Integrated Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Munich, Germany
- * E-mail:
| |
Collapse
|
20
|
Ma L, Zhao B, Chen K, Thomas A, Tuteja JH, He X, He C, White KP. Evolution of transcript modification by N6-methyladenosine in primates. Genome Res 2017; 27:385-392. [PMID: 28052920 PMCID: PMC5340966 DOI: 10.1101/gr.212563.116] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Accepted: 12/19/2016] [Indexed: 11/24/2022]
Abstract
Phenotypic differences within populations and between closely related species are often driven by variation and evolution of gene expression. However, most analyses have focused on the effects of genomic variation at cis-regulatory elements such as promoters and enhancers that control transcriptional activity, and little is understood about the influence of post-transcriptional processes on transcript evolution. Post-transcriptional modification of RNA by N6-methyladenosine (m6A) has been shown to be widespread throughout the transcriptome, and this reversible mark can affect transcript stability and translation dynamics. Here we analyze m6A mRNA modifications in lymphoblastoid cell lines (LCLs) from human, chimpanzee and rhesus, and we identify patterns of m6A evolution among species. We find that m6A evolution occurs in parallel with evolution of consensus RNA sequence motifs known to be associated with the enzymatic complexes that regulate m6A dynamics, and expression evolution of m6A-modified genes occurs in parallel with m6A evolution.
Collapse
Affiliation(s)
- Lijia Ma
- Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois 60637, USA.,Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Boxuan Zhao
- Department of Chemistry, The University of Chicago, Chicago, Illinois 60637, USA.,Department of Biochemistry and Molecular Biology and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, USA.,Howard Hughes Medical Institute, The University of Chicago, Chicago, Illinois 60637, USA
| | - Kai Chen
- Department of Chemistry, The University of Chicago, Chicago, Illinois 60637, USA.,Department of Biochemistry and Molecular Biology and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Amber Thomas
- Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois 60637, USA.,Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Jigyasa H Tuteja
- Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois 60637, USA.,Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Xin He
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Chuan He
- Department of Chemistry, The University of Chicago, Chicago, Illinois 60637, USA.,Department of Biochemistry and Molecular Biology and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, USA.,Howard Hughes Medical Institute, The University of Chicago, Chicago, Illinois 60637, USA
| | - Kevin P White
- Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois 60637, USA.,Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA.,Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois 60637, USA.,Tempus Health, Incorporated, Chicago, Illinois 60654, USA
| |
Collapse
|
21
|
|
22
|
Abstract
As genes originate at different evolutionary times, they harbor distinctive genomic signatures of evolutionary ages. Although previous studies have investigated different gene age-related signatures, what signatures dominantly associate with gene age remains unresolved. Here we address this question via a combined approach of comprehensive assignment of gene ages, gene family identification, and multivariate analyses. We first provide a comprehensive and improved gene age assignment by combining homolog clustering with phylogeny inference and categorize human genes into 26 age classes spanning the whole tree of life. We then explore the dominant age-related signatures based on a collection of 10 potential signatures (including gene composition, gene length, selection pressure, expression level, connectivity in protein–protein interaction network and DNA methylation). Our results show that GC content and connectivity in protein–protein interaction network (PPIN) associate dominantly with gene age. Furthermore, we investigate the heterogeneity of dominant signatures in duplicates and singletons. We find that GC content is a consistent primary factor of gene age in duplicates and singletons, whereas PPIN is more strongly associated with gene age in singletons than in duplicates. Taken together, GC content and PPIN are two dominant signatures in close association with gene age, exhibiting heterogeneity in duplicates and singletons and presumably reflecting complex differential interplays between natural selection and mutation.
Collapse
Affiliation(s)
- Hongyan Yin
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
| | - Guangyu Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
| | - Lina Ma
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China
| | - Soojin V Yi
- School of Biology, Georgia Institute of Technology, Atlanta
| | - Zhang Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
23
|
Laarits T, Bordalo P, Lemos B. Genes under weaker stabilizing selection increase network evolvability and rapid regulatory adaptation to an environmental shift. J Evol Biol 2016; 29:1602-16. [DOI: 10.1111/jeb.12897] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2016] [Revised: 05/03/2016] [Accepted: 05/13/2016] [Indexed: 11/28/2022]
Affiliation(s)
| | - P. Bordalo
- Department of Systems Biology; Harvard Medical School; Boston MA USA
| | - B. Lemos
- Program in Molecular and Integrative Physiological Sciences; Department of Environmental Health; Harvard T. H. Chan School of Public Health; Boston MA USA
| |
Collapse
|
24
|
Carvunis AR, Wang T, Skola D, Yu A, Chen J, Kreisberg JF, Ideker T. Evidence for a common evolutionary rate in metazoan transcriptional networks. eLife 2015; 4. [PMID: 26682651 PMCID: PMC4764585 DOI: 10.7554/elife.11615] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Accepted: 12/17/2015] [Indexed: 12/13/2022] Open
Abstract
Genome sequences diverge more rapidly in mammals than in other animal lineages, such as birds or insects. However, the effect of this rapid divergence on transcriptional evolution remains unclear. Recent reports have indicated a faster divergence of transcription factor binding in mammals than in insects, but others found the reverse for mRNA expression. Here, we show that these conflicting interpretations resulted from differing methodologies. We performed an integrated analysis of transcriptional network evolution by examining mRNA expression, transcription factor binding and cis-regulatory motifs across >25 animal species, including mammals, birds and insects. Strikingly, we found that transcriptional networks evolve at a common rate across the three animal lineages. Furthermore, differences in rates of genome divergence were greatly reduced when restricting comparisons to chromatin-accessible sequences. The evolution of transcription is thus decoupled from the global rate of genome sequence evolution, suggesting that a small fraction of the genome regulates transcription. DOI:http://dx.doi.org/10.7554/eLife.11615.001 The genetic information that makes each individual unique is encoded in DNA molecules. Cells read this molecular instruction manual by a process called transcription, in which proteins called transcription factors bind to DNA in specific places and regulate which sections of the DNA will be expressed. These 'transcripts' are active molecules that determine the cell’s – and ultimately the individual’s – characteristics. However, it is not well understood how alterations in the DNA of different individuals or species can lead to changes in where the transcription factors bind, and in which transcripts are expressed. Carvunis, Wang, Skola et al. set out to determine if there is a relationship between how often DNA changes and how often transcription changes during the evolution of animals. The experiments examined the abundance of transcripts in the cells of a variety of animal species with close or distant evolutionary relationships. For example, the house mouse was compared to a close relative called the Algerian mouse, to another species of rodent (rat) and to humans. The experiments show that the changes in transcript abundances are happening at similar rates in mammals, birds and insects, even though DNA changes at very different rates in these groups of animals. This similarity was also observed for other aspects of transcription, such as in changes to where transcription factors bind to DNA. The next challenges are to find out what makes transcription evolve at such similar rates in these groups of animals, and whether these findings extend to other species and to other processes in cells. DOI:http://dx.doi.org/10.7554/eLife.11615.002
Collapse
Affiliation(s)
| | - Tina Wang
- Department of Medicine, University of California, San Diego, La Jolla, United States
| | - Dylan Skola
- Department of Medicine, University of California, San Diego, La Jolla, United States
| | - Alice Yu
- Department of Medicine, University of California, San Diego, La Jolla, United States
| | - Jonathan Chen
- Department of Medicine, University of California, San Diego, La Jolla, United States
| | - Jason F Kreisberg
- Department of Medicine, University of California, San Diego, La Jolla, United States
| | - Trey Ideker
- Department of Medicine, University of California, San Diego, La Jolla, United States
| |
Collapse
|
25
|
Muiño JM, de Bruijn S, Pajoro A, Geuten K, Vingron M, Angenent GC, Kaufmann K. Evolution of DNA-Binding Sites of a Floral Master Regulatory Transcription Factor. Mol Biol Evol 2015; 33:185-200. [PMID: 26429922 PMCID: PMC4693976 DOI: 10.1093/molbev/msv210] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Flower development is controlled by the action of key regulatory transcription factors of the MADS-domain family. The function of these factors appears to be highly conserved among species based on mutant phenotypes. However, the conservation of their downstream processes is much less well understood, mostly because the evolutionary turnover and variation of their DNA-binding sites (BSs) among plant species have not yet been experimentally determined. Here, we performed comparative ChIP (chromatin immunoprecipitation)-seq experiments of the MADS-domain transcription factor SEPALLATA3 (SEP3) in two closely related Arabidopsis species: Arabidopsis thaliana and A. lyrata which have very similar floral organ morphology. We found that BS conservation is associated with DNA sequence conservation, the presence of the CArG-box BS motif and on the relative position of the BS to its potential target gene. Differences in genome size and structure can explain that SEP3 BSs in A. lyrata can be located more distantly to their potential target genes than their counterparts in A. thaliana. In A. lyrata, we identified transposition as a mechanism to generate novel SEP3 binding locations in the genome. Comparative gene expression analysis shows that the loss/gain of BSs is associated with a change in gene expression. In summary, this study investigates the evolutionary dynamics of DNA BSs of a floral key-regulatory transcription factor and explores factors affecting this phenomenon.
Collapse
Affiliation(s)
- Jose M Muiño
- Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany Laboratory of Bioinformatics, Wageningen University, Wageningen, The Netherlands
| | - Suzanne de Bruijn
- Institute for Biochemistry and Biology, Potsdam University, Potsdam, Germany Laboratory of Molecular Biology, Wageningen University, Wageningen, The Netherlands
| | - Alice Pajoro
- Laboratory of Molecular Biology, Wageningen University, Wageningen, The Netherlands
| | - Koen Geuten
- Laboratory of Molecular Plant Biology, Department of Biology, University of Leuven (KU Leuven), Leuven, Belgium
| | - Martin Vingron
- Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Gerco C Angenent
- Laboratory of Molecular Biology, Wageningen University, Wageningen, The Netherlands Bioscience, Plant Research International, Wageningen, The Netherlands
| | - Kerstin Kaufmann
- Institute for Biochemistry and Biology, Potsdam University, Potsdam, Germany
| |
Collapse
|
26
|
Ribeiro-dos-Santos AM, da Silva VL, de Souza JES, de Souza SJ. Populational landscape of INDELs affecting transcription factor-binding sites in humans. BMC Genomics 2015; 16:536. [PMID: 26194008 PMCID: PMC4509691 DOI: 10.1186/s12864-015-1744-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2015] [Accepted: 07/02/2015] [Indexed: 01/31/2023] Open
Abstract
BACKGROUND Differences in gene expression have a significant role in the diversity of phenotypes in humans. Here we integrated human public data from ENCODE, 1000 Genomes and Geuvadis to explore the populational landscape of INDELs affecting transcription factor-binding sites (TFBS). A significant fraction of TFBS close to the transcription start site of known genes is affected by INDELs with a consequent effect at the expression of the associated gene. RESULTS Hundreds of TFBS-affecting INDELs (TFBS-ID) show a differential frequency between human populations, suggesting a role of natural selection in the spread of such variant INDELs. A comparison with a dataset of known human genomic regions under natural selection allowed us to identify several cases of TFBS-ID likely involved in populational adaptations. Ontology analyses on the differential TFBS-ID further indicated several biological processes under natural selection in different populations. CONCLUSION Together, our results strongly suggest that INDELs have an important role in modulating gene expression patterns in humans. The dataset we make available, together with other data reporting variability at both regulatory and coding regions of genes, represent a powerful tool for studies aiming to better understand the evolution of gene regulatory networks in humans.
Collapse
Affiliation(s)
| | - Vandeclécio L da Silva
- PhD Program in Genetics and Molecular Biology, UFPA, Belém, PA, Brazil.
- Instituto de Bioinformática e Biotecnologia, Natal, RN, Brazil.
| | - Jorge E S de Souza
- Instituto de Bioinformática e Biotecnologia, Natal, RN, Brazil.
- Instituto Metrópole Digital, UFRN, Natal, RN, Brazil.
| | - Sandro J de Souza
- Brain Institute, UFRN, Av. Nascimento de Castro, 2155 - 59056-450, Natal, RN, Brazil.
| |
Collapse
|
27
|
Abstract
The modENCODE (Model Organism Encyclopedia of DNA Elements) Consortium aimed to map functional elements-including transcripts, chromatin marks, regulatory factor binding sites, and origins of DNA replication-in the model organisms Drosophila melanogaster and Caenorhabditis elegans. During its five-year span, the consortium conducted more than 2,000 genome-wide assays in developmentally staged animals, dissected tissues, and homogeneous cell lines. Analysis of these data sets provided foundational insights into genome, epigenome, and transcriptome structure and the evolutionary turnover of regulatory pathways. These studies facilitated a comparative analysis with similar data types produced by the ENCODE Consortium for human cells. Genome organization differs drastically in these distant species, and yet quantitative relationships among chromatin state, transcription, and cotranscriptional RNA processing are deeply conserved. Of the many biological discoveries of the modENCODE Consortium, we highlight insights that emerged from integrative studies. We focus on operational and scientific lessons that may aid future projects of similar scale or aims in other, emerging model systems.
Collapse
Affiliation(s)
- James B Brown
- Department of Statistics, University of California, Berkeley, California 94720;
| | | |
Collapse
|
28
|
Shen W, Wang D, Ye B, Shi M, Zhang Y, Zhao Z. A possible role of Drosophila CTCF in mitotic bookmarking and maintaining chromatin domains during the cell cycle. Biol Res 2015; 48:27. [PMID: 26013116 PMCID: PMC4485355 DOI: 10.1186/s40659-015-0019-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Accepted: 05/20/2015] [Indexed: 11/10/2022] Open
Abstract
Background The CCCTC-binding factor (CTCF) is a highly conserved insulator protein that plays various roles in many cellular processes. CTCF is one of the main architecture proteins in higher eukaryotes, and in combination with other architecture proteins and regulators, also shapes the three-dimensional organization of a genome. Experiments show CTCF partially remains associated with chromatin during mitosis. However, the role of CTCF in the maintenance and propagation of genome architectures throughout the cell cycle remains elusive. Results We performed a comprehensive bioinformatics analysis on public datasets of Drosophila CTCF (dCTCF). We characterized dCTCF-binding sites according to their occupancy status during the cell cycle, and identified three classes: interphase-mitosis-common (IM), interphase-only (IO) and mitosis-only (MO) sites. Integrated function analysis showed dCTCF-binding sites of different classes might be involved in different biological processes, and IM sites were more conserved and more intensely bound. dCTCF-binding sites of the same class preferentially localized closer to each other, and were highly enriched at chromatin syntenic and topologically associating domains boundaries. Conclusions Our results revealed different functions of dCTCF during the cell cycle and suggested that dCTCF might contribute to the establishment of the three-dimensional architecture of the Drosophila genome by maintaining local chromatin compartments throughout the whole cell cycle. Electronic supplementary material The online version of this article (doi:10.1186/s40659-015-0019-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Wenlong Shen
- Beijing Institute of Biotechnology, No. 20, Dongdajie Street, Beijing, Fengtai District, 100071, China.
| | - Dong Wang
- Beijing Institute of Biotechnology, No. 20, Dongdajie Street, Beijing, Fengtai District, 100071, China.
| | - Bingyu Ye
- Beijing Institute of Biotechnology, No. 20, Dongdajie Street, Beijing, Fengtai District, 100071, China. .,College of Life Science, Capital Normal University, 105 Xisihuanbei Road, Beijing, Haidian District, 100048, China.
| | - Minglei Shi
- Beijing Institute of Biotechnology, No. 20, Dongdajie Street, Beijing, Fengtai District, 100071, China.
| | - Yan Zhang
- Beijing Institute of Biotechnology, No. 20, Dongdajie Street, Beijing, Fengtai District, 100071, China.
| | - Zhihu Zhao
- Beijing Institute of Biotechnology, No. 20, Dongdajie Street, Beijing, Fengtai District, 100071, China.
| |
Collapse
|
29
|
Arthur RK, Ma L, Slattery M, Spokony RF, Ostapenko A, Nègre N, White KP. Evolution of H3K27me3-marked chromatin is linked to gene expression evolution and to patterns of gene duplication and diversification. Genome Res 2015; 24:1115-24. [PMID: 24985914 PMCID: PMC4079967 DOI: 10.1101/gr.162008.113] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Histone modifications are critical for the regulation of gene expression, cell type specification, and differentiation. However, evolutionary patterns of key modifications that regulate gene expression in differentiating organisms have not been examined. Here we mapped the genomic locations of the repressive mark histone 3 lysine 27 trimethylation (H3K27me3) in four species of Drosophila, and compared these patterns to those in C. elegans. We found that patterns of H3K27me3 are highly conserved across species, but conservation is substantially weaker among duplicated genes. We further discovered that retropositions are associated with greater evolutionary changes in H3K27me3 and gene expression than tandem duplications, indicating that local chromatin constraints influence duplicated gene evolution. These changes are also associated with concomitant evolution of gene expression. Our findings reveal the strong conservation of genomic architecture governed by an epigenetic mark across distantly related species and the importance of gene duplication in generating novel H3K27me3 profiles.
Collapse
Affiliation(s)
- Robert K Arthur
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637, USA; Institute for Genomics and Systems Biology, University of Chicago and Argonne National Laboratory, Chicago, Illinois 60637, USA
| | - Lijia Ma
- Institute for Genomics and Systems Biology, University of Chicago and Argonne National Laboratory, Chicago, Illinois 60637, USA; Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA
| | - Matthew Slattery
- Institute for Genomics and Systems Biology, University of Chicago and Argonne National Laboratory, Chicago, Illinois 60637, USA; Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA; Department of Biomedical Sciences, University of Minnesota Medical School, Duluth, Minnesota 55455, USA
| | - Rebecca F Spokony
- Institute for Genomics and Systems Biology, University of Chicago and Argonne National Laboratory, Chicago, Illinois 60637, USA; Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA; Department of Natural Sciences, Baruch College, City University of New York, New York 10010, USA
| | - Alexander Ostapenko
- Institute for Genomics and Systems Biology, University of Chicago and Argonne National Laboratory, Chicago, Illinois 60637, USA; Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA
| | - Nicolas Nègre
- Institute for Genomics and Systems Biology, University of Chicago and Argonne National Laboratory, Chicago, Illinois 60637, USA; Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA; Université de Montpellier 2 and INRA, UMR1333 DGIMI, F-34095 Montpellier, France
| | - Kevin P White
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637, USA; Institute for Genomics and Systems Biology, University of Chicago and Argonne National Laboratory, Chicago, Illinois 60637, USA; Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA
| |
Collapse
|
30
|
Magbanua JP, Runneburger E, Russell S, White R. A variably occupied CTCF binding site in the ultrabithorax gene in the Drosophila bithorax complex. Mol Cell Biol 2015; 35:318-30. [PMID: 25368383 PMCID: PMC4295388 DOI: 10.1128/mcb.01061-14] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2014] [Revised: 09/10/2014] [Accepted: 10/25/2014] [Indexed: 11/20/2022] Open
Abstract
Although the majority of genomic binding sites for the insulator protein CCCTC-binding factor (CTCF) are constitutively occupied, a subset show variable occupancy. Such variable sites provide an opportunity to assess context-specific CTCF functions in gene regulation. Here, we have identified a variably occupied CTCF site in the Drosophila Ultrabithorax (Ubx) gene. This site is occupied in tissues where Ubx is active (third thoracic leg imaginal disc) but is not bound in tissues where the Ubx gene is repressed (first thoracic leg imaginal disc). Using chromatin conformation capture, we show that this site preferentially interacts with the Ubx promoter region in the active state. The site lies close to Ubx enhancer elements and is also close to the locations of several gypsy transposon insertions that disrupt Ubx expression, leading to the bx mutant phenotype. gypsy insertions carry the Su(Hw)-dependent gypsy insulator and were found to affect both CTCF binding at the variable site and the chromatin topology. This suggests that insertion of the gypsy insulator in this region interferes with CTCF function and supports a model for the normal function of the variable CTCF site as a chromatin loop facilitator, promoting interaction between Ubx enhancers and the Ubx transcription start site.
Collapse
Affiliation(s)
- Jose Paolo Magbanua
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom
| | - Estelle Runneburger
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom
| | - Steven Russell
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Robert White
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
31
|
Beadell AV, Haag ES. Evolutionary Dynamics of GLD-1-mRNA complexes in Caenorhabditis nematodes. Genome Biol Evol 2014; 7:314-35. [PMID: 25502909 PMCID: PMC4316625 DOI: 10.1093/gbe/evu272] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/04/2014] [Indexed: 12/17/2022] Open
Abstract
Given the large number of RNA-binding proteins and regulatory RNAs within genomes, posttranscriptional regulation may be an underappreciated aspect of cis-regulatory evolution. Here, we focus on nematode germ cells, which are known to rely heavily upon translational control to regulate meiosis and gametogenesis. GLD-1 belongs to the STAR-domain family of RNA-binding proteins, conserved throughout eukaryotes, and functions in Caenorhabditis elegans as a germline-specific translational repressor. A phylogenetic analysis across opisthokonts shows that GLD-1 is most closely related to Drosophila How and deuterostome Quaking, both implicated in alternative splicing. We identify messenger RNAs associated with C. briggsae GLD-1 on a genome-wide scale and provide evidence that many participate in aspects of germline development. By comparing our results with published C. elegans GLD-1 targets, we detect nearly 100 that are conserved between the two species. We also detected several hundred Cbr-GLD-1 targets whose homologs have not been reported to be associated with C. elegans GLD-1 in either of two independent studies. Low expression in C. elegans may explain the failure to detect most of them, but a highly expressed subset are strong candidates for Cbr-GLD-1-specific targets. We examine GLD-1-binding motifs among targets conserved in C. elegans and C. briggsae and find that most, but not all, display evidence of shared ancestral binding sites. Our work illustrates both the conservative and the dynamic character of evolution at the posttranslational level of gene regulation, even between congeners.
Collapse
Affiliation(s)
- Alana V Beadell
- Program in Behavior, Evolution, Ecology, and Systematics, University of Maryland, College Park Present address: Department of Organismal Biology and Anatomy, University of Chicago, Chicago, IL
| | - Eric S Haag
- Program in Behavior, Evolution, Ecology, and Systematics, University of Maryland, College Park Department of Biology, University of Maryland, College Park
| |
Collapse
|
32
|
Kratochwil CF, Meyer A. Closing the genotype-phenotype gap: emerging technologies for evolutionary genetics in ecological model vertebrate systems. Bioessays 2014; 37:213-26. [PMID: 25380076 DOI: 10.1002/bies.201400142] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
The analysis of genetic and epigenetic mechanisms of the genotype-phenotypic connection has, so far, only been possible in a handful of genetic model systems. Recent technological advances, including next-generation sequencing methods such as RNA-seq, ChIP-seq and RAD-seq, and genome-editing approaches including CRISPR-Cas, now permit to address these fundamental questions of biology also in organisms that have been studied in their natural habitats. We provide an overview of the benefits and drawbacks of these novel techniques and experimental approaches that can now be applied to ecological and evolutionary vertebrate models such as sticklebacks and cichlid fish. We can anticipate that these new methods will increase the understanding of the genetic and epigenetic factors influencing adaptations and phenotypic variation in ecological settings. These new arrows in the methodological quiver of ecologist will drastically increase the understanding of the genetic basis of adaptive traits - leading to a further closing of the genotype-phenotype gap.
Collapse
Affiliation(s)
- Claudius F Kratochwil
- Chair in Zoology and Evolutionary Biology, Department of Biology, University of Konstanz, Konstanz, Germany; Zukunftskolleg, University of Konstanz, Konstanz, Germany
| | | |
Collapse
|
33
|
Maksimenko O, Bartkuhn M, Stakhov V, Herold M, Zolotarev N, Jox T, Buxa MK, Kirsch R, Bonchuk A, Fedotova A, Kyrchanova O, Renkawitz R, Georgiev P. Two new insulator proteins, Pita and ZIPIC, target CP190 to chromatin. Genome Res 2014; 25:89-99. [PMID: 25342723 PMCID: PMC4317163 DOI: 10.1101/gr.174169.114] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Insulators are multiprotein-DNA complexes that regulate the nuclear architecture. The Drosophila CP190 protein is a cofactor for the DNA-binding insulator proteins Su(Hw), CTCF, and BEAF-32. The fact that CP190 has been found at genomic sites devoid of either of the known insulator factors has until now been unexplained. We have identified two DNA-binding zinc-finger proteins, Pita, and a new factor named ZIPIC, that interact with CP190 in vivo and in vitro at specific interaction domains. Genomic binding sites for these proteins are clustered with CP190 as well as with CTCF and BEAF-32. Model binding sites for Pita or ZIPIC demonstrate a partial enhancer-blocking activity and protect gene expression from PRE-mediated silencing. The function of the CTCF-bound MCP insulator sequence requires binding of Pita. These results identify two new insulator proteins and emphasize the unifying function of CP190, which can be recruited by many DNA-binding insulator proteins.
Collapse
Affiliation(s)
- Oksana Maksimenko
- Laboratory of Gene Expression Regulation in Development, Institute of Gene Biology, Russian Academy of Sciences, Moscow 119334, Russia
| | - Marek Bartkuhn
- Institute for Genetics, Justus-Liebig-University Giessen, Heinrich-Buff-Ring, D-35392 Giessen, Germany
| | - Viacheslav Stakhov
- Laboratory of Gene Expression Regulation in Development, Institute of Gene Biology, Russian Academy of Sciences, Moscow 119334, Russia
| | - Martin Herold
- Institute for Genetics, Justus-Liebig-University Giessen, Heinrich-Buff-Ring, D-35392 Giessen, Germany
| | - Nickolay Zolotarev
- Laboratory of Gene Expression Regulation in Development, Institute of Gene Biology, Russian Academy of Sciences, Moscow 119334, Russia
| | - Theresa Jox
- Institute for Genetics, Justus-Liebig-University Giessen, Heinrich-Buff-Ring, D-35392 Giessen, Germany
| | - Melanie K Buxa
- Institute for Genetics, Justus-Liebig-University Giessen, Heinrich-Buff-Ring, D-35392 Giessen, Germany
| | - Ramona Kirsch
- Institute for Genetics, Justus-Liebig-University Giessen, Heinrich-Buff-Ring, D-35392 Giessen, Germany
| | - Artem Bonchuk
- Group of Transcriptional Regulation, Institute of Gene Biology, Russian Academy of Sciences, Moscow 119334, Russia
| | - Anna Fedotova
- Laboratory of Gene Expression Regulation in Development, Institute of Gene Biology, Russian Academy of Sciences, Moscow 119334, Russia
| | - Olga Kyrchanova
- Group of Transcriptional Regulation, Institute of Gene Biology, Russian Academy of Sciences, Moscow 119334, Russia
| | - Rainer Renkawitz
- Institute for Genetics, Justus-Liebig-University Giessen, Heinrich-Buff-Ring, D-35392 Giessen, Germany;
| | - Pavel Georgiev
- Department of the Control of Genetic Processes, Institute of Gene Biology, Russian Academy of Sciences, Moscow 119334, Russia
| |
Collapse
|
34
|
Tagu D, Colbourne JK, Nègre N. Genomic data integration for ecological and evolutionary traits in non-model organisms. BMC Genomics 2014; 15:490. [PMID: 25047861 PMCID: PMC4108784 DOI: 10.1186/1471-2164-15-490] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2012] [Accepted: 06/17/2014] [Indexed: 02/02/2023] Open
Abstract
Why is it needed to develop system biology initiatives such as ENCODE on non-model organisms?
Collapse
Affiliation(s)
- Denis Tagu
- />INRA Rennes, UMR 1349 IGEPP, BP 35327, 35657 Le Rheu Cedex, France
| | - John K Colbourne
- />School of Bioscience, University of Birmingham, Birmingham, West Midlands England
| | - Nicolas Nègre
- />Université Montpellier 2, UMR1333 DGIMI, F-34095 Montpellier, France
- />INRA, UMR1333 DGIMI, F-34095 Montpellier, France
| |
Collapse
|
35
|
Wei KHC, Clark AG, Barbash DA. Limited gene misregulation is exacerbated by allele-specific upregulation in lethal hybrids between Drosophila melanogaster and Drosophila simulans. Mol Biol Evol 2014; 31:1767-78. [PMID: 24723419 PMCID: PMC4069615 DOI: 10.1093/molbev/msu127] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Misregulation of gene expression is often observed in interspecific hybrids and is generally attributed to regulatory incompatibilities caused by divergence between the two genomes. However, it has been challenging to distinguish effects of regulatory divergence from secondary effects including developmental and physiological defects common to hybrids. Here, we use RNA-Seq to profile gene expression in F1 hybrid male larvae from crosses of Drosophila melanogaster to its sibling species D. simulans. We analyze lethal and viable hybrid males, the latter produced using a mutation in the X-linked D. melanogaster Hybrid male rescue (Hmr) gene and compare them with their parental species and to public data sets of gene expression across development. We find that Hmr has drastically different effects on the parental and hybrid genomes, demonstrating that hybrid incompatibility genes can exhibit novel properties in the hybrid genetic background. Additionally, we find that D. melanogaster alleles are preferentially affected between lethal and viable hybrids. We further determine that many of the differences between the hybrids result from developmental delay in the Hmr(+) hybrids. Finally, we find surprisingly modest expression differences in hybrids when compared with the parents, with only 9% and 4% of genes deviating from additivity or expressed outside of the parental range, respectively. Most of these differences can be attributed to developmental delay and differences in tissue types. Overall, our study suggests that hybrid gene misexpression is prone to overestimation and that even between species separated by approximately 2.5 Ma, regulatory incompatibilities are not widespread in hybrids.
Collapse
Affiliation(s)
- Kevin H-C Wei
- Department of Molecular Biology and Genetics, Cornell University
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University
| | - Daniel A Barbash
- Department of Molecular Biology and Genetics, Cornell University
| |
Collapse
|
36
|
Heger P, Wiehe T. New tools in the box: An evolutionary synopsis of chromatin insulators. Trends Genet 2014; 30:161-71. [DOI: 10.1016/j.tig.2014.03.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2014] [Revised: 03/24/2014] [Accepted: 03/25/2014] [Indexed: 01/19/2023]
|
37
|
Naturally occurring deletions of hunchback binding sites in the even-skipped stripe 3+7 enhancer. PLoS One 2014; 9:e91924. [PMID: 24786295 PMCID: PMC4006794 DOI: 10.1371/journal.pone.0091924] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Accepted: 02/18/2014] [Indexed: 11/23/2022] Open
Abstract
Changes in regulatory DNA contribute to phenotypic differences within and between taxa. Comparative studies show that many transcription factor binding sites (TFBS) are conserved between species whereas functional studies reveal that some mutations segregating within species alter TFBS function. Consistently, in this analysis of 13 regulatory elements in Drosophila melanogaster populations, single base and insertion/deletion polymorphism are rare in characterized regulatory elements. Experimentally defined TFBS are nearly devoid of segregating mutations and, as has been shown before, are quite conserved. For instance 8 of 11 Hunchback binding sites in the stripe 3+7 enhancer of even-skipped are conserved between D. melanogaster and Drosophila virilis. Oddly, we found a 72 bp deletion that removes one of these binding sites (Hb8), segregating within D. melanogaster. Furthermore, a 45 bp deletion polymorphism in the spacer between the stripe 3+7 and stripe 2 enhancers, removes another predicted Hunchback site. These two deletions are separated by ∼250 bp, sit on distinct haplotypes, and segregate at appreciable frequency. The Hb8Δ is at 5 to 35% frequency in the new world, but also shows cosmopolitan distribution. There is depletion of sequence variation on the Hb8Δ-carrying haplotype. Quantitative genetic tests indicate that Hb8Δ affects developmental time, but not viability of offspring. The Eve expression pattern differs between inbred lines, but the stripe 3 and 7 boundaries seem unaffected by Hb8Δ. The data reveal segregating variation in regulatory elements, which may reflect evolutionary turnover of characterized TFBS due to drift or co-evolution.
Collapse
|
38
|
Cis-regulatory variation: significance in biomedicine and evolution. Cell Tissue Res 2014; 356:495-505. [PMID: 24744265 DOI: 10.1007/s00441-014-1855-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2014] [Accepted: 02/19/2014] [Indexed: 12/29/2022]
Abstract
Cis-regulatory regions (CRR) control gene expression and chromatin modifications. Genetic variation at CRR in individuals across a population contributes to phenotypic differences of biomedical relevance. This standing variation is important for personalized genomic medicine as well as for adaptive evolution and speciation. This review focuses on genetic variation at CRR, its influence on chromatin, gene expression, and ultimately disease phenotypes. In addition, we summarize our understanding of how this variation may contribute to evolution. Recent technological and computational advances have accelerated research in the direction of personalized medicine, combining strengths of molecular biology and genomics. This will pave new ways to understand how CRR variation affects phenotypes and chart out possible avenues of intervention.
Collapse
|
39
|
Villar D, Flicek P, Odom DT. Evolution of transcription factor binding in metazoans - mechanisms and functional implications. Nat Rev Genet 2014; 15:221-33. [PMID: 24590227 PMCID: PMC4175440 DOI: 10.1038/nrg3481] [Citation(s) in RCA: 166] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Differences in transcription factor binding can contribute to organismal evolution by altering downstream gene expression programmes. Genome-wide studies in Drosophila melanogaster and mammals have revealed common quantitative and combinatorial properties of in vivo DNA binding, as well as marked differences in the rate and mechanisms of evolution of transcription factor binding in metazoans. Here, we review the recently discovered rapid 're-wiring' of in vivo transcription factor binding between related metazoan species and summarize general principles underlying the observed patterns of evolution. We then consider what might explain the differences in genome evolution between metazoan phyla and outline the conceptual and technological challenges facing this research field.
Collapse
Affiliation(s)
- Diego Villar
- University of Cambridge, Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB1 01SD, UK
| | - Duncan T Odom
- University of Cambridge, Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK
| |
Collapse
|
40
|
Liao BY, Chang A. Accumulation of CTCF-binding sites drives expression divergence between tandemly duplicated genes in humans. BMC Genomics 2014; 15 Suppl 1:S8. [PMID: 24564680 PMCID: PMC4046690 DOI: 10.1186/1471-2164-15-s1-s8] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Background During eukaryotic genome evolution, tandem gene duplication is the most frequent event giving rise to clustered gene families. However, how expression divergence between tandemly duplicated genes has emerged and maintained remain unclear. In particular, it is unknown if epigenetic regulators have been involved in the process. Results We demonstrate that CCCTC-binding factor (CTCF), the master epigenetic regulator and the only known insulator protein in humans, has played a predominant role in generating divergence in both expression profiles and expression levels between adjacent paralogs in the human genome. This phenomenon was not observed for non-paralogous adjacent genes. After tandem duplication events, CTCF-binding sites gradually accumulate between paralogs. This trend was more prominent for genes involved in particular functions. Conclusions The accumulation of CTCF-binding sites drives expression divergence of tandemly duplicated genes. This process is likely targeted by natural selection. Our study reveals the importance of CTCF to the evolution of animal diversity and complexity. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-S1-S8) contains supplementary material, which is available to authorized users.
Collapse
|
41
|
Schwalie PC, Ward MC, Cain CE, Faure AJ, Gilad Y, Odom DT, Flicek P. Co-binding by YY1 identifies the transcriptionally active, highly conserved set of CTCF-bound regions in primate genomes. Genome Biol 2013; 14:R148. [PMID: 24380390 PMCID: PMC4056453 DOI: 10.1186/gb-2013-14-12-r148] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2013] [Accepted: 12/31/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The genomic binding of CTCF is highly conserved across mammals, but the mechanisms that underlie its stability are poorly understood. One transcription factor known to functionally interact with CTCF in the context of X-chromosome inactivation is the ubiquitously expressed YY1. Because combinatorial transcription factor binding can contribute to the evolutionary stabilization of regulatory regions, we tested whether YY1 and CTCF co-binding could in part account for conservation of CTCF binding. RESULTS Combined analysis of CTCF and YY1 binding in lymphoblastoid cell lines from seven primates, as well as in mouse and human livers, reveals extensive genome-wide co-localization specifically at evolutionarily stable CTCF-bound regions. CTCF-YY1 co-bound regions resemble regions bound by YY1 alone, as they enrich for active histone marks, RNA polymerase II and transcription factor binding. Although these highly conserved, transcriptionally active CTCF-YY1 co-bound regions are often promoter-proximal, gene-distal regions show similar molecular features. CONCLUSIONS Our results reveal that these two ubiquitously expressed, multi-functional zinc-finger proteins collaborate in functionally active regions to stabilize one another's genome-wide binding across primate evolution.
Collapse
Affiliation(s)
- Petra C Schwalie
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- Current address: Laboratory of Systems Biology and Genetics, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland
| | - Michelle C Ward
- University of Cambridge, Cancer Research UK-Cambridge Institute, Robinson Way, Cambridge CB2 0RE, UK
- Current address: Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Carolyn E Cain
- Current address: Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Andre J Faure
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Yoav Gilad
- Current address: Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Duncan T Odom
- University of Cambridge, Cancer Research UK-Cambridge Institute, Robinson Way, Cambridge CB2 0RE, UK
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| |
Collapse
|
42
|
Jiang P, Singh M. CCAT: Combinatorial Code Analysis Tool for transcriptional regulation. Nucleic Acids Res 2013; 42:2833-47. [PMID: 24366875 PMCID: PMC3950699 DOI: 10.1093/nar/gkt1302] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Combinatorial interplay among transcription factors (TFs) is an important mechanism by which transcriptional regulatory specificity is achieved. However, despite the increasing number of TFs for which either binding specificities or genome-wide occupancy data are known, knowledge about cooperativity between TFs remains limited. To address this, we developed a computational framework for predicting genome-wide co-binding between TFs (CCAT, Combinatorial Code Analysis Tool), and applied it to Drosophila melanogaster to uncover cooperativity among TFs during embryo development. Using publicly available TF binding specificity data and DNaseI chromatin accessibility data, we first predicted genome-wide binding sites for 324 TFs across five stages of D. melanogaster embryo development. We then applied CCAT in each of these developmental stages, and identified from 19 to 58 pairs of TFs in each stage whose predicted binding sites are significantly co-localized. We found that nearby binding sites for pairs of TFs predicted to cooperate were enriched in regions bound in relevant ChIP experiments, and were more evolutionarily conserved than other pairs. Further, we found that TFs tend to be co-localized with other TFs in a dynamic manner across developmental stages. All generated data as well as source code for our front-to-end pipeline are available at http://cat.princeton.edu.
Collapse
Affiliation(s)
- Peng Jiang
- Department of Computer Science, Princeton University, Princeton, 08540 NJ, USA and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, 08544 NJ, USA
| | | |
Collapse
|
43
|
Abstract
During the course of evolution, genomes acquire novel genetic elements as sources of functional and phenotypic diversity, including new genes that originated in recent evolution. In the past few years, substantial progress has been made in understanding the evolution and phenotypic effects of new genes. In particular, an emerging picture is that new genes, despite being present in the genomes of only a subset of species, can rapidly evolve indispensable roles in fundamental biological processes, including development, reproduction, brain function and behaviour. The molecular underpinnings of how new genes can develop these roles are starting to be characterized. These recent discoveries yield fresh insights into our broad understanding of biological diversity at refined resolution.
Collapse
|
44
|
Ouboussad L, Kreuz S, Lefevre PF. CTCF depletion alters chromatin structure and transcription of myeloid-specific factors. J Mol Cell Biol 2013; 5:308-22. [PMID: 23933634 DOI: 10.1093/jmcb/mjt023] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2023] Open
Abstract
Differentiation is a multistep process tightly regulated and controlled by complex transcription factor networks. Here, we show that the rate of differentiation of common myeloid precursor cells increases after depletion of CTCF, a protein emerging as a potential key factor regulating higher-order chromatin structure. We identified CTCF binding in the vicinity of important transcription factors regulating myeloid differentiation and showed that CTCF depletion impacts on the expression of these genes in concordance with the observed acceleration of the myeloid commitment. Furthermore, we observed a loss of the histone variant H2A.Z within the selected promoter regions and an increase in non-coding RNA transcription upstream of these genes. Both abnormalities suggest a global chromatin structure destabilization and an associated increase of non-productive transcription in response to CTCF depletion but do not drive the CTCF-mediated transcription alterations of the neighbouring genes. Finally, we detected a transient eviction of CTCF at the Egr1 locus in correlation with Egr1 peak of expression in response to lipopolysaccharide (LPS) treatment in macrophages. This eviction is also correlated with the expression of an antisense non-coding RNA transcribing through the CTCF-binding region indicating that non-coding RNA transcription could be the cause and the consequence of CTCF eviction.
Collapse
Affiliation(s)
- Lylia Ouboussad
- Section of Experimental Haematology, Leeds Institute of Cancer Studies and Pathology, University of Leeds, Wellcome Trust Brenner Building, St. James's University Hospital, Leeds LS9 7TF, UK
| | | | | |
Collapse
|
45
|
Abstract
Genes are perpetually added to and deleted from genomes during evolution. Thus, it is important to understand how new genes are formed and how they evolve to be critical components of the genetic systems that determine the biological diversity of life. Two decades of effort have shed light on the process of new gene origination and have contributed to an emerging comprehensive picture of how new genes are added to genomes, ranging from the mechanisms that generate new gene structures to the presence of new genes in different organisms to the rates and patterns of new gene origination and the roles of new genes in phenotypic evolution. We review each of these aspects of new gene evolution, summarizing the main evidence for the origination and importance of new genes in evolution. We highlight findings showing that new genes rapidly change existing genetic systems that govern various molecular, cellular, and phenotypic functions.
Collapse
Affiliation(s)
- Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois 60637;
| | | | | | | |
Collapse
|
46
|
Paris M, Kaplan T, Li XY, Villalta JE, Lott SE, Eisen MB. Extensive divergence of transcription factor binding in Drosophila embryos with highly conserved gene expression. PLoS Genet 2013; 9:e1003748. [PMID: 24068946 PMCID: PMC3772039 DOI: 10.1371/journal.pgen.1003748] [Citation(s) in RCA: 83] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2013] [Accepted: 07/10/2013] [Indexed: 11/19/2022] Open
Abstract
To better characterize how variation in regulatory sequences drives divergence in gene expression, we undertook a systematic study of transcription factor binding and gene expression in blastoderm embryos of four species, which sample much of the diversity in the 40 million-year old genus Drosophila: D. melanogaster, D. yakuba, D. pseudoobscura and D. virilis. We compared gene expression, measured by mRNA-seq, to the genome-wide binding, measured by ChIP-seq, of four transcription factors involved in early anterior-posterior patterning. We found that mRNA levels are much better conserved than individual transcription factor binding events, and that changes in a gene's expression were poorly explained by changes in adjacent transcription factor binding. However, highly bound sites, sites in regions bound by multiple factors and sites near genes are conserved more frequently than other binding, suggesting that a considerable amount of transcription factor binding is weakly or non-functional and not subject to purifying selection.
Collapse
Affiliation(s)
- Mathilde Paris
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, California, United States of America
| | - Tommy Kaplan
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, California, United States of America
- School of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel
| | - Xiao Yong Li
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, California, United States of America
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, California, United States of America
| | | | - Susan E. Lott
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, California, United States of America
- Department of Evolution and Ecology, University of California, Davis, California, United States of America
| | - Michael B. Eisen
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, California, United States of America
- School of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, California, United States of America
| |
Collapse
|
47
|
Heger P, George R, Wiehe T. Successive gain of insulator proteins in arthropod evolution. Evolution 2013; 67:2945-56. [PMID: 24094345 PMCID: PMC4208683 DOI: 10.1111/evo.12155] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2012] [Accepted: 04/08/2013] [Indexed: 01/23/2023]
Abstract
Alteration of regulatory DNA elements or their binding proteins may have drastic consequences for morphological evolution. Chromatin insulators are one example of such proteins and play a fundamental role in organizing gene expression. While a single insulator protein, CTCF (CCCTC-binding factor), is known in vertebrates, Drosophila melanogaster utilizes six additional factors. We studied the evolution of these proteins and show here that—in contrast to the bilaterian-wide distribution of CTCF—all other D. melanogaster insulators are restricted to arthropods. The full set is present exclusively in the genus Drosophila whereas only two insulators, Su(Hw) and CTCF, existed at the base of the arthropod clade and all additional factors have been acquired successively at later stages. Secondary loss of factors in some lineages further led to the presence of different insulator subsets in arthropods. Thus, the evolution of insulator proteins within arthropods is an ongoing and dynamic process that reshapes and supplements the ancient CTCF-based system common to bilaterians. Expansion of insulator systems may therefore be a general strategy to increase an organism’s gene regulatory repertoire and its potential for morphological plasticity.
Collapse
Affiliation(s)
- Peter Heger
- Cologne Biocenter, Institute for Genetics, University of Cologne, Zülpicher Straße 47a, 50674 Köln, Germany.
| | | | | |
Collapse
|
48
|
Evolution: Positively selecting CTCF binding. Nat Rev Genet 2012. [PMID: 23183708 DOI: 10.1038/nrg3392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|