1
|
Lambourne L, Mattioli K, Santoso C, Sheynkman G, Inukai S, Kaundal B, Berenson A, Spirohn-Fitzgerald K, Bhattacharjee A, Rothman E, Shrestha S, Laval F, Yang Z, Bisht D, Sewell JA, Li G, Prasad A, Phanor S, Lane R, Campbell DM, Hunt T, Balcha D, Gebbia M, Twizere JC, Hao T, Frankish A, Riback JA, Salomonis N, Calderwood MA, Hill DE, Sahni N, Vidal M, Bulyk ML, Fuxman Bass JI. Widespread variation in molecular interactions and regulatory properties among transcription factor isoforms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.12.584681. [PMID: 38617209 PMCID: PMC11014633 DOI: 10.1101/2024.03.12.584681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Most human Transcription factors (TFs) genes encode multiple protein isoforms differing in DNA binding domains, effector domains, or other protein regions. The global extent to which this results in functional differences between isoforms remains unknown. Here, we systematically compared 693 isoforms of 246 TF genes, assessing DNA binding, protein binding, transcriptional activation, subcellular localization, and condensate formation. Relative to reference isoforms, two-thirds of alternative TF isoforms exhibit differences in one or more molecular activities, which often could not be predicted from sequence. We observed two primary categories of alternative TF isoforms: "rewirers" and "negative regulators", both of which were associated with differentiation and cancer. Our results support a model wherein the relative expression levels of, and interactions involving, TF isoforms add an understudied layer of complexity to gene regulatory networks, demonstrating the importance of isoform-aware characterization of TF functions and providing a rich resource for further studies.
Collapse
Affiliation(s)
- Luke Lambourne
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Kaia Mattioli
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Clarissa Santoso
- Department of Biology, Boston University, Boston, MA, USA
- Bioinformatics Program, Boston University, Boston, MA, USA
| | - Gloria Sheynkman
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sachi Inukai
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Babita Kaundal
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Anna Berenson
- Molecular Biology, Cell Biology & Biochemistry Program, Boston University, Boston, MA, USA
| | - Kerstin Spirohn-Fitzgerald
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Anukana Bhattacharjee
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Elisabeth Rothman
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Florent Laval
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Zhipeng Yang
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Deepa Bisht
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jared A Sewell
- Department of Biology, Boston University, Boston, MA, USA
| | - Guangyuan Li
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Anisa Prasad
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Harvard College, Cambridge MA, USA
| | - Sabrina Phanor
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Ryan Lane
- Department of Biology, Boston University, Boston, MA, USA
| | | | - Toby Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Dawit Balcha
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Marinella Gebbia
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Jean-Claude Twizere
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Tong Hao
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Adam Frankish
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Josh A Riback
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Nathan Salomonis
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Michael A Calderwood
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - David E Hill
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Nidhi Sahni
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Juan I Fuxman Bass
- Department of Biology, Boston University, Boston, MA, USA
- Bioinformatics Program, Boston University, Boston, MA, USA
- Molecular Biology, Cell Biology & Biochemistry Program, Boston University, Boston, MA, USA
| |
Collapse
|
2
|
Kock KH, Kimes PK, Gisselbrecht SS, Inukai S, Phanor SK, Anderson JT, Ramakrishnan G, Lipper CH, Song D, Kurland JV, Rogers JM, Jeong R, Blacklow SC, Irizarry RA, Bulyk ML. DNA binding analysis of rare variants in homeodomains reveals homeodomain specificity-determining residues. Nat Commun 2024; 15:3110. [PMID: 38600112 PMCID: PMC11006913 DOI: 10.1038/s41467-024-47396-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 03/29/2024] [Indexed: 04/12/2024] Open
Abstract
Homeodomains (HDs) are the second largest class of DNA binding domains (DBDs) among eukaryotic sequence-specific transcription factors (TFs) and are the TF structural class with the largest number of disease-associated mutations in the Human Gene Mutation Database (HGMD). Despite numerous structural studies and large-scale analyses of HD DNA binding specificity, HD-DNA recognition is still not fully understood. Here, we analyze 92 human HD mutants, including disease-associated variants and variants of uncertain significance (VUS), for their effects on DNA binding activity. Many of the variants alter DNA binding affinity and/or specificity. Detailed biochemical analysis and structural modeling identifies 14 previously unknown specificity-determining positions, 5 of which do not contact DNA. The same missense substitution at analogous positions within different HDs often exhibits different effects on DNA binding activity. Variant effect prediction tools perform moderately well in distinguishing variants with altered DNA binding affinity, but poorly in identifying those with altered binding specificity. Our results highlight the need for biochemical assays of TF coding variants and prioritize dozens of variants for further investigations into their pathogenicity and the development of clinical diagnostics and precision therapies.
Collapse
Affiliation(s)
- Kian Hong Kock
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, USA
- Program in Biological and Biomedical Sciences, Harvard University, Cambridge, MA, USA
| | - Patrick K Kimes
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Stephen S Gisselbrecht
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, USA
| | - Sachi Inukai
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, USA
| | - Sabrina K Phanor
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, USA
| | - James T Anderson
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, USA
| | - Gayatri Ramakrishnan
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, USA
- Boston Bangalore Biosciences Beginnings Program, Harvard University, Cambridge, MA, USA
| | - Colin H Lipper
- Department of Biological Chemistry and Molecular Pharmacology, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana Farber Cancer Institute, Boston, MA, USA
| | - Dongyuan Song
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Jesse V Kurland
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, USA
| | - Julia M Rogers
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, USA
- Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, MA, USA
| | - Raehoon Jeong
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, USA
- Bioinformatics and Integrative Genomics Graduate Program, Harvard University, Cambridge, MA, USA
| | - Stephen C Blacklow
- Program in Biological and Biomedical Sciences, Harvard University, Cambridge, MA, USA
- Department of Biological Chemistry and Molecular Pharmacology, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana Farber Cancer Institute, Boston, MA, USA
- Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, MA, USA
| | - Rafael A Irizarry
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, USA.
- Program in Biological and Biomedical Sciences, Harvard University, Cambridge, MA, USA.
- Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, MA, USA.
- Bioinformatics and Integrative Genomics Graduate Program, Harvard University, Cambridge, MA, USA.
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
3
|
Duy DL, Kim N. Yeast transcription factor Msn2 binds to G4 DNA. Nucleic Acids Res 2023; 51:9643-9657. [PMID: 37615577 PMCID: PMC10570036 DOI: 10.1093/nar/gkad684] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 08/03/2023] [Accepted: 08/15/2023] [Indexed: 08/25/2023] Open
Abstract
Sequences capable of forming quadruplex or G4 DNA are prevalent in the promoter regions. The transformation from canonical to non-canonical secondary structure apparently regulates transcription of a number of human genes. In the budding yeast Saccharomyces cerevisiae, we identified 37 genes with a G4 motif in the promoters including 20 genes that contain stress response element (STRE) overlapping a G4 motif. STRE is the binding site of stress response regulators Msn2 and Msn4, transcription factors belonging to the C2H2 zinc-finger protein family. We show here that Msn2 binds directly to the G4 DNA structure through its zinc-finger domain with a dissociation constant similar to that of STRE-binding and that, in a stress condition, Msn2 is enriched at G4 DNA-forming loci in the yeast genome. For a large fraction of genes with G4/STRE-containing promoters, treating with G4-ligands led to significant elevations in transcription levels. Such transcriptional elevation was greatly diminished in a msn2Δ msn4Δ background and was partly muted when the G4 motif was disrupted. Taken together, our data suggest that G4 DNA could be an alternative binding site of Msn2 in addition to STRE, and that G4 DNA formation could be an important element of transcriptional regulation in yeast.
Collapse
Affiliation(s)
- Duong Long Duy
- Department of Microbiology and Molecular Genetics, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Nayun Kim
- Department of Microbiology and Molecular Genetics, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- MD Anderson Cancer Center UT Health Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| |
Collapse
|
4
|
Horton CA, Alexandari AM, Hayes MGB, Marklund E, Schaepe JM, Aditham AK, Shah N, Suzuki PH, Shrikumar A, Afek A, Greenleaf WJ, Gordân R, Zeitlinger J, Kundaje A, Fordyce PM. Short tandem repeats bind transcription factors to tune eukaryotic gene expression. Science 2023; 381:eadd1250. [PMID: 37733848 DOI: 10.1126/science.add1250] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 07/26/2023] [Indexed: 09/23/2023]
Abstract
Short tandem repeats (STRs) are enriched in eukaryotic cis-regulatory elements and alter gene expression, yet how they regulate transcription remains unknown. We found that STRs modulate transcription factor (TF)-DNA affinities and apparent on-rates by about 70-fold by directly binding TF DNA-binding domains, with energetic impacts exceeding many consensus motif mutations. STRs maximize the number of weakly preferred microstates near target sites, thereby increasing TF density, with impacts well predicted by statistical mechanics. Confirming that STRs also affect TF binding in cells, neural networks trained only on in vivo occupancies predicted effects identical to those observed in vitro. Approximately 90% of TFs preferentially bound STRs that need not resemble known motifs, providing a cis-regulatory mechanism to target TFs to genomic sites.
Collapse
Affiliation(s)
- Connor A Horton
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Amr M Alexandari
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Michael G B Hayes
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Emil Marklund
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Julia M Schaepe
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Arjun K Aditham
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- ChEM-H Institute, Stanford University, Stanford, CA 94305, USA
| | - Nilay Shah
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Peter H Suzuki
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Avanti Shrikumar
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Ariel Afek
- Center for Genomic and Computational Biology, Duke University School of Medicine, Durham, NC 27710, USA
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27710, USA
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot 7610001, Israel
| | | | - Raluca Gordân
- Center for Genomic and Computational Biology, Duke University School of Medicine, Durham, NC 27710, USA
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27710, USA
- Department of Computer Science, Duke University, Durham, NC 27708, USA
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, NC 27710, USA
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
- The University of Kansas Medical Center, Kansas City, KS 66103, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Polly M Fordyce
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- ChEM-H Institute, Stanford University, Stanford, CA 94305, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94110, USA
| |
Collapse
|
5
|
Chen SY, Liu FC. The Fgf9-Nolz1-Wnt2 axis regulates morphogenesis of the lung. Development 2023; 150:dev201827. [PMID: 37497597 DOI: 10.1242/dev.201827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 07/19/2023] [Indexed: 07/28/2023]
Abstract
Morphological development of the lung requires complex signal crosstalk between the mesenchymal and epithelial progenitors. Elucidating the genetic cascades underlying signal crosstalk is essential to understanding lung morphogenesis. Here, we identified Nolz1 as a mesenchymal lineage-specific transcriptional regulator that plays a key role in lung morphogenesis. Nolz1 null mutation resulted in a severe hypoplasia phenotype, including a decreased proliferation of mesenchymal cells, aberrant differentiation of epithelial cells and defective growth of epithelial branches. Nolz1 deletion also downregulated Wnt2, Lef1, Fgf10, Gli3 and Bmp4 mRNAs. Mechanistically, Nolz1 regulates lung morphogenesis primarily through Wnt2 signaling. Loss-of-function and overexpression studies demonstrated that Nolz1 transcriptionally activated Wnt2 and downstream β-catenin signaling to control mesenchymal cell proliferation and epithelial branching. Exogenous Wnt2 could rescue defective proliferation and epithelial branching in Nolz1 knockout lungs. Finally, we identified Fgf9 as an upstream regulator of Nolz1. Collectively, Fgf9-Nolz1-Wnt2 signaling represents a novel axis in the control of lung morphogenesis. These findings are relevant to lung tumorigenesis, in which a pathological function of Nolz1 is implicated.
Collapse
Affiliation(s)
- Shih-Yun Chen
- Institute of Neuroscience, National Yang Ming Chiao Tung University, Taipei 112304, Taiwan
| | - Fu-Chin Liu
- Institute of Neuroscience, National Yang Ming Chiao Tung University, Taipei 112304, Taiwan
| |
Collapse
|
6
|
Alexandari AM, Horton CA, Shrikumar A, Shah N, Li E, Weilert M, Pufall MA, Zeitlinger J, Fordyce PM, Kundaje A. De novo distillation of thermodynamic affinity from deep learning regulatory sequence models of in vivo protein-DNA binding. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.11.540401. [PMID: 37214836 PMCID: PMC10197627 DOI: 10.1101/2023.05.11.540401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Transcription factors (TF) are proteins that bind DNA in a sequence-specific manner to regulate gene transcription. Despite their unique intrinsic sequence preferences, in vivo genomic occupancy profiles of TFs differ across cellular contexts. Hence, deciphering the sequence determinants of TF binding, both intrinsic and context-specific, is essential to understand gene regulation and the impact of regulatory, non-coding genetic variation. Biophysical models trained on in vitro TF binding assays can estimate intrinsic affinity landscapes and predict occupancy based on TF concentration and affinity. However, these models cannot adequately explain context-specific, in vivo binding profiles. Conversely, deep learning models, trained on in vivo TF binding assays, effectively predict and explain genomic occupancy profiles as a function of complex regulatory sequence syntax, albeit without a clear biophysical interpretation. To reconcile these complementary models of in vitro and in vivo TF binding, we developed Affinity Distillation (AD), a method that extracts thermodynamic affinities de-novo from deep learning models of TF chromatin immunoprecipitation (ChIP) experiments by marginalizing away the influence of genomic sequence context. Applied to neural networks modeling diverse classes of yeast and mammalian TFs, AD predicts energetic impacts of sequence variation within and surrounding motifs on TF binding as measured by diverse in vitro assays with superior dynamic range and accuracy compared to motif-based methods. Furthermore, AD can accurately discern affinities of TF paralogs. Our results highlight thermodynamic affinity as a key determinant of in vivo binding, suggest that deep learning models of in vivo binding implicitly learn high-resolution affinity landscapes, and show that these affinities can be successfully distilled using AD. This new biophysical interpretation of deep learning models enables high-throughput in silico experiments to explore the influence of sequence context and variation on both intrinsic affinity and in vivo occupancy.
Collapse
Affiliation(s)
- Amr M. Alexandari
- Department of Computer Science, Stanford University, Stanford, CA 94305
| | | | - Avanti Shrikumar
- Department of Earth System Science, Stanford University, Stanford, CA 94305
| | - Nilay Shah
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Eileen Li
- Department of Genetics, Stanford University, Stanford, CA 94305
| | - Melanie Weilert
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Miles A. Pufall
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, Iowa 52242, USA
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO, USA
- The University of Kansas Medical Center, Kansas City, KS, USA
| | - Polly M. Fordyce
- Department of Genetics, Stanford University, Stanford, CA 94305
- Department of Bioengineering, Stanford University, Stanford, CA 94305
- ChEM-H Institute, Stanford University, Stanford, CA 94305
- Chan Zuckerberg Biohub, San Francisco, CA 94110
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA 94305
- Department of Genetics, Stanford University, Stanford, CA 94305
| |
Collapse
|
7
|
Luan Y, Tang Z, He Y, Xie Z. Intra-Domain Residue Coevolution in Transcription Factors Contributes to DNA Binding Specificity. Microbiol Spectr 2023; 11:e0365122. [PMID: 36943132 PMCID: PMC10100741 DOI: 10.1128/spectrum.03651-22] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 02/22/2023] [Indexed: 03/23/2023] Open
Abstract
Understanding the basis of the DNA-binding specificity of transcription factors (TFs) has been of long-standing interest. Despite extensive efforts to map millions of putative TF binding sequences, identifying the critical determinants for DNA binding specificity remains a major challenge. The coevolution of residues in proteins occurs due to a shared evolutionary history. However, it is unclear how coevolving residues in TFs contribute to DNA binding specificity. Here, we systematically collected publicly available data sets from multiple large-scale high-throughput TF-DNA interaction screening experiments for the major TF families with large numbers of TF members. These families included the Homeobox, HLH, bZIP_1, Ets, HMG_box, ZF-C4, and Zn_clus TFs. We detected TF subclass-determining sites (TSDSs) and showed that the TSDSs were more likely to coevolve with other TSDSs than with non-TSDSs, particularly for the Homeobox, HLH, Ets, bZIP_1, and HMG_box TF families. By in silico modeling, we showed that mutation of the highly coevolving residues could significantly reduce the stability of the TF-DNA complex. The distant residues from the DNA interface also contributed to TF-DNA binding activity. Overall, our study gave evidence that coevolved residues relate to transcriptional regulation and provided insights into the potential application of engineered DNA-binding domains and proteins. IMPORTANCE While unraveling DNA-binding specificity of TFs is the key to understanding the basis and molecular mechanism of gene expression regulation, identifying the critical determinants that contribute to DNA binding specificity remains a major challenge. In this study, we provided evidence showing that coevolving residues in TF domains contributed to DNA binding specificity. We demonstrated that the TSDSs were more likely to coevolve with other TSDSs than with non-TSDSs. Mutation of the coevolving residue pairs (CRPs) could significantly reduce the stability of THE TF-DNA complex, and even the distant residues from the DNA interface contribute to TF-DNA binding activity. Collectively, our study expands our knowledge of the interactions among coevolved residues in TFs, tertiary contacting, and functional importance in refined transcriptional regulation. Understanding the impact of coevolving residues in TFs will help understand the details of transcription of gene regulation and advance the application of engineered DNA-binding domains and protein.
Collapse
Affiliation(s)
- Yizhao Luan
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Zehua Tang
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yao He
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
8
|
Structural insights into the recognition of telomeric variant repeat TTGGGG by broad-complex, tramtrack and bric-à-brac - zinc finger protein ZBTB10. J Biol Chem 2023; 299:102918. [PMID: 36657642 PMCID: PMC9958480 DOI: 10.1016/j.jbc.2023.102918] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 01/12/2023] [Accepted: 01/13/2023] [Indexed: 01/17/2023] Open
Abstract
Multiple proteins bind to telomeric DNA and are important for the role of telomeres in genome stability. A recent study established a broad-complex, tramtrack and bric-à-brac - zinc finger (BTB-ZF) protein, ZBTB10 (zinc finger and BTB domain-containing protein 10), as a telomeric variant repeat-binding protein at telomeres that use an alternative method for lengthening telomeres). ZBTB10 specifically interacts with the double-stranded telomeric variant repeat sequence TTGGGG by employing its tandem C2H2 zinc fingers (ZF1-2). Here, we solved the crystal structure of human ZBTB10 ZF1-2 in complex with a double-stranded DNA duplex containing the sequence TTGGGG to assess the molecular details of this interaction. Combined with calorimetric analysis, we identified the vital residues in TTGGGG recognition and determined the specific recognition mechanisms that are different from those of TZAP (telomere zinc finger-associated protein), a recently defined telomeric DNA-binding protein. Following these studies, we further identified a single amino-acid mutant (Arg767Gln) of ZBTB10 ZF1-2 that shows a preference for the telomeric DNA repeat TTAGGG sequence. We solved the cocrystal structure, providing a structural basis for telomeric DNA recognition by C2H2 ZF proteins.
Collapse
|
9
|
TOR complex 2 is a master regulator of plasma membrane homeostasis. Biochem J 2022; 479:1917-1940. [PMID: 36149412 PMCID: PMC9555796 DOI: 10.1042/bcj20220388] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 08/30/2022] [Accepted: 09/01/2022] [Indexed: 11/17/2022]
Abstract
As first demonstrated in budding yeast (Saccharomyces cerevisiae), all eukaryotic cells contain two, distinct multi-component protein kinase complexes that each harbor the TOR (Target Of Rapamycin) polypeptide as the catalytic subunit. These ensembles, dubbed TORC1 and TORC2, function as universal, centrally important sensors, integrators, and controllers of eukaryotic cell growth and homeostasis. TORC1, activated on the cytosolic surface of the lysosome (or, in yeast, on the cytosolic surface of the vacuole), has emerged as a primary nutrient sensor that promotes cellular biosynthesis and suppresses autophagy. TORC2, located primarily at the plasma membrane, plays a major role in maintaining the proper levels and bilayer distribution of all plasma membrane components (sphingolipids, glycerophospholipids, sterols, and integral membrane proteins). This article surveys what we have learned about signaling via the TORC2 complex, largely through studies conducted in S. cerevisiae. In this yeast, conditions that challenge plasma membrane integrity can, depending on the nature of the stress, stimulate or inhibit TORC2, resulting in, respectively, up-regulation or down-regulation of the phosphorylation and thus the activity of its essential downstream effector the AGC family protein kinase Ypk1. Through the ensuing effect on the efficiency with which Ypk1 phosphorylates multiple substrates that control diverse processes, membrane homeostasis is maintained. Thus, the major focus here is on TORC2, Ypk1, and the multifarious targets of Ypk1 and how the functions of these substrates are regulated by their Ypk1-mediated phosphorylation, with emphasis on recent advances in our understanding of these processes.
Collapse
|
10
|
Marchal C, Defossez PA, Miotto B. Context-dependent CpG methylation directs cell-specific binding of transcription factor ZBTB38. Epigenetics 2022; 17:2122-2143. [PMID: 36000449 DOI: 10.1080/15592294.2022.2111135] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Abstract
DNA methylation on CpGs regulates transcription in mammals, both by decreasing the binding of methylation-repelled factors and by increasing the binding of methylation-attracted factors. Among the latter, zinc finger proteins have the potential to bind methylated CpGs in a sequence-specific context. The protein ZBTB38 is unique in that it has two independent sets of zinc fingers, which recognize two different methylated consensus sequences in vitro. Here, we identify the binding sites of ZBTB38 in a human cell line, and show that they contain the two methylated consensus sequences identified in vitro. In addition, we show that the distribution of ZBTB38 sites is highly unusual: while 10% of the ZBTB38 sites are also bound by CTCF, the other 90% of sites reside in closed chromatin and are not bound by any of the other factors mapped in our model cell line. Finally, a third of ZBTB38 sites are found upstream of long and active CpG islands. Our work therefore validates ZBTB38 as a methyl-DNA binder in vivo and identifies its unique distribution in the genome.
Collapse
Affiliation(s)
- Claire Marchal
- Université Paris Cité, Institut Cochin, INSERM, CNRS, Paris, France
| | | | - Benoit Miotto
- Université Paris Cité, Institut Cochin, INSERM, CNRS, Paris, France
| |
Collapse
|
11
|
Yang X, Sun G, Xia T, Cha M, Zhang L, Pang B, Tang Q, Dou H, Zhang H. Transcriptome analysis provides new insights into cold adaptation of corsac fox (
Vulpes Corsac
). Ecol Evol 2022; 12:e8866. [PMID: 35462974 PMCID: PMC9019142 DOI: 10.1002/ece3.8866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 12/10/2021] [Accepted: 04/06/2022] [Indexed: 11/11/2022] Open
Abstract
Vulpesare widely distributed throughout the world and have undergone drastic physiological and phenotypic changes in response to their environment. However, little is known about the underlying genetic causes of these traits, especially Vulpes corsac. In this study, RNA‐Seq was used to obtain a comprehensive dataset for multiple pooled tissues of corsac fox, and selection analysis of orthologous genes was performed to identify the genes that may be influenced by the low‐temperature environment. More than 6.32 Gb clean reads were obtained and assembled into a total of 173,353 unigenes with an average length of 557 bp for corsac fox. Selective pressure analysis showed that 16 positively selected genes (PSGs) were identified in corsac fox, red fox, and arctic fox. Enrichment analysis of PSGs showed that the LRP11 gene was enriched in several pathways related to the low‐temperature response and might play a key role in response to environmental stimuli of foxes. In addition, several positively selected genes were related to DNA damage repair (ELP2 and CHAF1A), innate immunity (ARRDC4 and S100A12), and the respiratory chain (NDUFA5), and these positively selected genes might play a role in adaptation to harsh wild fox environments. The results of common orthologous gene analysis showed that gene flow or convergent evolution might be an important factor in promoting regional differentiation of foxes. Our study provides a valuable transcriptomic resource for the evolutionary history of the corsac fox and the adaptations to the extreme environments.
Collapse
Affiliation(s)
- Xiufeng Yang
- College of Life Science Qufu Normal University Qufu China
| | - Guolei Sun
- College of Life Science Qufu Normal University Qufu China
| | - Tian Xia
- College of Life Science Qufu Normal University Qufu China
| | - Muha Cha
- Hulunbuir Academy of Inland Lakes in Northern Cold & Arid Areas Hulunbuir China
| | - Lei Zhang
- College of Life Science Qufu Normal University Qufu China
| | - Bo Pang
- Hulunbuir Academy of Inland Lakes in Northern Cold & Arid Areas Hulunbuir China
| | - Qingming Tang
- Hulun Buir Forestry and Grassland Business Development Center Hulunbuir China
| | - Huashan Dou
- Hulunbuir Academy of Inland Lakes in Northern Cold & Arid Areas Hulunbuir China
| | - Honghai Zhang
- College of Life Science Qufu Normal University Qufu China
| |
Collapse
|
12
|
Gera T, Jonas F, More R, Barkai N. Evolution of binding preferences among whole-genome duplicated transcription factors. eLife 2022; 11:73225. [PMID: 35404235 PMCID: PMC9000951 DOI: 10.7554/elife.73225] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Accepted: 01/20/2022] [Indexed: 01/10/2023] Open
Abstract
Throughout evolution, new transcription factors (TFs) emerge by gene duplication, promoting growth and rewiring of transcriptional networks. How TF duplicates diverge was studied in a few cases only. To provide a genome-scale view, we considered the set of budding yeast TFs classified as whole-genome duplication (WGD)-retained paralogs (~35% of all specific TFs). Using high-resolution profiling, we find that ~60% of paralogs evolved differential binding preferences. We show that this divergence results primarily from variations outside the DNA-binding domains (DBDs), while DBD preferences remain largely conserved. Analysis of non-WGD orthologs revealed uneven splitting of ancestral preferences between duplicates, and the preferential acquiring of new targets by the least conserved paralog (biased neo/sub-functionalization). Interactions between paralogs were rare, and, when present, occurred through weak competition for DNA-binding or dependency between dimer-forming paralogs. We discuss the implications of our findings for the evolutionary design of transcriptional networks.
Collapse
Affiliation(s)
- Tamar Gera
- Department of Molecular Genetics, Weizmann Institute of Science
| | - Felix Jonas
- Department of Molecular Genetics, Weizmann Institute of Science
| | - Roye More
- Department of Molecular Genetics, Weizmann Institute of Science
| | - Naama Barkai
- Department of Molecular Genetics, Weizmann Institute of Science
| |
Collapse
|
13
|
Transcription Factors in the Fungus Aspergillus nidulans: Markers of Genetic Innovation, Network Rewiring and Conflict between Genomics and Transcriptomics. J Fungi (Basel) 2021; 7:jof7080600. [PMID: 34436139 PMCID: PMC8396895 DOI: 10.3390/jof7080600] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 07/16/2021] [Accepted: 07/23/2021] [Indexed: 12/20/2022] Open
Abstract
Gene regulatory networks (GRNs) are shaped by the democratic/hierarchical relationships among transcription factors (TFs) and associated proteins, together with the cis-regulatory sequences (CRSs) bound by these TFs at target promoters. GRNs control all cellular processes, including metabolism, stress response, growth and development. Due to the ability to modify morphogenetic and developmental patterns, there is the consensus view that the reorganization of GRNs is a driving force of species evolution and differentiation. GRNs are rewired through events including the duplication of TF-coding genes, their divergent sequence evolution and the gain/loss/modification of CRSs. Fungi (mainly Saccharomycotina) have served as a reference kingdom for the study of GRN evolution. Here, I studied the genes predicted to encode TFs in the fungus Aspergillus nidulans (Pezizomycotina). The analysis of the expansion of different families of TFs suggests that the duplication of TFs impacts the species level, and that the expansion in Zn2Cys6 TFs is mainly due to dispersed duplication events. Comparison of genomic annotation and transcriptomic data suggest that a significant percentage of genes should be re-annotated, while many others remain silent. Finally, a new regulator of growth and development is identified and characterized. Overall, this study establishes a novel theoretical framework in synthetic biology, as the overexpression of silent TF forms would provide additional tools to assess how GRNs are rewired.
Collapse
|
14
|
Jana T, Brodsky S, Barkai N. Speed-Specificity Trade-Offs in the Transcription Factors Search for Their Genomic Binding Sites. Trends Genet 2021; 37:421-432. [PMID: 33414013 DOI: 10.1016/j.tig.2020.12.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Revised: 12/04/2020] [Accepted: 12/07/2020] [Indexed: 12/17/2022]
Abstract
Transcription factors (TFs) regulate gene expression by binding DNA sequences recognized by their DNA-binding domains (DBDs). DBD-recognized motifs are short and highly abundant in genomes. The ability of TFs to bind a specific subset of motif-containing sites, and to do so rapidly upon activation, is fundamental for gene expression in all eukaryotes. Despite extensive interest, our understanding of the TF-target search process is fragmented; although binding specificity and detection speed are two facets of this same process, trade-offs between them are rarely addressed. In this opinion article, we discuss potential speed-specificity trade-offs in the context of existing models. We further discuss the recently described 'distributed specificity' paradigm, suggesting that intrinsically disordered regions (IDRs) promote specificity while reducing the TF-target search time.
Collapse
Affiliation(s)
- Tamar Jana
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Sagie Brodsky
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Naama Barkai
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel.
| |
Collapse
|
15
|
Structure-function relationships explain CTCF zinc finger mutation phenotypes in cancer. Cell Mol Life Sci 2021; 78:7519-7536. [PMID: 34657170 PMCID: PMC8629902 DOI: 10.1007/s00018-021-03946-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 07/29/2021] [Accepted: 09/17/2021] [Indexed: 12/12/2022]
Abstract
CCCTC-binding factor (CTCF) plays fundamental roles in transcriptional regulation and chromatin architecture maintenance. CTCF is also a tumour suppressor frequently mutated in cancer, however, the structural and functional impact of mutations have not been examined. We performed molecular and structural characterisation of five cancer-specific CTCF missense zinc finger (ZF) mutations occurring within key intra- and inter-ZF residues. Functional characterisation of CTCF ZF mutations revealed a complete (L309P, R339W, R377H) or intermediate (R339Q) abrogation as well as an enhancement (G420D) of the anti-proliferative effects of CTCF. DNA binding at select sites was disrupted and transcriptional regulatory activities abrogated. Molecular docking and molecular dynamics confirmed that mutations in residues specifically contacting DNA bases or backbone exhibited loss of DNA binding. However, R339Q and G420D were stabilised by the formation of new primary DNA bonds, contributing to gain-of-function. Our data confirm that a spectrum of loss-, change- and gain-of-function impacts on CTCF zinc fingers are observed in cell growth regulation and gene regulatory activities. Hence, diverse cellular phenotypes of mutant CTCF are clearly explained by examining structure-function relationships.
Collapse
|
16
|
Brodsky S, Jana T, Mittelman K, Chapal M, Kumar DK, Carmi M, Barkai N. Intrinsically Disordered Regions Direct Transcription Factor In Vivo Binding Specificity. Mol Cell 2020; 79:459-471.e4. [DOI: 10.1016/j.molcel.2020.05.032] [Citation(s) in RCA: 99] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Revised: 03/10/2020] [Accepted: 05/21/2020] [Indexed: 11/25/2022]
|
17
|
Chapal M, Mintzer S, Brodsky S, Carmi M, Barkai N. Resolving noise-control conflict by gene duplication. PLoS Biol 2019; 17:e3000289. [PMID: 31756183 PMCID: PMC6874299 DOI: 10.1371/journal.pbio.3000289] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2019] [Accepted: 10/21/2019] [Indexed: 12/15/2022] Open
Abstract
Gene duplication promotes adaptive evolution in two main ways: allowing one duplicate to evolve a new function and splitting ancestral functions between the duplicates. The second scenario may resolve adaptive conflicts that can rise when one gene performs different functions. In an apparent departure from both scenarios, low-expressing transcription factor (TF) duplicates commonly bind to the same DNA motifs and act in overlapping conditions. To examine for possible benefits of this apparent redundancy, we examined the Msn2 and Msn4 duplicates in budding yeast. We show that Msn2,4 function as one unit by inducing the same set of target genes in overlapping conditions. Yet, the two-factor composition allows this unit's expression to be both environmentally responsive and with low noise, resolving an adaptive conflict that limits expression of single genes. We propose that duplication can provide adaptive benefit through cooperation rather than functional divergence, allowing two-factor dynamics with beneficial properties that cannot be achieved by a single gene.
Collapse
Affiliation(s)
- Michal Chapal
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Sefi Mintzer
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Sagie Brodsky
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Miri Carmi
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Naama Barkai
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
18
|
Rogers JM, Waters CT, Seegar TCM, Jarrett SM, Hallworth AN, Blacklow SC, Bulyk ML. Bispecific Forkhead Transcription Factor FoxN3 Recognizes Two Distinct Motifs with Different DNA Shapes. Mol Cell 2019; 74:245-253.e6. [PMID: 30826165 PMCID: PMC6474805 DOI: 10.1016/j.molcel.2019.01.019] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Revised: 12/17/2018] [Accepted: 01/11/2019] [Indexed: 12/13/2022]
Abstract
Transcription factors (TFs) control gene expression by binding DNA recognition sites in genomic regulatory regions. Although most forkhead TFs recognize a canonical forkhead (FKH) motif, RYAAAYA, some forkheads recognize a completely different (FHL) motif, GACGC. Bispecific forkhead proteins recognize both motifs, but the molecular basis for bispecific DNA recognition is not understood. We present co-crystal structures of the FoxN3 DNA binding domain bound to the FKH and FHL sites, respectively. FoxN3 adopts a similar conformation to recognize both motifs, making contacts with different DNA bases using the same amino acids. However, the DNA structure is different in the two complexes. These structures reveal how a single TF binds two unrelated DNA sequences and the importance of DNA shape in the mechanism of bispecific recognition.
Collapse
Affiliation(s)
- Julia M Rogers
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, MA 02138, USA
| | - Colin T Waters
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Program in Biological and Biomedical Sciences, Harvard University, Cambridge, MA 02138, USA
| | - Tom C M Seegar
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
| | - Sanchez M Jarrett
- Program in Biological and Biomedical Sciences, Harvard University, Cambridge, MA 02138, USA; Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
| | - Amelia N Hallworth
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Stephen C Blacklow
- Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, MA 02138, USA; Program in Biological and Biomedical Sciences, Harvard University, Cambridge, MA 02138, USA; Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA; Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Department of Cancer Biology, Dana Farber Cancer Institute, Boston, MA 02215, USA.
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, MA 02138, USA; Program in Biological and Biomedical Sciences, Harvard University, Cambridge, MA 02138, USA; Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
19
|
Ray S, Ufot A, Assad N, Singh J, Durell SR, Porollo A, Tillo D, Vinson C. The bZIP mutant CEBPB (V285A) has sequence specific DNA binding propensities similar to CREB1. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1862:486-492. [PMID: 30825655 DOI: 10.1016/j.bbagrm.2019.02.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Revised: 01/09/2019] [Accepted: 02/05/2019] [Indexed: 12/25/2022]
Abstract
The bZIP homodimers CEBPB and CREB1 bind DNA containing methylated cytosines differently. CREB1 binds stronger to the C/EBP half-site GCAA when the cytosine is methylated. For CEBPB, methylation of the same cytosine does not affect DNA binding. The X-ray structure of CREB1 binding the half site GTCA identifies an alanine in the DNA binding region interacting with the methyl group of T, structurally analogous to the methyl group of methylated C. This alanine is replaced with a valine in CEBPB. To explore the contribution of this amino acid to binding with methylated cytosine of the GCAA half-site, we made the reciprocal mutants CEBPB(V285A) and CREB1(A297V) and used protein binding microarrays (PBM) to examine binding to four types of double-stranded DNA (dsDNA): 1) DNA with cytosine in both strands (DNA(C|C)), 2) DNA with 5-methylcytosine (M) in one strand and cytosine in the second strand (DNA(M|C)), 3) DNA with 5-hydroxymethylcytosine (H) in one strand and cytosine in the second strand (DNA(H|C)), and 4) DNA with both cytosines in all CG dinucleotides containing 5-methylcytosine (DNA(5mCG)). When binding to DNA(C|C), CEBPB (V285A) preferentially binds the CRE consensus motif (TGACGTCA), similar to CREB1. The reciprocal mutant, CREB1(A297V) binds DNA with some similarity to CEBPB, with strongest binding to the methylated PAR site 8-mer TTACGTAA. These data demonstrate that V285 residue inhibits CEBPB binding to methylated cytosine of the GCAA half-site.
Collapse
Affiliation(s)
- Sreejana Ray
- Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, United States of America
| | - Aniekanabasi Ufot
- Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, United States of America
| | - Nima Assad
- Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, United States of America
| | - Jocelyn Singh
- Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, United States of America
| | - Stewart R Durell
- Laboratory of Cell Biology, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, United States of America
| | - Aleksey Porollo
- Center for Autoimmune Genomics and Etiology, Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, United States of America; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45267, United States of America
| | - Desiree Tillo
- Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, United States of America
| | - Charles Vinson
- Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, United States of America.
| |
Collapse
|
20
|
Rogers JM, Bulyk ML. Diversification of transcription factor-DNA interactions and the evolution of gene regulatory networks. WILEY INTERDISCIPLINARY REVIEWS. SYSTEMS BIOLOGY AND MEDICINE 2018; 10:e1423. [PMID: 29694718 PMCID: PMC6202284 DOI: 10.1002/wsbm.1423] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Revised: 02/23/2018] [Accepted: 03/11/2018] [Indexed: 01/17/2023]
Abstract
Sequence-specific transcription factors (TFs) bind short DNA sequences in the genome to regulate the expression of target genes. In the last decade, numerous technical advances have enabled the determination of the DNA-binding specificities of many of these factors. Large-scale screens of many TFs enabled the creation of databases of TF DNA-binding specificities, typically represented as position weight matrices (PWMs). Although great progress has been made in determining and predicting binding specificities systematically, there are still many surprises to be found when studying a particular TF's interactions with DNA in detail. Paralogous TFs' binding specificities can differ in subtle ways, in a manner that is not immediately apparent from looking at their PWMs. These differences affect gene regulatory outputs and enable TFs to rewire transcriptional networks over evolutionary time. This review discusses recent observations made in the study of TF-DNA interactions that highlight the importance of continued in-depth analysis of TF-DNA interactions and their inherent complexity. This article is categorized under: Biological Mechanisms > Regulatory Biology.
Collapse
Affiliation(s)
- Julia M. Rogers
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, 02115, USA,Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, MA, 02138, USA
| | - Martha L. Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, 02115, USA,Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, MA, 02138, USA,Department of Pathology, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, 02115, USA
| |
Collapse
|
21
|
Transcription Factors Controlling Primary and Secondary Metabolism in Filamentous Fungi: The β-Lactam Paradigm. FERMENTATION-BASEL 2018. [DOI: 10.3390/fermentation4020047] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
22
|
Gross T, Broholm S, Becker A. CRABS CLAW Acts as a Bifunctional Transcription Factor in Flower Development. FRONTIERS IN PLANT SCIENCE 2018; 9:835. [PMID: 29973943 PMCID: PMC6019494 DOI: 10.3389/fpls.2018.00835] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2018] [Accepted: 05/29/2018] [Indexed: 05/06/2023]
Abstract
One of the crucial steps in the life cycle of angiosperms is the development of carpels. They are the most complex plant organs, harbor the seeds, and, after fertilization, develop into fruits and are thus an important ecological and economic trait. CRABS CLAW (CRC), a YABBY protein and putative transcription factor, is one of the major carpel developmental regulators in A. thaliana that includes a C2C2 zinc finger and a domain with similarities to an HMG box. CRC is involved in the regulation of processes such as carpel fusion and growth, floral meristem termination, and nectary formation. While its genetic interactions with other carpel development regulators are well described, its biochemical properties and molecular way of action remain unclear. We combined Bimolecular Fluorescence Complementation, Yeast Two-Hybrid, and Yeast One-Hybrid analyzes to shed light on the molecular biology of CRC. Our results showed that CRC dimerizes, also with other YABBY proteins, via the YABBY domain, and that its DNA binding is mainly cooperative and is mediated by the YABBY domain. Further, we identified that CRC is involved in floral meristem termination via transcriptional repression while it acts as a transcriptional activator in nectary development and carpel fusion and growth control. This work increases our understanding on how YABBY transcription factors interact with other proteins and how they regulate their targets.
Collapse
Affiliation(s)
- Thomas Gross
- Department of Biology, Institute of Botany, Justus Liebig University Giessen, Giessen, Germany
- *Correspondence: Thomas Gross,
| | - Suvi Broholm
- Biosciences and Environment Research Unit, Academy of Finland, Helsinki, Finland
| | - Annette Becker
- Department of Biology, Institute of Botany, Justus Liebig University Giessen, Giessen, Germany
| |
Collapse
|
23
|
Najafabadi HS, Garton M, Weirauch MT, Mnaimneh S, Yang A, Kim PM, Hughes TR. Non-base-contacting residues enable kaleidoscopic evolution of metazoan C2H2 zinc finger DNA binding. Genome Biol 2017; 18:167. [PMID: 28877740 PMCID: PMC5588721 DOI: 10.1186/s13059-017-1287-y] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Accepted: 07/14/2017] [Indexed: 02/07/2023] Open
Abstract
Background The C2H2 zinc finger (C2H2-ZF) is the most numerous protein domain in many metazoans, but is not as frequent or diverse in other eukaryotes. The biochemical and evolutionary mechanisms that underlie the diversity of this DNA-binding domain exclusively in metazoans are, however, mostly unknown. Results Here, we show that the C2H2-ZF expansion in metazoans is facilitated by contribution of non-base-contacting residues to DNA binding energy, allowing base-contacting specificity residues to mutate without catastrophic loss of DNA binding. In contrast, C2H2-ZF DNA binding in fungi, plants, and other lineages is constrained by reliance on base-contacting residues for DNA-binding functionality. Reconstructions indicate that virtually every DNA triplet was recognized by at least one C2H2-ZF domain in the common progenitor of placental mammals, but that extant C2H2-ZF domains typically bind different sequences from these ancestral domains, with changes facilitated by non-base-contacting residues. Conclusions Our results suggest that the evolution of C2H2-ZFs in metazoans was expedited by the interaction of non-base-contacting residues with the DNA backbone. We term this phenomenon “kaleidoscopic evolution,” to reflect the diversity of both binding motifs and binding motif transitions and the facilitation of their diversification. Electronic supplementary material The online version of this article (doi:10.1186/s13059-017-1287-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hamed S Najafabadi
- Department of Human Genetics, McGill University, Montreal, QC, Canada. .,McGill University and Genome Quebec Innovation Centre, Montreal, QC, Canada. .,Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada.
| | - Michael Garton
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Matthew T Weirauch
- Center for Autoimmune Genomics and Etiology, and Divisions of Biomedical Informatics and Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA.,Canadian Institute for Advanced Research, Toronto, ON, Canada
| | - Sanie Mnaimneh
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Ally Yang
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Philip M Kim
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada.,Department of Computer Science, University of Toronto, Toronto, ON, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Timothy R Hughes
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada. .,Canadian Institute for Advanced Research, Toronto, ON, Canada. .,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
24
|
Inukai S, Kock KH, Bulyk ML. Transcription factor-DNA binding: beyond binding site motifs. Curr Opin Genet Dev 2017; 43:110-119. [PMID: 28359978 PMCID: PMC5447501 DOI: 10.1016/j.gde.2017.02.007] [Citation(s) in RCA: 189] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2016] [Revised: 02/02/2017] [Accepted: 02/07/2017] [Indexed: 12/12/2022]
Abstract
Sequence-specific transcription factors (TFs) regulate gene expression by binding to cis-regulatory elements in promoter and enhancer DNA. While studies of TF-DNA binding have focused on TFs' intrinsic preferences for primary nucleotide sequence motifs, recent studies have elucidated additional layers of complexity that modulate TF-DNA binding. In this review, we discuss technological developments for identifying TF binding preferences and highlight recent discoveries that elaborate how TF interactions, local DNA structure, and genomic features influence TF-DNA binding. We highlight novel approaches for characterizing functional binding site motifs that promise to inform our understanding of how TF binding controls gene expression and ultimately contributes to phenotype.
Collapse
Affiliation(s)
- Sachi Inukai
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Kian Hong Kock
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Program in Biological and Biomedical Sciences, Harvard University, Cambridge, MA 02138, USA
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Program in Biological and Biomedical Sciences, Harvard University, Cambridge, MA 02138, USA; Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
25
|
Gaspar VM, Cruz C, Queiroz JA, Pichon C, Correia IJ, Sousa F. Highly selective capture of minicircle DNA biopharmaceuticals by a novel zinc-histidine peptide conjugate. Sep Purif Technol 2017. [DOI: 10.1016/j.seppur.2016.10.054] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
26
|
Garton M, Najafabadi HS, Schmitges FW, Radovani E, Hughes TR, Kim PM. A structural approach reveals how neighbouring C2H2 zinc fingers influence DNA binding specificity. Nucleic Acids Res 2015; 43:9147-57. [PMID: 26384429 PMCID: PMC4627083 DOI: 10.1093/nar/gkv919] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Accepted: 09/05/2015] [Indexed: 12/28/2022] Open
Abstract
Development of an accurate protein–DNA recognition code that can predict DNA specificity from protein sequence is a central problem in biology. C2H2 zinc fingers constitute by far the largest family of DNA binding domains and their binding specificity has been studied intensively. However, despite decades of research, accurate prediction of DNA specificity remains elusive. A major obstacle is thought to be the inability of current methods to account for the influence of neighbouring domains. Here we show that this problem can be addressed using a structural approach: we build structural models for all C2H2-ZF–DNA complexes with known binding motifs and find six distinct binding modes. Each mode changes the orientation of specificity residues with respect to the DNA, thereby modulating base preference. Most importantly, the structural analysis shows that residues at the domain interface strongly and predictably influence the binding mode, and hence specificity. Accounting for predicted binding mode significantly improves prediction accuracy of predicted motifs. This new insight into the fundamental behaviour of C2H2-ZFs has implications for both improving the prediction of natural zinc finger-binding sites, and for prioritizing further experiments to complete the code. It also provides a new design feature for zinc finger engineering.
Collapse
Affiliation(s)
- Michael Garton
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto M5S 3E1, Canada
| | - Hamed S Najafabadi
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto M5S 3E1, Canada
| | - Frank W Schmitges
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto M5S 3E1, Canada
| | - Ernest Radovani
- Department of Molecular Genetics, University of Toronto, Toronto M5S 1A8, Canada
| | - Timothy R Hughes
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto M5S 3E1, Canada Department of Molecular Genetics, University of Toronto, Toronto M5S 1A8, Canada
| | - Philip M Kim
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto M5S 3E1, Canada Department of Molecular Genetics, University of Toronto, Toronto M5S 1A8, Canada Department of Computer Science, University of Toronto, Toronto M5S 2E4, Canada
| |
Collapse
|
27
|
Menke C, Cionni M, Siggers T, Bulyk ML, Beier DR, Stottmann RW. Grhl2 is required in nonneural tissues for neural progenitor survival and forebrain development. Genesis 2015; 53:573-582. [PMID: 26177923 PMCID: PMC4713386 DOI: 10.1002/dvg.22875] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Revised: 07/06/2015] [Accepted: 07/07/2015] [Indexed: 11/06/2022]
Abstract
Grainyhead-like genes are part of a highly conserved gene family that play a number of roles in ectoderm development and maintenance in mammals. Here we identify a novel allele of Grhl2, cleft-face 3 (clft3), in a mouse line recovered from an ENU mutagenesis screen for organogenesis defects. Homozygous clft3 mutants have a number of phenotypes in common with other alleles of Grhl2. We note a significant effect of genetic background on the clft3 phenotype. One of these is a reduction in size of the telencephalon where we find abnormal patterns of neural progenitor mitosis and apoptosis in mutant brains. Interestingly, Grhl2 is not expressed in the developing forebrain, suggesting this is a survival factor for neural progenitors exerting a paracrine effect on the neural tissue from the overlying ectoderm where Grhl2 is highly expressed. genesis 53:573-582, 2015. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Chelsea Menke
- Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH
| | - Megan Cionni
- Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH
| | - Trevor Siggers
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA
- Department of Biology, Boston University, Boston, MA
| | - Martha L. Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA
- Department of Pathology, Brigham & Women’s Hospital and Harvard Medical School, Boston, MA
| | - David R. Beier
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA
- Center for Developmental Biology and Regenerative Medicine, Seattle Children’s Hospital, Seattle, WA
| | - Rolf W. Stottmann
- Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA
- Division of Developmental Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH
| |
Collapse
|
28
|
Hansen AS, O'Shea EK. cis Determinants of Promoter Threshold and Activation Timescale. Cell Rep 2015; 12:1226-33. [PMID: 26279577 DOI: 10.1016/j.celrep.2015.07.035] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2015] [Revised: 07/08/2015] [Accepted: 07/15/2015] [Indexed: 11/16/2022] Open
Abstract
Although the relationship between DNA cis-regulatory sequences and gene expression has been extensively studied at steady state, how cis-regulatory sequences affect the dynamics of gene induction is not known. The dynamics of gene induction can be described by the promoter activation timescale (AcTime) and amplitude threshold (AmpThr). Combining high-throughput microfluidics with quantitative time-lapse microscopy, we control the activation dynamics of the budding yeast transcription factor, Msn2, and reveal how cis-regulatory motifs in 20 promoter variants of the Msn2-target-gene SIP18 affect AcTime and AmpThr. By modulating Msn2 binding sites, we can decouple AmpThr from AcTime and switch the SIP18 promoter class from high AmpThr and slow AcTime to low AmpThr and either fast or slow AcTime. We present a model that quantitatively explains gene-induction dynamics on the basis of the Msn2-binding-site number, TATA box location, and promoter nucleosome organization. Overall, we elucidate the cis-regulatory logic underlying promoter decoding of TF dynamics.
Collapse
Affiliation(s)
- Anders S Hansen
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA; Howard Hughes Medical Institute, Harvard University, Northwest Laboratory, 52 Oxford Street, Cambridge, MA 02138, USA; Faculty of Arts and Sciences Center for Systems Biology, Harvard University, Northwest Laboratory, 52 Oxford Street, Cambridge, MA 02138, USA
| | - Erin K O'Shea
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA; Howard Hughes Medical Institute, Harvard University, Northwest Laboratory, 52 Oxford Street, Cambridge, MA 02138, USA; Faculty of Arts and Sciences Center for Systems Biology, Harvard University, Northwest Laboratory, 52 Oxford Street, Cambridge, MA 02138, USA; Department of Molecular and Cellular Biology, Harvard University, Northwest Laboratory, 52 Oxford Street, Cambridge, MA 02138, USA.
| |
Collapse
|
29
|
Yu JS, Lim MC, Huynh DTN, Kim HJ, Kim HM, Kim YR, Kim KB. Identifying the Location of a Single Protein along the DNA Strand Using Solid-State Nanopores. ACS NANO 2015; 9:5289-98. [PMID: 25938865 DOI: 10.1021/acsnano.5b00784] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Solid-state nanopore has been widely studied as an effective tool to detect and analyze small biomolecules, such as DNA, RNA, and proteins, at a single molecule level. In this study, we demonstrate a rapid identification of the location of zinc finger protein (ZFP), which is bound to a specific locus along the length of a double-stranded DNA (dsDNA) to a single protein resolution using a low noise solid-state nanopore. When ZFP labeled DNAs were driven through a nanopore by an externally applied electric field, characteristic ionic current signals arising from the passage of the DNA/ZFP complex and bare DNA were detected, which enabled us to identify the locations of ZFP binding site. We examined two DNAs with ZFP binding sites at different positions and found that the location of the additional current drop derived from the DNA/ZFP complex is well-matched with a theoretical one along the length of the DNA molecule. These results suggest that the protein binding site on DNA can be mapped or that genetic information can be read at a single molecule level using solid-state nanopores.
Collapse
Affiliation(s)
- Jae-Seok Yu
- †Department of Materials Science and Engineering, Seoul National University, Seoul 151-742, Korea
| | - Min-Cheol Lim
- ‡Graduate School of Biotechnology and Department of Food Science and Biotechnology, Kyung Hee University, Yongin 446-701, Korea
| | - Duyen Thi Ngoc Huynh
- ‡Graduate School of Biotechnology and Department of Food Science and Biotechnology, Kyung Hee University, Yongin 446-701, Korea
| | - Hyung-Jun Kim
- †Department of Materials Science and Engineering, Seoul National University, Seoul 151-742, Korea
| | - Hyun-Mi Kim
- †Department of Materials Science and Engineering, Seoul National University, Seoul 151-742, Korea
| | - Young-Rok Kim
- ‡Graduate School of Biotechnology and Department of Food Science and Biotechnology, Kyung Hee University, Yongin 446-701, Korea
| | - Ki-Bum Kim
- †Department of Materials Science and Engineering, Seoul National University, Seoul 151-742, Korea
| |
Collapse
|
30
|
Narasimhan K, Lambert SA, Yang AWH, Riddell J, Mnaimneh S, Zheng H, Albu M, Najafabadi HS, Reece-Hoyes JS, Fuxman Bass JI, Walhout AJM, Weirauch MT, Hughes TR. Mapping and analysis of Caenorhabditis elegans transcription factor sequence specificities. eLife 2015; 4. [PMID: 25905672 PMCID: PMC4434323 DOI: 10.7554/elife.06967] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2015] [Accepted: 04/22/2015] [Indexed: 12/13/2022] Open
Abstract
Caenorhabditis elegans is a powerful model for studying gene regulation, as it has a compact genome and a wealth of genomic tools. However, identification of regulatory elements has been limited, as DNA-binding motifs are known for only 71 of the estimated 763 sequence-specific transcription factors (TFs). To address this problem, we performed protein binding microarray experiments on representatives of canonical TF families in C. elegans, obtaining motifs for 129 TFs. Additionally, we predict motifs for many TFs that have DNA-binding domains similar to those already characterized, increasing coverage of binding specificities to 292 C. elegans TFs (∼40%). These data highlight the diversification of binding motifs for the nuclear hormone receptor and C2H2 zinc finger families and reveal unexpected diversity of motifs for T-box and DM families. Motif enrichment in promoters of functionally related genes is consistent with known biology and also identifies putative regulatory roles for unstudied TFs. DOI:http://dx.doi.org/10.7554/eLife.06967.001 Many scientists use ‘model’ species—such as the fruit fly or a nematode worm called Caenorhabditis elegans—in their research because these organisms have useful features that make it easier to carry out many experiments. For example, C. elegans has a smaller genome compared to many other animals, which is useful for studying the roles of individual genes or stretches of DNA. Transcription factors are a type of protein that can bind to specific stretches of DNA and help to switch certain genes on or off. These ‘motifs’ may be close to the gene or further away in the genome, and therefore, must stand out amongst the rest of the DNA, like lights on a landing strip. However, the motifs for only 10% of the estimated 763 transcription factors in C. elegans have been identified so far. In this study, Narasimhan, Lambert, Yang et al. used a technique called a ‘protein binding microarray’ to identify the motifs for many more of the C. elegans transcription factors. These findings were then used to predict motifs for other transcription factors. Together, these methods increased the proportion of C. elegans transcription factors with known DNA-binding motifs from 10% to around 40%. Now that more DNA motifs have been identified, it is possible to look for similarities and differences between them. For example, Narasimhan, Lambert, Yang et al. found that transcription factors with similar sequences can bind to very varied motifs. On the other hand, some transcription factors that are very different are able to recognize very similar motifs. The experiments also indicate that motifs found very close to genes—in sequences known as ‘promoters’—may be able to interact with many proteins to influence the activity of genes. Narasimhan, Lambert, Yang et al.'s findings increase the number of C. elegans transcription factors with a motif, bringing the knowledge of these proteins more in line with the better-studied transcription factors of humans and fruit flies. The next challenge is to identify DNA motifs for the remaining 60% of transcription factors. DOI:http://dx.doi.org/10.7554/eLife.06967.002
Collapse
Affiliation(s)
- Kamesh Narasimhan
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Canada
| | - Samuel A Lambert
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
| | - Ally W H Yang
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Canada
| | - Jeremy Riddell
- Department of Molecular and Cellular Physiology, Systems Biology and Physiology Program, University of Cincinnati, Cincinnati, United States
| | - Sanie Mnaimneh
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Canada
| | - Hong Zheng
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Canada
| | - Mihai Albu
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Canada
| | - Hamed S Najafabadi
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Canada
| | - John S Reece-Hoyes
- Program in Systems Biology, University of Massachusetts Medical School, Worcester, United States
| | - Juan I Fuxman Bass
- Program in Systems Biology, University of Massachusetts Medical School, Worcester, United States
| | - Albertha J M Walhout
- Program in Systems Biology, University of Massachusetts Medical School, Worcester, United States
| | - Matthew T Weirauch
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, United States
| | - Timothy R Hughes
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Canada
| |
Collapse
|
31
|
Nadimpalli S, Persikov AV, Singh M. Pervasive variation of transcription factor orthologs contributes to regulatory network evolution. PLoS Genet 2015; 11:e1005011. [PMID: 25748510 PMCID: PMC4351887 DOI: 10.1371/journal.pgen.1005011] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2014] [Accepted: 01/18/2015] [Indexed: 01/17/2023] Open
Abstract
Differences in transcriptional regulatory networks underlie much of the phenotypic variation observed across organisms. Changes to cis-regulatory elements are widely believed to be the predominant means by which regulatory networks evolve, yet examples of regulatory network divergence due to transcription factor (TF) variation have also been observed. To systematically ascertain the extent to which TFs contribute to regulatory divergence, we analyzed the evolution of the largest class of metazoan TFs, Cys2-His2 zinc finger (C2H2-ZF) TFs, across 12 Drosophila species spanning ~45 million years of evolution. Remarkably, we uncovered that a significant fraction of all C2H2-ZF 1-to-1 orthologs in flies exhibit variations that can affect their DNA-binding specificities. In addition to loss and recruitment of C2H2-ZF domains, we found diverging DNA-contacting residues in ~44% of domains shared between D. melanogaster and the other fly species. These diverging DNA-contacting residues, found in ~70% of the D. melanogaster C2H2-ZF genes in our analysis and corresponding to ~26% of all annotated D. melanogaster TFs, show evidence of functional constraint: they tend to be conserved across phylogenetic clades and evolve slower than other diverging residues. These same variations were rarely found as polymorphisms within a population of D. melanogaster flies, indicating their rapid fixation. The predicted specificities of these dynamic domains gradually change across phylogenetic distances, suggesting stepwise evolutionary trajectories for TF divergence. Further, whereas proteins with conserved C2H2-ZF domains are enriched in developmental functions, those with varying domains exhibit no functional enrichments. Our work suggests that a subset of highly dynamic and largely unstudied TFs are a likely source of regulatory variation in Drosophila and other metazoans.
Collapse
Affiliation(s)
- Shilpa Nadimpalli
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - Anton V. Persikov
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - Mona Singh
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| |
Collapse
|
32
|
How do regulatory networks evolve and expand throughout evolution? Curr Opin Biotechnol 2015; 34:180-8. [PMID: 25723843 DOI: 10.1016/j.copbio.2015.02.001] [Citation(s) in RCA: 69] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2014] [Revised: 02/04/2015] [Accepted: 02/04/2015] [Indexed: 11/23/2022]
Abstract
Throughout evolution, regulatory networks need to expand and adapt to accommodate novel genes and gene functions. However, the molecular details explaining how gene networks evolve remain largely unknown. Recent studies demonstrate that changes in transcription factors contribute to the evolution of regulatory networks. In particular, duplication of transcription factors followed by specific mutations in their DNA-binding or interaction domains propels the divergence and emergence of new networks. The innate promiscuity and modularity of regulatory networks contributes to their evolvability: duplicated promiscuous regulators and their target promoters can acquire mutations that lead to gradual increases in specificity, allowing neofunctionalization or subfunctionalization.
Collapse
|
33
|
Cheatle Jarvela AM, Hinman VF. Evolution of transcription factor function as a mechanism for changing metazoan developmental gene regulatory networks. EvoDevo 2015; 6:3. [PMID: 25685316 PMCID: PMC4327956 DOI: 10.1186/2041-9139-6-3] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2014] [Accepted: 12/18/2014] [Indexed: 11/10/2022] Open
Abstract
The form that an animal takes during development is directed by gene regulatory networks (GRNs). Developmental GRNs interpret maternally deposited molecules and externally supplied signals to direct cell-fate decisions, which ultimately leads to the arrangements of organs and tissues in the organism. Genetically encoded modifications to these networks have generated the wide range of metazoan diversity that exists today. Most studies of GRN evolution focus on changes to cis-regulatory DNA, and it was historically theorized that changes to the transcription factors that bind to these cis-regulatory modules (CRMs) contribute to this process only rarely. A growing body of evidence suggests that changes to the coding regions of transcription factors play a much larger role in the evolution of developmental gene regulatory networks than originally imagined. Just as cis-regulatory changes make use of modular binding site composition and tissue-specific modules to avoid pleiotropy, transcription factor coding regions also predominantly evolve in ways that limit the context of functional effects. Here, we review the recent works that have led to this unexpected change in the field of Evolution and Development (Evo-Devo) and consider the implications these studies have had on our understanding of the evolution of developmental processes.
Collapse
Affiliation(s)
- Alys M Cheatle Jarvela
- Department of Biological Sciences, Carnegie Mellon University, 4400 5th Ave, Pittsburgh, PA 15213 USA
| | - Veronica F Hinman
- Department of Biological Sciences, Carnegie Mellon University, 4400 5th Ave, Pittsburgh, PA 15213 USA
| |
Collapse
|
34
|
Abstract
Understanding how sequence-specific protein-DNA interactions direct cellular function is of great interest to the research community. High-throughput methods have been developed to determine DNA-binding specificities; one such technique, the bacterial one-hybrid (B1H) system, confers advantages including ease of use, sensitivity and throughput. In this review, we describe the evolution of the B1H system as a tool capable of screening large DNA libraries to investigate protein-DNA interactions of interest. We discuss how DNA-binding specificities produced by the B1H system have been used to predict regulatory targets. Additionally, we examine how this approach has been applied to characterize two common DNA-binding domain families-homeodomains and Cys2His2 zinc fingers-both in organism-wide studies and with synthetic approaches. In the case of the former, the B1H system has produced large catalogs of protein specificity and nuanced information about previously recovered DNA targets, thereby improving our understanding of these proteins' functions in vivo and increasing our capacity to predict similar interactions in other species. In the latter, synthetic screens of the same DNA-binding domains have further refined our models of specificity, through analyzing comprehensive libraries to uncover all proteins able to bind a complete set of targets, and, for instance, exploring how context-in the form of domain position within the parent protein-may affect specificity. Finally, we recognize the limitations of the B1H system and discuss its potential for use in the production of designer proteins and in studies of protein-protein interactions.
Collapse
|
35
|
Andrilenas KK, Penvose A, Siggers T. Using protein-binding microarrays to study transcription factor specificity: homologs, isoforms and complexes. Brief Funct Genomics 2014; 14:17-29. [PMID: 25431149 DOI: 10.1093/bfgp/elu046] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Protein-DNA binding is central to specificity in gene regulation, and methods for characterizing transcription factor (TF)-DNA binding remain crucial to studies of regulatory specificity. High-throughput (HT) technologies have revolutionized our ability to characterize protein-DNA binding by significantly increasing the number of binding measurements that can be performed. Protein-binding microarrays (PBMs) are a robust and powerful HT platform for studying DNA-binding specificity of TFs. Analysis of PBM-determined DNA-binding profiles has provided new insight into the scope and mechanisms of TF binding diversity. In this review, we focus specifically on the PBM technique and discuss its application to the study of TF specificity, in particular, the binding diversity of TF homologs and multi-protein complexes.
Collapse
|
36
|
Hume MA, Barrera LA, Gisselbrecht SS, Bulyk ML. UniPROBE, update 2015: new tools and content for the online database of protein-binding microarray data on protein-DNA interactions. Nucleic Acids Res 2014; 43:D117-22. [PMID: 25378322 PMCID: PMC4383892 DOI: 10.1093/nar/gku1045] [Citation(s) in RCA: 209] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
The Universal PBM Resource for Oligonucleotide Binding Evaluation (UniPROBE) serves as a convenient source of information on published data generated using universal protein-binding microarray (PBM) technology, which provides in vitro data about the relative DNA-binding preferences of transcription factors for all possible sequence variants of a length k (‘k-mers’). The database displays important information about the proteins and displays their DNA-binding specificity data in terms of k-mers, position weight matrices and graphical sequence logos. This update to the database documents the growth of UniPROBE since the last update 4 years ago, and introduces a variety of new features and tools, including a new streamlined pipeline that facilitates data deposition by universal PBM data generators in the research community, a tool that generates putative nonbinding (i.e. negative control) DNA sequences for one or more proteins and novel motifs obtained by analyzing the PBM data using the BEEML-PBM algorithm for motif inference. The UniPROBE database is available at http://uniprobe.org.
Collapse
Affiliation(s)
- Maxwell A Hume
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA Bioinformatics Graduate Program, Northeastern University, Boston, MA 02115, USA
| | - Luis A Barrera
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, MA 02138, USA Bioinformatics and Integrative Genomics Graduate Program, Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School, Boston, MA 02115, USA
| | - Stephen S Gisselbrecht
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, MA 02138, USA Bioinformatics and Integrative Genomics Graduate Program, Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School, Boston, MA 02115, USA Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|