1
|
Bonnell V, Zhang Y, Brown A, Horton J, Josling G, Chiu TP, Rohs R, Mahony S, Gordân R, Llinás M. DNA sequence and chromatin differentiate sequence-specific transcription factor binding in the human malaria parasite Plasmodium falciparum. Nucleic Acids Res 2024; 52:10161-10179. [PMID: 38966997 PMCID: PMC11417369 DOI: 10.1093/nar/gkae585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 05/30/2024] [Accepted: 06/27/2024] [Indexed: 07/06/2024] Open
Abstract
Development of the malaria parasite, Plasmodium falciparum, is regulated by a limited number of sequence-specific transcription factors (TFs). However, the mechanisms by which these TFs recognize genome-wide binding sites is largely unknown. To address TF specificity, we investigated the binding of two TF subsets that either bind CACACA or GTGCAC DNA sequence motifs and further characterized two additional ApiAP2 TFs, PfAP2-G and PfAP2-EXP, which bind unique DNA motifs (GTAC and TGCATGCA). We also interrogated the impact of DNA sequence and chromatin context on P. falciparum TF binding by integrating high-throughput in vitro and in vivo binding assays, DNA shape predictions, epigenetic post-translational modifications, and chromatin accessibility. We found that DNA sequence context minimally impacts binding site selection for paralogous CACACA-binding TFs, while chromatin accessibility, epigenetic patterns, co-factor recruitment, and dimerization correlate with differential binding. In contrast, GTGCAC-binding TFs prefer different DNA sequence context in addition to chromatin dynamics. Finally, we determined that TFs that preferentially bind divergent DNA motifs may bind overlapping genomic regions due to low-affinity binding to other sequence motifs. Our results demonstrate that TF binding site selection relies on a combination of DNA sequence and chromatin features, thereby contributing to the complexity of P. falciparum gene regulatory mechanisms.
Collapse
Affiliation(s)
- Victoria A Bonnell
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Malaria Research, The Pennsylvania State University, University Park, PA 16802, USA
| | - Yuning Zhang
- Center for Genomic and Computational Biology, Duke University, Durham, NC 27708, USA
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27708, USA
- Program in Computational Biology and Bioinformatics, Duke University, Durham, NC 27708, USA
| | - Alan S Brown
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Malaria Research, The Pennsylvania State University, University Park, PA 16802, USA
| | - John Horton
- Center for Genomic and Computational Biology, Duke University, Durham, NC 27708, USA
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27708, USA
| | - Gabrielle A Josling
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Malaria Research, The Pennsylvania State University, University Park, PA 16802, USA
| | - Tsu-Pei Chiu
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Remo Rohs
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
- Department of Chemistry, University of Southern California, Los Angeles, CA 90089, USA
- Department of Physics and Astronomy, University of Southern California, Los Angeles, CA 90089, USA
- Thomas Lord Department of Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Shaun Mahony
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Raluca Gordân
- Center for Genomic and Computational Biology, Duke University, Durham, NC 27708, USA
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27708, USA
- Department of Computer Science, Duke University, Durham, NC 27708, USA
- Department of Molecular Genetics and Microbiology, Duke University, Durham, NC 27708, USA
| | - Manuel Llinás
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Malaria Research, The Pennsylvania State University, University Park, PA 16802, USA
- Department of Chemistry, The Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
2
|
Choi Y, Koh J, Cha SS, Roe JH. Activation of zinc uptake regulator by zinc binding to three regulatory sites. Nucleic Acids Res 2024; 52:4185-4197. [PMID: 38349033 PMCID: PMC11077047 DOI: 10.1093/nar/gkae079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 01/23/2024] [Accepted: 02/05/2024] [Indexed: 05/09/2024] Open
Abstract
Zur is a Fur-family metalloregulator that is widely used to control zinc homeostasis in bacteria. In Streptomyces coelicolor, Zur (ScZur) acts as both a repressor for zinc uptake (znuA) gene and an activator for zinc exporter (zitB) gene. Previous structural studies revealed three zinc ions specifically bound per ScZur monomer; a structural one to allow dimeric architecture and two regulatory ones for DNA-binding activity. In this study, we present evidence that Zur contains a fourth specific zinc-binding site with a key histidine residue (H36), widely conserved among actinobacteria, for regulatory function. Biochemical, genetic, and calorimetric data revealed that H36 is critical for hexameric binding of Zur to the zitB zurbox and further binding to its upstream region required for full activation. A comprehensive thermodynamic model demonstrated that the DNA-binding affinity of Zur to both znuA and zitB zurboxes is remarkably enhanced upon saturation of all three regulatory zinc sites. The model also predicts that the strong coupling between zinc binding and DNA binding equilibria of Zur drives a biphasic activation of the zitB gene in response to a wide concentration change of zinc. Similar mechanisms may be pertinent to other metalloproteins, expanding their response spectrum through binding multiple regulatory metals.
Collapse
Affiliation(s)
- Yunchan Choi
- Laboratory of Molecular Microbiology, School of Biological Sciences, College of Natural Science, Seoul National University, Seoul 08826, Republic of Korea
| | - Junseock Koh
- Laboratory of Biophysical Chemistry, School of Biological Sciences, College of Natural Science, Seoul National University, Seoul 08826, Republic of Korea
| | - Sun-Shin Cha
- Protein Research Laboratory, Department of Chemistry and Nanoscience, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Jung-Hye Roe
- Laboratory of Molecular Microbiology, School of Biological Sciences, College of Natural Science, Seoul National University, Seoul 08826, Republic of Korea
| |
Collapse
|
3
|
Li J, Chiu TP, Rohs R. Predicting DNA structure using a deep learning method. Nat Commun 2024; 15:1243. [PMID: 38336958 PMCID: PMC10858265 DOI: 10.1038/s41467-024-45191-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 01/17/2024] [Indexed: 02/12/2024] Open
Abstract
Understanding the mechanisms of protein-DNA binding is critical in comprehending gene regulation. Three-dimensional DNA structure, also described as DNA shape, plays a key role in these mechanisms. In this study, we present a deep learning-based method, Deep DNAshape, that fundamentally changes the current k-mer based high-throughput prediction of DNA shape features by accurately accounting for the influence of extended flanking regions, without the need for extensive molecular simulations or structural biology experiments. By using the Deep DNAshape method, DNA structural features can be predicted for any length and number of DNA sequences in a high-throughput manner, providing an understanding of the effects of flanking regions on DNA structure in a target region of a sequence. The Deep DNAshape method provides access to the influence of distant flanking regions on a region of interest. Our findings reveal that DNA shape readout mechanisms of a core target are quantitatively affected by flanking regions, including extended flanking regions, providing valuable insights into the detailed structural readout mechanisms of protein-DNA binding. Furthermore, when incorporated in machine learning models, the features generated by Deep DNAshape improve the model prediction accuracy. Collectively, Deep DNAshape can serve as versatile and powerful tool for diverse DNA structure-related studies.
Collapse
Affiliation(s)
- Jinsen Li
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, 90089, USA
| | - Tsu-Pei Chiu
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, 90089, USA
| | - Remo Rohs
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, 90089, USA.
- Department of Chemistry, University of Southern California, Los Angeles, CA, 90089, USA.
- Department of Physics and Astronomy, University of Southern California, Los Angeles, CA, 90089, USA.
- Thomas Lord Department of Computer Science, University of Southern California, Los Angeles, CA, 90089, USA.
| |
Collapse
|
4
|
Cirakli E, Basu A. A method for assaying DNA flexibility. Methods 2023; 219:68-72. [PMID: 37769928 DOI: 10.1016/j.ymeth.2023.09.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 09/05/2023] [Accepted: 09/21/2023] [Indexed: 10/03/2023] Open
Abstract
The transcription, replication, packaging, and repair of genetic information ubiquitously involves DNA:protein interactions and other biological processes that require local mechanical distortions of DNA. The energetics of such DNA-deforming processes are thus dependent on the local mechanical properties of DNA such as bendability or torsional rigidity. Such properties, in turn, depend on sequence, making it possible for sequence to regulate diverse biological processes by controlling the local mechanical properties of DNA. A deeper understanding of how such a "mechanical code" can encode broad regulatory information has historically been hampered by the absence of technology to measure in high throughput how local DNA mechanics varies with sequence along large regions of the genome. This was overcome in a recently developed technique called loop-seq. Here we describe a variant of the loop-seq protocol, that permits making rapid flexibility measurements in low-throughput, without the need for next-generation sequencing. We use our method to validate a previous prediction about how the binding site for the bacterial transcription factor Integration Host Factor (IHF) might serve as a rigid roadblock, preventing efficient enhancer-promoter contacts in IHF site containing promoters in E. coli, which can be relieved by IHF binding.
Collapse
Affiliation(s)
- Eliz Cirakli
- Department of Chemistry, Durham University, Durham, UK; Department of Biosciences, Durham University, Durham, UK
| | - Aakash Basu
- Department of Biosciences, Durham University, Durham, UK.
| |
Collapse
|
5
|
Li J, Chiu TP, Rohs R. Deep DNAshape: Predicting DNA shape considering extended flanking regions using a deep learning method. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.22.563383. [PMID: 37961633 PMCID: PMC10634709 DOI: 10.1101/2023.10.22.563383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Understanding the mechanisms of protein-DNA binding is critical in comprehending gene regulation. Three-dimensional DNA shape plays a key role in these mechanisms. In this study, we present a deep learning-based method, Deep DNAshape, that fundamentally changes the current k -mer based high-throughput prediction of DNA shape features by accurately accounting for the influence of extended flanking regions, without the need for extensive molecular simulations or structural biology experiments. By using the Deep DNAshape method, refined DNA shape features can be predicted for any length and number of DNA sequences in a high-throughput manner, providing a deeper understanding of the effects of flanking regions on DNA shape in a target region of a sequence. Deep DNAshape method provides access to the influence of distant flanking regions on a region of interest. Our findings reveal that DNA shape readout mechanisms of a core target are quantitatively affected by flanking regions, including extended flanking regions, providing valuable insights into the detailed structural readout mechanisms of protein-DNA binding. Furthermore, when incorporated in machine learning models, the features generated by Deep DNAshape improve the model prediction accuracy. Collectively, Deep DNAshape can serve as a versatile and powerful tool for diverse DNA structure-related studies.
Collapse
|
6
|
Basu A, Bobrovnikov DG, Cieza B, Arcon JP, Qureshi Z, Orozco M, Ha T. Deciphering the mechanical code of the genome and epigenome. Nat Struct Mol Biol 2022; 29:1178-1187. [PMID: 36471057 PMCID: PMC10142808 DOI: 10.1038/s41594-022-00877-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Accepted: 10/18/2022] [Indexed: 12/12/2022]
Abstract
Diverse DNA-deforming processes are impacted by the local mechanical and structural properties of DNA, which in turn depend on local sequence and epigenetic modifications. Deciphering this mechanical code (that is, this dependence) has been challenging due to the lack of high-throughput experimental methods. Here we present a comprehensive characterization of the mechanical code. Utilizing high-throughput measurements of DNA bendability via loop-seq, we quantitatively established how the occurrence and spatial distribution of dinucleotides, tetranucleotides and methylated CpG impact DNA bendability. We used our measurements to develop a physical model for the sequence and methylation dependence of DNA bendability. We validated the model by performing loop-seq on mouse genomic sequences around transcription start sites and CTCF-binding sites. We applied our model to test the predictions of all-atom molecular dynamics simulations and to demonstrate that sequence and epigenetic modifications can mechanically encode regulatory information in diverse contexts.
Collapse
Affiliation(s)
- Aakash Basu
- Department of Biosciences, Durham University, Durham, UK. .,Department of Biophysics and Biophysical Chemistry, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Dmitriy G Bobrovnikov
- Department of Biophysics and Biophysical Chemistry, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Basilio Cieza
- Department of Biophysics, Johns Hopkins University, Baltimore, MD, USA
| | - Juan Pablo Arcon
- Institute for Research in Biomedicine (IRB Barcelona), Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Zan Qureshi
- Department of Biophysics, Johns Hopkins University, Baltimore, MD, USA
| | - Modesto Orozco
- Institute for Research in Biomedicine (IRB Barcelona), Barcelona Institute of Science and Technology, Barcelona, Spain.,Department of Biochemistry and Biomedicine, Universitat de Barcelona, Barcelona, Spain
| | - Taekjip Ha
- Department of Biophysics and Biophysical Chemistry, Johns Hopkins University School of Medicine, Baltimore, MD, USA. .,Department of Biophysics, Johns Hopkins University, Baltimore, MD, USA. .,Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA. .,Howard Hughes Medical Institute, Baltimore, MD, USA.
| |
Collapse
|
7
|
Towards a better understanding of TF-DNA binding prediction from genomic features. Comput Biol Med 2022; 149:105993. [DOI: 10.1016/j.compbiomed.2022.105993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/12/2022] [Accepted: 08/14/2022] [Indexed: 11/17/2022]
|
8
|
Singh RK, Mukherjee A. Molecular Mechanism of the Intercalation of the SOX-4 Protein into DNA Inducing Bends and Kinks. J Phys Chem B 2021; 125:3752-3762. [PMID: 33848164 DOI: 10.1021/acs.jpcb.0c11496] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
DNA-protein interactions regulate several biophysical functions, yet the mechanism of only a few is investigated in molecular detail. An important example is the intercalation of transcription factor proteins into DNA that produce bent and kinked DNA. Here, we have studied the molecular mechanism of the intercalation of a transcription factor SOX4 into DNA with a goal to understand the sequence of molecular events that precede the bending and kinking of the DNA. Our long well-tempered metadynamics and molecular dynamics (MD) simulations show that the protein primarily binds to the backbone of DNA and rotates around it to form an intercalative native state. We show that although there are multiple pathways for intercalation, the deintercalation pathway matches with the most probable intercalation pathway. In both cases, bending and kinking happen simultaneously, driven by the onset of the intercalation of the amino acid.
Collapse
Affiliation(s)
- Reman Kumar Singh
- Department of Chemistry, Indian Institute of Science Education and Research, Pune 411008, India
| | - Arnab Mukherjee
- Department of Chemistry, Indian Institute of Science Education and Research, Pune 411008, India
| |
Collapse
|
9
|
He W, Chen YL, Pollack L, Kirmizialtin S. The structural plasticity of nucleic acid duplexes revealed by WAXS and MD. SCIENCE ADVANCES 2021; 7:7/17/eabf6106. [PMID: 33893104 PMCID: PMC8064643 DOI: 10.1126/sciadv.abf6106] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 03/05/2021] [Indexed: 05/06/2023]
Abstract
Double-stranded DNA (dsDNA) and RNA (dsRNA) helices display an unusual structural diversity. Some structural variations are linked to sequence and may serve as signaling units for protein-binding partners. Therefore, elucidating the mechanisms and factors that modulate these variations is of fundamental importance. While the structural diversity of dsDNA has been extensively studied, similar studies have not been performed for dsRNA. Because of the increasing awareness of RNA's diverse biological roles, such studies are timely and increasingly important. We integrate solution x-ray scattering at wide angles (WAXS) with all-atom molecular dynamics simulations to explore the conformational ensemble of duplex topologies for different sequences and salt conditions. These tightly coordinated studies identify robust correlations between features in the WAXS profiles and duplex geometry and enable atomic-level insights into the structural diversity of DNA and RNA duplexes. Notably, dsRNA displays a marked sensitivity to the valence and identity of its associated cations.
Collapse
Affiliation(s)
- Weiwei He
- Chemistry Program, Science Division, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
- Department of Chemistry, New York University, New York, NY, USA
| | - Yen-Lin Chen
- School of Applied and Engineering Physics, Cornell University, Ithaca, NY, USA
| | - Lois Pollack
- School of Applied and Engineering Physics, Cornell University, Ithaca, NY, USA.
| | - Serdal Kirmizialtin
- Chemistry Program, Science Division, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates.
| |
Collapse
|
10
|
Walker CR, Scally A, De Maio N, Goldman N. Short-range template switching in great ape genomes explored using pair hidden Markov models. PLoS Genet 2021; 17:e1009221. [PMID: 33651813 PMCID: PMC7954356 DOI: 10.1371/journal.pgen.1009221] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 03/12/2021] [Accepted: 02/10/2021] [Indexed: 12/14/2022] Open
Abstract
Many complex genomic rearrangements arise through template switch errors, which occur in DNA replication when there is a transient polymerase switch to an alternate template nearby in three-dimensional space. While typically investigated at kilobase-to-megabase scales, the genomic and evolutionary consequences of this mutational process are not well characterised at smaller scales, where they are often interpreted as clusters of independent substitutions, insertions and deletions. Here we present an improved statistical approach using pair hidden Markov models, and use it to detect and describe short-range template switches underlying clusters of mutations in the multi-way alignment of hominid genomes. Using robust statistics derived from evolutionary genomic simulations, we show that template switch events have been widespread in the evolution of the great apes’ genomes and provide a parsimonious explanation for the presence of many complex mutation clusters in their phylogenetic context. Larger-scale mechanisms of genome rearrangement are typically associated with structural features around breakpoints, and accordingly we show that atypical patterns of secondary structure formation and DNA bending are present at the initial template switch loci. Our methods improve on previous non-probabilistic approaches for computational detection of template switch mutations, allowing the statistical significance of events to be assessed. By specifying realistic evolutionary parameters based on the genomes and taxa involved, our methods can be readily adapted to other intra- or inter-species comparisons. DNA replication is an imperfect process which causes the mutations that give rise to genetic diversity during the evolution of genomes. While many mutations are independent, single-nucleotide substitutions or small insertions and deletions, some mutations arise as nonindependent clusters of substitutions and larger scale chromosomal rearrangements. Large-scale rearrangements (also called structural variants) in particular can have a profound impact on genome evolution and contribute to both germline and somatic disease in humans. The replication-based mechanisms underlying structural variation typically involve a polymerase switch event in which a large segment of DNA is copied using a template from an alternate location in the genome. Methods for identifying these template switch mutations lack the power to detect smaller scale rearrangements which can arise through the same replication-based pathways. Here we outline a model which can detect and assess the statistical significance of such small-scale template switches within their evolutionary context. We show that these events are widespread in the evolution of great apes and that the genomic features associated with these small-scale rearrangements are similar to those of large-scale structural variants.
Collapse
Affiliation(s)
- Conor R. Walker
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Aylwyn Scally
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
- * E-mail:
| |
Collapse
|
11
|
Lara-Gonzalez S, Dantas Machado AC, Rao S, Napoli AA, Birktoft J, Di Felice R, Rohs R, Lawson CL. The RNA Polymerase α Subunit Recognizes the DNA Shape of the Upstream Promoter Element. Biochemistry 2020; 59:4523-4532. [PMID: 33205945 DOI: 10.1021/acs.biochem.0c00571] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
We demonstrate here that the α subunit C-terminal domain of Escherichia coli RNA polymerase (αCTD) recognizes the upstream promoter (UP) DNA element via its characteristic minor groove shape and electrostatic potential. In two compositionally distinct crystallized assemblies, a pair of αCTD subunits bind in tandem to the UP element consensus A-tract that is 6 bp in length (A6-tract), each with their arginine 265 guanidinium group inserted into the minor groove. The A6-tract minor groove is significantly narrowed in these crystal structures, as well as in computationally predicted structures of free and bound DNA duplexes derived by Monte Carlo and molecular dynamics simulations, respectively. The negative electrostatic potential of free A6-tract DNA is substantially enhanced compared to that of generic DNA. Shortening the A-tract by 1 bp is shown to "knock out" binding of the second αCTD through widening of the minor groove. Furthermore, in computationally derived structures with arginine 265 mutated to alanine in either αCTD, either with or without the "knockout" DNA mutation, contact with the DNA is perturbed, highlighting the importance of arginine 265 in achieving αCTD-DNA binding. These results demonstrate that the importance of the DNA shape in sequence-dependent recognition of DNA by RNA polymerase is comparable to that of certain transcription factors.
Collapse
Affiliation(s)
- Samuel Lara-Gonzalez
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 610 Taylor Road, Piscataway, New Jersey 08854, United States
| | - Ana Carolina Dantas Machado
- Quantitative and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, United States
| | - Satyanarayan Rao
- Quantitative and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, United States
| | - Andrew A Napoli
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 610 Taylor Road, Piscataway, New Jersey 08854, United States
| | - Jens Birktoft
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 610 Taylor Road, Piscataway, New Jersey 08854, United States
| | - Rosa Di Felice
- Quantitative and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, United States.,Department of Physics and Astronomy, University of Southern California, Los Angeles, California 90089, United States.,CNR-NANO Modena, Via Campi 213/A, 41125 Modena, Italy
| | - Remo Rohs
- Quantitative and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, United States.,Department of Physics and Astronomy, University of Southern California, Los Angeles, California 90089, United States.,Department of Chemistry, University of Southern California, Los Angeles, California 90089, United States.,Department of Computer Science, University of Southern California, Los Angeles, California 90089, United States
| | - Catherine L Lawson
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 610 Taylor Road, Piscataway, New Jersey 08854, United States.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, New Jersey 08854, United States
| |
Collapse
|
12
|
Epigenetic competition reveals density-dependent regulation and target site plasticity of phosphorothioate epigenetics in bacteria. Proc Natl Acad Sci U S A 2020; 117:14322-14330. [PMID: 32518115 DOI: 10.1073/pnas.2002933117] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Phosphorothioate (PT) DNA modifications-in which a nonbonding phosphate oxygen is replaced with sulfur-represent a widespread, horizontally transferred epigenetic system in prokaryotes and have a highly unusual property of occupying only a small fraction of available consensus sequences in a genome. Using Salmonella enterica as a model, we asked a question of fundamental importance: How do the PT-modifying DndA-E proteins select their GPSAAC/GPSTTC targets? Here, we applied innovative analytical, sequencing, and computational tools to discover a novel behavior for DNA-binding proteins: The Dnd proteins are "parked" at the G6mATC Dam methyltransferase consensus sequence instead of the expected GAAC/GTTC motif, with removal of the 6mA permitting extensive PT modification of GATC sites. This shift in modification sites further revealed a surprising constancy in the density of PT modifications across the genome. Computational analysis showed that GAAC, GTTC, and GATC share common features of DNA shape, which suggests that PT epigenetics are regulated in a density-dependent manner partly by DNA shape-driven target selection in the genome.
Collapse
|
13
|
Chiu TP, Xin B, Markarian N, Wang Y, Rohs R. TFBSshape: an expanded motif database for DNA shape features of transcription factor binding sites. Nucleic Acids Res 2020; 48:D246-D255. [PMID: 31665425 PMCID: PMC7145579 DOI: 10.1093/nar/gkz970] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 10/08/2019] [Accepted: 10/11/2019] [Indexed: 12/31/2022] Open
Abstract
TFBSshape (https://tfbsshape.usc.edu) is a motif database for analyzing structural profiles of transcription factor binding sites (TFBSs). The main rationale for this database is to be able to derive mechanistic insights in protein-DNA readout modes from sequencing data without available structures. We extended the quantity and dimensionality of TFBSshape, from mostly in vitro to in vivo binding and from unmethylated to methylated DNA. This new release of TFBSshape improves its functionality and launches a responsive and user-friendly web interface for easy access to the data. The current expansion includes new entries from the most recent collections of transcription factors (TFs) from the JASPAR and UniPROBE databases, methylated TFBSs derived from in vitro high-throughput EpiSELEX-seq binding assays and in vivo methylated TFBSs from the MeDReaders database. TFBSshape content has increased to 2428 structural profiles for 1900 TFs from 39 different species. The structural profiles for each TFBS entry now include 13 shape features and minor groove electrostatic potential for standard DNA and four shape features for methylated DNA. We improved the flexibility and accuracy for the shape-based alignment of TFBSs and designed new tools to compare methylated and unmethylated structural profiles of TFs and methods to derive DNA shape-preserving nucleotide mutations in TFBSs.
Collapse
Affiliation(s)
- Tsu-Pei Chiu
- Quantitative and Computational Biology, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Beibei Xin
- Quantitative and Computational Biology, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Nicholas Markarian
- Quantitative and Computational Biology, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Yingfei Wang
- Quantitative and Computational Biology, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Remo Rohs
- Quantitative and Computational Biology, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
14
|
Ranganathan S, Cheung J, Cassidy M, Ginter C, Pata JD, McDonough KA. Novel structural features drive DNA binding properties of Cmr, a CRP family protein in TB complex mycobacteria. Nucleic Acids Res 2019; 46:403-420. [PMID: 29165665 PMCID: PMC5758884 DOI: 10.1093/nar/gkx1148] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2017] [Accepted: 11/13/2017] [Indexed: 11/16/2022] Open
Abstract
Mycobacterium tuberculosis (Mtb) encodes two CRP/FNR family transcription factors (TF) that contribute to virulence, Cmr (Rv1675c) and CRPMt (Rv3676). Prior studies identified distinct chromosomal binding profiles for each TF despite their recognizing overlapping DNA motifs. The present study shows that Cmr binding specificity is determined by discriminator nucleotides at motif positions 4 and 13. X-ray crystallography and targeted mutational analyses identified an arginine-rich loop that expands Cmr’s DNA interactions beyond the classical helix-turn-helix contacts common to all CRP/FNR family members and facilitates binding to imperfect DNA sequences. Cmr binding to DNA results in a pronounced asymmetric bending of the DNA and its high level of cooperativity is consistent with DNA-facilitated dimerization. A unique N-terminal extension inserts between the DNA binding and dimerization domains, partially occluding the site where the canonical cAMP binding pocket is found. However, an unstructured region of this N-terminus may help modulate Cmr activity in response to cellular signals. Cmr’s multiple levels of DNA interaction likely enhance its ability to integrate diverse gene regulatory signals, while its novel structural features establish Cmr as an atypical CRP/FNR family member.
Collapse
Affiliation(s)
- Sridevi Ranganathan
- Department of Biomedical Sciences, School of Public Health, University at Albany, SUNY, Albany, NY 12201, USA
| | - Jonah Cheung
- New York Structural Biology Center, New York, NY 10027, USA
| | | | | | - Janice D Pata
- Department of Biomedical Sciences, School of Public Health, University at Albany, SUNY, Albany, NY 12201, USA.,Wadsworth Center, New York State Department of Health, 120 New Scotland Avenue, PO Box 22002, Albany, NY 12201-2002, USA
| | - Kathleen A McDonough
- Department of Biomedical Sciences, School of Public Health, University at Albany, SUNY, Albany, NY 12201, USA.,Wadsworth Center, New York State Department of Health, 120 New Scotland Avenue, PO Box 22002, Albany, NY 12201-2002, USA
| |
Collapse
|
15
|
Malkowska M, Zubek J, Plewczynski D, Wyrwicz LS. ShapeGTB: the role of local DNA shape in prioritization of functional variants in human promoters with machine learning. PeerJ 2018; 6:e5742. [PMID: 30519505 PMCID: PMC6275119 DOI: 10.7717/peerj.5742] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2017] [Accepted: 09/13/2018] [Indexed: 02/01/2023] Open
Abstract
Motivation The identification of functional sequence variations in regulatory DNA regions is one of the major challenges of modern genetics. Here, we report results of a combined multifactor analysis of properties characterizing functional sequence variants located in promoter regions of genes. Results We demonstrate that GC-content of the local sequence fragments and local DNA shape features play significant role in prioritization of functional variants and outscore features related to histone modifications, transcription factors binding sites, or evolutionary conservation descriptors. Those observations allowed us to build specialized machine learning classifier identifying functional single nucleotide polymorphisms within promoter regions—ShapeGTB. We compared our method with more general tools predicting pathogenicity of all non-coding variants. ShapeGTB outperformed them by a wide margin (average precision 0.93 vs. 0.47–0.55). On the external validation set based on ClinVar database it displayed worse performance but was still competitive with other methods (average precision 0.47 vs. 0.23–0.42). Such results suggest unique characteristics of mutations located within promoter regions and are a promising signal for the development of more accurate variant prioritization tools in the future.
Collapse
Affiliation(s)
- Maja Malkowska
- Laboratory of Bioinformatics and Biostatistics, Maria Sklodowska-Curie Memorial Cancer Centre and Institute of Oncology, Warsaw, Poland
| | - Julian Zubek
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Warsaw, Poland
| | - Dariusz Plewczynski
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Warsaw, Poland.,Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | - Lucjan S Wyrwicz
- Laboratory of Bioinformatics and Biostatistics, Maria Sklodowska-Curie Memorial Cancer Centre and Institute of Oncology, Warsaw, Poland
| |
Collapse
|
16
|
Morgunova E, Yin Y, Das PK, Jolma A, Zhu F, Popov A, Xu Y, Nilsson L, Taipale J. Two distinct DNA sequences recognized by transcription factors represent enthalpy and entropy optima. eLife 2018; 7:32963. [PMID: 29638214 PMCID: PMC5896879 DOI: 10.7554/elife.32963] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Accepted: 02/12/2018] [Indexed: 11/17/2022] Open
Abstract
Most transcription factors (TFs) can bind to a population of sequences closely related to a single optimal site. However, some TFs can bind to two distinct sequences that represent two local optima in the Gibbs free energy of binding (ΔG). To determine the molecular mechanism behind this effect, we solved the structures of human HOXB13 and CDX2 bound to their two optimal DNA sequences, CAATAAA and TCGTAAA. Thermodynamic analyses by isothermal titration calorimetry revealed that both sites were bound with similar ΔG. However, the interaction with the CAA sequence was driven by change in enthalpy (ΔH), whereas the TCG site was bound with similar affinity due to smaller loss of entropy (ΔS). This thermodynamic mechanism that leads to at least two local optima likely affects many macromolecular interactions, as ΔG depends on two partially independent variables ΔH and ΔS according to the central equation of thermodynamics, ΔG = ΔH - TΔS. Genes are sections of DNA that carry the instructions needed to build other molecules including all the proteins that the cell needs to fulfill its role. The information in the DNA is stored as a code consisting of four chemical bases, often referred to simply as “A”, “C”, “G” and “T”. The order or sequence of these bases determines the role of a protein. Many organisms – including humans – are built of many different types of cells that perform unique roles. Almost all cells carry the same genetic information, but proteins called transcription factors can regulate the activity of genes so that only a relevant subset of genes is switched on at a particular time. Transcription factors glide along DNA and bind to short DNA sequences by attaching to the DNA bases directly or through bridges made up of water molecules. Two physical concepts known as enthalpy and entropy determine the strength of the connection. Enthalpy relates to how strong the chemical bonds that form between the transcription factors and the DNA bases are, compared to a situation where the transcription factor and DNA do not form a complex and bind to water molecules around them. Entropy measures the disorder of the system – the more disordered the solvent and protein-DNA complex are compared to solvent-containing free DNA and protein, the stronger the binding. A water molecule that bridges a DNA base with an amino-acid of a protein contributes to enthalpy, but results in loss of entropy, because the system becomes more ordered since the water molecule can no longer move freely. Most transcription factors can only bind to DNA sequences that are very similar to each other, but some transcription factors can recognize several different kinds of sequences, and until now it was not clear how they could do this. Morgunova et al. studied four different human transcription factors that can each bind to two distinct DNA sequences. The results showed that the transcription factors bound to both DNA sequences with similar strength, but via different mechanisms. For one DNA sequence, an enthalpy-based mechanism essentially ‘froze’ the transcription factor to the DNA through rigid water bridges. The other DNA sequence was bound equally strongly but through moving water molecules, because this increased the entropy of the system. It is possible that these mechanisms could also apply to many other molecules that interact with each other through water-molecule bridges. A better knowledge of the chemical bonds between transcription factors and DNA bases may in future help efforts to develop new treatments that depend on molecules being able to bind to other molecules. In addition, these findings may one day help scientists to predict how strongly two molecules will interact simply by knowing the structures of the molecules involved.
Collapse
Affiliation(s)
- Ekaterina Morgunova
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| | - Yimeng Yin
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| | - Pratyush K Das
- Genome-Scale Biology Research Program, University of Helsinki, Helsinki, Finland
| | - Arttu Jolma
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| | - Fangjie Zhu
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| | | | - You Xu
- Department of Bioscience and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Lennart Nilsson
- Department of Bioscience and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Jussi Taipale
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.,Genome-Scale Biology Research Program, University of Helsinki, Helsinki, Finland.,Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
17
|
Rao S, Chiu TP, Kribelbauer JF, Mann RS, Bussemaker HJ, Rohs R. Systematic prediction of DNA shape changes due to CpG methylation explains epigenetic effects on protein-DNA binding. Epigenetics Chromatin 2018; 11:6. [PMID: 29409522 PMCID: PMC5800008 DOI: 10.1186/s13072-018-0174-4] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2017] [Accepted: 01/15/2018] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND DNA shape analysis has demonstrated the potential to reveal structure-based mechanisms of protein-DNA binding. However, information about the influence of chemical modification of DNA is limited. Cytosine methylation, the most frequent modification, represents the addition of a methyl group at the major groove edge of the cytosine base. In mammalian genomes, cytosine methylation most frequently occurs at CpG dinucleotides. In addition to changing the chemical signature of C/G base pairs, cytosine methylation can affect DNA structure. Since the original discovery of DNA methylation, major efforts have been made to understand its effect from a sequence perspective. Compared to unmethylated DNA, however, little structural information is available for methylated DNA, due to the limited number of experimentally determined structures. To achieve a better mechanistic understanding of the effect of CpG methylation on local DNA structure, we developed a high-throughput method, methyl-DNAshape, for predicting the effect of cytosine methylation on DNA shape. RESULTS Using our new method, we found that CpG methylation significantly altered local DNA shape. Four DNA shape features-helix twist, minor groove width, propeller twist, and roll-were considered in this analysis. Distinct distributions of effect size were observed for different features. Roll and propeller twist were the DNA shape features most strongly affected by CpG methylation with an effect size depending on the local sequence context. Methylation-induced changes in DNA shape were predictive of the measured rate of cleavage by DNase I and suggest a possible mechanism for some of the methylation sensitivities that were recently observed for human Pbx-Hox complexes. CONCLUSIONS CpG methylation is an important epigenetic mark in the mammalian genome. Understanding its role in protein-DNA recognition can further our knowledge of gene regulation. Our high-throughput methyl-DNAshape method can be used to predict the effect of cytosine methylation on DNA shape and its subsequent influence on protein-DNA interactions. This approach overcomes the limited availability of experimental DNA structures that contain 5-methylcytosine.
Collapse
Affiliation(s)
- Satyanarayan Rao
- Computational Biology and Bioinformatics Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA
| | - Tsu-Pei Chiu
- Computational Biology and Bioinformatics Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA
| | - Judith F Kribelbauer
- Department of Biological Sciences, Columbia University, New York, NY, 10027, USA.,Department of Systems Biology, Columbia University, New York, NY, 10032, USA.,Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, 10032, USA
| | - Richard S Mann
- Department of Systems Biology, Columbia University, New York, NY, 10032, USA.,Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, 10032, USA.,Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, 10027, USA.,Department of Neuroscience, Columbia University, New York, NY, 10027, USA
| | - Harmen J Bussemaker
- Department of Biological Sciences, Columbia University, New York, NY, 10027, USA. .,Department of Systems Biology, Columbia University, New York, NY, 10032, USA.
| | - Remo Rohs
- Computational Biology and Bioinformatics Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA. .,Department of Chemistry, University of Southern California, Los Angeles, CA, 90089, USA. .,Department of Physics & Astronomy, University of Southern California, Los Angeles, CA, 90089, USA. .,Department of Computer Science, University of Southern California, Los Angeles, CA, 90089, USA.
| |
Collapse
|
18
|
Li J, Sagendorf JM, Chiu TP, Pasi M, Perez A, Rohs R. Expanding the repertoire of DNA shape features for genome-scale studies of transcription factor binding. Nucleic Acids Res 2018; 45:12877-12887. [PMID: 29165643 PMCID: PMC5728407 DOI: 10.1093/nar/gkx1145] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2017] [Accepted: 10/30/2017] [Indexed: 12/18/2022] Open
Abstract
Uncovering the mechanisms that affect the binding specificity of transcription factors (TFs) is critical for understanding the principles of gene regulation. Although sequence-based models have been used successfully to predict TF binding specificities, we found that including DNA shape information in these models improved their accuracy and interpretability. Previously, we developed a method for modeling DNA binding specificities based on DNA shape features extracted from Monte Carlo (MC) simulations. Prediction accuracies of our models, however, have not yet been compared to accuracies of models incorporating DNA shape information extracted from X-ray crystallography (XRC) data or Molecular Dynamics (MD) simulations. Here, we integrated DNA shape information extracted from MC or MD simulations and XRC data into predictive models of TF binding and compared their performance. Models that incorporated structural information consistently showed improved performance over sequence-based models regardless of data source. Furthermore, we derived and validated nine additional DNA shape features beyond our original set of four features. The expanded repertoire of 13 distinct DNA shape features, including six intra-base pair and six inter-base pair parameters and minor groove width, is available in our R/Bioconductor package DNAshapeR and enables a comprehensive structural description of the double helix on a genome-wide scale.
Collapse
Affiliation(s)
- Jinsen Li
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Jared M Sagendorf
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Tsu-Pei Chiu
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Marco Pasi
- Centre for Biomolecular Sciences and School of Pharmacy, University of Nottingham, Nottingham NG7 2RD, UK
| | - Alberto Perez
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
| | - Remo Rohs
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
19
|
Chiu TP, Rao S, Mann RS, Honig B, Rohs R. Genome-wide prediction of minor-groove electrostatic potential enables biophysical modeling of protein-DNA binding. Nucleic Acids Res 2017; 45:12565-12576. [PMID: 29040720 PMCID: PMC5716191 DOI: 10.1093/nar/gkx915] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Accepted: 09/28/2017] [Indexed: 12/16/2022] Open
Abstract
Protein–DNA binding is a fundamental component of gene regulatory processes, but it is still not completely understood how proteins recognize their target sites in the genome. Besides hydrogen bonding in the major groove (base readout), proteins recognize minor-groove geometry using positively charged amino acids (shape readout). The underlying mechanism of DNA shape readout involves the correlation between minor-groove width and electrostatic potential (EP). To probe this biophysical effect directly, rather than using minor-groove width as an indirect measure for shape readout, we developed a methodology, DNAphi, for predicting EP in the minor groove and confirmed the direct role of EP in protein–DNA binding using massive sequencing data. The DNAphi method uses a sliding-window approach to mine results from non-linear Poisson–Boltzmann (NLPB) calculations on DNA structures derived from all-atom Monte Carlo simulations. We validated this approach, which only requires nucleotide sequence as input, based on direct comparison with NLPB calculations for available crystal structures. Using statistical machine-learning approaches, we showed that adding EP as a biophysical feature can improve the predictive power of quantitative binding specificity models across 27 transcription factor families. High-throughput prediction of EP offers a novel way to integrate biophysical and genomic studies of protein–DNA binding.
Collapse
Affiliation(s)
- Tsu-Pei Chiu
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Satyanarayan Rao
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Richard S Mann
- Departments of Systems Biology and Biochemistry & Molecular Biophysics, Mortimer B. Zuckerman Institute, Columbia University, New York, NY 10032, USA
| | - Barry Honig
- Departments of Systems Biology and Biochemistry & Molecular Biophysics, Mortimer B. Zuckerman Institute, Columbia University, New York, NY 10032, USA.,Howard Hughes Medical Institute, New York, NY 10032, USA
| | - Remo Rohs
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
20
|
Li J, Dantas Machado AC, Guo M, Sagendorf JM, Zhou Z, Jiang L, Chen X, Wu D, Qu L, Chen Z, Chen L, Rohs R, Chen Y. Structure of the Forkhead Domain of FOXA2 Bound to a Complete DNA Consensus Site. Biochemistry 2017. [PMID: 28644006 DOI: 10.1021/acs.biochem.7b00211] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
FOXA2, a member of the forkhead family of transcription factors, plays essential roles in liver development and bile acid homeostasis. In this study, we report a 2.8 Å co-crystal structure of the FOXA2 DNA-binding domain (FOXA2-DBD) bound to a DNA duplex containing a forkhead consensus binding site (GTAAACA). The FOXA2-DBD adopts the canonical winged-helix fold, with helix H3 and wing 1 regions mainly mediating the DNA recognition. Although the wing 2 region was not defined in the structure, isothermal titration calorimetry assays suggested that this region was required for optimal DNA binding. Structure comparison with the FOXA3-DBD bound to DNA revealed more major groove contacts and fewer minor groove contacts in the FOXA2 structure than in the FOXA3 structure. Structure comparison with the FOXO1-DBD bound to DNA showed that different forkhead proteins could induce different DNA conformations upon binding to identical DNA sequences. Our findings provide the structural basis for FOXA2 protein binding to a consensus forkhead site and elucidate how members of the forkhead protein family bind different DNA sites.
Collapse
Affiliation(s)
- Jun Li
- Key Laboratory of Cancer Proteomics of Chinese Ministry of Health and Laboratory of Structural Biology, Xiangya Hospital, Central South University , Changsha, Hunan 410008, China.,State Key Laboratory of Medical Genetics and College of Life Science, Central South University , Changsha, Hunan 410008, China
| | - Ana Carolina Dantas Machado
- Molecular and Computational Biology Program, Department of Biological Sciences and Department of Chemistry, University of Southern California , Los Angeles, California 90089, United States.,Department of Physics and Astronomy and Department of Computer Science, University of Southern California , Los Angeles, California 90089, United States
| | - Ming Guo
- Key Laboratory of Cancer Proteomics of Chinese Ministry of Health and Laboratory of Structural Biology, Xiangya Hospital, Central South University , Changsha, Hunan 410008, China
| | - Jared M Sagendorf
- Molecular and Computational Biology Program, Department of Biological Sciences and Department of Chemistry, University of Southern California , Los Angeles, California 90089, United States.,Department of Physics and Astronomy and Department of Computer Science, University of Southern California , Los Angeles, California 90089, United States
| | - Zhan Zhou
- Key Laboratory of Cancer Proteomics of Chinese Ministry of Health and Laboratory of Structural Biology, Xiangya Hospital, Central South University , Changsha, Hunan 410008, China
| | - Longying Jiang
- Key Laboratory of Cancer Proteomics of Chinese Ministry of Health and Laboratory of Structural Biology, Xiangya Hospital, Central South University , Changsha, Hunan 410008, China
| | - Xiaojuan Chen
- Key Laboratory of Cancer Proteomics of Chinese Ministry of Health and Laboratory of Structural Biology, Xiangya Hospital, Central South University , Changsha, Hunan 410008, China.,State Key Laboratory of Medical Genetics and College of Life Science, Central South University , Changsha, Hunan 410008, China
| | - Daichao Wu
- Key Laboratory of Cancer Proteomics of Chinese Ministry of Health and Laboratory of Structural Biology, Xiangya Hospital, Central South University , Changsha, Hunan 410008, China
| | - Lingzhi Qu
- Key Laboratory of Cancer Proteomics of Chinese Ministry of Health and Laboratory of Structural Biology, Xiangya Hospital, Central South University , Changsha, Hunan 410008, China
| | - Zhuchu Chen
- Key Laboratory of Cancer Proteomics of Chinese Ministry of Health and Laboratory of Structural Biology, Xiangya Hospital, Central South University , Changsha, Hunan 410008, China
| | - Lin Chen
- Key Laboratory of Cancer Proteomics of Chinese Ministry of Health and Laboratory of Structural Biology, Xiangya Hospital, Central South University , Changsha, Hunan 410008, China.,Molecular and Computational Biology Program, Department of Biological Sciences and Department of Chemistry, University of Southern California , Los Angeles, California 90089, United States
| | - Remo Rohs
- Molecular and Computational Biology Program, Department of Biological Sciences and Department of Chemistry, University of Southern California , Los Angeles, California 90089, United States.,Department of Physics and Astronomy and Department of Computer Science, University of Southern California , Los Angeles, California 90089, United States
| | - Yongheng Chen
- Key Laboratory of Cancer Proteomics of Chinese Ministry of Health and Laboratory of Structural Biology, Xiangya Hospital, Central South University , Changsha, Hunan 410008, China.,State Key Laboratory of Medical Genetics and College of Life Science, Central South University , Changsha, Hunan 410008, China.,Collaborative Innovation Center for Cancer Medicine , Guangzhou, Guangdong 510060, China
| |
Collapse
|
21
|
Chen C, Pettitt BM. DNA Shape versus Sequence Variations in the Protein Binding Process. Biophys J 2017; 110:534-544. [PMID: 26840719 DOI: 10.1016/j.bpj.2015.11.3527] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2015] [Revised: 10/15/2015] [Accepted: 11/02/2015] [Indexed: 01/02/2023] Open
Abstract
The binding process of a protein with a DNA involves three stages: approach, encounter, and association. It has been known that the complexation of protein and DNA involves mutual conformational changes, especially for a specific sequence association. However, it is still unclear how the conformation and the information in the DNA sequences affects the binding process. What is the extent to which the DNA structure adopted in the complex is induced by protein binding, or is instead intrinsic to the DNA sequence? In this study, we used the multiscale simulation method to explore the binding process of a protein with DNA in terms of DNA sequence, conformation, and interactions. We found that in the approach stage the protein can bind both the major and minor groove of the DNA, but uses different features to locate the binding site. The intrinsic conformational properties of the DNA play a significant role in this binding stage. By comparing the specific DNA with the nonspecific in unbound, intermediate, and associated states, we found that for a specific DNA sequence, ∼40% of the bending in the association forms is intrinsic and that ∼60% is induced by the protein. The protein does not induce appreciable bending of nonspecific DNA. In addition, we proposed that the DNA shape variations induced by protein binding are required in the early stage of the binding process, so that the protein is able to approach, encounter, and form an intermediate at the correct site on DNA.
Collapse
Affiliation(s)
- Chuanying Chen
- Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston, Texas
| | - B Montgomery Pettitt
- Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston, Texas.
| |
Collapse
|
22
|
Kachhap S, Priyadarshini P, Singh B. Molecular dynamics simulations show altered secondary structure of clawless in binary complex with DNA providing insights into aristaless-clawless-DNA ternary complex formation. J Biomol Struct Dyn 2016; 35:1153-1167. [PMID: 27058822 DOI: 10.1080/07391102.2016.1175967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Aristaless (Al) and clawless (Cll) homeodomains that are involved in leg development in Drosophila melanogaster are known to bind cooperatively to 5'-(T/C)TAATTAA(T/A)(T/A)G-3' DNA sequence, but the mechanism of their binding to DNA is unknown. Molecular dynamics (MD) studies have been carried out on binary, ternary, and reconstructed protein-DNA complexes involving Al, Cll, and DNA along with binding free energy analysis of these complexes. Analysis of MD trajectories of Cll-3A01, binary complex reveals that C-terminal end of helixIII of Cll, unwind in the absence of Al and remains so in reconstructed ternary complex, Cll-3A01-Al. In addition, this change in secondary structure of Cll does not allow it to form protein-protein interactions with Al in the ternary reconstructed complex. However, secondary structure of Cll and its interactions are maintained in other reconstructed ternary complex, Al-3A01-Cll where Cll binds to Al-3A01, binary complex to form ternary complex. These interactions as observed during MD simulations compare well with those observed in ternary crystal structure. Thus, this study highlights the role of helixIII of Cll and protein-protein interactions while proposing likely mechanism of recognition in ternary complex, Al-Cll-DNA.
Collapse
Affiliation(s)
- Sangita Kachhap
- a Bioinformatics Centre , Council of Scientific & Industrial Research - Institute of Microbial Technology , Sector 39A, Chandigarh , India
| | - Pragya Priyadarshini
- a Bioinformatics Centre , Council of Scientific & Industrial Research - Institute of Microbial Technology , Sector 39A, Chandigarh , India
| | - Balvinder Singh
- a Bioinformatics Centre , Council of Scientific & Industrial Research - Institute of Microbial Technology , Sector 39A, Chandigarh , India
| |
Collapse
|
23
|
Gu C, Zhang J, Yang YI, Chen X, Ge H, Sun Y, Su X, Yang L, Xie S, Gao YQ. DNA Structural Correlation in Short and Long Ranges. J Phys Chem B 2015; 119:13980-90. [PMID: 26439165 DOI: 10.1021/acs.jpcb.5b06217] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Recent single-molecule measurements have revealed the DNA allostery in protein/DNA binding. MD simulations showed that this allosteric effect is associated with the deformation properties of DNA. In this study, we used MD simulations to further investigate the mechanism of DNA structural correlation, its dependence on DNA sequence, and the chemical modification of the bases. Besides a random sequence, poly d(AT) and poly d(GC) are also used as simpler model systems, which show the different bending and twisting flexibilities. The base-stacking interactions and the methyl group on the 5-carbon site of thymine causes local structures and flexibility to be very different for the two model systems, which further lead to obviously different tendencies of the conformational deformations, including the long-range allosteric effects.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Sunney Xie
- Department of Chemistry and Chemical Biology, Harvard University , Cambridge, Massachusetts 02138, United States
| | | |
Collapse
|
24
|
Dantas Machado AC, Zhou T, Rao S, Goel P, Rastogi C, Lazarovici A, Bussemaker HJ, Rohs R. Evolving insights on how cytosine methylation affects protein-DNA binding. Brief Funct Genomics 2015; 14:61-73. [PMID: 25319759 PMCID: PMC4303714 DOI: 10.1093/bfgp/elu040] [Citation(s) in RCA: 79] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Many anecdotal observations exist of a regulatory effect of DNA methylation on gene expression. However, in general, the underlying mechanisms of this effect are poorly understood. In this review, we summarize what is currently known about how this important, but mysterious, epigenetic mark impacts cellular functions. Cytosine methylation can abrogate or enhance interactions with DNA-binding proteins, or it may have no effect, depending on the context. Despite being only a small chemical change, the addition of a methyl group to cytosine can affect base readout via hydrophobic contacts in the major groove and shape readout via electrostatic contacts in the minor groove. We discuss the recent discovery that CpG methylation increases DNase I cleavage at adjacent positions by an order of magnitude through altering the local 3D DNA shape and the possible implications of this structural insight for understanding the methylation sensitivity of transcription factors (TFs). Additionally, 5-methylcytosines change the stability of nucleosomes and, thus, affect the local chromatin structure and access of TFs to genomic DNA. Given these complexities, it seems unlikely that the influence of DNA methylation on protein-DNA binding can be captured in a small set of general rules. Hence, data-driven approaches may be essential to gain a better understanding of these mechanisms.
Collapse
|
25
|
Harris LA, Williams LD, Koudelka GB. Specific minor groove solvation is a crucial determinant of DNA binding site recognition. Nucleic Acids Res 2014; 42:14053-9. [PMID: 25429976 PMCID: PMC4267663 DOI: 10.1093/nar/gku1259] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
The DNA sequence preferences of nearly all sequence specific DNA binding proteins are influenced by the identities of bases that are not directly contacted by protein. Discrimination between non-contacted base sequences is commonly based on the differential abilities of DNA sequences to allow narrowing of the DNA minor groove. However, the factors that govern the propensity of minor groove narrowing are not completely understood. Here we show that the differential abilities of various DNA sequences to support formation of a highly ordered and stable minor groove solvation network are a key determinant of non-contacted base recognition by a sequence-specific binding protein. In addition, disrupting the solvent network in the non-contacted region of the binding site alters the protein's ability to recognize contacted base sequences at positions 5–6 bases away. This observation suggests that DNA solvent interactions link contacted and non-contacted base recognition by the protein.
Collapse
Affiliation(s)
- Lydia-Ann Harris
- Department of Biological Sciences, 607 Cooke Hall, University at Buffalo, Buffalo, NY 14260, USA
| | - Loren Dean Williams
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA
| | - Gerald B Koudelka
- Department of Biological Sciences, 607 Cooke Hall, University at Buffalo, Buffalo, NY 14260, USA
| |
Collapse
|
26
|
Chiu TP, Yang L, Zhou T, Main BJ, Parker SCJ, Nuzhdin SV, Tullius TD, Rohs R. GBshape: a genome browser database for DNA shape annotations. Nucleic Acids Res 2014; 43:D103-9. [PMID: 25326329 PMCID: PMC4384032 DOI: 10.1093/nar/gku977] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Many regulatory mechanisms require a high degree of specificity in protein-DNA binding. Nucleotide sequence does not provide an answer to the question of why a protein binds only to a small subset of the many putative binding sites in the genome that share the same core motif. Whereas higher-order effects, such as chromatin accessibility, cooperativity and cofactors, have been described, DNA shape recently gained attention as another feature that fine-tunes the DNA binding specificities of some transcription factor families. Our Genome Browser for DNA shape annotations (GBshape; freely available at http://rohslab.cmb.usc.edu/GBshape/) provides minor groove width, propeller twist, roll, helix twist and hydroxyl radical cleavage predictions for the entire genomes of 94 organisms. Additional genomes can easily be added using the GBshape framework. GBshape can be used to visualize DNA shape annotations qualitatively in a genome browser track format, and to download quantitative values of DNA shape features as a function of genomic position at nucleotide resolution. As biological applications, we illustrate the periodicity of DNA shape features that are present in nucleosome-occupied sequences from human, fly and worm, and we demonstrate structural similarities between transcription start sites in the genomes of four Drosophila species.
Collapse
Affiliation(s)
- Tsu-Pei Chiu
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Lin Yang
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Tianyin Zhou
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Bradley J Main
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Stephen C J Parker
- Departments of Computational Medicine and Bioinformatics and Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Sergey V Nuzhdin
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Thomas D Tullius
- Department of Chemistry and Program in Bioinformatics, Boston University, Boston, MA 02215, USA
| | - Remo Rohs
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA Departments of Chemistry, Physics, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
27
|
Slattery M, Zhou T, Yang L, Dantas Machado AC, Gordân R, Rohs R. Absence of a simple code: how transcription factors read the genome. Trends Biochem Sci 2014; 39:381-99. [PMID: 25129887 DOI: 10.1016/j.tibs.2014.07.002] [Citation(s) in RCA: 352] [Impact Index Per Article: 35.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2014] [Revised: 07/11/2014] [Accepted: 07/15/2014] [Indexed: 12/21/2022]
Abstract
Transcription factors (TFs) influence cell fate by interpreting the regulatory DNA within a genome. TFs recognize DNA in a specific manner; the mechanisms underlying this specificity have been identified for many TFs based on 3D structures of protein-DNA complexes. More recently, structural views have been complemented with data from high-throughput in vitro and in vivo explorations of the DNA-binding preferences of many TFs. Together, these approaches have greatly expanded our understanding of TF-DNA interactions. However, the mechanisms by which TFs select in vivo binding sites and alter gene expression remain unclear. Recent work has highlighted the many variables that influence TF-DNA binding, while demonstrating that a biophysical understanding of these many factors will be central to understanding TF function.
Collapse
Affiliation(s)
- Matthew Slattery
- Department of Biomedical Sciences, University of Minnesota Medical School, Duluth, MN 55812, USA; Developmental Biology Center, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Tianyin Zhou
- Molecular and Computational Biology Program, Departments of Biological Sciences, Chemistry, Physics, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Lin Yang
- Molecular and Computational Biology Program, Departments of Biological Sciences, Chemistry, Physics, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Ana Carolina Dantas Machado
- Molecular and Computational Biology Program, Departments of Biological Sciences, Chemistry, Physics, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Raluca Gordân
- Center for Genomic and Computational Biology, Departments of Biostatistics and Bioinformatics, Computer Science, and Molecular Genetics and Microbiology, Duke University, Durham, NC 27708, USA.
| | - Remo Rohs
- Molecular and Computational Biology Program, Departments of Biological Sciences, Chemistry, Physics, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA.
| |
Collapse
|
28
|
Abstract
Nucleosomes alter gene expression by preventing transcription factors from occupying binding sites along DNA. DNA methylation can affect nucleosome positioning and so alter gene expression epigenetically (without changing DNA sequence). Conventional methods to predict nucleosome occupancy are trained on observed DNA sequence patterns or known DNA oligonucleotide structures. They are statistical and lack the physics needed to predict subtle epigenetic changes due to DNA methylation. The training-free method presented here uses physical principles and state-of-the-art all-atom force fields to predict both nucleosome occupancy along genomic sequences as well as binding to known positioning sequences. Our method calculates the energy of both nucleosomal and linear DNA of the given sequence. Based on the DNA deformation energy, we accurately predict the in vitro occupancy profile observed experimentally for a 20,000-bp genomic region as well as the experimental locations of nucleosomes along 13 well-established positioning sequence elements. DNA with all C bases methylated at the 5 position shows less variation of nucleosome binding: Strong binding is weakened and weak binding is strengthened compared with normal DNA. Methylation also alters the preference of nucleosomes for some positioning sequences but not others.
Collapse
|
29
|
Wüstner D, Sklenar H. Atomistic Monte Carlo simulation of lipid membranes. Int J Mol Sci 2014; 15:1767-803. [PMID: 24469314 PMCID: PMC3958820 DOI: 10.3390/ijms15021767] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2013] [Revised: 12/06/2013] [Accepted: 01/09/2014] [Indexed: 02/07/2023] Open
Abstract
Biological membranes are complex assemblies of many different molecules of which analysis demands a variety of experimental and computational approaches. In this article, we explain challenges and advantages of atomistic Monte Carlo (MC) simulation of lipid membranes. We provide an introduction into the various move sets that are implemented in current MC methods for efficient conformational sampling of lipids and other molecules. In the second part, we demonstrate for a concrete example, how an atomistic local-move set can be implemented for MC simulations of phospholipid monomers and bilayer patches. We use our recently devised chain breakage/closure (CBC) local move set in the bond-/torsion angle space with the constant-bond-length approximation (CBLA) for the phospholipid dipalmitoylphosphatidylcholine (DPPC). We demonstrate rapid conformational equilibration for a single DPPC molecule, as assessed by calculation of molecular energies and entropies. We also show transition from a crystalline-like to a fluid DPPC bilayer by the CBC local-move MC method, as indicated by the electron density profile, head group orientation, area per lipid, and whole-lipid displacements. We discuss the potential of local-move MC methods in combination with molecular dynamics simulations, for example, for studying multi-component lipid membranes containing cholesterol.
Collapse
Affiliation(s)
- Daniel Wüstner
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense M DK-5230, Denmark.
| | - Heinz Sklenar
- Theoretical Biophysics Group, Max Delbrück Center for Molecular Medicine, Robert-Rössle-Str. 10, Berlin D-13125, Germany.
| |
Collapse
|
30
|
Sapienza PJ, Niu T, Kurpiewski MR, Grigorescu A, Jen-Jacobson L. Thermodynamic and structural basis for relaxation of specificity in protein-DNA recognition. J Mol Biol 2014; 426:84-104. [PMID: 24041571 PMCID: PMC3928799 DOI: 10.1016/j.jmb.2013.09.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Revised: 09/03/2013] [Accepted: 09/08/2013] [Indexed: 11/26/2022]
Abstract
As a novel approach to the structural and functional properties that give rise to extremely stringent sequence specificity in protein-DNA interactions, we have exploited "promiscuous" mutants of EcoRI endonuclease to study the detailed mechanism by which changes in a protein can relax specificity. The A138T promiscuous mutant protein binds more tightly to the cognate GAATTC site than does wild-type EcoRI yet displays relaxed specificity deriving from tighter binding and faster cleavage at EcoRI* sites (one incorrect base pair). AAATTC EcoRI* sites are cleaved by A138T up to 170-fold faster than by wild-type enzyme if the site is abutted by a 5'-purine-pyrimidine (5'-RY) motif. When wild-type protein binds to an EcoRI* site, it forms structurally adapted complexes with thermodynamic parameters of binding that differ markedly from those of specific complexes. By contrast, we show that A138T complexes with 5'-RY-flanked AAATTC sites are virtually indistinguishable from wild-type-specific complexes with respect to the heat capacity change upon binding (∆C°P), the change in excluded macromolecular volume upon association, and contacts to the phosphate backbone. While the preference for the 5'-RY motif implicates contacts to flanking bases as important for relaxed specificity, local effects are not sufficient to explain the large differences in ∆C°P and excluded volume, as these parameters report on global features of the complex. Our findings therefore support the view that specificity does not derive from the additive effects of individual interactions but rather from a set of cooperative events that are uniquely associated with specific recognition.
Collapse
Affiliation(s)
- Paul J Sapienza
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Tianyi Niu
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Michael R Kurpiewski
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Arabela Grigorescu
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Linda Jen-Jacobson
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA.
| |
Collapse
|
31
|
Zhang X, Dantas Machado AC, Ding Y, Chen Y, Lu Y, Duan Y, Tham KW, Chen L, Rohs R, Qin PZ. Conformations of p53 response elements in solution deduced using site-directed spin labeling and Monte Carlo sampling. Nucleic Acids Res 2013; 42:2789-97. [PMID: 24293651 PMCID: PMC3936745 DOI: 10.1093/nar/gkt1219] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
The tumor suppressor protein p53 regulates numerous signaling pathways by specifically recognizing diverse p53 response elements (REs). Understanding the mechanisms of p53-DNA interaction requires structural information on p53 REs. However, such information is limited as a 3D structure of any RE in the unbound form is not available yet. Here, site-directed spin labeling was used to probe the solution structures of REs involved in p53 regulation of the p21 and Bax genes. Multiple nanometer distances in the p21-RE and BAX-RE, measured using a nucleotide-independent nitroxide probe and double-electron-electron-resonance spectroscopy, were used to derive molecular models of unbound REs from pools of all-atom structures generated by Monte-Carlo simulations, thus enabling analyses to reveal sequence-dependent DNA shape features of unbound REs in solution. The data revealed distinct RE conformational changes on binding to the p53 core domain, and support the hypothesis that sequence-dependent properties encoded in REs are exploited by p53 to achieve the energetically most favorable mode of deformation, consequently enhancing binding specificity. This work reveals mechanisms of p53-DNA recognition, and establishes a new experimental/computational approach for studying DNA shape in solution that has far-reaching implications for studying protein-DNA interactions.
Collapse
Affiliation(s)
- Xiaojun Zhang
- Department of Chemistry, University of Southern California, Los Angeles, CA 90089, USA and Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Siggers T, Gordân R. Protein-DNA binding: complexities and multi-protein codes. Nucleic Acids Res 2013; 42:2099-111. [PMID: 24243859 PMCID: PMC3936734 DOI: 10.1093/nar/gkt1112] [Citation(s) in RCA: 153] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Binding of proteins to particular DNA sites across the genome is a primary determinant of specificity in genome maintenance and gene regulation. DNA-binding specificity is encoded at multiple levels, from the detailed biophysical interactions between proteins and DNA, to the assembly of multi-protein complexes. At each level, variation in the mechanisms used to achieve specificity has led to difficulties in constructing and applying simple models of DNA binding. We review the complexities in protein–DNA binding found at multiple levels and discuss how they confound the idea of simple recognition codes. We discuss the impact of new high-throughput technologies for the characterization of protein–DNA binding, and how these technologies are uncovering new complexities in protein–DNA recognition. Finally, we review the concept of multi-protein recognition codes in which new DNA-binding specificities are achieved by the assembly of multi-protein complexes.
Collapse
Affiliation(s)
- Trevor Siggers
- Department of Biology, Boston University, Boston, MA 02215, USA, Departments of Biostatistics and Bioinformatics, Computer Science, and Molecular Genetics and Microbiology, Institute for Genome Sciences and Policy, Duke University, Durham, NC 27708, USA
| | | |
Collapse
|
33
|
Wan H, Hu JP, Li KS, Tian XH, Chang S. Molecular dynamics simulations of DNA-free and DNA-bound TAL effectors. PLoS One 2013; 8:e76045. [PMID: 24130757 PMCID: PMC3794935 DOI: 10.1371/journal.pone.0076045] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2013] [Accepted: 08/22/2013] [Indexed: 12/05/2022] Open
Abstract
TAL (transcriptional activator-like) effectors (TALEs) are DNA-binding proteins, containing a modular central domain that recognizes specific DNA sequences. Recently, the crystallographic studies of TALEs revealed the structure of DNA-recognition domain. In this article, molecular dynamics (MD) simulations are employed to study two crystal structures of an 11.5-repeat TALE, in the presence and absence of DNA, respectively. The simulated results indicate that the specific binding of RVDs (repeat-variable diresidues) with DNA leads to the markedly reduced fluctuations of tandem repeats, especially at the two ends. In the DNA-bound TALE system, the base-specific interaction is formed mainly by the residue at position 13 within a TAL repeat. Tandem repeats with weak RVDs are unfavorable for the TALE-DNA binding. These observations are consistent with experimental studies. By using principal component analysis (PCA), the dominant motions are open-close movements between the two ends of the superhelical structure in both DNA-free and DNA-bound TALE systems. The open-close movements are found to be critical for the recognition and binding of TALE-DNA based on the analysis of free energy landscape (FEL). The conformational analysis of DNA indicates that the 5′ end of DNA target sequence has more remarkable structural deformability than the other sites. Meanwhile, the conformational change of DNA is likely associated with the specific interaction of TALE-DNA. We further suggest that the arrangement of N-terminal repeats with strong RVDs may help in the design of efficient TALEs. This study provides some new insights into the understanding of the TALE-DNA recognition mechanism.
Collapse
Affiliation(s)
- Hua Wan
- College of Informatics, South China Agricultural University, Guangzhou, China
| | - Jian-ping Hu
- College of Chemistry, Leshan Normal University, Leshan, China
| | - Kang-shun Li
- College of Informatics, South China Agricultural University, Guangzhou, China
| | - Xu-hong Tian
- College of Informatics, South China Agricultural University, Guangzhou, China
| | - Shan Chang
- College of Informatics, South China Agricultural University, Guangzhou, China
- * E-mail:
| |
Collapse
|
34
|
Dror I, Zhou T, Mandel-Gutfreund Y, Rohs R. Covariation between homeodomain transcription factors and the shape of their DNA binding sites. Nucleic Acids Res 2013; 42:430-41. [PMID: 24078250 PMCID: PMC3874178 DOI: 10.1093/nar/gkt862] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Protein–DNA recognition is a critical component of gene regulatory processes but the underlying molecular mechanisms are not yet completely understood. Whereas the DNA binding preferences of transcription factors (TFs) are commonly described using nucleotide sequences, the 3D DNA structure is recognized by proteins and is crucial for achieving binding specificity. However, the ability to analyze DNA shape in a high-throughput manner made it only recently feasible to integrate structural information into studies of protein–DNA binding. Here we focused on the homeodomain family of TFs and analyzed the DNA shape of thousands of their DNA binding sites, investigating the covariation between the protein sequence and the sequence and shape of their DNA targets. We found distinct homeodomain regions that were more correlated with either the nucleotide sequence or the DNA shape of their preferred binding sites, demonstrating different readout mechanisms through which homeodomains attain DNA binding specificity. We identified specific homeodomain residues that likely play key roles in DNA recognition via shape readout. Finally, we showed that adding DNA shape information when characterizing binding sites improved the prediction accuracy of homeodomain binding specificities. Taken together, our findings indicate that DNA shape information can generally provide new mechanistic insights into TF binding.
Collapse
Affiliation(s)
- Iris Dror
- Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089, USA and Department of Biology, Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel
| | | | | | | |
Collapse
|
35
|
Yao MD, Ohtsuka J, Nagata K, Miyazono KI, Zhi Y, Ohnishi Y, Tanokura M. Complex structure of the DNA-binding domain of AdpA, the global transcription factor in Streptomyces griseus, and a target duplex DNA reveals the structural basis of its tolerant DNA sequence specificity. J Biol Chem 2013; 288:31019-29. [PMID: 24019524 DOI: 10.1074/jbc.m113.473611] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
AdpA serves as the global transcription factor in the A-factor regulatory cascade, controlling the secondary metabolism and morphological differentiation of the filamentous bacterium Streptomyces griseus. AdpA binds to over 500 operator regions with the consensus sequence 5'-TGGCSNGWWY-3' (where S is G or C, W is A or T, Y is T or C, and N is any nucleotide). However, it is still obscure how AdpA can control hundreds of genes. To elucidate the structural basis of this tolerant DNA recognition by AdpA, we focused on the interaction between the DNA-binding domain of AdpA (AdpA-DBD), which consists of two helix-turn-helix motifs, and a target duplex DNA containing the consensus sequence 5'-TGGCGGGTTC-3'. The crystal structure of the AdpA-DBD-DNA complex and the mutant analysis of AdpA-DBD revealed its unique manner of DNA recognition, whereby only two arginine residues directly recognize the consensus sequence, explaining the strict recognition of G and C at positions 2 and 4, respectively, and the tolerant recognition of other positions of the consensus sequence. AdpA-DBD confers tolerant DNA sequence specificity to AdpA, allowing it to control hundreds of genes as a global transcription factor.
Collapse
Affiliation(s)
- Ming Dong Yao
- From the Departments of Applied Biological Chemistry and
| | | | | | | | | | | | | |
Collapse
|
36
|
Porrúa O, López-Sánchez A, Platero AI, Santero E, Shingler V, Govantes F. An A-tract at the AtzR binding site assists DNA binding, inducer-dependent repositioning and transcriptional activation of the PatzDEF promoter. Mol Microbiol 2013; 90:72-87. [PMID: 23906008 DOI: 10.1111/mmi.12346] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/24/2013] [Indexed: 11/29/2022]
Abstract
The LysR-type regulator AtzR activates the Pseudomonas sp. ADP atzDEF operon in response to nitrogen limitation and cyanuric acid. Activation involves repositioning of the AtzR tetramer on the PatzDEF promoter and relaxation of an AtzR-induced DNA bend. Here we examine the in vivo and in vitro contribution of an A5 -tract present at the PatzDEF promoter region to AtzR binding and transcriptional activation. Substitution of the A-tract for the sequence ACTCA prevented PatzDEF activation and high-affinity AtzR binding, impaired AtzR contacts with the activator binding site and shifted the position of the AtzR-induced DNA bend. Analysis of a collection of mutants bearing different alterations in the A-tract sequence showed that the extent of AtzR-dependent activation does not correlate with the magnitude or orientation of the spontaneous DNA bend generated at this site. Our results support the notion that indirect readout of the A-tract-associated narrow minor groove is essential for the AtzR-DNA complex to achieve a conformation competent for activation of the PatzDEF promoter. Conservation of this motif in several binding sites of LysR-type regulators suggests that this mechanism may be shared by other proteins in this family.
Collapse
Affiliation(s)
- Odil Porrúa
- Centro Andaluz de Biología del Desarrollo, Universidad Pablo de Olavide/Consejo Superior de Investigaciones Científicas/Junta de Andalucía, Carretera de Utrera, Km. 1, 41013, Sevilla, Spain; Departamento de Biología Molecular e Ingeniería Bioquímica, Universidad Pablo de Olavide, Carretera de Utrera, Km. 1, 41013, Sevilla, Spain
| | | | | | | | | | | |
Collapse
|
37
|
Zhou T, Yang L, Lu Y, Dror I, Dantas Machado AC, Ghane T, Di Felice R, Rohs R. DNAshape: a method for the high-throughput prediction of DNA structural features on a genomic scale. Nucleic Acids Res 2013; 41:W56-62. [PMID: 23703209 PMCID: PMC3692085 DOI: 10.1093/nar/gkt437] [Citation(s) in RCA: 211] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We present a method and web server for predicting DNA structural features in a high-throughput (HT) manner for massive sequence data. This approach provides the framework for the integration of DNA sequence and shape analyses in genome-wide studies. The HT methodology uses a sliding-window approach to mine DNA structural information obtained from Monte Carlo simulations. It requires only nucleotide sequence as input and instantly predicts multiple structural features of DNA (minor groove width, roll, propeller twist and helix twist). The results of rigorous validations of the HT predictions based on DNA structures solved by X-ray crystallography and NMR spectroscopy, hydroxyl radical cleavage data, statistical analysis and cross-validation, and molecular dynamics simulations provide strong confidence in this approach. The DNAshape web server is freely available at http://rohslab.cmb.usc.edu/DNAshape/.
Collapse
Affiliation(s)
- Tianyin Zhou
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | | | | | | | | | | | | | | |
Collapse
|
38
|
Genomic regions flanking E-box binding sites influence DNA binding specificity of bHLH transcription factors through DNA shape. Cell Rep 2013; 3:1093-104. [PMID: 23562153 DOI: 10.1016/j.celrep.2013.03.014] [Citation(s) in RCA: 222] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2012] [Revised: 02/12/2013] [Accepted: 03/12/2013] [Indexed: 01/07/2023] Open
Abstract
DNA sequence is a major determinant of the binding specificity of transcription factors (TFs) for their genomic targets. However, eukaryotic cells often express, at the same time, TFs with highly similar DNA binding motifs but distinct in vivo targets. Currently, it is not well understood how TFs with seemingly identical DNA motifs achieve unique specificities in vivo. Here, we used custom protein-binding microarrays to analyze TF specificity for putative binding sites in their genomic sequence context. Using yeast TFs Cbf1 and Tye7 as our case studies, we found that binding sites of these bHLH TFs (i.e., E-boxes) are bound differently in vitro and in vivo, depending on their genomic context. Computational analyses suggest that nucleotides outside E-box binding sites contribute to specificity by influencing the three-dimensional structure of DNA binding sites. Thus, the local shape of target sites might play a widespread role in achieving regulatory specificity within TF families.
Collapse
|
39
|
Chang YP, Xu M, Machado ACD, Yu XJ, Rohs R, Chen XS. Mechanism of origin DNA recognition and assembly of an initiator-helicase complex by SV40 large tumor antigen. Cell Rep 2013; 3:1117-27. [PMID: 23545501 DOI: 10.1016/j.celrep.2013.03.002] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2012] [Revised: 01/10/2013] [Accepted: 03/01/2013] [Indexed: 10/27/2022] Open
Abstract
The DNA tumor virus Simian virus 40 (SV40) is a model system for studying eukaryotic replication. SV40 large tumor antigen (LTag) is the initiator/helicase that is essential for genome replication. LTag recognizes and assembles at the viral replication origin. We determined the structure of two multidomain LTag subunits bound to origin DNA. The structure reveals that the origin binding domains (OBDs) and Zn and AAA+ domains are involved in origin recognition and assembly. Notably, the OBDs recognize the origin in an unexpected manner. The histidine residues of the AAA+ domains insert into a narrow minor groove region with enhanced negative electrostatic potential. Computational analysis indicates that this region is intrinsically narrow, demonstrating the role of DNA shape readout in origin recognition. Our results provide important insights into the assembly of the LTag initiator/helicase at the replication origin and suggest that histidine contacts with the minor groove serve as a mechanism of DNA shape readout.
Collapse
Affiliation(s)
- Y Paul Chang
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | | | | | | | | | | |
Collapse
|
40
|
Liu LA, Bradley P. Atomistic modeling of protein-DNA interaction specificity: progress and applications. Curr Opin Struct Biol 2012; 22:397-405. [PMID: 22796087 DOI: 10.1016/j.sbi.2012.06.002] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2012] [Accepted: 06/20/2012] [Indexed: 12/22/2022]
Abstract
An accurate, predictive understanding of protein-DNA binding specificity is crucial for the successful design and engineering of novel protein-DNA binding complexes. In this review, we summarize recent studies that use atomistic representations of interfaces to predict protein-DNA binding specificity computationally. Although methods with limited structural flexibility have proven successful at recapitulating consensus binding sequences from wild-type complex structures, conformational flexibility is likely important for design and template-based modeling, where non-native conformations need to be sampled and accurately scored. A successful application of such computational modeling techniques in the construction of the TAL-DNA complex structure is discussed. With continued improvements in energy functions, solvation models, and conformational sampling, we are optimistic that reliable and large-scale protein-DNA binding prediction and engineering is a goal within reach.
Collapse
|
41
|
Pryor EE, Waligora EA, Xu B, Dellos-Nolan S, Wozniak DJ, Hollis T. The transcription factor AmrZ utilizes multiple DNA binding modes to recognize activator and repressor sequences of Pseudomonas aeruginosa virulence genes. PLoS Pathog 2012; 8:e1002648. [PMID: 22511872 PMCID: PMC3325190 DOI: 10.1371/journal.ppat.1002648] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2011] [Accepted: 03/02/2012] [Indexed: 01/07/2023] Open
Abstract
AmrZ, a member of the Ribbon-Helix-Helix family of DNA binding proteins, functions as both a transcriptional activator and repressor of multiple genes encoding Pseudomonas aeruginosa virulence factors. The expression of these virulence factors leads to chronic and sustained infections associated with worsening prognosis. In this study, we present the X-ray crystal structure of AmrZ in complex with DNA containing the repressor site, amrZ1. Binding of AmrZ to this site leads to auto-repression. AmrZ binds this DNA sequence as a dimer-of-dimers, and makes specific base contacts to two half sites, separated by a five base pair linker region. Analysis of the linker region shows a narrowing of the minor groove, causing significant distortions. AmrZ binding assays utilizing sequences containing variations in this linker region reveals that secondary structure of the DNA, conferred by the sequence of this region, is an important determinant in binding affinity. The results from these experiments allow for the creation of a model where both intrinsic structure of the DNA and specific nucleotide recognition are absolutely necessary for binding of the protein. We also examined AmrZ binding to the algD promoter, which results in activation of the alginate exopolysaccharide biosynthetic operon, and found the protein utilizes different interactions with this site. Finally, we tested the in vivo effects of this differential binding by switching the AmrZ binding site at algD, where it acts as an activator, for a repressor binding sequence and show that differences in binding alone do not affect transcriptional regulation.
Collapse
Affiliation(s)
- Edward E. Pryor
- Department of Biochemistry and Center for Structural Biology, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| | - Elizabeth A. Waligora
- Department of Microbiology and Immunology, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| | - Binjie Xu
- Departments of Microbiology and Microbial Infection and Immunity, Center for Microbial Interface Biology, The Ohio State University, Columbus, Ohio, United States of America
| | - Sheri Dellos-Nolan
- Departments of Microbiology and Microbial Infection and Immunity, Center for Microbial Interface Biology, The Ohio State University, Columbus, Ohio, United States of America
| | - Daniel J. Wozniak
- Department of Microbiology and Immunology, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
- Departments of Microbiology and Microbial Infection and Immunity, Center for Microbial Interface Biology, The Ohio State University, Columbus, Ohio, United States of America
| | - Thomas Hollis
- Department of Biochemistry and Center for Structural Biology, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
- * E-mail:
| |
Collapse
|
42
|
Slattery M, Riley T, Liu P, Abe N, Gomez-Alcala P, Dror I, Zhou T, Rohs R, Honig B, Bussemaker HJ, Mann RS. Cofactor binding evokes latent differences in DNA binding specificity between Hox proteins. Cell 2012; 147:1270-82. [PMID: 22153072 DOI: 10.1016/j.cell.2011.10.053] [Citation(s) in RCA: 374] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2011] [Revised: 08/19/2011] [Accepted: 10/06/2011] [Indexed: 11/30/2022]
Abstract
Members of transcription factor families typically have similar DNA binding specificities yet execute unique functions in vivo. Transcription factors often bind DNA as multiprotein complexes, raising the possibility that complex formation might modify their DNA binding specificities. To test this hypothesis, we developed an experimental and computational platform, SELEX-seq, that can be used to determine the relative affinities to any DNA sequence for any transcription factor complex. Applying this method to all eight Drosophila Hox proteins, we show that they obtain novel recognition properties when they bind DNA with the dimeric cofactor Extradenticle-Homothorax (Exd). Exd-Hox specificities group into three main classes that obey Hox gene collinearity rules and DNA structure predictions suggest that anterior and posterior Hox proteins prefer DNA sequences with distinct minor groove topographies. Together, these data suggest that emergent DNA recognition properties revealed by interactions with cofactors contribute to transcription factor specificities in vivo.
Collapse
Affiliation(s)
- Matthew Slattery
- Department of Biochemistry and Molecular Biophysics, Columbia University, 701 West 168(th) Street, HHSC 1104, New York, NY 10032, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
43
|
Bishop EP, Rohs R, Parker SCJ, West SM, Liu P, Mann RS, Honig B, Tullius TD. A map of minor groove shape and electrostatic potential from hydroxyl radical cleavage patterns of DNA. ACS Chem Biol 2011; 6:1314-20. [PMID: 21967305 DOI: 10.1021/cb200155t] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
DNA shape variation and the associated variation in minor groove electrostatic potential are widely exploited by proteins for DNA recognition. Here we show that the hydroxyl radical cleavage pattern is a quantitative measure of DNA backbone solvent accessibility, minor groove width, and minor groove electrostatic potential, at single nucleotide resolution. We introduce maps of DNA shape and electrostatic potential as tools for understanding how proteins recognize binding sites in a genome. These maps reveal periodic structural signals in yeast and Drosophila genomic DNA sequences that are associated with positioned nucleosomes.
Collapse
Affiliation(s)
| | - Remo Rohs
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, United States
| | - Stephen C. J. Parker
- National Human Genome Research Institute, National Institutes of Health, Rockville, Maryland 20852, United States
| | | | | | | | | | | |
Collapse
|
44
|
Minary P, Levitt M. Conformational optimization with natural degrees of freedom: a novel stochastic chain closure algorithm. J Comput Biol 2010; 17:993-1010. [PMID: 20726792 DOI: 10.1089/cmb.2010.0016] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The present article introduces a set of novel methods that facilitate the use of "natural moves" or arbitrary degrees of freedom that can give rise to collective rearrangements in the structure of biological macromolecules. While such "natural moves" may spoil the stereochemistry and even break the bonded chain at multiple locations, our new method restores the correct chain geometry by adjusting bond and torsion angles in an arbitrary defined molten zone. This is done by successive stages of partial closure that propagate the location of the chain break backwards along the chain. At the end of these stages, the size of the chain break is generally reduced so much that it can be repaired by adjusting the position of a single atom. Our chain closure method is efficient with a computational complexity of O(N(d)), where N(d) is the number of degrees of freedom used to repair the chain break. The new method facilitates the use of arbitrary degrees of freedom including the "natural" degrees of freedom inferred from analyzing experimental (X-ray crystallography and nuclear magnetic resonance [NMR]) structures of nucleic acids and proteins. In terms of its ability to generate large conformational moves and its effectiveness in locating low energy states, the new method is robust and computationally efficient.
Collapse
Affiliation(s)
- Peter Minary
- Department of Structural Biology, Stanford University School of Medicine, Stanford, California 94305, USA.
| | | |
Collapse
|
45
|
Rohs R, Jin X, West SM, Joshi R, Honig B, Mann RS. Origins of specificity in protein-DNA recognition. Annu Rev Biochem 2010; 79:233-69. [PMID: 20334529 DOI: 10.1146/annurev-biochem-060408-091030] [Citation(s) in RCA: 672] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Specific interactions between proteins and DNA are fundamental to many biological processes. In this review, we provide a revised view of protein-DNA interactions that emphasizes the importance of the three-dimensional structures of both macromolecules. We divide protein-DNA interactions into two categories: those when the protein recognizes the unique chemical signatures of the DNA bases (base readout) and those when the protein recognizes a sequence-dependent DNA shape (shape readout). We further divide base readout into those interactions that occur in the major groove from those that occur in the minor groove. Analogously, the readout of the DNA shape is subdivided into global shape recognition (for example, when the DNA helix exhibits an overall bend) and local shape recognition (for example, when a base pair step is kinked or a region of the minor groove is narrow). Based on the >1500 structures of protein-DNA complexes now available in the Protein Data Bank, we argue that individual DNA-binding proteins combine multiple readout mechanisms to achieve DNA-binding specificity. Specificity that distinguishes between families frequently involves base readout in the major groove, whereas shape readout is often exploited for higher resolution specificity, to distinguish between members within the same DNA-binding protein family.
Collapse
Affiliation(s)
- Remo Rohs
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10032, USA
| | | | | | | | | | | |
Collapse
|
46
|
West SM, Rohs R, Mann RS, Honig B. Electrostatic interactions between arginines and the minor groove in the nucleosome. J Biomol Struct Dyn 2010; 27:861-6. [PMID: 20232938 PMCID: PMC2946858 DOI: 10.1080/07391102.2010.10508587] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Proteins rely on a variety of readout mechanisms to preferentially bind specific DNA sequences. The nucleosome offers a prominent example of a shape readout mechanism where arginines insert into narrow minor groove regions that face the histone core. Here we compare DNA shape and arginine recognition of three nucleosome core particle structures, expanding on our previous study by characterizing two additional structures, one with a different protein sequence and one with a different DNA sequence. The electrostatic potential in the minor groove is shown to be largely independent of the underlying sequence but is, however, dominated by groove geometry. Our results extend and generalize our previous observation that the interaction of arginines with narrow minor grooves plays an important role in stabilizing the deformed DNA in the nucleosome.
Collapse
Affiliation(s)
- Sean M. West
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, 1130 St. Nicholas Avenue, New York, NY 10032, USA
| | - Remo Rohs
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, 1130 St. Nicholas Avenue, New York, NY 10032, USA
| | - Richard S. Mann
- Department of Biochemistry and Molecular Biophysics, Columbia University, 701 West 168 Street, HHSC 1104, New York, NY 10032, USA
| | - Barry Honig
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, 1130 St. Nicholas Avenue, New York, NY 10032, USA
| |
Collapse
|
47
|
Abstract
The recognition of specific DNA sequences by proteins is thought to depend on two types of mechanisms: one that involves the formation of hydrogen bonds with specific bases, primarily in the major groove, and one involving sequence-dependent deformations of the DNA helix. By comprehensively analyzing the three dimensional structures of protein-DNA complexes, we show that the binding of arginines to narrow minor grooves is a widely used mode for protein-DNA recognition. This readout mechanism exploits the phenomenon that narrow minor grooves strongly enhance the negative electrostatic potential of the DNA. The nucleosome core particle offers a striking example of this effect. Minor groove narrowing is often associated with the presence of A-tracts, AT-rich sequences that exclude the flexible TpA step. These findings suggest that the ability to detect local variations in DNA shape and electrostatic potential is a general mechanism that enables proteins to use information in the minor groove, which otherwise offers few opportunities for the formation of base-specific hydrogen bonds, to achieve DNA binding specificity.
Collapse
|
48
|
Abstract
AbstractShort runs of adenines are a ubiquitous DNA element in regulatory regions of many organisms. When runs of 4–6 adenine base pairs (‘A-tracts’) are repeated with the helical periodicity, they give rise to global curvature of the DNA double helix, which can be macroscopically characterized by anomalously slow migration on polyacrylamide gels. The molecular structure of these DNA tracts is unusual and distinct from that of canonical B-DNA. We review here our current knowledge about the molecular details of A-tract structure and its interaction with sequences flanking them of either side and with the environment. Various molecular models were proposed to describe A-tract structure and how it causes global deflection of the DNA helical axis. We review old and recent findings that enable us to amalgamate the various findings to one model that conforms to the experimental data. Sequences containing phased repeats of A-tracts have from the very beginning been synonymous with global intrinsic DNA bending. In this review, we show that very often it is the unique structure of A-tracts that is at the basis of their widespread occurrence in regulatory regions of many organisms. Thus, the biological importance of A-tracts may often be residing in their distinct structure rather than in the global curvature that they induce on sequences containing them.
Collapse
|
49
|
Rohs R, West SM, Liu P, Honig B. Nuance in the double-helix and its role in protein-DNA recognition. Curr Opin Struct Biol 2009; 19:171-7. [PMID: 19362815 PMCID: PMC2701566 DOI: 10.1016/j.sbi.2009.03.002] [Citation(s) in RCA: 79] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2009] [Revised: 02/25/2009] [Accepted: 03/03/2009] [Indexed: 10/20/2022]
Abstract
It has been known for some time that the double-helix is not a uniform structure but rather exhibits sequence-specific variations that, combined with base-specific intermolecular interactions, offer the possibility of numerous modes of protein-DNA recognition. All-atom simulations have revealed mechanistic insights into the structural and energetic basis of various recognition mechanisms for a number of protein-DNA complexes while coarser grained simulations have begun to provide an understanding of the function of larger assemblies. Molecular simulations have also been applied to the prediction of transcription factor binding sites, while empirical approaches have been developed to predict nucleosome positioning. Studies that combine and integrate experimental, statistical and computational data offer the promise of rapid advances in our understanding of protein-DNA recognition mechanisms.
Collapse
Affiliation(s)
- Remo Rohs
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Columbia University, 1130 St., Nicholas Avenue, New York, NY 10032, USADepartment of Biochemistry and Molecular Biophysics, Columbia University, 630 West, 168 Street, New York, NY 10032, USA
| | - Sean M. West
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Columbia University, 1130 St., Nicholas Avenue, New York, NY 10032, USADepartment of Biochemistry and Molecular Biophysics, Columbia University, 630 West, 168 Street, New York, NY 10032, USA
| | - Peng Liu
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Columbia University, 1130 St., Nicholas Avenue, New York, NY 10032, USADepartment of Biochemistry and Molecular Biophysics, Columbia University, 630 West, 168 Street, New York, NY 10032, USA
| | - Barry Honig
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Columbia University, 1130 St., Nicholas Avenue, New York, NY 10032, USADepartment of Biochemistry and Molecular Biophysics, Columbia University, 630 West, 168 Street, New York, NY 10032, USA
| |
Collapse
|
50
|
Protein Sliding along DNA: Dynamics and Structural Characterization. J Mol Biol 2009; 385:1087-97. [DOI: 10.1016/j.jmb.2008.11.016] [Citation(s) in RCA: 170] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2008] [Revised: 10/07/2008] [Accepted: 11/11/2008] [Indexed: 10/21/2022]
|