1
|
Bonnell V, Zhang Y, Brown A, Horton J, Josling G, Chiu TP, Rohs R, Mahony S, Gordân R, Llinás M. DNA sequence and chromatin differentiate sequence-specific transcription factor binding in the human malaria parasite Plasmodium falciparum. Nucleic Acids Res 2024; 52:10161-10179. [PMID: 38966997 PMCID: PMC11417369 DOI: 10.1093/nar/gkae585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 05/30/2024] [Accepted: 06/27/2024] [Indexed: 07/06/2024] Open
Abstract
Development of the malaria parasite, Plasmodium falciparum, is regulated by a limited number of sequence-specific transcription factors (TFs). However, the mechanisms by which these TFs recognize genome-wide binding sites is largely unknown. To address TF specificity, we investigated the binding of two TF subsets that either bind CACACA or GTGCAC DNA sequence motifs and further characterized two additional ApiAP2 TFs, PfAP2-G and PfAP2-EXP, which bind unique DNA motifs (GTAC and TGCATGCA). We also interrogated the impact of DNA sequence and chromatin context on P. falciparum TF binding by integrating high-throughput in vitro and in vivo binding assays, DNA shape predictions, epigenetic post-translational modifications, and chromatin accessibility. We found that DNA sequence context minimally impacts binding site selection for paralogous CACACA-binding TFs, while chromatin accessibility, epigenetic patterns, co-factor recruitment, and dimerization correlate with differential binding. In contrast, GTGCAC-binding TFs prefer different DNA sequence context in addition to chromatin dynamics. Finally, we determined that TFs that preferentially bind divergent DNA motifs may bind overlapping genomic regions due to low-affinity binding to other sequence motifs. Our results demonstrate that TF binding site selection relies on a combination of DNA sequence and chromatin features, thereby contributing to the complexity of P. falciparum gene regulatory mechanisms.
Collapse
Affiliation(s)
- Victoria A Bonnell
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Malaria Research, The Pennsylvania State University, University Park, PA 16802, USA
| | - Yuning Zhang
- Center for Genomic and Computational Biology, Duke University, Durham, NC 27708, USA
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27708, USA
- Program in Computational Biology and Bioinformatics, Duke University, Durham, NC 27708, USA
| | - Alan S Brown
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Malaria Research, The Pennsylvania State University, University Park, PA 16802, USA
| | - John Horton
- Center for Genomic and Computational Biology, Duke University, Durham, NC 27708, USA
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27708, USA
| | - Gabrielle A Josling
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Malaria Research, The Pennsylvania State University, University Park, PA 16802, USA
| | - Tsu-Pei Chiu
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Remo Rohs
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
- Department of Chemistry, University of Southern California, Los Angeles, CA 90089, USA
- Department of Physics and Astronomy, University of Southern California, Los Angeles, CA 90089, USA
- Thomas Lord Department of Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Shaun Mahony
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Raluca Gordân
- Center for Genomic and Computational Biology, Duke University, Durham, NC 27708, USA
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27708, USA
- Department of Computer Science, Duke University, Durham, NC 27708, USA
- Department of Molecular Genetics and Microbiology, Duke University, Durham, NC 27708, USA
| | - Manuel Llinás
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes Center for Malaria Research, The Pennsylvania State University, University Park, PA 16802, USA
- Department of Chemistry, The Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
2
|
Xu J, Gao Y, Lu Q, Zhang R, Gui J, Liu X, Yue Z. RiceSNP-BST: a deep learning framework for predicting biotic stress-associated SNPs in rice. Brief Bioinform 2024; 25:bbae599. [PMID: 39562160 PMCID: PMC11576077 DOI: 10.1093/bib/bbae599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Revised: 10/07/2024] [Accepted: 11/04/2024] [Indexed: 11/21/2024] Open
Abstract
Rice consistently faces significant threats from biotic stresses, such as fungi, bacteria, pests, and viruses. Consequently, accurately and rapidly identifying previously unknown single-nucleotide polymorphisms (SNPs) in the rice genome is a critical challenge for rice research and the development of resistant varieties. However, the limited availability of high-quality rice genotype data has hindered this research. Deep learning has transformed biological research by facilitating the prediction and analysis of SNPs in biological sequence data. Convolutional neural networks are especially effective in extracting structural and local features from DNA sequences, leading to significant advancements in genomics. Nevertheless, the expanding catalog of genome-wide association studies provides valuable biological insights for rice research. Expanding on this idea, we introduce RiceSNP-BST, an automatic architecture search framework designed to predict SNPs associated with rice biotic stress traits (BST-associated SNPs) by integrating multidimensional features. Notably, the model successfully innovates the datasets, offering more precision than state-of-the-art methods while demonstrating good performance on an independent test set and cross-species datasets. Additionally, we extracted features from the original DNA sequences and employed causal inference to enhance the biological interpretability of the model. This study highlights the potential of RiceSNP-BST in advancing genome prediction in rice. Furthermore, a user-friendly web server for RiceSNP-BST (http://rice-snp-bst.aielab.cc) has been developed to support broader genome research.
Collapse
Affiliation(s)
- Jiajun Xu
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
| | - Yujia Gao
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
| | - Quan Lu
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
| | - Renyi Zhang
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
| | - Jianfeng Gui
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
| | - Xiaoshuang Liu
- Research Center for Biological Breeding Technology, Advance Academy, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
| | - Zhenyu Yue
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
- Research Center for Biological Breeding Technology, Advance Academy, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
| |
Collapse
|
3
|
Reuter LM, Khadayate SP, Mossler A, Liebl K, Faull SV, Karimi MM, Speck C. MCM2-7 loading-dependent ORC release ensures genome-wide origin licensing. Nat Commun 2024; 15:7306. [PMID: 39181881 PMCID: PMC11344781 DOI: 10.1038/s41467-024-51538-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 08/09/2024] [Indexed: 08/27/2024] Open
Abstract
Origin recognition complex (ORC)-dependent loading of the replicative helicase MCM2-7 onto replication origins in G1-phase forms the basis of replication fork establishment in S-phase. However, how ORC and MCM2-7 facilitate genome-wide DNA licensing is not fully understood. Mapping the molecular footprints of budding yeast ORC and MCM2-7 genome-wide, we discovered that MCM2-7 loading is associated with ORC release from origins and redistribution to non-origin sites. Our bioinformatic analysis revealed that origins are compact units, where a single MCM2-7 double hexamer blocks repetitive loading through steric ORC binding site occlusion. Analyses of A-elements and an improved B2-element consensus motif uncovered that DNA shape, DNA flexibility, and the correct, face-to-face spacing of the two DNA elements are hallmarks of ORC-binding and efficient helicase loading sites. Thus, our work identified fundamental principles for MCM2-7 helicase loading that explain how origin licensing is realised across the genome.
Collapse
Affiliation(s)
- L Maximilian Reuter
- DNA Replication Group, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, United Kingdom.
- Institute of Molecular Biology (IMB) gGmbH, Ackermannweg 4, Mainz, Germany.
| | | | - Audrey Mossler
- DNA Replication Group, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Korbinian Liebl
- Department of Chemistry, Chicago Center for Theoretical Chemistry, Institute for Biophysical Dynamics, and James Franck Institute, The University of Chicago, Chicago, IL, USA
| | - Sarah V Faull
- DNA Replication Group, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Mohammad M Karimi
- MRC London Institute of Medical Sciences (LMS), London, United Kingdom
- Comprehensive Cancer Centre, School of Cancer & Pharmaceutical Sciences, Faculty of Life Sciences & Medicine, King's College London, London, United Kingdom
| | - Christian Speck
- DNA Replication Group, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, United Kingdom.
- MRC London Institute of Medical Sciences (LMS), London, United Kingdom.
| |
Collapse
|
4
|
Xu C, Kleinschmidt H, Yang J, Leith EM, Johnson J, Tan S, Mahony S, Bai L. Systematic dissection of sequence features affecting binding specificity of a pioneer factor reveals binding synergy between FOXA1 and AP-1. Mol Cell 2024; 84:2838-2855.e10. [PMID: 39019045 PMCID: PMC11334613 DOI: 10.1016/j.molcel.2024.06.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 04/23/2024] [Accepted: 06/21/2024] [Indexed: 07/19/2024]
Abstract
Despite the unique ability of pioneer factors (PFs) to target nucleosomal sites in closed chromatin, they only bind a small fraction of their genomic motifs. The underlying mechanism of this selectivity is not well understood. Here, we design a high-throughput assay called chromatin immunoprecipitation with integrated synthetic oligonucleotides (ChIP-ISO) to systematically dissect sequence features affecting the binding specificity of a classic PF, FOXA1, in human A549 cells. Combining ChIP-ISO with in vitro and neural network analyses, we find that (1) FOXA1 binding is strongly affected by co-binding transcription factors (TFs) AP-1 and CEBPB; (2) FOXA1 and AP-1 show binding cooperativity in vitro; (3) FOXA1's binding is determined more by local sequences than chromatin context, including eu-/heterochromatin; and (4) AP-1 is partially responsible for differential binding of FOXA1 in different cell types. Our study presents a framework for elucidating genetic rules underlying PF binding specificity and reveals a mechanism for context-specific regulation of its binding.
Collapse
Affiliation(s)
- Cheng Xu
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Holly Kleinschmidt
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Jianyu Yang
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Erik M Leith
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Jenna Johnson
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Song Tan
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Shaun Mahony
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA; Department of Physics, The Pennsylvania State University, University Park, PA 16802, USA.
| |
Collapse
|
5
|
Li J, Rohs R. Deep DNAshape webserver: prediction and real-time visualization of DNA shape considering extended k-mers. Nucleic Acids Res 2024; 52:W7-W12. [PMID: 38801070 PMCID: PMC11223853 DOI: 10.1093/nar/gkae433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 04/30/2024] [Accepted: 05/08/2024] [Indexed: 05/29/2024] Open
Abstract
Sequence-dependent DNA shape plays an important role in understanding protein-DNA binding mechanisms. High-throughput prediction of DNA shape features has become a valuable tool in the field of protein-DNA recognition, transcription factor-DNA binding specificity, and gene regulation. However, our widely used webserver, DNAshape, relies on statistically summarized pentamer query tables to query DNA shape features. These query tables do not consider flanking regions longer than two base pairs, and acquiring a query table for hexamers or higher-order k-mers is currently still unrealistic due to limitations in achieving sufficient statistical coverage in molecular simulations or structural biology experiments. A recent deep-learning method, Deep DNAshape, can predict DNA shape features at the core of a DNA fragment considering flanking regions of up to seven base pairs, trained on limited simulation data. However, Deep DNAshape is rather complicated to install, and it must run locally compared to the pentamer-based DNAshape webserver, creating a barrier for users. Here, we present the Deep DNAshape webserver, which has the benefits of both methods while being accurate, fast, and accessible to all users. Additional improvements of the webserver include the detection of user input in real time, the ability of interactive visualization tools and different modes of analyses. URL: https://deepdnashape.usc.edu.
Collapse
Affiliation(s)
- Jinsen Li
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Remo Rohs
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
- Department of Chemistry, University of Southern California, Los Angeles, CA 90089, USA
- Department of Physics and Astronomy, University of Southern California, Los Angeles, CA 90089, USA
- Thomas Lord Department of Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
6
|
Chen N, Yu J, Liu Z, Meng L, Li X, Wong KC. Discovering DNA shape motifs with multiple DNA shape features: generalization, methods, and validation. Nucleic Acids Res 2024; 52:4137-4150. [PMID: 38572749 PMCID: PMC11077088 DOI: 10.1093/nar/gkae210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 03/06/2024] [Accepted: 03/12/2024] [Indexed: 04/05/2024] Open
Abstract
DNA motifs are crucial patterns in gene regulation. DNA-binding proteins (DBPs), including transcription factors, can bind to specific DNA motifs to regulate gene expression and other cellular activities. Past studies suggest that DNA shape features could be subtly involved in DNA-DBP interactions. Therefore, the shape motif annotations based on intrinsic DNA topology can deepen the understanding of DNA-DBP binding. Nevertheless, high-throughput tools for DNA shape motif discovery that incorporate multiple features altogether remain insufficient. To address it, we propose a series of methods to discover non-redundant DNA shape motifs with the generalization to multiple motifs in multiple shape features. Specifically, an existing Gibbs sampling method is generalized to multiple DNA motif discovery with multiple shape features. Meanwhile, an expectation-maximization (EM) method and a hybrid method coupling EM with Gibbs sampling are proposed and developed with promising performance, convergence capability, and efficiency. The discovered DNA shape motif instances reveal insights into low-signal ChIP-seq peak summits, complementing the existing sequence motif discovery works. Additionally, our modelling captures the potential interplays across multiple DNA shape features. We provide a valuable platform of tools for DNA shape motif discovery. An R package is built for open accessibility and long-lasting impact: https://zenodo.org/doi/10.5281/zenodo.10558980.
Collapse
Affiliation(s)
- Nanjun Chen
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Jixiang Yu
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Zhe Liu
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Lingkuan Meng
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Changchun City, Jilin Province, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
- Hong Kong Institute of Data Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
- Shenzhen Research Institute, City University of Hong Kong, Shenzhen, China
| |
Collapse
|
7
|
Francis A, Campbell C, Gaunt TR. DrivR-Base: a feature extraction toolkit for variant effect prediction model construction. Bioinformatics 2024; 40:btae197. [PMID: 38603611 PMCID: PMC11057939 DOI: 10.1093/bioinformatics/btae197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 03/01/2024] [Accepted: 04/09/2024] [Indexed: 04/13/2024] Open
Abstract
MOTIVATION Recent advancements in sequencing technologies have led to the discovery of numerous variants in the human genome. However, understanding their precise roles in diseases remains challenging due to their complex functional mechanisms. Various methodologies have emerged to predict the pathogenic significance of these genetic variants. Typically, these methods employ an integrative approach, leveraging diverse data sources that provide important insights into genomic function. Despite the abundance of publicly available data sources and databases, the process of navigating, extracting, and pre-processing features for machine learning models can be highly challenging and time-consuming. Furthermore, researchers often invest substantial effort in feature extraction, only to later discover that these features lack informativeness. RESULTS In this article, we introduce DrivR-Base, an innovative resource that efficiently extracts and integrates molecular information (features) related to single nucleotide variants. These features encompass information about the genomic positions and the associated protein positions of a variant. They are derived from a wide array of databases and tools, including structural properties obtained from AlphaFold, regulatory information sourced from ENCODE, and predicted variant consequences from Variant Effect Predictor. DrivR-Base is easily deployable via a Docker container to ensure reproducibility and ease of access across diverse computational environments. The resulting features can be used as input for machine learning models designed to predict the pathogenic impact of human genome variants in disease. Moreover, these feature sets have applications beyond this, including haploinsufficiency prediction and the development of drug repurposing tools. We describe the resource's development, practical applications, and potential for future expansion and enhancement. AVAILABILITY AND IMPLEMENTATION DrivR-Base source code is available at https://github.com/amyfrancis97/DrivR-Base.
Collapse
Affiliation(s)
- Amy Francis
- MRC Integrative Epidemiology Unit, Bristol Medical School (PHS), University of Bristol, Bristol BS8 2BN, United Kingdom
| | - Colin Campbell
- Intelligent Systems Laboratory, University of Bristol, Bristol BS1 5DD, United Kingdom
| | - Tom R Gaunt
- MRC Integrative Epidemiology Unit, Bristol Medical School (PHS), University of Bristol, Bristol BS8 2BN, United Kingdom
| |
Collapse
|
8
|
Lee AJ, Rackers JA, Pathak S, Bricker WP. Building an ab initio solvated DNA model using Euclidean neural networks. PLoS One 2024; 19:e0297502. [PMID: 38358990 PMCID: PMC10868815 DOI: 10.1371/journal.pone.0297502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Accepted: 01/06/2024] [Indexed: 02/17/2024] Open
Abstract
Accurately modeling large biomolecules such as DNA from first principles is fundamentally challenging due to the steep computational scaling of ab initio quantum chemistry methods. This limitation becomes even more prominent when modeling biomolecules in solution due to the need to include large numbers of solvent molecules. We present a machine-learned electron density model based on a Euclidean neural network framework that includes a built-in understanding of equivariance to model explicitly solvated double-stranded DNA. By training the machine learning model using molecular fragments that sample the key DNA and solvent interactions, we show that the model predicts electron densities of arbitrary systems of solvated DNA accurately, resolves polarization effects that are neglected by classical force fields, and captures the physics of the DNA-solvent interaction at the ab initio level.
Collapse
Affiliation(s)
- Alex J. Lee
- Department of Chemical and Biological Engineering, University of New Mexico, Albuquerque, NM, United States of America
| | - Joshua A. Rackers
- Center for Computing Research, Sandia National Laboratories, Albuquerque, NM, United States of America
| | - Shivesh Pathak
- Center for Computing Research, Sandia National Laboratories, Albuquerque, NM, United States of America
| | - William P. Bricker
- Department of Chemical and Biological Engineering, University of New Mexico, Albuquerque, NM, United States of America
| |
Collapse
|
9
|
Li J, Chiu TP, Rohs R. Predicting DNA structure using a deep learning method. Nat Commun 2024; 15:1243. [PMID: 38336958 PMCID: PMC10858265 DOI: 10.1038/s41467-024-45191-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 01/17/2024] [Indexed: 02/12/2024] Open
Abstract
Understanding the mechanisms of protein-DNA binding is critical in comprehending gene regulation. Three-dimensional DNA structure, also described as DNA shape, plays a key role in these mechanisms. In this study, we present a deep learning-based method, Deep DNAshape, that fundamentally changes the current k-mer based high-throughput prediction of DNA shape features by accurately accounting for the influence of extended flanking regions, without the need for extensive molecular simulations or structural biology experiments. By using the Deep DNAshape method, DNA structural features can be predicted for any length and number of DNA sequences in a high-throughput manner, providing an understanding of the effects of flanking regions on DNA structure in a target region of a sequence. The Deep DNAshape method provides access to the influence of distant flanking regions on a region of interest. Our findings reveal that DNA shape readout mechanisms of a core target are quantitatively affected by flanking regions, including extended flanking regions, providing valuable insights into the detailed structural readout mechanisms of protein-DNA binding. Furthermore, when incorporated in machine learning models, the features generated by Deep DNAshape improve the model prediction accuracy. Collectively, Deep DNAshape can serve as versatile and powerful tool for diverse DNA structure-related studies.
Collapse
Affiliation(s)
- Jinsen Li
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, 90089, USA
| | - Tsu-Pei Chiu
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, 90089, USA
| | - Remo Rohs
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, 90089, USA.
- Department of Chemistry, University of Southern California, Los Angeles, CA, 90089, USA.
- Department of Physics and Astronomy, University of Southern California, Los Angeles, CA, 90089, USA.
- Thomas Lord Department of Computer Science, University of Southern California, Los Angeles, CA, 90089, USA.
| |
Collapse
|
10
|
Jiang Y, Chiu TP, Mitra R, Rohs R. Probing the role of the protonation state of a minor groove-linker histidine in Exd-Hox-DNA binding. Biophys J 2024; 123:248-259. [PMID: 38130056 PMCID: PMC10808038 DOI: 10.1016/j.bpj.2023.12.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 09/22/2023] [Accepted: 12/13/2023] [Indexed: 12/23/2023] Open
Abstract
DNA recognition and targeting by transcription factors (TFs) through specific binding are fundamental in biological processes. Furthermore, the histidine protonation state at the TF-DNA binding interface can significantly influence the binding mechanism of TF-DNA complexes. Nevertheless, the role of histidine in TF-DNA complexes remains underexplored. Here, we employed all-atom molecular dynamics simulations using AlphaFold2-modeled complexes based on previously solved co-crystal structures to probe the role of the His-12 residue in the Extradenticle (Exd)-Sex combs reduced (Scr)-DNA complex when binding to Scr and Ultrabithorax (Ubx) target sites. Our results demonstrate that the protonation state of histidine notably affected the DNA minor-groove width profile and binding free energy. Examining flanking sequences of various binding affinities derived from SELEX-seq experiments, we analyzed the relationship between binding affinity and specificity. We uncovered how histidine protonation leads to increased binding affinity but can lower specificity. Our findings provide new mechanistic insights into the role of histidine in modulating TF-DNA binding.
Collapse
Affiliation(s)
- Yibei Jiang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California
| | - Tsu-Pei Chiu
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California
| | - Raktim Mitra
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California
| | - Remo Rohs
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California; Department of Chemistry, University of Southern California, Los Angeles, California; Department of Physics and Astronomy, University of Southern California, Los Angeles, California; Thomas Lord Department of Computer Science, University of Southern California, Los Angeles, California.
| |
Collapse
|
11
|
Vernon TN, Terrell JR, Albrecht AV, Germann MW, Wilson WD, Poon GMK. Dissection of integrated readout reveals the structural thermodynamics of DNA selection by transcription factors. Structure 2024; 32:83-96.e4. [PMID: 38042148 DOI: 10.1016/j.str.2023.11.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 10/12/2023] [Accepted: 11/07/2023] [Indexed: 12/04/2023]
Abstract
Nucleobases such as inosine have been extensively utilized to map direct contacts by proteins in the DNA groove. Their deployment as targeted probes of dynamics and hydration, which are dominant thermodynamic drivers of affinity and specificity, has been limited by a paucity of suitable experimental models. We report a joint crystallographic, thermodynamic, and computational study of the bidentate complex of the arginine side chain with a Watson-Crick guanine (Arg×GC), a highly specific configuration adopted by major transcription factors throughout the eukaryotic branches in the Tree of Life. Using the ETS-family factor PU.1 as a high-resolution structural framework, inosine substitution for guanine resulted in a sharp dissection of conformational dynamics and hydration and elucidated their role in the DNA specificity of PU.1. Our work suggests an under-exploited utility of modified nucleobases in untangling the structural thermodynamics of interactions, such as the Arg×GC motif, where direct and indirect readout are tightly integrated.
Collapse
Affiliation(s)
- Tyler N Vernon
- Department of Chemistry, Georgia State University, Atlanta, GA 30302, USA
| | - J Ross Terrell
- Department of Chemistry, Georgia State University, Atlanta, GA 30302, USA
| | - Amanda V Albrecht
- Department of Chemistry, Georgia State University, Atlanta, GA 30302, USA
| | - Markus W Germann
- Department of Chemistry, Georgia State University, Atlanta, GA 30302, USA; Department of Biology, Georgia State University, Atlanta, GA 30302, USA.
| | - W David Wilson
- Department of Chemistry, Georgia State University, Atlanta, GA 30302, USA; Center for Diagnostics and Therapeutics, Georgia State University, Atlanta, GA 30302, USA.
| | - Gregory M K Poon
- Department of Chemistry, Georgia State University, Atlanta, GA 30302, USA; Center for Diagnostics and Therapeutics, Georgia State University, Atlanta, GA 30302, USA.
| |
Collapse
|
12
|
Xu C, Kleinschmidt H, Yang J, Leith E, Johnson J, Tan S, Mahony S, Bai L. Systematic Dissection of Sequence Features Affecting the Binding Specificity of a Pioneer Factor Reveals Binding Synergy Between FOXA1 and AP-1. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.08.566246. [PMID: 37986839 PMCID: PMC10659273 DOI: 10.1101/2023.11.08.566246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Despite the unique ability of pioneer transcription factors (PFs) to target nucleosomal sites in closed chromatin, they only bind a small fraction of their genomic motifs. The underlying mechanism of this selectivity is not well understood. Here, we design a high-throughput assay called ChIP-ISO to systematically dissect sequence features affecting the binding specificity of a classic PF, FOXA1. Combining ChIP-ISO with in vitro and neural network analyses, we find that 1) FOXA1 binding is strongly affected by co-binding TFs AP-1 and CEBPB, 2) FOXA1 and AP-1 show binding cooperativity in vitro, 3) FOXA1's binding is determined more by local sequences than chromatin context, including eu-/heterochromatin, and 4) AP-1 is partially responsible for differential binding of FOXA1 in different cell types. Our study presents a framework for elucidating genetic rules underlying PF binding specificity and reveals a mechanism for context-specific regulation of its binding.
Collapse
Affiliation(s)
- Cheng Xu
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Holly Kleinschmidt
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Jianyu Yang
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Erik Leith
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Jenna Johnson
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Song Tan
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Shaun Mahony
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
- Department of Physics, The Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
13
|
Li J, Chiu TP, Rohs R. Deep DNAshape: Predicting DNA shape considering extended flanking regions using a deep learning method. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.22.563383. [PMID: 37961633 PMCID: PMC10634709 DOI: 10.1101/2023.10.22.563383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Understanding the mechanisms of protein-DNA binding is critical in comprehending gene regulation. Three-dimensional DNA shape plays a key role in these mechanisms. In this study, we present a deep learning-based method, Deep DNAshape, that fundamentally changes the current k -mer based high-throughput prediction of DNA shape features by accurately accounting for the influence of extended flanking regions, without the need for extensive molecular simulations or structural biology experiments. By using the Deep DNAshape method, refined DNA shape features can be predicted for any length and number of DNA sequences in a high-throughput manner, providing a deeper understanding of the effects of flanking regions on DNA shape in a target region of a sequence. Deep DNAshape method provides access to the influence of distant flanking regions on a region of interest. Our findings reveal that DNA shape readout mechanisms of a core target are quantitatively affected by flanking regions, including extended flanking regions, providing valuable insights into the detailed structural readout mechanisms of protein-DNA binding. Furthermore, when incorporated in machine learning models, the features generated by Deep DNAshape improve the model prediction accuracy. Collectively, Deep DNAshape can serve as a versatile and powerful tool for diverse DNA structure-related studies.
Collapse
|
14
|
Freda I, Exertier C, Barile A, Chaves-Sanjuan A, Vega M, Isupov M, Harmer N, Gugole E, Swuec P, Bolognesi M, Scipioni A, Savino C, Di Salvo M, Contestabile R, Vallone B, Tramonti A, Montemiglio L. Structural insights into the DNA recognition mechanism by the bacterial transcription factor PdxR. Nucleic Acids Res 2023; 51:8237-8254. [PMID: 37378428 PMCID: PMC10450172 DOI: 10.1093/nar/gkad552] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 06/08/2023] [Accepted: 06/22/2023] [Indexed: 06/29/2023] Open
Abstract
Specificity in protein-DNA recognition arises from the synergy of several factors that stem from the structural and chemical signatures encoded within the targeted DNA molecule. Here, we deciphered the nature of the interactions driving DNA recognition and binding by the bacterial transcription factor PdxR, a member of the MocR family responsible for the regulation of pyridoxal 5'-phosphate (PLP) biosynthesis. Single particle cryo-EM performed on the PLP-PdxR bound to its target DNA enabled the isolation of three conformers of the complex, which may be considered as snapshots of the binding process. Moreover, the resolution of an apo-PdxR crystallographic structure provided a detailed description of the transition of the effector domain to the holo-PdxR form triggered by the binding of the PLP effector molecule. Binding analyses of mutated DNA sequences using both wild type and PdxR variants revealed a central role of electrostatic interactions and of the intrinsic asymmetric bending of the DNA in allosterically guiding the holo-PdxR-DNA recognition process, from the first encounter through the fully bound state. Our results detail the structure and dynamics of the PdxR-DNA complex, clarifying the mechanism governing the DNA-binding mode of the holo-PdxR and the regulation features of the MocR family of transcription factors.
Collapse
Affiliation(s)
- Ida Freda
- Department of Biochemical Sciences “A. Rossi Fanelli”, Sapienza, University of Rome, Rome 00185, Italy
| | - Cécile Exertier
- Institute of Molecular Biology and Pathology, National Research Council, Rome 00185, Italy
| | - Anna Barile
- Institute of Molecular Biology and Pathology, National Research Council, Rome 00185, Italy
| | - Antonio Chaves-Sanjuan
- Department of Biosciences, Pediatric Clinical Research Center Romeo ed Enrica Invernizzi and NOLIMITS, University of Milano, Milano 20133, Italy
| | - Mirella Vivoli Vega
- School of Biochemistry, University of Bristol, University Walk, BS8 1TD Bristol, UK
| | - Michail N Isupov
- Geoffrey Pope Building, University of Exeter, Stocker Road, Exeter EX4 4QD, UK
| | - Nicholas J Harmer
- Living Systems Institute, University of Exeter, Stocker Road, Exeter EX4 4QD, UK
| | - Elena Gugole
- Institute of Molecular Biology and Pathology, National Research Council, Rome 00185, Italy
| | - Paolo Swuec
- Cryo-Electron Microscopy Core Facility, Human Technopole, Milano 20157, Italy
| | - Martino Bolognesi
- Department of Biosciences, Pediatric Clinical Research Center Romeo ed Enrica Invernizzi and NOLIMITS, University of Milano, Milano 20133, Italy
| | - Anita Scipioni
- Department of Chemistry, Sapienza, University of Rome, Rome 00185, Italy
| | - Carmelinda Savino
- Institute of Molecular Biology and Pathology, National Research Council, Rome 00185, Italy
| | - Martino Luigi Di Salvo
- Department of Biochemical Sciences “A. Rossi Fanelli”, Sapienza, University of Rome, Rome 00185, Italy
| | - Roberto Contestabile
- Department of Biochemical Sciences “A. Rossi Fanelli”, Sapienza, University of Rome, Rome 00185, Italy
- Istituto Pasteur-Fondazione Cenci Bolognetti, Sapienza, University of Rome, Rome 00185, Italy
| | - Beatrice Vallone
- Department of Biochemical Sciences “A. Rossi Fanelli”, Sapienza, University of Rome, Rome 00185, Italy
- Institute of Molecular Biology and Pathology, National Research Council, Rome 00185, Italy
| | - Angela Tramonti
- Institute of Molecular Biology and Pathology, National Research Council, Rome 00185, Italy
| | | |
Collapse
|
15
|
Liu Z, Samee M. Structural underpinnings of mutation rate variations in the human genome. Nucleic Acids Res 2023; 51:7184-7197. [PMID: 37395403 PMCID: PMC10415140 DOI: 10.1093/nar/gkad551] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 06/06/2023] [Accepted: 06/15/2023] [Indexed: 07/04/2023] Open
Abstract
Single nucleotide mutation rates have critical implications for human evolution and genetic diseases. Importantly, the rates vary substantially across the genome and the principles underlying such variations remain poorly understood. A recent model explained much of this variation by considering higher-order nucleotide interactions in the 7-mer sequence context around mutated nucleotides. This model's success implicates a connection between DNA shape and mutation rates. DNA shape, i.e. structural properties like helical twist and tilt, is known to capture interactions between nucleotides within a local context. Thus, we hypothesized that changes in DNA shape features at and around mutated positions can explain mutation rate variations in the human genome. Indeed, DNA shape-based models of mutation rates showed similar or improved performance over current nucleotide sequence-based models. These models accurately characterized mutation hotspots in the human genome and revealed the shape features whose interactions underlie mutation rate variations. DNA shape also impacts mutation rates within putative functional regions like transcription factor binding sites where we find a strong association between DNA shape and position-specific mutation rates. This work demonstrates the structural underpinnings of nucleotide mutations in the human genome and lays the groundwork for future models of genetic variations to incorporate DNA shape.
Collapse
Affiliation(s)
- Zian Liu
- Department of Integrative Physiology, Baylor College of Medicine, Houston, TX 77030, USA
| | - Md Abul Hassan Samee
- Department of Integrative Physiology, Baylor College of Medicine, Houston, TX 77030, USA
| |
Collapse
|
16
|
Xu AM, Chour W, DeLucia DC, Su Y, Pavlovitch-Bedzyk AJ, Ng R, Rasheed Y, Davis MM, Lee JK, Heath JR. Entropic analysis of antigen-specific CDR3 domains identifies essential binding motifs shared by CDR3s with different antigen specificities. Cell Syst 2023; 14:273-284.e5. [PMID: 37001518 PMCID: PMC10355346 DOI: 10.1016/j.cels.2023.03.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 09/01/2022] [Accepted: 03/01/2023] [Indexed: 04/22/2023]
Abstract
Antigen-specific T cell receptor (TCR) sequences can have prognostic, predictive, and therapeutic value, but decoding the specificity of TCR recognition remains challenging. Unlike DNA strands that base pair, TCRs bind to their targets with different orientations and different lengths, which complicates comparisons. We present scanning parametrized by normalized TCR length (SPAN-TCR) to analyze antigen-specific TCR CDR3 sequences and identify patterns driving TCR-pMHC specificity. Using entropic analysis, SPAN-TCR identifies 2-mer motifs that decrease the diversity (entropy) of CDR3s. These motifs are the most common patterns that can predict CDR3 composition, and we identify "essential" motifs that decrease entropy in the same CDR3 α or β chain containing the 2-mer, and "super-essential" motifs that decrease entropy in both chains. Molecular dynamics analysis further suggests that these motifs may play important roles in binding. We then employ SPAN-TCR to resolve similarities in TCR repertoires against different antigens using public databases of TCR sequences.
Collapse
Affiliation(s)
- Alexander M Xu
- Institute for Systems Biology, Seattle, WA 98109, USA; Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA; Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.
| | - William Chour
- Institute for Systems Biology, Seattle, WA 98109, USA; Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA; Keck School of Medicine, University of Southern California, Los Angeles, CA 91125, USA
| | - Diana C DeLucia
- Division of Human Biology, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Yapeng Su
- Institute for Systems Biology, Seattle, WA 98109, USA; Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | | | - Rachel Ng
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Yusuf Rasheed
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Mark M Davis
- Computational and Systems Immunology Program, Stanford University School of Medicine, Stanford, CA 94305, USA; Institute for Immunity, Transplantation and Infection, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA 94305, USA; Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - John K Lee
- Division of Human Biology, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Division of Medical Oncology, Department of Medicine, University of Washington, Seattle, WA 98195, USA
| | - James R Heath
- Institute for Systems Biology, Seattle, WA 98109, USA.
| |
Collapse
|
17
|
Lee AJ, Rackers JA, Bricker WP. Predicting accurate ab initio DNA electron densities with equivariant neural networks. Biophys J 2022; 121:3883-3895. [PMID: 36057785 PMCID: PMC9674991 DOI: 10.1016/j.bpj.2022.08.045] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 07/22/2022] [Accepted: 08/29/2022] [Indexed: 11/19/2022] Open
Abstract
One of the fundamental limitations of accurately modeling biomolecules like DNA is the inability to perform quantum chemistry calculations on large molecular structures. We present a machine learning model based on an equivariant Euclidean neural network framework to obtain accurate ab initio electron densities for arbitrary DNA structures that are much too large for conventional quantum methods. The model is trained on representative B-DNA basepair steps that capture both base pairing and base stacking interactions. The model produces accurate electron densities for arbitrary B-DNA structures with typical errors of less than 1%. Crucially, the error does not increase with system size, which suggests that the model can extrapolate to large DNA structures with negligible loss of accuracy. The model also generalizes reasonably to other DNA structural motifs such as the A- and Z-DNA forms, despite being trained on only B-DNA configurations. The model is used to calculate electron densities of several large-scale DNA structures, and we show that the computational scaling for this model is essentially linear. We also show that this machine learning electron density model can be used to calculate accurate electrostatic potentials for DNA. These electrostatic potentials produce more accurate results compared with classical force fields and do not show the usual deficiencies at short range.
Collapse
Affiliation(s)
- Alex J Lee
- Department of Chemical and Biological Engineering, University of New Mexico, Albuquerque, New Mexico
| | - Joshua A Rackers
- Center for Computing Research, Sandia National Laboratories, Albuquerque, New Mexico.
| | - William P Bricker
- Department of Chemical and Biological Engineering, University of New Mexico, Albuquerque, New Mexico.
| |
Collapse
|
18
|
Towards a better understanding of TF-DNA binding prediction from genomic features. Comput Biol Med 2022; 149:105993. [DOI: 10.1016/j.compbiomed.2022.105993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/12/2022] [Accepted: 08/14/2022] [Indexed: 11/17/2022]
|
19
|
Jain A, Mittal S, Tripathi LP, Nussinov R, Ahmad S. Host-pathogen protein-nucleic acid interactions: A comprehensive review. Comput Struct Biotechnol J 2022; 20:4415-4436. [PMID: 36051878 PMCID: PMC9420432 DOI: 10.1016/j.csbj.2022.08.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 08/01/2022] [Accepted: 08/01/2022] [Indexed: 12/02/2022] Open
Abstract
Recognition of pathogen-derived nucleic acids by host cells is an effective host strategy to detect pathogenic invasion and trigger immune responses. In the context of pathogen-specific pharmacology, there is a growing interest in mapping the interactions between pathogen-derived nucleic acids and host proteins. Insight into the principles of the structural and immunological mechanisms underlying such interactions and their roles in host defense is necessary to guide therapeutic intervention. Here, we discuss the newest advances in studies of molecular interactions involving pathogen nucleic acids and host factors, including their drug design, molecular structure and specific patterns. We observed that two groups of nucleic acid recognizing molecules, Toll-like receptors (TLRs) and the cytoplasmic retinoic acid-inducible gene (RIG)-I-like receptors (RLRs) form the backbone of host responses to pathogen nucleic acids, with additional support provided by absent in melanoma 2 (AIM2) and DNA-dependent activator of Interferons (IFNs)-regulatory factors (DAI) like cytosolic activity. We review the structural, immunological, and other biological aspects of these representative groups of molecules, especially in terms of their target specificity and affinity and challenges in leveraging host-pathogen protein-nucleic acid interactions (HP-PNI) in drug discovery.
Collapse
Affiliation(s)
- Anuja Jain
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Shikha Mittal
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Solan, Himachal Pradesh, 173234, India
| | - Lokesh P. Tripathi
- National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
- Riken Center for Integrative Medical Sciences, Tsurumi, Yokohama, Kanagawa, Japan
| | - Ruth Nussinov
- Computational Structural Biology Section, Basic Science Program, Frederick National, Laboratory for Cancer Research, Frederick, MD 21702, USA
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Israel
| | - Shandar Ahmad
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| |
Collapse
|
20
|
Bacterial H-NS contacts DNA at the same irregularly spaced sites in both bridged and hemi-sequestered linear filaments. iScience 2022; 25:104429. [PMID: 35669520 PMCID: PMC9162952 DOI: 10.1016/j.isci.2022.104429] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 04/01/2022] [Accepted: 05/13/2022] [Indexed: 11/22/2022] Open
Abstract
Gene silencing in bacteria is mediated by chromatin proteins, of which Escherichia coli H-NS is a paradigmatic example. H-NS forms nucleoprotein filaments with either one or two DNA duplexes. However, the structures, arrangements of DNA-binding domains (DBDs), and positions of DBD-DNA contacts in linear and bridged filaments are uncertain. To characterize the H-NS DBD contacts that silence transcription by RNA polymerase, we combined ·OH footprinting, molecular dynamics, statistical modeling, and DBD mapping using a chemical nuclease (Fe2+-EDTA) tethered to the DBDs (TEN-map). We find that H-NS DBDs contact DNA at indistinguishable locations in bridged or linear filaments and that the DBDs vary in orientation and position with ∼10-bp average spacing. Our results support a hemi-sequestration model of linear-to-bridged H-NS switching. Linear filaments able to inhibit only transcription initiation switch to bridged filaments able to inhibit both initiation and elongation using the same irregularly spaced DNA contacts.
Collapse
|
21
|
Malik FK, Guo JT. Insights into protein-DNA interactions from hydrogen bond energy-based comparative protein-ligand analyses. Proteins 2022; 90:1303-1314. [PMID: 35122321 PMCID: PMC9018545 DOI: 10.1002/prot.26313] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 01/17/2022] [Accepted: 01/31/2022] [Indexed: 01/18/2023]
Abstract
Hydrogen bonds play important roles in protein folding and protein-ligand interactions, particularly in specific protein-DNA recognition. However, the distributions of hydrogen bonds, especially hydrogen bond energy (HBE) in different types of protein-ligand complexes, is unknown. Here we performed a comparative analysis of hydrogen bonds among three non-redundant datasets of protein-protein, protein-peptide, and protein-DNA complexes. Besides comparing the number of hydrogen bonds in terms of types and locations, we investigated the distributions of HBE. Our results indicate that while there is no significant difference of hydrogen bonds within protein chains among the three types of complexes, interfacial hydrogen bonds are significantly more prevalent in protein-DNA complexes. More importantly, the interfacial hydrogen bonds in protein-DNA complexes displayed a unique energy distribution of strong and weak hydrogen bonds whereas majority of the interfacial hydrogen bonds in protein-protein and protein-peptide complexes are of predominantly high strength with low energy. Moreover, there is a significant difference in the energy distributions of minor groove hydrogen bonds between protein-DNA complexes with different binding specificity. Highly specific protein-DNA complexes contain more strong hydrogen bonds in the minor groove than multi-specific complexes, suggesting important role of minor groove in specific protein-DNA recognition. These results can help better understand protein-DNA interactions and have important implications in improving quality assessments of protein-DNA complex models.
Collapse
Affiliation(s)
- Fareeha K Malik
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, North Carolina, USA.,Research Center of Modeling and Simulation, National University of Science and Technology, Islamabad, Pakistan
| | - Jun-Tao Guo
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, North Carolina, USA
| |
Collapse
|
22
|
Human Cytomegalovirus IE2 Both Activates and Represses Initiation and Modulates Elongation in a Context-Dependent Manner. mBio 2022; 13:e0033722. [PMID: 35579393 PMCID: PMC9239164 DOI: 10.1128/mbio.00337-22] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Human cytomegalovirus (HCMV) immediate-early 2 (IE2) protein is a multifunctional transcription factor that is essential for lytic HCMV infection. IE2 functions as an activator of viral early genes, negatively regulates its own promoter, and is required for viral replication. The mechanisms by which IE2 executes these distinct functions are incompletely understood. Using PRO-Seq, which profiles nascent transcripts, and a recently developed DFF-chromatin immunoprecipitation (DFF-ChIP; employs chromatin digestion by the endonuclease DNA fragmentation factor prior to IP) approach that resolves occupancy and local chromatin environment, we show that IE2 controls viral gene transcription in three distinct capacities during late HCMV infection and reveal mechanisms that involve direct binding of IE2 to viral DNA. IE2 represses a subset of viral promoters by binding within their core promoter regions and blocking the assembly of preinitiation complexes (PICs). Remarkably, IE2 forms a repressive complex at the major immediate-early promoter region involving direct association of IE2 with nucleosomes and TBP. IE2 stimulates transcription by binding nearby, but not within, core promoter regions. In addition, IE2 functions as a direct roadblock to transcription elongation. At one locus, this function of IE2 appears to be important for the synthesis of a spliced viral RNA. Consistent with the minimal observed effects of IE2 depletion on host gene transcription, IE2 does not functionally engage the host genome. Our results reveal mechanisms of transcriptional control by IE2, uncover a previously unknown function of IE2 as a Pol II elongation modulator, and demonstrate that DFF-ChIP is a useful tool for probing transcription factor occupancy and interactions between transcription factors and nucleosomes at high resolution.
Collapse
|
23
|
Huber EM, Hortschansky P, Scheven MT, Misslinger M, Haas H, Brakhage AA, Groll M. Structural insights into cooperative DNA recognition by the CCAAT-binding complex and its bZIP transcription factor HapX. Structure 2022; 30:934-946.e4. [PMID: 35472306 DOI: 10.1016/j.str.2022.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 03/21/2022] [Accepted: 03/31/2022] [Indexed: 11/25/2022]
Abstract
The heterotrimeric CCAAT-binding complex (CBC) is a fundamental eukaryotic transcription factor recognizing the CCAAT box. In certain fungi, like Aspergilli, the CBC cooperates with the basic leucine zipper HapX to control iron metabolism. HapX functionally depends on the CBC, and the stable interaction of both requires DNA. To study this cooperative effect, X-ray structures of the CBC-HapX-DNA complex were determined. Downstream of the CCAAT box, occupied by the CBC, a HapX dimer binds to the major groove. The leash-like N terminus of the distal HapX subunit contacts the CBC, and via a flexible polyproline type II helix mediates minor groove interactions that stimulate sequence promiscuity. In vitro and in vivo mutagenesis suggest that the structural and functional plasticity of HapX results from local asymmetry and its ability to target major and minor grooves simultaneously. The latter feature may also apply to related transcription factors such as yeast Hap4 and distinct Yap family members.
Collapse
Affiliation(s)
- Eva M Huber
- Chair of Biochemistry, Center for Protein Assemblies, Technical University of Munich, Ernst-Otto-Fischer-Straße 8, 85748 Garching, Germany
| | - Peter Hortschansky
- Department of Molecular and Applied Microbiology, Leibniz Institute for Natural Product Research and Infection Biology, Hans Knöll Institute (Leibniz-HKI), Adolf-Reichwein-Straße 23, 07745 Jena, Germany
| | - Mareike T Scheven
- Department of Molecular and Applied Microbiology, Leibniz Institute for Natural Product Research and Infection Biology, Hans Knöll Institute (Leibniz-HKI), Adolf-Reichwein-Straße 23, 07745 Jena, Germany
| | - Matthias Misslinger
- Institute of Molecular Biology, Biocenter, Medical University of Innsbruck, Innrain 80/82, 6020 Innsbruck, Austria
| | - Hubertus Haas
- Institute of Molecular Biology, Biocenter, Medical University of Innsbruck, Innrain 80/82, 6020 Innsbruck, Austria
| | - Axel A Brakhage
- Department of Molecular and Applied Microbiology, Leibniz Institute for Natural Product Research and Infection Biology, Hans Knöll Institute (Leibniz-HKI), Adolf-Reichwein-Straße 23, 07745 Jena, Germany; Institute for Microbiology, Friedrich Schiller University Jena, Neugasse 25, 07743 Jena, Germany.
| | - Michael Groll
- Chair of Biochemistry, Center for Protein Assemblies, Technical University of Munich, Ernst-Otto-Fischer-Straße 8, 85748 Garching, Germany.
| |
Collapse
|
24
|
Zuñiga-Martínez BS, Domínguez-Avila JA, Wall-Medrano A, Ayala-Zavala JF, Hernández-Paredes J, Salazar-López NJ, Villegas-Ochoa MA, González-Aguilar GA. Avocado paste from industrial byproducts as an unconventional source of bioactive compounds: characterization, in vitro digestion and in silico interactions of its main phenolics with cholesterol. JOURNAL OF FOOD MEASUREMENT AND CHARACTERIZATION 2021. [DOI: 10.1007/s11694-021-01117-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
25
|
Sarkar S, Dey U, Khohliwe TB, Yella VR, Kumar A. Analysis of nucleoid-associated protein-binding regions reveals DNA structural features influencing genome organization in Mycobacterium tuberculosis. FEBS Lett 2021; 595:2504-2521. [PMID: 34387867 DOI: 10.1002/1873-3468.14178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 08/01/2021] [Accepted: 08/11/2021] [Indexed: 11/10/2022]
Abstract
Nucleoid-associated proteins (NAPs) maintain bacterial nucleoid configuration through their architectural properties of DNA bending, wrapping, and bridging. However, the contribution of DNA structural alterations to DNA-NAP recognition at the genomic scale remains unresolved. Present work dissects the DNA sequence, shape and altered structural preferences at a genomic scale for six NAPs in Mycobacterium tuberculosis. Results suggest narrower minor groove width (MGW) and higher DNA rigidity are marked for the binding sites of EspR and Lsr2, while mIHF, MtHU and NapM have heterogeneous DNA structural predilections. In contrast, WhiB4-DNA-binding sites were characterized by wider MGW, highly deformable and less curved DNA. This work provides systematic insight into NAP-mediated genome organization as a function of DNA structural features.
Collapse
Affiliation(s)
- Sharmilee Sarkar
- Department of Molecular Biology and Biotechnology, Tezpur University, India
| | - Upalabdha Dey
- Department of Molecular Biology and Biotechnology, Tezpur University, India
| | | | - Venkata Rajesh Yella
- Department of Biotechnology, Koneru Lakshmaiah Education Foundation, Guntur, India
| | - Aditya Kumar
- Department of Molecular Biology and Biotechnology, Tezpur University, India
| |
Collapse
|
26
|
Marcos-Torres FJ, Maurer D, Juniar L, Griese JJ. The bacterial iron sensor IdeR recognizes its DNA targets by indirect readout. Nucleic Acids Res 2021; 49:10120-10135. [PMID: 34417623 PMCID: PMC8464063 DOI: 10.1093/nar/gkab711] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 07/19/2021] [Accepted: 08/02/2021] [Indexed: 01/11/2023] Open
Abstract
The iron-dependent regulator IdeR is the main transcriptional regulator controlling iron homeostasis genes in Actinobacteria, including species from the Corynebacterium, Mycobacterium and Streptomyces genera, as well as the erythromycin-producing bacterium Saccharopolyspora erythraea. Despite being a well-studied transcription factor since the identification of the Diphtheria toxin repressor DtxR three decades ago, the details of how IdeR proteins recognize their highly conserved 19-bp DNA target remain to be elucidated. IdeR makes few direct contacts with DNA bases in its target sequence, and we show here that these contacts are not required for target recognition. The results of our structural and mutational studies support a model wherein IdeR mainly uses an indirect readout mechanism, identifying its targets via the sequence-dependent DNA backbone structure rather than through specific contacts with the DNA bases. Furthermore, we show that IdeR efficiently recognizes a shorter palindromic sequence corresponding to a half binding site as compared to the full 19-bp target previously reported, expanding the number of potential target genes controlled by IdeR proteins.
Collapse
Affiliation(s)
| | - Dirk Maurer
- Department of Cell and Molecular Biology, Uppsala University, SE-751 24 Uppsala, Sweden
| | - Linda Juniar
- Department of Cell and Molecular Biology, Uppsala University, SE-751 24 Uppsala, Sweden
| | - Julia J Griese
- Department of Cell and Molecular Biology, Uppsala University, SE-751 24 Uppsala, Sweden
| |
Collapse
|
27
|
Mozo-Villarías A, Cedano J, Querol E. The importance of hydrophobic interactions in the structure of transcription systems. EUROPEAN BIOPHYSICS JOURNAL : EBJ 2021; 50:951-961. [PMID: 34131772 DOI: 10.1007/s00249-021-01557-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 06/08/2021] [Accepted: 06/09/2021] [Indexed: 12/01/2022]
Abstract
Hydrophobic forces play a crucial role in both the stability of B DNA and its interactions with proteins. In the present study, we postulate that the hydrophobic effect is an essential component in establishing specificity in the interaction transcription factor proteins with their consensus DNA sequence partners. The PDB coordinates of more than 50 transcription systems have been used to analyze the hydrophobic attraction of proteins towards their DNA consensus. This analysis includes computing the hydrophobic energy of the interacting molecules by means of their hydrophobic moments. Hydrophobic moments have successfully been used in previous studies involving self-assembly protein systems. In the present case, in spite of some variability, we found specificity in transcription factors when interacting with their respective consensus DNA sequences. By applying our model of biological membrane pattern for hydrophobic interactions, we postulate that hydrophobic forces constitute the necessary intermediate interaction between the unspecific electrostatic attraction for DNA phosphate groups and the very short-range interaction promoting hydrogen bonds. We conclude that hydrophobic interactions serve as the intermediate force guiding transcriptions factors towards the proper hydrogen bonds to their DNAs.
Collapse
Affiliation(s)
- Angel Mozo-Villarías
- Departament de Bioquímica i Biologia Molecular, Campus de Bellaterra, Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, 08193, Bellaterra, Barcelona, Spain.
| | - Juan Cedano
- Departament de Bioquímica i Biologia Molecular, Campus de Bellaterra, Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, 08193, Bellaterra, Barcelona, Spain
| | - Enrique Querol
- Departament de Bioquímica i Biologia Molecular, Campus de Bellaterra, Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, 08193, Bellaterra, Barcelona, Spain
| |
Collapse
|
28
|
de Almeida LC, Calil FA, Machado-Neto JA, Costa-Lotufo LV. DNA damaging agents and DNA repair: From carcinogenesis to cancer therapy. Cancer Genet 2021; 252-253:6-24. [DOI: 10.1016/j.cancergen.2020.12.002] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Revised: 11/30/2020] [Accepted: 12/02/2020] [Indexed: 02/09/2023]
|
29
|
Rabdano SO, Shannon MD, Izmailov SA, Gonzalez Salguero N, Zandian M, Purusottam RN, Poirier MG, Skrynnikov NR, Jaroniec CP. Histone H4 Tails in Nucleosomes: a Fuzzy Interaction with DNA. Angew Chem Int Ed Engl 2021. [DOI: 10.1002/ange.202012046] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- Sevastyan O. Rabdano
- Laboratory of Biomolecular NMR St. Petersburg State University St. Petersburg 199034 Russian Federation
| | - Matthew D. Shannon
- Department of Chemistry and Biochemistry The Ohio State University Columbus OH 43210 USA
| | - Sergei A. Izmailov
- Laboratory of Biomolecular NMR St. Petersburg State University St. Petersburg 199034 Russian Federation
| | | | - Mohamad Zandian
- Department of Chemistry and Biochemistry The Ohio State University Columbus OH 43210 USA
| | - Rudra N. Purusottam
- Department of Chemistry and Biochemistry The Ohio State University Columbus OH 43210 USA
| | | | - Nikolai R. Skrynnikov
- Laboratory of Biomolecular NMR St. Petersburg State University St. Petersburg 199034 Russian Federation
- Department of Chemistry Purdue University West Lafayette IN 47906 USA
| | | |
Collapse
|
30
|
Rabdano SO, Shannon MD, Izmailov SA, Gonzalez Salguero N, Zandian M, Purusottam RN, Poirier MG, Skrynnikov NR, Jaroniec CP. Histone H4 Tails in Nucleosomes: a Fuzzy Interaction with DNA. Angew Chem Int Ed Engl 2021; 60:6480-6487. [PMID: 33522067 DOI: 10.1002/anie.202012046] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 12/15/2020] [Indexed: 12/21/2022]
Abstract
The interaction of positively charged N-terminal histone tails with nucleosomal DNA plays an important role in chromatin assembly and regulation, modulating their susceptibility to post-translational modifications and recognition by chromatin-binding proteins. Here, we report residue-specific 15 N NMR relaxation rates for histone H4 tails in reconstituted nucleosomes. These data indicate that H4 tails are strongly dynamically disordered, albeit with reduced conformational flexibility compared to a free peptide with the same sequence. Remarkably, the NMR observables were successfully reproduced in a 2-μs MD trajectory of the nucleosome. This is an important step toward resolving an apparent inconsistency where prior simulations were generally at odds with experimental evidence on conformational dynamics of histone tails. Our findings indicate that histone H4 tails engage in a fuzzy interaction with nucleosomal DNA, underpinned by a variable pattern of short-lived salt bridges and hydrogen bonds, which persists at low ionic strength (0-100 mM NaCl).
Collapse
Affiliation(s)
- Sevastyan O Rabdano
- Laboratory of Biomolecular NMR, St. Petersburg State University, St. Petersburg, 199034, Russian Federation
| | - Matthew D Shannon
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH, 43210, USA
| | - Sergei A Izmailov
- Laboratory of Biomolecular NMR, St. Petersburg State University, St. Petersburg, 199034, Russian Federation
| | | | - Mohamad Zandian
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH, 43210, USA
| | - Rudra N Purusottam
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH, 43210, USA
| | - Michael G Poirier
- Department of Physics, The Ohio State University, Columbus, OH, 43210, USA
| | - Nikolai R Skrynnikov
- Laboratory of Biomolecular NMR, St. Petersburg State University, St. Petersburg, 199034, Russian Federation.,Department of Chemistry, Purdue University, West Lafayette, IN, 47906, USA
| | - Christopher P Jaroniec
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH, 43210, USA
| |
Collapse
|
31
|
Farkas M, Hashimoto H, Bi Y, Davuluri RV, Resnick-Silverman L, Manfredi JJ, Debler EW, McMahon SB. Distinct mechanisms control genome recognition by p53 at its target genes linked to different cell fates. Nat Commun 2021; 12:484. [PMID: 33473123 PMCID: PMC7817693 DOI: 10.1038/s41467-020-20783-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 12/15/2020] [Indexed: 12/21/2022] Open
Abstract
The tumor suppressor p53 integrates stress response pathways by selectively engaging one of several potential transcriptomes, thereby triggering cell fate decisions (e.g., cell cycle arrest, apoptosis). Foundational to this process is the binding of tetrameric p53 to 20-bp response elements (REs) in the genome (RRRCWWGYYYN0-13RRRCWWGYYY). In general, REs at cell cycle arrest targets (e.g. p21) are of higher affinity than those at apoptosis targets (e.g., BAX). However, the RE sequence code underlying selectivity remains undeciphered. Here, we identify molecular mechanisms mediating p53 binding to high- and low-affinity REs by showing that key determinants of the code are embedded in the DNA shape. We further demonstrate that differences in minor/major groove widths, encoded by G/C or A/T bp content at positions 3, 8, 13, and 18 in the RE, determine distinct p53 DNA-binding modes by inducing different Arg248 and Lys120 conformations and interactions. The predictive capacity of this code was confirmed in vivo using genome editing at the BAX RE to interconvert the DNA-binding modes, transcription pattern, and cell fate outcome.
Collapse
Affiliation(s)
- Marina Farkas
- Department of Biochemistry and Molecular Biology, Thomas Jefferson University, Philadelphia, PA, USA
| | - Hideharu Hashimoto
- Department of Biochemistry and Molecular Biology, Thomas Jefferson University, Philadelphia, PA, USA
| | - Yingtao Bi
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Ramana V Davuluri
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | | | | | - Erik W Debler
- Department of Biochemistry and Molecular Biology, Thomas Jefferson University, Philadelphia, PA, USA
| | - Steven B McMahon
- Department of Biochemistry and Molecular Biology, Thomas Jefferson University, Philadelphia, PA, USA.
| |
Collapse
|
32
|
Abstract
The Origin Recognition Complex (ORC) is an evolutionarily conserved six-subunit protein complex that binds specific sites at many locations to coordinately replicate the entire eukaryote genome. Though highly conserved in structure, ORC’s selectivity for replication origins has diverged tremendously between yeasts and humans to adapt to vastly different life cycles. In this work, we demonstrate that the selectivity determinant of ORC for DNA binding lies in a 19-amino acid insertion helix in the Orc4 subunit, which is present in yeast but absent in human. Removal of this motif from Orc4 transforms the yeast ORC, which selects origins based on base-specific binding at defined locations, into one whose selectivity is dictated by chromatin landscape and afforded with plasticity, as reported for human. Notably, the altered yeast ORC has acquired an affinity for regions near transcriptional start sites (TSSs), which the human ORC also favors. In most model yeast species the Origin Recognition Complex (ORC) binds defined and species-specific base sequences while in humans what determines the binding appears to be more complex. Here the authors reveal that the yeast’s ORC complex binding specificity is dependent on a 19-amino acid insertion helix in the Orc4 subunit which is lost in human.
Collapse
|
33
|
Schnepf M, von Reutern M, Ludwig C, Jung C, Gaul U. Transcription Factor Binding Affinities and DNA Shape Readout. iScience 2020; 23:101694. [PMID: 33163946 PMCID: PMC7607496 DOI: 10.1016/j.isci.2020.101694] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 09/30/2020] [Accepted: 10/13/2020] [Indexed: 12/16/2022] Open
Abstract
An essential event in gene regulation is the binding of a transcription factor (TF) to its target DNA. Models considering the interactions between the TF and the DNA geometry proved to be successful approaches to describe this binding event, while conserving data interpretability. However, a direct characterization of the DNA shape contribution to binding is still missing due to the lack of accurate and large-scale binding affinity data. Here, we use a binding assay we recently established to measure with high sensitivity the binding specificities of 13 Drosophila TFs, including dinucleotide dependencies to capture non-independent amino acid-base interactions. Correlating the binding affinities with all DNA shape features, we find that shape readout is widely used by these factors. A shape readout/TF-DNA complex structure analysis validates our approach while providing biological insights such as positively charged or highly polar amino acids often contact nucleotides that exhibit strong shape readout. The DNA shape contribution to Drosophila TFs-DNA binding is directly characterized Zeroth- and first-order TF-DNA binding specificities are measured with high accuracy DNA shape readout is widely used by these TFs A shape readout/structural correlation analysis provides biological insights
Collapse
Affiliation(s)
- Max Schnepf
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Marc von Reutern
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Claudia Ludwig
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Christophe Jung
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Ulrike Gaul
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| |
Collapse
|
34
|
Dantas Machado AC, Cooper BH, Lei X, Di Felice R, Chen L, Rohs R. Landscape of DNA binding signatures of myocyte enhancer factor-2B reveals a unique interplay of base and shape readout. Nucleic Acids Res 2020; 48:8529-8544. [PMID: 32738045 PMCID: PMC7470950 DOI: 10.1093/nar/gkaa642] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Revised: 07/16/2020] [Accepted: 07/22/2020] [Indexed: 01/08/2023] Open
Abstract
Myocyte enhancer factor-2B (MEF2B) has the unique capability of binding to its DNA target sites with a degenerate motif, while still functioning as a gene-specific transcriptional regulator. Identifying its DNA targets is crucial given regulatory roles exerted by members of the MEF2 family and MEF2B's involvement in B-cell lymphoma. Analyzing structural data and SELEX-seq experimental results, we deduced the DNA sequence and shape determinants of MEF2B target sites on a high-throughput basis in vitro for wild-type and mutant proteins. Quantitative modeling of MEF2B binding affinities and computational simulations exposed the DNA readout mechanisms of MEF2B. The resulting binding signature of MEF2B revealed distinct intricacies of DNA recognition compared to other transcription factors. MEF2B uses base readout at its half-sites combined with shape readout at the center of its degenerate motif, where A-tract polarity dictates nuances of binding. The predominant role of shape readout at the center of the core motif, with most contacts formed in the minor groove, differs from previously observed protein-DNA readout modes. MEF2B, therefore, represents a unique protein for studies of the role of DNA shape in achieving binding specificity. MEF2B-DNA recognition mechanisms are likely representative for other members of the MEF2 family.
Collapse
Affiliation(s)
- Ana Carolina Dantas Machado
- Quantitative and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Brendon H Cooper
- Quantitative and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Xiao Lei
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Rosa Di Felice
- Quantitative and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
- Department of Physics & Astronomy, University of Southern California, Los Angeles, CA 90089, USA
| | - Lin Chen
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
- Department of Chemistry, University of Southern California, Los Angeles, CA 90089, USA
- Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA 90033, USA
| | - Remo Rohs
- Quantitative and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
- Department of Physics & Astronomy, University of Southern California, Los Angeles, CA 90089, USA
- Department of Chemistry, University of Southern California, Los Angeles, CA 90089, USA
- Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA 90033, USA
- Department of Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
35
|
Epigenetic competition reveals density-dependent regulation and target site plasticity of phosphorothioate epigenetics in bacteria. Proc Natl Acad Sci U S A 2020; 117:14322-14330. [PMID: 32518115 DOI: 10.1073/pnas.2002933117] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Phosphorothioate (PT) DNA modifications-in which a nonbonding phosphate oxygen is replaced with sulfur-represent a widespread, horizontally transferred epigenetic system in prokaryotes and have a highly unusual property of occupying only a small fraction of available consensus sequences in a genome. Using Salmonella enterica as a model, we asked a question of fundamental importance: How do the PT-modifying DndA-E proteins select their GPSAAC/GPSTTC targets? Here, we applied innovative analytical, sequencing, and computational tools to discover a novel behavior for DNA-binding proteins: The Dnd proteins are "parked" at the G6mATC Dam methyltransferase consensus sequence instead of the expected GAAC/GTTC motif, with removal of the 6mA permitting extensive PT modification of GATC sites. This shift in modification sites further revealed a surprising constancy in the density of PT modifications across the genome. Computational analysis showed that GAAC, GTTC, and GATC share common features of DNA shape, which suggests that PT epigenetics are regulated in a density-dependent manner partly by DNA shape-driven target selection in the genome.
Collapse
|
36
|
Chiu TP, Xin B, Markarian N, Wang Y, Rohs R. TFBSshape: an expanded motif database for DNA shape features of transcription factor binding sites. Nucleic Acids Res 2020; 48:D246-D255. [PMID: 31665425 PMCID: PMC7145579 DOI: 10.1093/nar/gkz970] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 10/08/2019] [Accepted: 10/11/2019] [Indexed: 12/31/2022] Open
Abstract
TFBSshape (https://tfbsshape.usc.edu) is a motif database for analyzing structural profiles of transcription factor binding sites (TFBSs). The main rationale for this database is to be able to derive mechanistic insights in protein-DNA readout modes from sequencing data without available structures. We extended the quantity and dimensionality of TFBSshape, from mostly in vitro to in vivo binding and from unmethylated to methylated DNA. This new release of TFBSshape improves its functionality and launches a responsive and user-friendly web interface for easy access to the data. The current expansion includes new entries from the most recent collections of transcription factors (TFs) from the JASPAR and UniPROBE databases, methylated TFBSs derived from in vitro high-throughput EpiSELEX-seq binding assays and in vivo methylated TFBSs from the MeDReaders database. TFBSshape content has increased to 2428 structural profiles for 1900 TFs from 39 different species. The structural profiles for each TFBS entry now include 13 shape features and minor groove electrostatic potential for standard DNA and four shape features for methylated DNA. We improved the flexibility and accuracy for the shape-based alignment of TFBSs and designed new tools to compare methylated and unmethylated structural profiles of TFs and methods to derive DNA shape-preserving nucleotide mutations in TFBSs.
Collapse
Affiliation(s)
- Tsu-Pei Chiu
- Quantitative and Computational Biology, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Beibei Xin
- Quantitative and Computational Biology, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Nicholas Markarian
- Quantitative and Computational Biology, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Yingfei Wang
- Quantitative and Computational Biology, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Remo Rohs
- Quantitative and Computational Biology, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
37
|
Kolchina N, Khavinson V, Linkova N, Yakimov A, Baitin D, Afanasyeva A, Petukhov M. Systematic search for structural motifs of peptide binding to double-stranded DNA. Nucleic Acids Res 2020; 47:10553-10563. [PMID: 31598715 PMCID: PMC6847403 DOI: 10.1093/nar/gkz850] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Revised: 09/17/2019] [Accepted: 09/29/2019] [Indexed: 01/06/2023] Open
Abstract
A large variety of short biologically active peptides possesses antioxidant, antibacterial, antitumour, anti-ageing and anti-inflammatory activity, involved in the regulation of neuro-immuno-endocrine system functions, cell apoptosis, proliferation and differentiation. Therefore, the mechanisms of their biological activity are attracting increasing attention not only in modern molecular biology, biochemistry and biophysics, but also in pharmacology and medicine. In this work, we systematically analysed the ability of dipeptides (all possible combinations of the 20 standard amino acids) to bind all possible combinations of tetra-nucleotides in the central part of dsDNA in the classic B-form using molecular docking and molecular dynamics. The vast majority of the dipeptides were found to be unable to bind dsDNA. However, we were able to identify 57 low-energy dipeptide complexes with peptide-dsDNA possessing high selectivity for DNA binding. The analysis of the dsDNA complexes with dipeptides with free and blocked N- and C-terminus showed that selective peptide binding to dsDNA can increase dramatically with the peptide length.
Collapse
Affiliation(s)
- Nina Kolchina
- Petersburg Nuclear Physics Institute named after B.P. Konstantinov, NRC "Kurchatov Institute", Gatchina, Russia.,Peter the Great St. Petersburg Polytechnic University, St. Petersburg, Russia.,Russian Scientific Center of Radiology and Surgical Technologies named after A.M. Granov, St. Petersburg, Russia
| | - Vladimir Khavinson
- Saint Petersburg Institute of Bioregulation and Gerontology, St. Petersburg, Russia.,Pavlov Institute of Physiology of RAS, St. Petersburg, Russia.,North-Western State Medical University named after I.I. Mechnikov, St. Petersburg, Russia
| | - Natalia Linkova
- Saint Petersburg Institute of Bioregulation and Gerontology, St. Petersburg, Russia.,Academy of postgraduate education under FSBU FSCC of FMBA of Russia, Moscow, Russia
| | - Alexander Yakimov
- Petersburg Nuclear Physics Institute named after B.P. Konstantinov, NRC "Kurchatov Institute", Gatchina, Russia.,Peter the Great St. Petersburg Polytechnic University, St. Petersburg, Russia
| | - Dmitry Baitin
- Petersburg Nuclear Physics Institute named after B.P. Konstantinov, NRC "Kurchatov Institute", Gatchina, Russia
| | - Arina Afanasyeva
- Petersburg Nuclear Physics Institute named after B.P. Konstantinov, NRC "Kurchatov Institute", Gatchina, Russia.,Peter the Great St. Petersburg Polytechnic University, St. Petersburg, Russia.,National Institutes of Biomedical Innovation, Health and Nutrition, Osaka, Japan
| | - Michael Petukhov
- Petersburg Nuclear Physics Institute named after B.P. Konstantinov, NRC "Kurchatov Institute", Gatchina, Russia.,Peter the Great St. Petersburg Polytechnic University, St. Petersburg, Russia.,Russian Scientific Center of Radiology and Surgical Technologies named after A.M. Granov, St. Petersburg, Russia
| |
Collapse
|
38
|
Tian Z, Li X, Li M, Wu W, Zhang M, Tang C, Li Z, Liu Y, Chen Z, Yang M, Ma L, Caba C, Tong Y, Lam HM, Dai S, Chen Z. Crystal structures of REF6 and its complex with DNA reveal diverse recognition mechanisms. Cell Discov 2020; 6:17. [PMID: 32257379 PMCID: PMC7105484 DOI: 10.1038/s41421-020-0150-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Accepted: 02/03/2020] [Indexed: 12/12/2022] Open
Abstract
Relative of Early Flowing 6 (REF6) is a DNA-sequence-specific H3K27me3/2 demethylase that contains four zinc finger (ZnF) domains and targets several thousand genes in Arabidopsis thaliana. The ZnF domains are essential for binding target genes, but the structural basis remains unclear. Here, we determined crystal structures of the ZnF domains and REF6-DNA complex, revealing a unique REF6-family-specific half-cross-braced ZnF (RCZ) domain and two C2H2-type ZnFs. DNA-binding induces a profound conformational change in the hinge region of REF6. Each REF6 recognizes six bases and DNA methylation reduces the binding affinity. Both the acidic region and basic region are important for the self-association of REF6. The REF6 DNA-binding affinity is determined by the sequence-dependent conformations of DNA and also the cooperativity in different target motifs. The conformational plasticity enables REF6 to function as a global transcriptional regulator that directly binds to many diverse genes, revealing the structural basis for the epigenetic modification recognition.
Collapse
Affiliation(s)
- Zizi Tian
- State Key Laboratory of Agrobiotechnology and Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Biological Sciences, China Agricultural University, 100193 Beijing, China
| | - Xiaorong Li
- State Key Laboratory of Agrobiotechnology and Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Biological Sciences, China Agricultural University, 100193 Beijing, China
| | - Min Li
- State Key Laboratory of Agrobiotechnology and Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Biological Sciences, China Agricultural University, 100193 Beijing, China
| | - Wei Wu
- State Key Laboratory of Agrobiotechnology and Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Biological Sciences, China Agricultural University, 100193 Beijing, China
| | - Manfeng Zhang
- State Key Laboratory of Agrobiotechnology and Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Biological Sciences, China Agricultural University, 100193 Beijing, China
| | - Chenjun Tang
- State Key Laboratory of Agrobiotechnology and Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Biological Sciences, China Agricultural University, 100193 Beijing, China
| | - Zhihui Li
- State Key Laboratory of Agrobiotechnology and Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Biological Sciences, China Agricultural University, 100193 Beijing, China
| | - Yunlong Liu
- State Key Laboratory of Agrobiotechnology and Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Biological Sciences, China Agricultural University, 100193 Beijing, China
| | - Zhenhang Chen
- State Key Laboratory of Agrobiotechnology and Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Biological Sciences, China Agricultural University, 100193 Beijing, China
| | - Meiting Yang
- State Key Laboratory of Agrobiotechnology and Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Biological Sciences, China Agricultural University, 100193 Beijing, China
| | - Lulu Ma
- State Key Laboratory of Agrobiotechnology and Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Biological Sciences, China Agricultural University, 100193 Beijing, China
| | - Cody Caba
- Department of Chemistry and Biochemistry, University of Windsor, Windsor, ON N9B 3P4 Canada
| | - Yufeng Tong
- Department of Chemistry and Biochemistry, University of Windsor, Windsor, ON N9B 3P4 Canada
| | - Hon-Ming Lam
- School of Life Sciences and Center for Soybean Research of the State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
| | - Shaodong Dai
- Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO 80045 USA
| | - Zhongzhou Chen
- State Key Laboratory of Agrobiotechnology and Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Biological Sciences, China Agricultural University, 100193 Beijing, China
| |
Collapse
|
39
|
Pal S, Hoinka J, Przytycka TM. Co-SELECT reveals sequence non-specific contribution of DNA shape to transcription factor binding in vitro. Nucleic Acids Res 2020; 47:6632-6641. [PMID: 31226207 DOI: 10.1093/nar/gkz540] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Revised: 05/31/2019] [Accepted: 06/06/2019] [Indexed: 12/22/2022] Open
Abstract
Understanding the principles of DNA binding by transcription factors (TFs) is of primary importance for studying gene regulation. Recently, several lines of evidence suggested that both DNA sequence and shape contribute to TF binding. However, the following compelling question is yet to be considered: in the absence of any sequence similarity to the binding motif, can DNA shape still increase binding probability? To address this challenge, we developed Co-SELECT, a computational approach to analyze the results of in vitro HT-SELEX experiments for TF-DNA binding. Specifically, Co-SELECT leverages the presence of motif-free sequences in late HT-SELEX rounds and their enrichment in weak binders allows Co-SELECT to detect an evidence for the role of DNA shape features in TF binding. Our approach revealed that, even in the absence of the sequence motif, TFs have propensity to bind to DNA molecules of the shape consistent with the motif specific binding. This provides the first direct evidence that shape features that accompany the preferred sequence motifs also bestow an advantage for weak, sequence non-specific binding.
Collapse
Affiliation(s)
- Soumitra Pal
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Jan Hoinka
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Teresa M Przytycka
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| |
Collapse
|
40
|
Abstract
Biology is reaching a convergence point of its historic reductionist and modern holistic approaches to understanding the living system. Structural biology has historically taken the reductionist approach to deeply probe the inner workings of complex molecular machines. In contrast, systems biology and genome-scale modeling have organically grown out of the wealth of data now being generated by diverse omics measurements. In the late 2000s, a proposed interdisciplinary field of structural systems biology pitched the merger of these two approaches, with widespread applications in pharmacology, disease modeling, protein engineering, and evolutionary studies. In this commentary, we highlight the challenges of integrating these two fields, with a focus on genome-scale metabolic modeling, and the novel findings that are made possible from such a merger.
Collapse
Affiliation(s)
- Nathan Mih
- Department of BioengineeringUniversity of California San DiegoLa JollaCAUSA
- Bioinformatics and Systems Biology ProgramUniversity of California San DiegoLa JollaCAUSA
| | - Bernhard O Palsson
- Department of BioengineeringUniversity of California San DiegoLa JollaCAUSA
- The Novo Nordisk Foundation Center for BiosustainabilityTechnical University of DenmarkLyngbyDenmark
| |
Collapse
|
41
|
Hancock SP, Cascio D, Johnson RC. Cooperative DNA binding by proteins through DNA shape complementarity. Nucleic Acids Res 2019; 47:8874-8887. [PMID: 31616952 PMCID: PMC7145599 DOI: 10.1093/nar/gkz642] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Revised: 07/11/2019] [Accepted: 07/15/2019] [Indexed: 01/13/2023] Open
Abstract
Localized arrays of proteins cooperatively assemble onto chromosomes to control DNA activity in many contexts. Binding cooperativity is often mediated by specific protein-protein interactions, but cooperativity through DNA structure is becoming increasingly recognized as an additional mechanism. During the site-specific DNA recombination reaction that excises phage λ from the chromosome, the bacterial DNA architectural protein Fis recruits multiple λ-encoded Xis proteins to the attR recombination site. Here, we report X-ray crystal structures of DNA complexes containing Fis + Xis, which show little, if any, contacts between the two proteins. Comparisons with structures of DNA complexes containing only Fis or Xis, together with mutant protein and DNA binding studies, support a mechanism for cooperative protein binding solely by DNA allostery. Fis binding both molds the minor groove to potentiate insertion of the Xis β-hairpin wing motif and bends the DNA to facilitate Xis-DNA contacts within the major groove. The Fis-structured minor groove shape that is optimized for Xis binding requires a precisely positioned pyrimidine-purine base-pair step, whose location has been shown to modulate minor groove widths in Fis-bound complexes to different DNA targets.
Collapse
MESH Headings
- Allosteric Site
- Bacteriophage lambda/genetics
- Bacteriophage lambda/metabolism
- Base Sequence
- Binding Sites
- Chromosomes, Bacterial/chemistry
- Chromosomes, Bacterial/metabolism
- Cloning, Molecular
- Crystallography, X-Ray
- DNA Nucleotidyltransferases/chemistry
- DNA Nucleotidyltransferases/genetics
- DNA Nucleotidyltransferases/metabolism
- DNA, Bacterial/chemistry
- DNA, Bacterial/genetics
- DNA, Bacterial/metabolism
- Escherichia coli/genetics
- Escherichia coli/metabolism
- Escherichia coli Proteins/chemistry
- Escherichia coli Proteins/genetics
- Escherichia coli Proteins/metabolism
- Factor For Inversion Stimulation Protein/chemistry
- Factor For Inversion Stimulation Protein/genetics
- Factor For Inversion Stimulation Protein/metabolism
- Gene Expression
- Genetic Vectors/chemistry
- Genetic Vectors/metabolism
- Kinetics
- Models, Molecular
- Nucleic Acid Conformation
- Protein Binding
- Protein Conformation, alpha-Helical
- Protein Conformation, beta-Strand
- Protein Interaction Domains and Motifs
- Recombinant Proteins/chemistry
- Recombinant Proteins/genetics
- Recombinant Proteins/metabolism
- Recombinational DNA Repair
- Sequence Alignment
- Thermodynamics
- Viral Proteins/chemistry
- Viral Proteins/genetics
- Viral Proteins/metabolism
Collapse
Affiliation(s)
- Stephen P Hancock
- Department of Biological Chemistry, David Geffen School of Medicine at the University of California at Los Angeles, Los Angeles, CA 90095-1737, USA
- Department of Chemistry, Towson University, 8000 York Rd., Towson, MD 21252, USA
| | - Duilio Cascio
- University of California at Los Angeles-Department of Energy Institute of Genomics and Proteomics, University of California at Los Angeles, Los Angeles, CA 90095-1570, USA
| | - Reid C Johnson
- Department of Biological Chemistry, David Geffen School of Medicine at the University of California at Los Angeles, Los Angeles, CA 90095-1737, USA
- Molecular Biology Institute, University of California at Los Angeles, Los Angeles, CA 90095, USA
| |
Collapse
|
42
|
Comoglio F, Simonatto M, Polletti S, Liu X, Smale ST, Barozzi I, Natoli G. Dissection of acute stimulus-inducible nucleosome remodeling in mammalian cells. Genes Dev 2019; 33:1159-1174. [PMID: 31371436 PMCID: PMC6719622 DOI: 10.1101/gad.326348.119] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Accepted: 07/03/2019] [Indexed: 12/22/2022]
Abstract
Accessibility of the genomic regulatory information is largely controlled by the nucleosome-organizing activity of transcription factors (TFs). While stimulus-induced TFs bind to genomic regions that are maintained accessible by lineage-determining TFs, they also increase accessibility of thousands of cis-regulatory elements. Nucleosome remodeling events underlying such changes and their interplay with basal positioning are unknown. Here, we devised a novel quantitative framework discriminating different types of nucleosome remodeling events in micrococcal nuclease ChIP-seq (chromatin immunoprecipitation [ChIP] combined with high-throughput sequencing) data sets and used it to analyze nucleosome dynamics at stimulus-regulated cis-regulatory elements. At enhancers, remodeling preferentially affected poorly positioned nucleosomes while sparing well-positioned nucleosomes flanking the enhancer core, indicating that inducible TFs do not suffice to overrule basal nucleosomal organization maintained by lineage-determining TFs. Remodeling events appeared to be combinatorially driven by multiple TFs, with distinct TFs showing, however, different remodeling efficiencies. Overall, these data provide a systematic view of the impact of stimulation on nucleosome organization and genome accessibility in mammalian cells.
Collapse
Affiliation(s)
- Federico Comoglio
- Division of Gene Regulation, Netherlands Cancer Institute, 1066 CX Amsterdam, the Netherlands
- Department of Hematology, University of Cambridge, Cambridge CB2 0XY, United Kingdom
| | - Marta Simonatto
- Humanitas University (Hunimed), Pieve Emanuele, Milano 20090, Italy
| | - Sara Polletti
- Humanitas University (Hunimed), Pieve Emanuele, Milano 20090, Italy
| | - Xin Liu
- Department of Microbiology, Immunology, and Molecular Genetics, University of California at Los Angeles (UCLA), Los Angeles, California 90095, USA
| | - Stephen T Smale
- Department of Microbiology, Immunology, and Molecular Genetics, University of California at Los Angeles (UCLA), Los Angeles, California 90095, USA
| | - Iros Barozzi
- Department of Surgery and Cancer, Imperial College London, London W12 00N, United Kingdom
| | - Gioacchino Natoli
- Humanitas University (Hunimed), Pieve Emanuele, Milano 20090, Italy
- Humanitas Istituto di Ricovero e Cura a Carattere Scientifico, Rozzano, Milano 20089, Italy
| |
Collapse
|
43
|
Layek S, Agrahari B, Dey S, Ganguly R, Pathak DD. Copper(II)-faciliated synthesis of substituted thioethers and 5-substituted 1H-tetrazoles: Experimental and theoretical studies. J Organomet Chem 2019. [DOI: 10.1016/j.jorganchem.2019.06.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
44
|
Azad RN, Zafiropoulos D, Ober D, Jiang Y, Chiu TP, Sagendorf JM, Rohs R, Tullius TD. Experimental maps of DNA structure at nucleotide resolution distinguish intrinsic from protein-induced DNA deformations. Nucleic Acids Res 2019; 46:2636-2647. [PMID: 29390080 PMCID: PMC5946862 DOI: 10.1093/nar/gky033] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Accepted: 01/15/2018] [Indexed: 12/22/2022] Open
Abstract
Recognition of DNA by proteins depends on DNA sequence and structure. Often unanswered is whether the structure of naked DNA persists in a protein–DNA complex, or whether protein binding changes DNA shape. While X-ray structures of protein–DNA complexes are numerous, the structure of naked cognate DNA is seldom available experimentally. We present here an experimental and computational analysis pipeline that uses hydroxyl radical cleavage to map, at single-nucleotide resolution, DNA minor groove width, a recognition feature widely exploited by proteins. For 11 protein–DNA complexes, we compared experimental maps of naked DNA minor groove width with minor groove width measured from X-ray co-crystal structures. Seven sites had similar minor groove widths as naked DNA and when bound to protein. For four sites, part of the DNA in the complex had the same structure as naked DNA, and part changed structure upon protein binding. We compared the experimental map with minor groove patterns of DNA predicted by two computational approaches, DNAshape and ORChID2, and found good but not perfect concordance with both. This experimental approach will be useful in mapping structures of DNA sequences for which high-resolution structural data are unavailable. This approach allows probing of protein family-dependent readout mechanisms.
Collapse
Affiliation(s)
- Robert N Azad
- Department of Chemistry, Boston University, Boston, MA 02215, USA
| | | | - Douglas Ober
- Department of Chemistry, Boston University, Boston, MA 02215, USA
| | - Yining Jiang
- Department of Chemistry, Boston University, Boston, MA 02215, USA
| | - Tsu-Pei Chiu
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Jared M Sagendorf
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Remo Rohs
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Thomas D Tullius
- Department of Chemistry, Boston University, Boston, MA 02215, USA.,Program in Bioinformatics, Boston University, Boston, MA 02215, USA
| |
Collapse
|
45
|
Wang X, Zhou T, Wunderlich Z, Maurano MT, DePace AH, Nuzhdin SV, Rohs R. Analysis of Genetic Variation Indicates DNA Shape Involvement in Purifying Selection. Mol Biol Evol 2019; 35:1958-1967. [PMID: 29850830 PMCID: PMC6063282 DOI: 10.1093/molbev/msy099] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Noncoding DNA sequences, which play various roles in gene expression and regulation, are under evolutionary pressure. Gene regulation requires specific protein–DNA binding events, and our previous studies showed that both DNA sequence and shape readout are employed by transcription factors (TFs) to achieve DNA binding specificity. By investigating the shape-disrupting properties of single nucleotide polymorphisms (SNPs) in human regulatory regions, we established a link between disruptive local DNA shape changes and loss of specific TF binding. Furthermore, we described cases where disease-associated SNPs may alter TF binding through DNA shape changes. This link led us to hypothesize that local DNA shape within and around TF binding sites is under selection pressure. To verify this hypothesis, we analyzed SNP data derived from 216 natural strains of Drosophila melanogaster. Comparing SNPs located in functional and nonfunctional regions within experimentally validated cis-regulatory modules (CRMs) from D. melanogaster that are active in the blastoderm stage of development, we found that SNPs within functional regions tended to cause smaller DNA shape variations. Furthermore, SNPs with higher minor allele frequency were more likely to result in smaller DNA shape variations. The same analysis based on a large number of SNPs in putative CRMs of the D. melanogaster genome derived from DNase I accessibility data confirmed these observations. Taken together, our results indicate that common SNPs in functional regions tend to maintain DNA shape, whereas shape-disrupting SNPs are more likely to be eliminated through purifying selection.
Collapse
Affiliation(s)
- Xiaofei Wang
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA
| | - Tianyin Zhou
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA
| | - Zeba Wunderlich
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA
| | - Matthew T Maurano
- Institute for Systems Genetics, New York University Medical Center, New York, NY
| | - Angela H DePace
- Department of Systems Biology, Harvard Medical School, Boston, MA
| | - Sergey V Nuzhdin
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA
| | - Remo Rohs
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA.,Departments of Chemistry, Physics and Astronomy, and Computer Science, University of Southern California, Los Angeles, CA
| |
Collapse
|
46
|
Emamjomeh A, Choobineh D, Hajieghrari B, MahdiNezhad N, Khodavirdipour A. DNA-protein interaction: identification, prediction and data analysis. Mol Biol Rep 2019; 46:3571-3596. [PMID: 30915687 DOI: 10.1007/s11033-019-04763-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2018] [Accepted: 03/14/2019] [Indexed: 12/30/2022]
Abstract
Life in living organisms is dependent on specific and purposeful interaction between other molecules. Such purposeful interactions make the various processes inside the cells and the bodies of living organisms possible. DNA-protein interactions, among all the types of interactions between different molecules, are of considerable importance. Currently, with the development of numerous experimental techniques, diverse methods are convenient for recognition and investigating such interactions. While the traditional experimental techniques to identify DNA-protein complexes are time-consuming and are unsuitable for genome-scale studies, the current high throughput approaches are more efficient in determining such interaction at a large-scale, but they are clearly too costly to be practice for daily applications. Hence, according to the availability of much information related to different biological sequences and clearing different dimensions of conditions in which such interactions are formed, with the developments related to the computer, mathematics, and statistics motivate scientists to develop bioinformatics tools for prediction the interaction site(s). Until now, there has been much progress in this field. In this review, the factors and conditions governing the interaction and the laboratory techniques for examining such interactions are addressed. In addition, developed bioinformatics tools are introduced and compared for this reason and, in the end, several suggestions are offered for the promotion of such tools in prediction with much more precision.
Collapse
Affiliation(s)
- Abbasali Emamjomeh
- Laboratory of Computational Biotechnology and Bioinformatics (CBB), Department of Plant Breeding and Biotechnology (PBB), University of Zabol, Zabol, 98615-538, Iran.
| | - Darush Choobineh
- Agricultural Biotechnology, Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran
| | - Behzad Hajieghrari
- Department of Agricultural Biotechnology, College of Agriculture, Jahrom University, Jahrom, 74135-111, Iran.
| | - Nafiseh MahdiNezhad
- Laboratory of Computational Biotechnology and Bioinformatics (CBB), Department of Plant Breeding and Biotechnology (PBB), University of Zabol, Zabol, 98615-538, Iran
| | - Amir Khodavirdipour
- Division of Human Genetics, Department of Anatomy, St. John's hospital, Bangalore, India
| |
Collapse
|
47
|
Dou X, Meints GA, Sedaghat-Herati R. New Insights into the Interactions of a DNA Oligonucleotide with mPEGylated-PAMAM by Circular Dichroism and Solution NMR. J Phys Chem B 2019; 123:666-674. [PMID: 30562015 DOI: 10.1021/acs.jpcb.8b08517] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Dendrimers are well-defined, highly branched, synthetic three-dimensional molecules with a large number of reactive end groups. PAMAM dendrimers form stable complexes with DNA chemistries and constitute an important class of nonviral, cationic vectors in gene delivery. The aim of this study is to examine the interactions of a 12 bp DNA oligonucletide with PAMAM-G2 and mPEG- b-PAMAM-G3 having eight surface amine groups under physiological conditions, using constant DNA concentration but varying dendrimer concentration. 1D 31P NMR, 2D NOESY, and CD spectroscopic methods were employed to investigate the interactions between the dendrimer and the DNA. The CD experiments carried out with a constant DNA concentration of 25 μM and dendrimer concentrations from 0 to 100 μM indicated minimal change to the chirality of the DNA for both types of dendrimers. While the PAMAM-G2 dendrimer caused aggregation of the majority of the DNA, the 2D NMR data of the DNA with an mPEG- b-PAMAM-G3 dendrimer indicated general broadening of the 1D 31P peaks from the DNA phosphates, a small number of 1H chemical shift perturbations (CSPs), and reduction of specific 1H-1H NOE intensities. These data suggest there is minimal structural alteration of the DNA in the complex and indicate preferential binding of the dendrimer to the central AATT region of the DNA sequence. The results herein are the first such results demonstrating a soluble DNA complex with the mPEG- b-PAMAM-G3 dendrimer analyzed by multidimensional NMR.
Collapse
Affiliation(s)
- Xiaozheng Dou
- Department of Chemistry , Missouri State University , Springfield , Missouri 65897 , United States
| | - Gary A Meints
- Department of Chemistry , Missouri State University , Springfield , Missouri 65897 , United States
| | - Reza Sedaghat-Herati
- Department of Chemistry , Missouri State University , Springfield , Missouri 65897 , United States
| |
Collapse
|
48
|
The interaction landscape between transcription factors and the nucleosome. Nature 2018; 562:76-81. [PMID: 30250250 PMCID: PMC6173309 DOI: 10.1038/s41586-018-0549-5] [Citation(s) in RCA: 200] [Impact Index Per Article: 28.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2017] [Accepted: 08/06/2018] [Indexed: 01/01/2023]
Abstract
Nucleosomes cover most of the genome and are thought to be displaced by transcription factors in regions that direct gene expression. However, the modes of interaction between transcription factors and nucleosomal DNA remain largely unknown. Here we systematically explore interactions between the nucleosome and 220 transcription factors representing diverse structural families. Consistent with earlier observations, we find that the majority of the studied transcription factors have less access to nucleosomal DNA than to free DNA. The motifs recovered from transcription factors bound to nucleosomal and free DNA are generally similar. However, steric hindrance and scaffolding by the nucleosome result in specific positioning and orientation of the motifs. Many transcription factors preferentially bind close to the end of nucleosomal DNA, or to periodic positions on the solvent-exposed side of the DNA. In addition, several transcription factors usually bind to nucleosomal DNA in a particular orientation. Some transcription factors specifically interact with DNA located at the dyad position at which only one DNA gyre is wound, whereas other transcription factors prefer sites spanning two DNA gyres and bind specifically to each of them. Our work reveals notable differences in the binding of transcription factors to free and nucleosomal DNA, and uncovers a diverse interaction landscape between transcription factors and the nucleosome.
Collapse
|
49
|
Structural Insights into the CRTC2–CREB Complex Assembly on CRE. J Mol Biol 2018; 430:1926-1939. [DOI: 10.1016/j.jmb.2018.04.038] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 04/19/2018] [Accepted: 04/25/2018] [Indexed: 11/18/2022]
|
50
|
Xin B, Rohs R. Relationship between histone modifications and transcription factor binding is protein family specific. Genome Res 2018; 28:321-333. [PMID: 29326300 PMCID: PMC5848611 DOI: 10.1101/gr.220079.116] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Accepted: 01/10/2018] [Indexed: 12/20/2022]
Abstract
The very small fraction of putative binding sites (BSs) that are occupied by transcription factors (TFs) in vivo can be highly variable across different cell types. This observation has been partly attributed to changes in chromatin accessibility and histone modification (HM) patterns surrounding BSs. Previous studies focusing on BSs within DNA regulatory regions found correlations between HM patterns and TF binding specificities. However, a mechanistic understanding of TF-DNA binding specificity determinants is still not available. The ability to predict in vivo TF binding on a genome-wide scale requires the identification of features that determine TF binding based on evolutionary relationships of DNA binding proteins. To reveal protein family-dependent mechanisms of TF binding, we conducted comprehensive comparisons of HM patterns surrounding BSs and non-BSs with exactly matched core motifs for TFs in three cell lines: 33 TFs in GM12878, 37 TFs in K562, and 18 TFs in H1-hESC. These TFs displayed protein family-specific preferences for HM patterns surrounding BSs, with high agreement among cell lines. Moreover, compared to models based on DNA sequence and shape at flanking regions of BSs, HM-augmented quantitative machine-learning methods resulted in increased performance in a TF family-specific manner. Analysis of the relative importance of features in these models indicated that TFs, displaying larger HM pattern differences between BSs and non-BSs, bound DNA in an HM-specific manner on a protein family-specific basis. We propose that TF family-specific HM preferences reveal distinct mechanisms that assist in guiding TFs to their cognate BSs by altering chromatin structure and accessibility.
Collapse
Affiliation(s)
- Beibei Xin
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, California 90089, USA
| | - Remo Rohs
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, California 90089, USA
| |
Collapse
|