1
|
Li J, Rohs R. Deep DNAshape webserver: prediction and real-time visualization of DNA shape considering extended k-mers. Nucleic Acids Res 2024; 52:W7-W12. [PMID: 38801070 PMCID: PMC11223853 DOI: 10.1093/nar/gkae433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 04/30/2024] [Accepted: 05/08/2024] [Indexed: 05/29/2024] Open
Abstract
Sequence-dependent DNA shape plays an important role in understanding protein-DNA binding mechanisms. High-throughput prediction of DNA shape features has become a valuable tool in the field of protein-DNA recognition, transcription factor-DNA binding specificity, and gene regulation. However, our widely used webserver, DNAshape, relies on statistically summarized pentamer query tables to query DNA shape features. These query tables do not consider flanking regions longer than two base pairs, and acquiring a query table for hexamers or higher-order k-mers is currently still unrealistic due to limitations in achieving sufficient statistical coverage in molecular simulations or structural biology experiments. A recent deep-learning method, Deep DNAshape, can predict DNA shape features at the core of a DNA fragment considering flanking regions of up to seven base pairs, trained on limited simulation data. However, Deep DNAshape is rather complicated to install, and it must run locally compared to the pentamer-based DNAshape webserver, creating a barrier for users. Here, we present the Deep DNAshape webserver, which has the benefits of both methods while being accurate, fast, and accessible to all users. Additional improvements of the webserver include the detection of user input in real time, the ability of interactive visualization tools and different modes of analyses. URL: https://deepdnashape.usc.edu.
Collapse
Affiliation(s)
- Jinsen Li
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Remo Rohs
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
- Department of Chemistry, University of Southern California, Los Angeles, CA 90089, USA
- Department of Physics and Astronomy, University of Southern California, Los Angeles, CA 90089, USA
- Thomas Lord Department of Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
2
|
Ali MZ, Guharajan S, Parisutham V, Brewster RC. Regulatory properties of transcription factors with diverse mechanistic function. PLoS Comput Biol 2024; 20:e1012194. [PMID: 38857275 PMCID: PMC11192337 DOI: 10.1371/journal.pcbi.1012194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 06/21/2024] [Accepted: 05/24/2024] [Indexed: 06/12/2024] Open
Abstract
Transcription factors (TFs) regulate the process of transcription through the modulation of different kinetic steps. Although models can often describe the observed transcriptional output of a measured gene, predicting a TFs role on a given promoter requires an understanding of how the TF alters each step of the transcription process. In this work, we use a simple model of transcription to assess the role of promoter identity, and the degree to which TFs alter binding of RNAP (stabilization) and initiation of transcription (acceleration) on three primary characteristics: the range of steady-state regulation, cell-to-cell variability in expression, and the dynamic response time of a regulated gene. We find that steady state regulation and the response time of a gene behave uniquely for TFs that regulate incoherently, i.e that speed up one step but slow the other. We also find that incoherent TFs have dynamic implications, with one type of incoherent mode configuring the promoter to respond more slowly at intermediate TF concentrations. We also demonstrate that the noise of gene expression for these TFs is sensitive to promoter strength, with a distinct non-monotonic profile that is apparent under stronger promoters. Taken together, our work uncovers the coupling between promoters and TF regulatory modes with implications for understanding natural promoters and engineering synthetic gene circuits with desired expression properties.
Collapse
Affiliation(s)
- Md Zulfikar Ali
- Department of Systems Biology, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America
- Department of Microbiology and Physiological Systems, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America
- Department of Geology, Physics and Environmental Science, University of Southern Indiana, Evansville, Indiana, United States of America
| | - Sunil Guharajan
- Department of Systems Biology, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America
- Department of Microbiology and Physiological Systems, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America
| | - Vinuselvi Parisutham
- Department of Systems Biology, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America
- Department of Microbiology and Physiological Systems, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America
| | - Robert C. Brewster
- Department of Systems Biology, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America
- Department of Microbiology and Physiological Systems, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America
| |
Collapse
|
3
|
Hua K, Wu C, Lin C, Chen C. E2F1 promotes cell migration in hepatocellular carcinoma via FNDC3B. FEBS Open Bio 2024; 14:687-694. [PMID: 38403291 PMCID: PMC10988749 DOI: 10.1002/2211-5463.13783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 01/23/2024] [Accepted: 02/16/2024] [Indexed: 02/27/2024] Open
Abstract
FNDC3B (fibronectin type III domain containing 3B) is highly expressed in hepatocellular carcinoma (HCC) and other cancer types, and fusion genes involving FNDC3B have been identified in HCC and leukemia. Growing evidence suggests the significance of FNDC3B in tumorigenesis, particularly in cell migration and tumor metastasis. However, its regulatory mechanisms remain elusive. In this study, we employed bioinformatic, gene regulation, and protein-DNA interaction screening to investigate the transcription factors (TFs) involved in regulating FNDC3B. Initially, 338 candidate TFs were selected based on previous chromatin immunoprecipitation (ChIP)-seq experiments available in public domain databases. Through TF knockdown screening and ChIP coupled with Droplet Digital PCR assays, we identified that E2F1 (E2F transcription factor 1) is crucial for the activation of FNDC3B. Overexpression or knockdown of E2F1 significantly impacts the expression of FNDC3B. In conclusion, our study elucidated the mechanistic link between FNDC3B and E2F1. These findings contribute to a better understanding of FNDC3B in tumorigenesis and provide insights into potential therapeutic targets for cancer treatment.
Collapse
Affiliation(s)
- Kate Hua
- Cancer Progression Research CenterNational Yang Ming Chiao Tung UniversityTaipeiTaiwan
| | - Chen‐Tang Wu
- Cancer Progression Research CenterNational Yang Ming Chiao Tung UniversityTaipeiTaiwan
| | - Chin‐Hui Lin
- Cancer Progression Research CenterNational Yang Ming Chiao Tung UniversityTaipeiTaiwan
| | - Chian‐Feng Chen
- Cancer Progression Research CenterNational Yang Ming Chiao Tung UniversityTaipeiTaiwan
| |
Collapse
|
4
|
Jin R, He B, Qin Y, Du Z, Cao C, Li J. Unveiling the role of bZIP transcription factors CREB and CEBP in detoxification metabolism of Nilaparvata lugens (Stål). Int J Biol Macromol 2023; 253:126576. [PMID: 37648128 DOI: 10.1016/j.ijbiomac.2023.126576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Revised: 08/24/2023] [Accepted: 08/26/2023] [Indexed: 09/01/2023]
Abstract
The basic leucine zipper (bZIP) superfamily is a crucial group of xenobiotics in insects. However, little is known about the function of CAAT enhancer binding proteins (CEBP) and cAMP response element binding protein (CREB) in Nilaparvata lugens. In the present study, NlCEBP and NlCREB were cloned and identified. Quantitative polymerase real-time chain reaction (qRT-PCR) analysis showed the expression of NlCEBP and NlCREB was significantly induced after chemical insecticides exposure. Silencing of NlCEBP and NlCREB increased the susceptibility of N. lugens to insecticides, and the detoxification enzyme activities were also significantly decreased. In addition, comparative transcriptome analysis revealed that 174 genes were significantly co-down-regulated after interfering with the two transcription factors. GO analysis showed that co-down-regulated genes are mostly related to energy transport and metabolic functions indicating the potential regulatory role of NlCEBP and NlCREB in detoxification metabolism. Our research shed lights on the functional roles of transcription factors NlCEBP and NlCREB in the detoxification metabolism of N. lugens, providing a theoretical basis for pest management and comprehensive control of this pest and increasing our understanding of insect toxicology.
Collapse
Affiliation(s)
- Ruoheng Jin
- National Biopesticide Engineering Research Centre, Hubei Biopesticide Engineering Research Centre, Hubei Academy of Agricultural Science, Wuhan 430064, PR China; Hubei Insect Resources Utilization and Sustainable Pest Management Key Laboratory, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Biyan He
- Hubei Insect Resources Utilization and Sustainable Pest Management Key Laboratory, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China; Tongling Municipal Bureau of Agricultural and Rural Affairs, Tongling 244002, PR China
| | - Yao Qin
- Hubei Insect Resources Utilization and Sustainable Pest Management Key Laboratory, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Zuyi Du
- Hubei Insect Resources Utilization and Sustainable Pest Management Key Laboratory, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Chunxia Cao
- National Biopesticide Engineering Research Centre, Hubei Biopesticide Engineering Research Centre, Hubei Academy of Agricultural Science, Wuhan 430064, PR China.
| | - Jianhong Li
- Hubei Insect Resources Utilization and Sustainable Pest Management Key Laboratory, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China.
| |
Collapse
|
5
|
Martyn GE, Montgomery MT, Jones H, Guo K, Doughty BR, Linder J, Chen Z, Cochran K, Lawrence KA, Munson G, Pampari A, Fulco CP, Kelley DR, Lander ES, Kundaje A, Engreitz JM. Rewriting regulatory DNA to dissect and reprogram gene expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.20.572268. [PMID: 38187584 PMCID: PMC10769263 DOI: 10.1101/2023.12.20.572268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Regulatory DNA sequences within enhancers and promoters bind transcription factors to encode cell type-specific patterns of gene expression. However, the regulatory effects and programmability of such DNA sequences remain difficult to map or predict because we have lacked scalable methods to precisely edit regulatory DNA and quantify the effects in an endogenous genomic context. Here we present an approach to measure the quantitative effects of hundreds of designed DNA sequence variants on gene expression, by combining pooled CRISPR prime editing with RNA fluorescence in situ hybridization and cell sorting (Variant-FlowFISH). We apply this method to mutagenize and rewrite regulatory DNA sequences in an enhancer and the promoter of PPIF in two immune cell lines. Of 672 variant-cell type pairs, we identify 497 that affect PPIF expression. These variants appear to act through a variety of mechanisms including disruption or optimization of existing transcription factor binding sites, as well as creation of de novo sites. Disrupting a single endogenous transcription factor binding site often led to large changes in expression (up to -40% in the enhancer, and -50% in the promoter). The same variant often had different effects across cell types and states, demonstrating a highly tunable regulatory landscape. We use these data to benchmark performance of sequence-based predictive models of gene regulation, and find that certain types of variants are not accurately predicted by existing models. Finally, we computationally design 185 small sequence variants (≤10 bp) and optimize them for specific effects on expression in silico. 84% of these rationally designed edits showed the intended direction of effect, and some had dramatic effects on expression (-100% to +202%). Variant-FlowFISH thus provides a powerful tool to map the effects of variants and transcription factor binding sites on gene expression, test and improve computational models of gene regulation, and reprogram regulatory DNA.
Collapse
Affiliation(s)
- Gabriella E Martyn
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
| | - Michael T Montgomery
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
| | - Hank Jones
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
| | - Katherine Guo
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
| | - Benjamin R Doughty
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Ziwei Chen
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Kelly Cochran
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Kathryn A Lawrence
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Glen Munson
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Anusri Pampari
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Charles P Fulco
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Present Address: Sanofi, Cambridge, MA, USA
| | | | - Eric S Lander
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biology, MIT, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jesse M Engreitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA
| |
Collapse
|
6
|
Stellner NI, Rerop ZS, Mehlmer N, Masri M, Ringel M, Brück TB. Expanding the genetic toolbox for Cutaneotrichosporon oleaginosus employing newly identified promoters and a novel antibiotic resistance marker. BMC Biotechnol 2023; 23:40. [PMID: 37723521 PMCID: PMC10506223 DOI: 10.1186/s12896-023-00812-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 09/08/2023] [Indexed: 09/20/2023] Open
Abstract
BACKGROUND Cutaneotrichosporon oleaginosus is an oleaginous yeast that can produce up to 80% lipid per dry weight. Its high capacity for the biosynthesis of single cell oil makes it highly interesting for the production of engineered lipids or oleochemicals for industrial applications. However, the genetic toolbox for metabolic engineering of this non-conventional yeast has not yet been systematically expanded. Only three long endogenous promoter sequences have been used for heterologous gene expression, further three dominant and one auxotrophic marker have been established. RESULTS In this study, the structure of putative endogenous promoter sequences was analyzed based on more than 280 highly expressed genes. The identified motifs of regulatory elements and translational initiation sites were used to annotate the four endogenous putative promoter sequences D9FADp, UBIp, PPIp, and 60Sp. The promoter sequences were tested in a construct regulating the known dominant marker hygromycin B phosphotransferase. The four newly described promoters and the previously established GAPDHp successfully initiated expression of the resistance gene and PPIp was selected for further marker development. The geneticin G418 resistance (aminoglycoside 3'-phosphotransferase, APH) and the nourseothricin resistance gene N-acetyl transferase (NAT) were tested for applicability in C. oleaginosus. Both markers showed high transformation efficiency, positive rate, and were compatible for combined use in a successive and simultaneous manner. CONCLUSIONS The implementation of four endogenous promoters and one novel dominant resistance markers for C. oleaginosus opens up new opportunities for genetic engineering and strain development. In combination with recently developed methods for targeted genomic integration, the established toolbox allows a wide spectrum of new strategies for genetic and metabolic engineering of the industrially highly relevant yeast.
Collapse
Affiliation(s)
- Nikolaus I Stellner
- TUM School of Natural Sciences, Department of Chemistry, Werner Siemens-Chair for Synthetic Biotechnology, Technical University of Munich, Lichtenbergstr. 4, 85748, Garching, Germany
- TUM CREATE Ltd, 1 Create Way, #10-02 CREATE Tower, Singapore, 138602, Singapore
| | - Zora S Rerop
- TUM School of Natural Sciences, Department of Chemistry, Werner Siemens-Chair for Synthetic Biotechnology, Technical University of Munich, Lichtenbergstr. 4, 85748, Garching, Germany
| | - Norbert Mehlmer
- TUM School of Natural Sciences, Department of Chemistry, Werner Siemens-Chair for Synthetic Biotechnology, Technical University of Munich, Lichtenbergstr. 4, 85748, Garching, Germany
| | - Mahmoud Masri
- TUM School of Natural Sciences, Department of Chemistry, Werner Siemens-Chair for Synthetic Biotechnology, Technical University of Munich, Lichtenbergstr. 4, 85748, Garching, Germany
| | - Marion Ringel
- TUM School of Natural Sciences, Department of Chemistry, Werner Siemens-Chair for Synthetic Biotechnology, Technical University of Munich, Lichtenbergstr. 4, 85748, Garching, Germany
| | - Thomas B Brück
- TUM School of Natural Sciences, Department of Chemistry, Werner Siemens-Chair for Synthetic Biotechnology, Technical University of Munich, Lichtenbergstr. 4, 85748, Garching, Germany.
| |
Collapse
|
7
|
Gosai SJ, Castro RI, Fuentes N, Butts JC, Kales S, Noche RR, Mouri K, Sabeti PC, Reilly SK, Tewhey R. Machine-guided design of synthetic cell type-specific cis-regulatory elements. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.08.552077. [PMID: 37609287 PMCID: PMC10441439 DOI: 10.1101/2023.08.08.552077] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
Cis-regulatory elements (CREs) control gene expression, orchestrating tissue identity, developmental timing, and stimulus responses, which collectively define the thousands of unique cell types in the body. While there is great potential for strategically incorporating CREs in therapeutic or biotechnology applications that require tissue specificity, there is no guarantee that an optimal CRE for an intended purpose has arisen naturally through evolution. Here, we present a platform to engineer and validate synthetic CREs capable of driving gene expression with programmed cell type specificity. We leverage innovations in deep neural network modeling of CRE activity across three cell types, efficient in silico optimization, and massively parallel reporter assays (MPRAs) to design and empirically test thousands of CREs. Through in vitro and in vivo validation, we show that synthetic sequences outperform natural sequences from the human genome in driving cell type-specific expression. Synthetic sequences leverage unique sequence syntax to promote activity in the on-target cell type and simultaneously reduce activity in off-target cells. Together, we provide a generalizable framework to prospectively engineer CREs and demonstrate the required literacy to write regulatory code that is fit-for-purpose in vivo across vertebrates.
Collapse
Affiliation(s)
- SJ Gosai
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Graduate Program in Biological and Biomedical Science, Boston MA
- Department Of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - RI Castro
- The Jackson Laboratory, Bar Harbor, ME, USA
| | - N Fuentes
- The Jackson Laboratory, Bar Harbor, ME, USA
- Harvard College, Harvard University, Cambridge, MA, USA
| | - JC Butts
- The Jackson Laboratory, Bar Harbor, ME, USA
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
| | - S Kales
- The Jackson Laboratory, Bar Harbor, ME, USA
| | - RR Noche
- Department of Comparative Medicine, Yale School of Medicine, New Haven, CT, USA
- Yale Zebrafish Research Core, Yale School of Medicine, New Haven, CT, USA
| | - K Mouri
- The Jackson Laboratory, Bar Harbor, ME, USA
| | - PC Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department Of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - SK Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Wu Tsai Institute, Yale University, New Haven, CT, USA
| | - R Tewhey
- The Jackson Laboratory, Bar Harbor, ME, USA
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
- Graduate School of Biomedical Sciences, Tufts University School of Medicine, Boston, MA, USA
| |
Collapse
|
8
|
Hartwig T, Banf M, Prietsch GP, Zhu JY, Mora-Ramírez I, Schippers JHM, Snodgrass SJ, Seetharam AS, Huettel B, Kolkman JM, Yang J, Engelhorn J, Wang ZY. Hybrid allele-specific ChIP-seq analysis identifies variation in brassinosteroid-responsive transcription factor binding linked to traits in maize. Genome Biol 2023; 24:108. [PMID: 37158941 PMCID: PMC10165856 DOI: 10.1186/s13059-023-02909-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2022] [Accepted: 03/23/2023] [Indexed: 05/10/2023] Open
Abstract
BACKGROUND Genetic variation in regulatory sequences that alter transcription factor (TF) binding is a major cause of phenotypic diversity. Brassinosteroid is a growth hormone that has major effects on plant phenotypes. Genetic variation in brassinosteroid-responsive cis-elements likely contributes to trait variation. Pinpointing such regulatory variations and quantitative genomic analysis of the variation in TF-target binding, however, remains challenging. How variation in transcriptional targets of signaling pathways such as the brassinosteroid pathway contributes to phenotypic variation is an important question to be investigated with innovative approaches. RESULTS Here, we use a hybrid allele-specific chromatin binding sequencing (HASCh-seq) approach and identify variations in target binding of the brassinosteroid-responsive TF ZmBZR1 in maize. HASCh-seq in the B73xMo17 F1s identifies thousands of target genes of ZmBZR1. Allele-specific ZmBZR1 binding (ASB) has been observed for 18.3% of target genes and is enriched in promoter and enhancer regions. About a quarter of the ASB sites correlate with sequence variation in BZR1-binding motifs and another quarter correlate with haplotype-specific DNA methylation, suggesting that both genetic and epigenetic variations contribute to the high level of variation in ZmBZR1 occupancy. Comparison with GWAS data shows linkage of hundreds of ASB loci to important yield and disease-related traits. CONCLUSION Our study provides a robust method for analyzing genome-wide variations of TF occupancy and identifies genetic and epigenetic variations of the brassinosteroid response transcription network in maize.
Collapse
Affiliation(s)
- Thomas Hartwig
- Department of Plant Biology, Carnegie Institution for Science, 260 Panama Street, Stanford, CA, 94305, USA.
- Heinrich-Heine University, Universitätsstraße 1, Düsseldorf, NRW, 40225, Germany.
- Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, Cologne, NRW, 50829, Germany.
| | - Michael Banf
- Department of Plant Biology, Carnegie Institution for Science, 260 Panama Street, Stanford, CA, 94305, USA
| | - Gisele Passaia Prietsch
- Department of Plant Biology, Carnegie Institution for Science, 260 Panama Street, Stanford, CA, 94305, USA
| | - Jia-Ying Zhu
- Leibniz-Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, SA, 06466, Germany
| | - Isabel Mora-Ramírez
- Leibniz-Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, SA, 06466, Germany
| | - Jos H M Schippers
- Leibniz-Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, SA, 06466, Germany
| | - Samantha J Snodgrass
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, 339A Bessey Hall, Ames, IA, 50011, USA
| | - Arun S Seetharam
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, 339A Bessey Hall, Ames, IA, 50011, USA
| | - Bruno Huettel
- Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, Cologne, NRW, 50829, Germany
| | - Judith M Kolkman
- School of Integrative Plant Science, Plant Pathology and Plant-Microbe Biology Section, Cornell University, 413 Bradfield Hall, Ithaca, NY, 14853, USA
| | - Jinliang Yang
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, 363 Keim Hall, Lincoln, NE, 68583, USA
| | - Julia Engelhorn
- Heinrich-Heine University, Universitätsstraße 1, Düsseldorf, NRW, 40225, Germany
- Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, Cologne, NRW, 50829, Germany
| | - Zhi-Yong Wang
- Department of Plant Biology, Carnegie Institution for Science, 260 Panama Street, Stanford, CA, 94305, USA.
| |
Collapse
|
9
|
Li M, Yao T, Lin W, Hinckley WE, Galli M, Muchero W, Gallavotti A, Chen JG, Huang SSC. Double DAP-seq uncovered synergistic DNA binding of interacting bZIP transcription factors. Nat Commun 2023; 14:2600. [PMID: 37147307 PMCID: PMC10163045 DOI: 10.1038/s41467-023-38096-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 04/15/2023] [Indexed: 05/07/2023] Open
Abstract
Many eukaryotic transcription factors (TF) form homodimer or heterodimer complexes to regulate gene expression. Dimerization of BASIC LEUCINE ZIPPER (bZIP) TFs are critical for their functions, but the molecular mechanism underlying the DNA binding and functional specificity of homo- versus heterodimers remains elusive. To address this gap, we present the double DNA Affinity Purification-sequencing (dDAP-seq) technique that maps heterodimer binding sites on endogenous genomic DNA. Using dDAP-seq we profile twenty pairs of C/S1 bZIP heterodimers and S1 homodimers in Arabidopsis and show that heterodimerization significantly expands the DNA binding preferences of these TFs. Analysis of dDAP-seq binding sites reveals the function of bZIP9 in abscisic acid response and the role of bZIP53 heterodimer-specific binding in seed maturation. The C/S1 heterodimers show distinct preferences for the ACGT elements recognized by plant bZIPs and motifs resembling the yeast GCN4 cis-elements. This study demonstrates the potential of dDAP-seq in deciphering the DNA binding specificities of interacting TFs that are key for combinatorial gene regulation.
Collapse
Affiliation(s)
- Miaomiao Li
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, 10003, USA
| | - Tao Yao
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
| | - Wanru Lin
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, 10003, USA
| | - Will E Hinckley
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, 10003, USA
| | - Mary Galli
- Waksman Institute of Microbiology, Rutgers University, Piscataway, NJ, 08854-8020, USA
| | - Wellington Muchero
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
| | - Andrea Gallavotti
- Waksman Institute of Microbiology, Rutgers University, Piscataway, NJ, 08854-8020, USA
| | - Jin-Gui Chen
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
| | - Shao-Shan Carol Huang
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, 10003, USA.
| |
Collapse
|
10
|
Song BP, Ragsac MF, Tellez K, Jindal GA, Grudzien JL, Le SH, Farley EK. Diverse logics and grammar encode notochord enhancers. Cell Rep 2023; 42:112052. [PMID: 36729834 PMCID: PMC10387507 DOI: 10.1016/j.celrep.2023.112052] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 11/07/2022] [Accepted: 01/17/2023] [Indexed: 02/03/2023] Open
Abstract
The notochord is a defining feature of all chordates. The transcription factors Zic and ETS regulate enhancer activity within the notochord. We conduct high-throughput screens of genomic elements within developing Ciona embryos to understand how Zic and ETS sites encode notochord activity. Our screen discovers an enhancer located near Lama, a gene critical for notochord development. Reversing the orientation of an ETS site within this enhancer abolishes expression, indicating that enhancer grammar is critical for notochord activity. Similarly organized clusters of Zic and ETS sites occur within mouse and human Lama1 introns. Within a Brachyury (Bra) enhancer, FoxA and Bra, in combination with Zic and ETS binding sites, are necessary and sufficient for notochord expression. This binding site logic also occurs within other Ciona and vertebrate Bra enhancers. Collectively, this study uncovers the importance of grammar within notochord enhancers and discovers signatures of enhancer logic and grammar conserved across chordates.
Collapse
Affiliation(s)
- Benjamin P Song
- Department of Medicine, Health Sciences, University of California San Diego, La Jolla, CA 92093, USA; Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA 92093, USA; Biological Sciences Graduate Program, University of California San Diego, La Jolla, CA 92093, USA
| | - Michelle F Ragsac
- Department of Medicine, Health Sciences, University of California San Diego, La Jolla, CA 92093, USA; Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA 92093, USA; Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA 92093, USA
| | - Krissie Tellez
- Department of Medicine, Health Sciences, University of California San Diego, La Jolla, CA 92093, USA; Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | - Granton A Jindal
- Department of Medicine, Health Sciences, University of California San Diego, La Jolla, CA 92093, USA; Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | - Jessica L Grudzien
- Department of Medicine, Health Sciences, University of California San Diego, La Jolla, CA 92093, USA; Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | - Sophia H Le
- Department of Medicine, Health Sciences, University of California San Diego, La Jolla, CA 92093, USA; Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | - Emma K Farley
- Department of Medicine, Health Sciences, University of California San Diego, La Jolla, CA 92093, USA; Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
11
|
Kari H, Bandi SMS, Kumar A, Yella VR. DeePromClass: Delineator for Eukaryotic Core Promoters Employing Deep Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:802-807. [PMID: 35353704 DOI: 10.1109/tcbb.2022.3163418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Computational promoter identification in eukaryotes is a classical biological problem that should be refurbished with the availability of an avalanche of experimental data and emerging deep learning technologies. The current knowledge indicates that eukaryotic core promoters display multifarious signals such as TATA-Box, Inr element, TCT, and Pause-button, etc., and structural motifs such as G-quadruplexes. In the present study, we combined the power of deep learning with a plethora of promoter motifs to delineate promoter and non-promoters gleaned from the statistical properties of DNA sequence arrangement. To this end, we implemented convolutional neural network (CNN) and long short-term memory (LSTM) recurrent neural network architecture for five model systems with [-100 to +50] segments relative to the transcription start site being the core promoter. Unlike previous state-of-the-art tools, which furnish a binary decision of promoter or non-promoter, we classify a chunk of 151mer sequence into a promoter along with the consensus signal type or a non-promoter. The combined CNN-LSTM model; we call "DeePromClass", achieved testing accuracy of 90.6%, 93.6%, 91.8%, 86.5%, and 84.0% for S. cerevisiae, C. elegans, D. melanogaster, Mus musculus, and Homo sapiens respectively. In total, our tool provides an insightful update on next-generation promoter prediction tools for promoter biologists.
Collapse
|
12
|
Liang Y, Xu H, Cheng T, Fu Y, Huang H, Qian W, Wang J, Zhou Y, Qian P, Yin Y, Xu P, Zou W, Chen B. Gene activation guided by nascent RNA-bound transcription factors. Nat Commun 2022; 13:7329. [PMID: 36443367 PMCID: PMC9705438 DOI: 10.1038/s41467-022-35041-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 11/16/2022] [Indexed: 11/29/2022] Open
Abstract
Technologies for gene activation are valuable tools for the study of gene functions and have a wide range of potential applications in bioengineering and medicine. In contrast to existing methods based on recruiting transcriptional modulators via DNA-binding proteins, we developed a strategy termed Narta (nascent RNA-guided transcriptional activation) to achieve gene activation by recruiting artificial transcription factors (aTFs) to transcription sites through nascent RNAs of the target gene. Using Narta, we demonstrate robust activation of a broad range of exogenous and endogenous genes in various cell types, including zebrafish embryos, mouse and human cells. Importantly, the activation is reversible, tunable and specific. Moreover, Narta provides better activation potency of some expressed genes than CRISPRa and, when used in combination with CRISPRa, has an enhancing effect on gene activation. Quantitative imaging illustrated that nascent RNA-directed aTFs could induce the high-density assembly of coactivators at transcription sites, which may explain the larger transcriptional burst size induced by Narta. Overall, our work expands the gene activation toolbox for biomedical research.
Collapse
Affiliation(s)
- Ying Liang
- grid.13402.340000 0004 1759 700XDepartment of Cell Biology and Bone Marrow Transplantation Center of the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China ,grid.13402.340000 0004 1759 700XLiangzhu Laboratory, Zhejiang University Medical Center, Hangzhou, China
| | - Haiyue Xu
- grid.13402.340000 0004 1759 700XDepartment of Cell Biology and Bone Marrow Transplantation Center of the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China ,grid.13402.340000 0004 1759 700XLiangzhu Laboratory, Zhejiang University Medical Center, Hangzhou, China
| | - Tao Cheng
- grid.13402.340000 0004 1759 700XWomen’s Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Yujuan Fu
- grid.13402.340000 0004 1759 700XDepartment of Cell Biology and Bone Marrow Transplantation Center of the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Hanwei Huang
- grid.13402.340000 0004 1759 700XDepartment of Cell Biology and Bone Marrow Transplantation Center of the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Wenchang Qian
- grid.13402.340000 0004 1759 700XCenter of Stem Cell and Regenerative Medicine, Zhejiang University School of Medicine, Hangzhou, China
| | - Junyan Wang
- grid.13402.340000 0004 1759 700XDepartment of Cell Biology and Bone Marrow Transplantation Center of the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Yuenan Zhou
- grid.13402.340000 0004 1759 700XDepartment of Cell Biology, Zhejiang University School of Medicine, Hangzhou, China
| | - Pengxu Qian
- grid.13402.340000 0004 1759 700XCenter of Stem Cell and Regenerative Medicine, Zhejiang University School of Medicine, Hangzhou, China
| | - Yafei Yin
- grid.13402.340000 0004 1759 700XDepartment of Cell Biology, Zhejiang University School of Medicine, Hangzhou, China
| | - Pengfei Xu
- grid.13402.340000 0004 1759 700XWomen’s Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Wei Zou
- grid.13402.340000 0004 1759 700XThe Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu, China ,grid.13402.340000 0004 1759 700XInsititute of Translational Medicine, Zhejiang University, Hangzhou, China
| | - Baohui Chen
- grid.13402.340000 0004 1759 700XDepartment of Cell Biology and Bone Marrow Transplantation Center of the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China ,grid.13402.340000 0004 1759 700XLiangzhu Laboratory, Zhejiang University Medical Center, Hangzhou, China ,grid.13402.340000 0004 1759 700XInstitute of Hematology, Zhejiang University & Zhejiang Engineering Laboratory for Stem Cell and Immunotherapy, Hangzhou, China ,grid.13402.340000 0004 1759 700XZhejiang Provincial Key Laboratory of Genetic & Developmental Disorders, Hangzhou, China
| |
Collapse
|
13
|
Patra P, Gao YQ. Sequence-Specific Structural Features and Solvation Properties of Transcription Factor Binding DNA Motifs: Insights from Molecular Dynamics Simulation. J Phys Chem B 2022; 126:9187-9206. [PMID: 36322688 DOI: 10.1021/acs.jpcb.2c05749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Sequence-specific recognition of transcription factor (TF) binding motifs in the target site of DNA over the vast amount of non-target DNA is of primary importance for the transcriptional regulation of gene expression by the TFs. Binding of TFs to the target site of DNA relies not only on the direct contact formation but also on the structural and conformational features of DNA. Recognition of DNA structural features or shape readout by proteins is an important factor in the context of TF-DNA interaction. Based on the atomistic molecular simulation, here we report the sequence-dependent unique structural features, solvation, and ion-binding properties of biologically relevant AT- and GC-rich human TF binding motifs in DNA. Counterion and water distribution around the motif is found to be sensitive to the motif sequence, which is accompanied with the DNA shape features. The motif sequence affects the electrostatic potential along the grooves, and cytosine methylation alters the DNA shape features. Characteristic solvation properties of TF binding motif DNA fragments infer that an ionic environment and hydration influences are essential to describe TF-DNA interactions.
Collapse
Affiliation(s)
- Piya Patra
- Shenzhen Bay Laboratory, Institute of Systems and Physical Biology, Shenzhen 518107, China
| | - Yi Qin Gao
- Shenzhen Bay Laboratory, Institute of Systems and Physical Biology, Shenzhen 518107, China.,Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China.,Biomedical Pioneering Innovation Center, Peking University, Beijing 100871, China
| |
Collapse
|
14
|
Fernandez-Lopez R, Ruiz R, del Campo I, Gonzalez-Montes L, Boer D, de la Cruz F, Moncalian G. Structural basis of direct and inverted DNA sequence repeat recognition by helix-turn-helix transcription factors. Nucleic Acids Res 2022; 50:11938-11947. [PMID: 36370103 PMCID: PMC9723621 DOI: 10.1093/nar/gkac1024] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 10/13/2022] [Accepted: 10/25/2022] [Indexed: 11/13/2022] Open
Abstract
Some transcription factors bind DNA motifs containing direct or inverted sequence repeats. Preference for each of these DNA topologies is dictated by structural constraints. Most prokaryotic regulators form symmetric oligomers, which require operators with a dyad structure. Binding to direct repeats requires breaking the internal symmetry, a property restricted to a few regulators, most of them from the AraC family. The KorA family of transcriptional repressors, involved in plasmid propagation and stability, includes members that form symmetric dimers and recognize inverted repeats. Our structural analyses show that ArdK, a member of this family, can form a symmetric dimer similar to that observed for KorA, yet it binds direct sequence repeats as a non-symmetric dimer. This is possible by the 180° rotation of one of the helix-turn-helix domains. We then probed and confirmed that ArdK shows affinity for an inverted repeat, which, surprisingly, is also recognized by a non-symmetrical dimer. Our results indicate that structural flexibility at different positions in the dimerization interface constrains transcription factors to bind DNA sequences with one of these two alternative DNA topologies.
Collapse
Affiliation(s)
- Raul Fernandez-Lopez
- Departamento de Biología Molecular, Universidad de Cantabria and Instituto de Biomedicina y Biotecnología de Cantabria (IBBTEC), CSIC-Universidad de Cantabria, 39011, Santander, Spain
| | - Raul Ruiz
- Departamento de Biología Molecular, Universidad de Cantabria and Instituto de Biomedicina y Biotecnología de Cantabria (IBBTEC), CSIC-Universidad de Cantabria, 39011, Santander, Spain
| | - Irene del Campo
- Departamento de Biología Molecular, Universidad de Cantabria and Instituto de Biomedicina y Biotecnología de Cantabria (IBBTEC), CSIC-Universidad de Cantabria, 39011, Santander, Spain
| | - Lorena Gonzalez-Montes
- Departamento de Biología Molecular, Universidad de Cantabria and Instituto de Biomedicina y Biotecnología de Cantabria (IBBTEC), CSIC-Universidad de Cantabria, 39011, Santander, Spain
| | - D Roeland Boer
- Alba Synchrotron, Cerdanyola del Vallès, 08290, Barcelona, Spain
| | | | | |
Collapse
|
15
|
Donohue LK, Guo MG, Zhao Y, Jung N, Bussat RT, Kim DS, Neela PH, Kellman LN, Garcia OS, Meyers RM, Altman RB, Khavari PA. A cis-regulatory lexicon of DNA motif combinations mediating cell-type-specific gene regulation. CELL GENOMICS 2022; 2:100191. [PMID: 36742369 PMCID: PMC9894309 DOI: 10.1016/j.xgen.2022.100191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Gene expression is controlled by transcription factors (TFs) that bind cognate DNA motif sequences in cis-regulatory elements (CREs). The combinations of DNA motifs acting within homeostasis and disease, however, are unclear. Gene expression, chromatin accessibility, TF footprinting, and H3K27ac-dependent DNA looping data were generated and a random-forest-based model was applied to identify 7,531 cell-type-specific cis-regulatory modules (CRMs) across 15 diploid human cell types. A co-enrichment framework within CRMs nominated 838 cell-type-specific, recurrent heterotypic DNA motif combinations (DMCs), which were functionally validated using massively parallel reporter assays. Cancer cells engaged DMCs linked to neoplasia-enabling processes operative in normal cells while also activating new DMCs only seen in the neoplastic state. This integrative approach identifies cell-type-specific cis-regulatory combinatorial DNA motifs in diverse normal and diseased human cells and represents a general framework for deciphering cis-regulatory sequence logic in gene regulation.
Collapse
Affiliation(s)
- Laura K.H. Donohue
- Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA,Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA,Synthego, Redwood City, CA, USA,These authors contributed equally
| | - Margaret G. Guo
- Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA,Stanford Program in Biomedical Informatics, Stanford University, Stanford, CA, USA,These authors contributed equally
| | - Yang Zhao
- Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA,Synthego, Redwood City, CA, USA
| | - Namyoung Jung
- Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA,Department of Life Science, Pohang University of Science and Technology, Pohang, Korea
| | - Rose T. Bussat
- Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA,23andMe, Inc., Sunnyvale, CA, USA
| | - Daniel S. Kim
- Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA,Stanford Program in Biomedical Informatics, Stanford University, Stanford, CA, USA
| | - Poornima H. Neela
- Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA,Fauna Bio, Emeryville, CA, USA
| | - Laura N. Kellman
- Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA,Stanford Program in Cancer Biology, Stanford University, Stanford, CA, USA
| | - Omar S. Garcia
- Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA
| | - Robin M. Meyers
- Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA,Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Russ B. Altman
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA,Stanford Program in Biomedical Informatics, Stanford University, Stanford, CA, USA,Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Paul A. Khavari
- Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA,Stanford Program in Cancer Biology, Stanford University, Stanford, CA, USA,Veterans Affairs Palo Alto Healthcare System, Palo Alto, CA, USA,Lead contact,Correspondence:
| |
Collapse
|
16
|
Ghoshdastidar D, Bansal M. Flexibility of flanking DNA is a key determinant of transcription factor affinity for the core motif. Biophys J 2022; 121:3987-4000. [PMID: 35978548 PMCID: PMC9674967 DOI: 10.1016/j.bpj.2022.08.015] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 07/28/2022] [Accepted: 08/15/2022] [Indexed: 11/02/2022] Open
Abstract
Selective gene regulation is mediated by recognition of specific DNA sequences by transcription factors (TFs). The extremely challenging task of searching out specific cognate DNA binding sites among several million putative sites within the eukaryotic genome is achieved by complex molecular recognition mechanisms. Elements of this recognition code include the core binding sequence, the flanking sequence context, and the shape and conformational flexibility of the composite binding site. To unravel the extent to which DNA flexibility modulates TF binding, in this study, we employed experimentally guided molecular dynamics simulations of ternary complex of closely related Hox heterodimers Exd-Ubx and Exd-Scr with DNA. Results demonstrate that flexibility signatures embedded in the flanking sequences impact TF binding at the cognate binding site. A DNA sequence has intrinsic shape and flexibility features. While shape features are localized, our analyses reveal that flexibility features of the flanking sequences percolate several basepairs and allosterically modulate TF binding at the core. We also show that lack of flexibility in the motif context can render the cognate site resistant to protein-induced shape changes and subsequently lower TF binding affinity. Overall, this study suggests that flexibility-guided DNA shape, and not merely the static shape, is a key unexplored component of the complex DNA-TF recognition code.
Collapse
Affiliation(s)
| | - Manju Bansal
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, Karnataka, India.
| |
Collapse
|
17
|
Yang MG, Ling E, Cowley CJ, Greenberg ME, Vierbuchen T. Characterization of sequence determinants of enhancer function using natural genetic variation. eLife 2022; 11:76500. [PMID: 36043696 PMCID: PMC9662815 DOI: 10.7554/elife.76500] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Accepted: 08/30/2022] [Indexed: 02/04/2023] Open
Abstract
Sequence variation in enhancers that control cell-type-specific gene transcription contributes significantly to phenotypic variation within human populations. However, it remains difficult to predict precisely the effect of any given sequence variant on enhancer function due to the complexity of DNA sequence motifs that determine transcription factor (TF) binding to enhancers in their native genomic context. Using F1-hybrid cells derived from crosses between distantly related inbred strains of mice, we identified thousands of enhancers with allele-specific TF binding and/or activity. We find that genetic variants located within the central region of enhancers are most likely to alter TF binding and enhancer activity. We observe that the AP-1 family of TFs (Fos/Jun) are frequently required for binding of TEAD TFs and for enhancer function. However, many sequence variants outside of core motifs for AP-1 and TEAD also impact enhancer function, including sequences flanking core TF motifs and AP-1 half sites. Taken together, these data represent one of the most comprehensive assessments of allele-specific TF binding and enhancer function to date and reveal how sequence changes at enhancers alter their function across evolutionary timescales.
Collapse
Affiliation(s)
- Marty G Yang
- Department of Neurobiology, Harvard Medical School, Boston, United States.,Program in Neuroscience, Harvard Medical School, Boston, United States
| | - Emi Ling
- Department of Neurobiology, Harvard Medical School, Boston, United States
| | | | | | - Thomas Vierbuchen
- Developmental Biology Program, Sloan Kettering Institute for Cancer Research, New York, United States.,Center for Stem Cell Biology, Sloan Kettering Institute for Cancer Research, New York, United States
| |
Collapse
|
18
|
Barissi S, Sala A, Wieczór M, Battistini F, Orozco M. DNAffinity: a machine-learning approach to predict DNA binding affinities of transcription factors. Nucleic Acids Res 2022; 50:9105-9114. [PMID: 36018808 PMCID: PMC9458447 DOI: 10.1093/nar/gkac708] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 07/21/2022] [Accepted: 08/08/2022] [Indexed: 12/24/2022] Open
Abstract
We present a physics-based machine learning approach to predict in vitro transcription factor binding affinities from structural and mechanical DNA properties directly derived from atomistic molecular dynamics simulations. The method is able to predict affinities obtained with techniques as different as uPBM, gcPBM and HT-SELEX with an excellent performance, much better than existing algorithms. Due to its nature, the method can be extended to epigenetic variants, mismatches, mutations, or any non-coding nucleobases. When complemented with chromatin structure information, our in vitro trained method provides also good estimates of in vivo binding sites in yeast.
Collapse
Affiliation(s)
| | | | - Miłosz Wieczór
- Institute for Research in Biomedicine (IRB Barcelona). The Barcelona Institute of Science and Technology. Baldiri Reixac 10–12, 08028 Barcelona, Spain,Department of Physical Chemistry. Gdansk University of Technology, 80-233 Gdańsk, Poland
| | | | - Modesto Orozco
- Correspondence may also be addressed to Modesto Orozco. Tel: +34 934 037 156;
| |
Collapse
|
19
|
DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers. Nat Genet 2022; 54:613-624. [PMID: 35551305 DOI: 10.1038/s41588-022-01048-5] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Accepted: 03/08/2022] [Indexed: 02/06/2023]
Abstract
Enhancer sequences control gene expression and comprise binding sites (motifs) for different transcription factors (TFs). Despite extensive genetic and computational studies, the relationship between DNA sequence and regulatory activity is poorly understood, and de novo enhancer design has been challenging. Here, we built a deep-learning model, DeepSTARR, to quantitatively predict the activities of thousands of developmental and housekeeping enhancers directly from DNA sequence in Drosophila melanogaster S2 cells. The model learned relevant TF motifs and higher-order syntax rules, including functionally nonequivalent instances of the same TF motif that are determined by motif-flanking sequence and intermotif distances. We validated these rules experimentally and demonstrated that they can be generalized to humans by testing more than 40,000 wildtype and mutant Drosophila and human enhancers. Finally, we designed and functionally validated synthetic enhancers with desired activities de novo.
Collapse
|
20
|
Tareen A, Kooshkbaghi M, Posfai A, Ireland WT, McCandlish DM, Kinney JB. MAVE-NN: learning genotype-phenotype maps from multiplex assays of variant effect. Genome Biol 2022; 23:98. [PMID: 35428271 PMCID: PMC9011994 DOI: 10.1186/s13059-022-02661-7] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 03/24/2022] [Indexed: 12/17/2022] Open
Abstract
Multiplex assays of variant effect (MAVEs) are a family of methods that includes deep mutational scanning experiments on proteins and massively parallel reporter assays on gene regulatory sequences. Despite their increasing popularity, a general strategy for inferring quantitative models of genotype-phenotype maps from MAVE data is lacking. Here we introduce MAVE-NN, a neural-network-based Python package that implements a broadly applicable information-theoretic framework for learning genotype-phenotype maps—including biophysically interpretable models—from MAVE datasets. We demonstrate MAVE-NN in multiple biological contexts, and highlight the ability of our approach to deconvolve mutational effects from otherwise confounding experimental nonlinearities and noise.
Collapse
|
21
|
Kim NM, Sinnott RW, Rothschild LN, Sandoval NR. Elucidation of Sequence-Function Relationships for an Improved Biobutanol In Vivo Biosensor in E. coli. Front Bioeng Biotechnol 2022; 10:821152. [PMID: 35265600 PMCID: PMC8899819 DOI: 10.3389/fbioe.2022.821152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 01/17/2022] [Indexed: 11/30/2022] Open
Abstract
Transcription factor (TF)–promoter pairs have been repurposed from native hosts to provide tools to measure intracellular biochemical production titer and dynamically control gene expression. Most often, native TF–promoter systems require rigorous screening to obtain desirable characteristics optimized for biotechnological applications. High-throughput techniques may provide a rational and less labor-intensive strategy to engineer user-defined TF–promoter pairs using fluorescence-activated cell sorting and deep sequencing methods (sort-seq). Based on the designed promoter library’s distribution characteristics, we elucidate sequence–function interactions between the TF and DNA. In this work, we use the sort-seq method to study the sequence–function relationship of a σ54-dependent, butanol-responsive TF–promoter pair, BmoR-PBMO derived from Thauera butanivorans, at the nucleotide level to improve biosensor characteristics, specifically an improved dynamic range. Activities of promoters from a mutagenized PBMO library were sorted based on gfp expression and subsequently deep sequenced to correlate site-specific sequences with changes in dynamic range. We identified site-specific mutations that increase the sensor output. Double mutant and a single mutant, CA(129,130)TC and G(205)A, in PBMO promoter increased dynamic ranges of 4-fold and 1.65-fold compared with the native system, respectively. In addition, sort-seq identified essential sites required for the proper function of the σ54-dependent promoter biosensor in the context of the host. This work can enable high-throughput screening methods for strain development.
Collapse
Affiliation(s)
- Nancy M Kim
- Interdisciplinary Bioinnovation PhD Program, Tulane University, New Orleans, LA, United States
| | - Riley W Sinnott
- Department of Chemical & Biomolecular Engineering, Tulane University, New Orleans, LA, United States
| | - Lily N Rothschild
- Department of Chemical & Biomolecular Engineering, Tulane University, New Orleans, LA, United States
| | - Nicholas R Sandoval
- Department of Chemical & Biomolecular Engineering, Tulane University, New Orleans, LA, United States
| |
Collapse
|
22
|
Vanaja A, Yella VR. Delineation of the DNA Structural Features of Eukaryotic Core Promoter Classes. ACS OMEGA 2022; 7:5657-5669. [PMID: 35224327 PMCID: PMC8867553 DOI: 10.1021/acsomega.1c04603] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 01/27/2022] [Indexed: 05/02/2023]
Abstract
The eukaryotic transcription is orchestrated from a chunk of the DNA region stated as the core promoter. Multifarious and punctilious core promoter signals, viz., TATA-box, Inr, BREs, and Pause Button, are associated with a subset of genes and regulate their spatiotemporal expression. However, the core promoter architecture linked with these signals has not been investigated exhaustively for several species. In this study, we attempted to envisage the adaptive binding landscape of the transcription initiation machinery as a function of DNA structure. To this end, we deployed a set of k-mer based DNA structural estimates and regular expression models derived from experiments, molecular dynamic simulations, and theoretical frameworks, and high-throughout promoter data sets retrieved from the eukaryotic promoter database. We categorized protein-coding gene core promoters based on characteristic motifs at precise locations and analyzed the B-DNA structural properties and non-B-DNA structural motifs for 15 different eukaryotic genomes. We observed that Inr, BREd, and no-motif classes display common patterns of DNA sequence and structural environment. TATA-containing, BREu, and Pause Button classes show a deviant behavior with the TATA class displaying varied axial and twisting flexibility while BREu and Pause Button leaned toward G-quadruplex motif enrichment. Intriguingly, DNA meltability and shape signals are conserved irrespective of the presence or absence of distinct core promoter motifs in the majority of species. Altogether, here we delineated the conserved DNA structural signals associated with several promoter classes that may contribute to the chromatin configuration, orchestration of transcription machinery, and DNA duplex melting during the transcription process.
Collapse
Affiliation(s)
- Akkinepally Vanaja
- Department
of Biotechnology, Koneru Lakshmaiah Education
Foundation, Vaddeswaram, Guntur 522502, Andhra
Pradesh, India
- KL
College of Pharmacy, Koneru Lakshmaiah Education
Foundation, Vaddeswaram, Guntur 522502, Andhra
Pradesh, India
| | - Venkata Rajesh Yella
- Department
of Biotechnology, Koneru Lakshmaiah Education
Foundation, Vaddeswaram, Guntur 522502, Andhra
Pradesh, India
- . Tel: +91-863-2399999, Extn-1021. Website: https://www.kluniversity.in/bt/faculty-list.aspx
| |
Collapse
|
23
|
Wu T, Jiang D, Zou M, Sun W, Wu D, Cui J, Huntress I, Peng X, Li G. Coupling high-throughput mapping with proteomics analysis delineates cis-regulatory elements at high resolution. Nucleic Acids Res 2022; 50:e5. [PMID: 34634809 PMCID: PMC8754656 DOI: 10.1093/nar/gkab890] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 08/20/2021] [Accepted: 09/17/2021] [Indexed: 12/30/2022] Open
Abstract
Growing evidence suggests that functional cis-regulatory elements (cis-REs) not only exist in epigenetically marked but also in unmarked sites of the human genome. While it is already difficult to identify cis-REs in the epigenetically marked sites, interrogating cis-REs residing within the unmarked sites is even more challenging. Here, we report adapting Reel-seq, an in vitro high-throughput (HTP) technique, to fine-map cis-REs at high resolution over a large region of the human genome in a systematic and continuous manner. Using Reel-seq, as a proof-of-principle, we identified 408 candidate cis-REs by mapping a 58 kb core region on the aging-related CDKN2A/B locus that harbors p16INK4a. By coupling Reel-seq with FREP-MS, a proteomics analysis technique, we characterized two cis-REs, one in an epigenetically marked site and the other in an epigenetically unmarked site. These elements are shown to regulate the p16INK4a expression over an ∼100 kb distance by recruiting the poly(A) binding protein PABPC1 and the transcription factor FOXC2. Downregulation of either PABPC1 or FOXC2 in human endothelial cells (ECs) can induce the p16INK4a-dependent cellular senescence. Thus, we confirmed the utility of Reel-seq and FREP-MS analyses for the systematic identification of cis-REs at high resolution over a large region of the human genome.
Collapse
Affiliation(s)
- Ting Wu
- Aging Institute, University of Pittsburgh, Pittsburgh, PA 15219, USA
- Department of Medicine, Xiangya School of Medicine, Central South University, Changsha 410083, China
| | - Danli Jiang
- Aging Institute, University of Pittsburgh, Pittsburgh, PA 15219, USA
| | - Meijuan Zou
- Aging Institute, University of Pittsburgh, Pittsburgh, PA 15219, USA
| | - Wei Sun
- Center for Pulmonary Vascular Biology and Medicine, Pittsburgh Heart, Lung, Blood, and Vascular Medicine Institute, University of Pittsburgh School of Medicine and University of Pittsburgh Medical Center, Pittsburgh, PA 15261, USA
| | - Di Wu
- Division of Oral Craniofacial Health Science, Adams School of Dentistry, Department of Biostatistics, UNC Gillings School of Global Public Health, University of North Carolina, NC 27599, USA
| | - Jing Cui
- Department of Medicine, Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Ian Huntress
- Department of Molecular Biomedical Sciences, North Carolina State University College of Veterinary Medicine, Raleigh, NC 27607, USA
- Bioinformatics Graduate Program, North Carolina State University, Raleigh, NC 27695, USA
| | - Xinxia Peng
- Bioinformatics Graduate Program, North Carolina State University, Raleigh, NC 27695, USA
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC 27695, USA
| | - Gang Li
- Aging Institute, University of Pittsburgh, Pittsburgh, PA 15219, USA
- Department of Medicine, Division of Cardiology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15223, USA
| |
Collapse
|
24
|
Ren N, Li B, Liu Q, Yang L, Liu X, Huang Q. Dinucleotide tag-based parallel reporter gene assay method enables efficient identification of regulatory mutations. Biotechnol J 2021; 17:e2100341. [PMID: 34894203 DOI: 10.1002/biot.202100341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 12/07/2021] [Accepted: 12/09/2021] [Indexed: 11/06/2022]
Abstract
BACKGROUND The causal single nucleotide polymorphisms (SNPs) leading to increased cancer predisposition mainly function as gene regulatory elements, the evaluation of which largely relies on the parallel reporter gene assay system. However, the common DNA barcodes used in parallel reporter gene assay systems typically because nucleotide composition bias, and many barcodes must be allocated for each sequence to reduce the bias effect. MAIN METHODS AND MAJOR RESULTS Here, a versatile dinucleotide-tag reporter system (DiR) that enables parallel analysis of regulatory elements with minimized bias based on next-generation sequencing is described. The DiR system is more robust than the classical luciferase assay method, particularly for the investigation of moderate-level regulatory elements. The authors applied the DiR-seq assay in the functional evaluation of SNPs with prostate cancer risk and nominated two and six regulatory SNPs in PC-3 and LNCaP cells, respectively. CONCLUSIONS AND IMPLICATIONS The DiR system has great potential to advance the functional study of SNPs associated with polygenic disease risks.
Collapse
Affiliation(s)
- Naixia Ren
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao, China
| | - Bo Li
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao, China
| | - Qingqing Liu
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao, China
| | - Lele Yang
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao, China
| | - Xiaodan Liu
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao, China
| | - Qilai Huang
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao, China
| |
Collapse
|
25
|
Mauduit D, Taskiran II, Minnoye L, de Waegeneer M, Christiaens V, Hulselmans G, Demeulemeester J, Wouters J, Aerts S. Analysis of long and short enhancers in melanoma cell states. eLife 2021; 10:e71735. [PMID: 34874265 PMCID: PMC8691835 DOI: 10.7554/elife.71735] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Accepted: 12/06/2021] [Indexed: 12/14/2022] Open
Abstract
Understanding how enhancers drive cell-type specificity and efficiently identifying them is essential for the development of innovative therapeutic strategies. In melanoma, the melanocytic (MEL) and the mesenchymal-like (MES) states present themselves with different responses to therapy, making the identification of specific enhancers highly relevant. Using massively parallel reporter assays (MPRAs) in a panel of patient-derived melanoma lines (MM lines), we set to identify and decipher melanoma enhancers by first focusing on regions with state-specific H3K27 acetylation close to differentially expressed genes. An in-depth evaluation of those regions was then pursued by investigating the activity of overlapping ATAC-seq peaks along with a full tiling of the acetylated regions with 190 bp sequences. Activity was observed in more than 60% of the selected regions, and we were able to precisely locate the active enhancers within ATAC-seq peaks. Comparison of sequence content with activity, using the deep learning model DeepMEL2, revealed that AP-1 alone is responsible for the MES enhancer activity. In contrast, SOX10 and MITF both influence MEL enhancer function with SOX10 being required to achieve high levels of activity. Overall, our MPRAs shed light on the relationship between long and short sequences in terms of their sequence content, enhancer activity, and specificity across melanoma cell states.
Collapse
Affiliation(s)
- David Mauduit
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| | - Ibrahim Ihsan Taskiran
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| | - Liesbeth Minnoye
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| | - Maxime de Waegeneer
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| | - Valerie Christiaens
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| | - Gert Hulselmans
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| | - Jonas Demeulemeester
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
- Cancer Genomics Laboratory, The Francis Crick InstituteLondonUnited Kingdom
| | - Jasper Wouters
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| | - Stein Aerts
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| |
Collapse
|
26
|
The dynamic, combinatorial cis-regulatory lexicon of epidermal differentiation. Nat Genet 2021; 53:1564-1576. [PMID: 34650237 PMCID: PMC8763320 DOI: 10.1038/s41588-021-00947-3] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 09/01/2021] [Indexed: 01/24/2023]
Abstract
Transcription factors bind DNA sequence motif vocabularies in cis-regulatory elements (CREs) to modulate chromatin state and gene expression during cell state transitions. A quantitative understanding of how motif lexicons influence dynamic regulatory activity has been elusive due to the combinatorial nature of the cis-regulatory code. To address this, we undertook multiomic data profiling of chromatin and expression dynamics across epidermal differentiation to identify 40,103 dynamic CREs associated with 3,609 dynamically expressed genes, then applied an interpretable deep-learning framework to model the cis-regulatory logic of chromatin accessibility. This analysis framework identified cooperative DNA sequence rules in dynamic CREs regulating synchronous gene modules with diverse roles in skin differentiation. Massively parallel reporter assay analysis validated temporal dynamics and cooperative cis-regulatory logic. Variants linked to human polygenic skin disease were enriched in these time-dependent combinatorial motif rules. This integrative approach shows the combinatorial cis-regulatory lexicon of epidermal differentiation and represents a general framework for deciphering the organizational principles of the cis-regulatory code of dynamic gene regulation.
Collapse
|
27
|
Ren N, Li Y, Xiong Y, Li P, Ren Y, Huang Q. Functional Screenings Identify Regulatory Variants Associated with Breast Cancer Susceptibility. Curr Issues Mol Biol 2021; 43:1756-1777. [PMID: 34889888 PMCID: PMC8928974 DOI: 10.3390/cimb43030124] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 10/12/2021] [Accepted: 10/15/2021] [Indexed: 12/14/2022] Open
Abstract
Genome-wide association studies (GWAS) have identified more than 2000 single nucleotide polymorphisms (SNPs) associated with breast cancer susceptibility, most of which are located in the non-coding region. However, the causal SNPs functioning as gene regulatory elements still remain largely undisclosed. Here, we applied a Dinucleotide Parallel Reporter sequencing (DiR-seq) assay to evaluate 288 breast cancer risk SNPs in nine different breast cancer cell lines. Further multi-omics analysis with the ATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing), DNase-seq (DNase I hypersensitive sites sequencing) and histone modification ChIP-seq (Chromatin Immunoprecipitation sequencing) nominated seven functional SNPs in breast cancer cells. Functional investigations show that rs4808611 affects breast cancer progression by altering the gene expression of NR2F6. For the other site, rs2236007, the alteration promotes the binding of the suppressive transcription factor EGR1 and results in the downregulation of PAX9 expression. The downregulated expression of PAX9 causes cancer malignancies and is associated with the poor prognosis of breast cancer patients. Our findings contribute to defining the functional risk SNPs and the related genes for breast cancer risk prediction.
Collapse
|
28
|
Eukaryotic Genomes Show Strong Evolutionary Conservation of k-mer Composition and Correlation Contributions between Introns and Intergenic Regions. Genes (Basel) 2021; 12:genes12101571. [PMID: 34680967 PMCID: PMC8536142 DOI: 10.3390/genes12101571] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 09/24/2021] [Accepted: 09/29/2021] [Indexed: 01/22/2023] Open
Abstract
Several strongly conserved DNA sequence patterns in and between introns and intergenic regions (IIRs) consisting of short tandem repeats (STRs) with repeat lengths <3 bp have already been described in the kingdom of Animalia. In this work, we expanded the search and analysis of conserved DNA sequence patterns to a wider range of eukaryotic genomes. Our aims were to confirm the conservation of these patterns, to support the hypothesis on their functional constraints and/or the identification of unknown patterns. We pairwise compared genomic DNA sequences of genes, exons, CDS, introns and intergenic regions of 34 Embryophyta (land plants), 30 Protista and 29 Fungi using established k-mer-based (alignment-free) comparison methods. Additionally, the results were compared with values derived for Animalia in former studies. We confirmed strong correlations between the sequence structures of IIRs spanning over the entire domain of Eukaryotes. We found that the high correlations within introns, intergenic regions and between the two are a result of conserved abundancies of STRs with repeat units ≤2 bp (e.g., (AT)n). For some sequence patterns and their inverse complementary sequences, we found a violation of equal distribution on complementary DNA strands in a subset of genomes. Looking at mismatches within the identified STR patterns, we found specific preferences for certain nucleotides stable over all four phylogenetic kingdoms. We conclude that all of these conserved patterns between IIRs indicate a shared function of these sequence structures related to STRs.
Collapse
|
29
|
Zhang Y, Mo Q, Xue L, Luo J. Evaluation of deep learning approaches for modeling transcription factor sequence specificity. Genomics 2021; 113:3774-3781. [PMID: 34534646 DOI: 10.1016/j.ygeno.2021.09.009] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 07/19/2021] [Accepted: 09/11/2021] [Indexed: 11/16/2022]
Abstract
As a key component of gene regulation, transcription factors (TFs) play an important role in a number of biological processes. To fully understand the underlying mechanism of TF-mediated gene regulation, it is therefore critical to accurately identify TF binding sites and predict their affinities. Recently, deep learning (DL) algorithms have achieved promising results in the prediction of DNA-TF binding, however, various deep learning architectures have not been systematically compared, and the relative merit of each architecture remains unclear. To address this problem, we applied four different deep learning architectures to SELEX-seq and HT-SELEX data, covering three species and 35 families. We evaluated and compared the performance of different deep neural models using 10-fold cross-validation. Our results indicate that the hybrid CNN + DNN model shows the best performances. We expect that our study will be broadly applicable to modeling and predicting TF binding specificity when more high-throughput affinity data are available.
Collapse
Affiliation(s)
- Yonglin Zhang
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou 646000, China
| | - Qi Mo
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou 646000, China
| | - Li Xue
- School of Public Health, Southwest Medical University, Luzhou 646000, China
| | - Jiesi Luo
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou 646000, China; Department of Pharmacy, The Affiliated Hospital of Southwest Medical University, Luzhou 646000, China; Sichuan Key Medical Laboratory of New Drug Discovery and Druggability Evaluation, Luzhou Key Laboratory of Activity Screening and Druggability Evaluation for Chinese Materia Medica, Southwest Medical University, Luzhou 646000, China.
| |
Collapse
|
30
|
Santiago-Frangos A, Buyukyoruk M, Wiegand T, Krishna P, Wiedenheft B. Distribution and phasing of sequence motifs that facilitate CRISPR adaptation. Curr Biol 2021; 31:3515-3524.e6. [PMID: 34174210 PMCID: PMC8552246 DOI: 10.1016/j.cub.2021.05.068] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 04/30/2021] [Accepted: 05/28/2021] [Indexed: 12/11/2022]
Abstract
CRISPR-associated proteins (Cas1 and Cas2) integrate foreign DNA at the "leader" end of CRISPR loci. Several CRISPR leader sequences are reported to contain a binding site for a DNA-bending protein called integration host factor (IHF). IHF-induced DNA bending kinks the leader of type I-E CRISPRs, recruiting an upstream sequence motif that helps dock Cas1-2 onto the first repeat of the CRISPR locus. To determine the prevalence of IHF-directed CRISPR adaptation, we analyzed 15,274 bacterial and archaeal CRISPR leaders. These experiments reveal multiple IHF binding sites and diverse upstream sequence motifs in a subset of the I-C, I-E, I-F, and II-C CRISPR leaders. We identify subtype-specific motifs and show that the phase of these motifs is critical for CRISPR adaptation. Collectively, this work clarifies the prevalence and mechanism(s) of IHF-dependent CRISPR adaptation and suggests that leader sequences and adaptation proteins may coevolve under the selective pressures of foreign genetic elements like plasmids or phages.
Collapse
Affiliation(s)
| | - Murat Buyukyoruk
- Department of Microbiology and Immunology, Montana State University, Bozeman, MT 59717, USA
| | - Tanner Wiegand
- Department of Microbiology and Immunology, Montana State University, Bozeman, MT 59717, USA
| | - Pushya Krishna
- Department of Microbiology and Immunology, Montana State University, Bozeman, MT 59717, USA
| | - Blake Wiedenheft
- Department of Microbiology and Immunology, Montana State University, Bozeman, MT 59717, USA.
| |
Collapse
|
31
|
Moutsopoulos I, Maischak L, Lauzikaite E, Vasquez Urbina S, Williams E, Drost HG, Mohorianu I. noisyR: enhancing biological signal in sequencing datasets by characterizing random technical noise. Nucleic Acids Res 2021; 49:e83. [PMID: 34076236 PMCID: PMC8373073 DOI: 10.1093/nar/gkab433] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 04/16/2021] [Accepted: 05/06/2021] [Indexed: 01/22/2023] Open
Abstract
High-throughput sequencing enables an unprecedented resolution in transcript quantification, at the cost of magnifying the impact of technical noise. The consistent reduction of random background noise to capture functionally meaningful biological signals is still challenging. Intrinsic sequencing variability introducing low-level expression variations can obscure patterns in downstream analyses. We introduce noisyR, a comprehensive noise filter to assess the variation in signal distribution and achieve an optimal information-consistency across replicates and samples; this selection also facilitates meaningful pattern recognition outside the background-noise range. noisyR is applicable to count matrices and sequencing data; it outputs sample-specific signal/noise thresholds and filtered expression matrices. We exemplify the effects of minimizing technical noise on several datasets, across various sequencing assays: coding, non-coding RNAs and interactions, at bulk and single-cell level. An immediate consequence of filtering out noise is the convergence of predictions (differential-expression calls, enrichment analyses and inference of gene regulatory networks) across different approaches.
Collapse
Affiliation(s)
- Ilias Moutsopoulos
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
| | - Lukas Maischak
- Computational Biology Group, Department of Molecular Biology, Max Planck Institute for Developmental Biology, Max-Planck Ring 1, 72076 Tübingen, Germany
| | - Elze Lauzikaite
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
| | - Sergio A Vasquez Urbina
- Computational Biology Group, Department of Molecular Biology, Max Planck Institute for Developmental Biology, Max-Planck Ring 1, 72076 Tübingen, Germany
| | - Eleanor C Williams
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
| | - Hajk-Georg Drost
- Computational Biology Group, Department of Molecular Biology, Max Planck Institute for Developmental Biology, Max-Planck Ring 1, 72076 Tübingen, Germany
| | - Irina I Mohorianu
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
| |
Collapse
|
32
|
Ren N, Liu Q, Yan L, Huang Q. Parallel Reporter Assays Identify Altered Regulatory Role of rs684232 in Leading to Prostate Cancer Predisposition. Int J Mol Sci 2021; 22:8792. [PMID: 34445492 PMCID: PMC8395720 DOI: 10.3390/ijms22168792] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 08/07/2021] [Accepted: 08/13/2021] [Indexed: 02/06/2023] Open
Abstract
Functional characterization of cancer risk-associated single nucleotide polymorphism (SNP) identified by genome-wide association studies (GWAS) has become a big challenge. To identify the regulatory risk SNPs that can lead to transcriptional misregulation, we performed parallel reporter gene assays with both alleles of 213 prostate cancer risk-associated GWAS SNPs in 22Rv1 cells. We disclosed 32 regulatory SNPs that exhibited different regulatory activities with two alleles. For one of the regulatory SNPs, rs684232, we found that the variation altered chromatin binding of transcription factor FOXA1 on the DNA region and led to aberrant gene expression of VPS53, FAM57A, and GEMIN4, which play vital roles in prostate cancer malignancy. Our findings reveal the roles and underlying mechanism of rs684232 in prostate cancer progression and hold great promise in benefiting prostate cancer patients with prognostic prediction and target therapies.
Collapse
Affiliation(s)
| | | | | | - Qilai Huang
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao 266237, China; (N.R.); (Q.L.); (L.Y.)
| |
Collapse
|
33
|
Savadel SD, Hartwig T, Turpin ZM, Vera DL, Lung PY, Sui X, Blank M, Frommer WB, Dennis JH, Zhang J, Bass HW. The native cistrome and sequence motif families of the maize ear. PLoS Genet 2021; 17:e1009689. [PMID: 34383745 PMCID: PMC8360572 DOI: 10.1371/journal.pgen.1009689] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Accepted: 06/30/2021] [Indexed: 01/22/2023] Open
Abstract
Elucidating the transcriptional regulatory networks that underlie growth and development requires robust ways to define the complete set of transcription factor (TF) binding sites. Although TF-binding sites are known to be generally located within accessible chromatin regions (ACRs), pinpointing these DNA regulatory elements globally remains challenging. Current approaches primarily identify binding sites for a single TF (e.g. ChIP-seq), or globally detect ACRs but lack the resolution to consistently define TF-binding sites (e.g. DNAse-seq, ATAC-seq). To address this challenge, we developed MNase-defined cistrome-Occupancy Analysis (MOA-seq), a high-resolution (< 30 bp), high-throughput, and genome-wide strategy to globally identify putative TF-binding sites within ACRs. We used MOA-seq on developing maize ears as a proof of concept, able to define a cistrome of 145,000 MOA footprints (MFs). While a substantial majority (76%) of the known ATAC-seq ACRs intersected with the MFs, only a minority of MFs overlapped with the ATAC peaks, indicating that the majority of MFs were novel and not detected by ATAC-seq. MFs were associated with promoters and significantly enriched for TF-binding and long-range chromatin interaction sites, including for the well-characterized FASCIATED EAR4, KNOTTED1, and TEOSINTE BRANCHED1. Importantly, the MOA-seq strategy improved the spatial resolution of TF-binding prediction and allowed us to identify 215 motif families collectively distributed over more than 100,000 non-overlapping, putatively-occupied binding sites across the genome. Our study presents a simple, efficient, and high-resolution approach to identify putative TF footprints and binding motifs genome-wide, to ultimately define a native cistrome atlas. Understanding gene regulation remains a central goal of modern biology. Delineating the full set of regulatory DNA elements that orchestrate this regulation requires information at two scales; the broad landscape of accessible chromatin, and the site-specific binding of transcription factors (TFs) at discrete cis-regulatory DNA elements. Here we describe a single assay that uses micrococcal nuclease (MNase) as a structural probe to simultaneously reveal regions of accessible chromatin in addition to high-resolution footprints with signatures of TF-occupied cis-elements. We have used maize developing ear tissue as proof of concept, showing the method detects known TF-binding sites. This genome-wide assay not only defines chromatin landscapes, but crucially enables global discovery and mapping of sequence motifs underlying small footprints of ~30 bp to produce an atlas of candidate TF occupancy.
Collapse
Affiliation(s)
- Savannah D. Savadel
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
| | - Thomas Hartwig
- Institute for Molecular Physiologie, Heinrich-Heine-Universität, Düsseldorf, Germany
- Independent research groups, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Zachary M. Turpin
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
| | - Daniel L. Vera
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
| | - Pei-Yau Lung
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
| | - Xin Sui
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
| | - Max Blank
- Institute for Molecular Physiologie, Heinrich-Heine-Universität, Düsseldorf, Germany
- Independent research groups, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Wolf B. Frommer
- Institute for Molecular Physiologie, Heinrich-Heine-Universität, Düsseldorf, Germany
- Independent research groups, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Jonathan H. Dennis
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
| | - Jinfeng Zhang
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
| | - Hank W. Bass
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
- * E-mail:
| |
Collapse
|
34
|
Cazier AP, Blazeck J. Advances in promoter engineering: novel applications and predefined transcriptional control. Biotechnol J 2021; 16:e2100239. [PMID: 34351706 DOI: 10.1002/biot.202100239] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 07/30/2021] [Accepted: 08/03/2021] [Indexed: 11/08/2022]
Abstract
Synthetic biology continues to progress by relying on more robust tools for transcriptional control, of which promoters are the most fundamental component. Numerous studies have sought to characterize promoter function, determine principles to guide their engineering, and create promoters with stronger expression or tailored inducible control. In this review, we will summarize promoter architecture and highlight recent advances in the field, focusing on the novel applications of inducible promoter design and engineering towards metabolic engineering and cellular therapeutic development. Additionally, we will highlight how the expansion of new, machine learning techniques for modeling and engineering promoter sequences are enabling more accurate prediction of promoter characteristics. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Andrew P Cazier
- School of Chemical and Biomolecular Engineering, Georgia Institute of Technology, 311 Ferst St. NW, Atlanta, Georgia, 30332, USA
| | - John Blazeck
- School of Chemical and Biomolecular Engineering, Georgia Institute of Technology, 311 Ferst St. NW, Atlanta, Georgia, 30332, USA
| |
Collapse
|
35
|
Saidi A, Hajibarat Z, Hajibarat Z. Phylogeny, gene structure and GATA genes expression in different tissues of solanaceae species. BIOCATALYSIS AND AGRICULTURAL BIOTECHNOLOGY 2021. [DOI: 10.1016/j.bcab.2021.102015] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
36
|
Ahmed Z, Renart EG, Zeeshan S. Genomics pipelines to investigate susceptibility in whole genome and exome sequenced data for variant discovery, annotation, prediction and genotyping. PeerJ 2021; 9:e11724. [PMID: 34395068 PMCID: PMC8320519 DOI: 10.7717/peerj.11724] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Accepted: 06/14/2021] [Indexed: 12/12/2022] Open
Abstract
Over the last few decades, genomics is leading toward audacious future, and has been changing our views about conducting biomedical research, studying diseases, and understanding diversity in our society across the human species. The whole genome and exome sequencing (WGS/WES) are two of the most popular next-generation sequencing (NGS) methodologies that are currently being used to detect genetic variations of clinical significance. Investigating WGS/WES data for the variant discovery and genotyping is based on the nexus of different data analytic applications. Although several bioinformatics applications have been developed, and many of those are freely available and published. Timely finding and interpreting genetic variants are still challenging tasks among diagnostic laboratories and clinicians. In this study, we are interested in understanding, evaluating, and reporting the current state of solutions available to process the NGS data of variable lengths and types for the identification of variants, alleles, and haplotypes. Residing within the scope, we consulted high quality peer reviewed literature published in last 10 years. We were focused on the standalone and networked bioinformatics applications proposed to efficiently process WGS and WES data, and support downstream analysis for gene-variant discovery, annotation, prediction, and interpretation. We have discussed our findings in this manuscript, which include but not are limited to the set of operations, workflow, data handling, involved tools, technologies and algorithms and limitations of the assessed applications.
Collapse
Affiliation(s)
- Zeeshan Ahmed
- Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA.,Department of Medicine, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| | - Eduard Gibert Renart
- Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| | - Saman Zeeshan
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| |
Collapse
|
37
|
Degtyareva AO, Antontseva EV, Merkulova TI. Regulatory SNPs: Altered Transcription Factor Binding Sites Implicated in Complex Traits and Diseases. Int J Mol Sci 2021; 22:6454. [PMID: 34208629 PMCID: PMC8235176 DOI: 10.3390/ijms22126454] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 06/15/2021] [Accepted: 06/15/2021] [Indexed: 12/19/2022] Open
Abstract
The vast majority of the genetic variants (mainly SNPs) associated with various human traits and diseases map to a noncoding part of the genome and are enriched in its regulatory compartment, suggesting that many causal variants may affect gene expression. The leading mechanism of action of these SNPs consists in the alterations in the transcription factor binding via creation or disruption of transcription factor binding sites (TFBSs) or some change in the affinity of these regulatory proteins to their cognate sites. In this review, we first focus on the history of the discovery of regulatory SNPs (rSNPs) and systematized description of the existing methodical approaches to their study. Then, we brief the recent comprehensive examples of rSNPs studied from the discovery of the changes in the TFBS sequence as a result of a nucleotide substitution to identification of its effect on the target gene expression and, eventually, to phenotype. We also describe state-of-the-art genome-wide approaches to identification of regulatory variants, including both making molecular sense of genome-wide association studies (GWAS) and the alternative approaches the primary goal of which is to determine the functionality of genetic variants. Among these approaches, special attention is paid to expression quantitative trait loci (eQTLs) analysis and the search for allele-specific events in RNA-seq (ASE events) as well as in ChIP-seq, DNase-seq, and ATAC-seq (ASB events) data.
Collapse
Affiliation(s)
- Arina O. Degtyareva
- Department of Molecular Genetic, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (A.O.D.); (E.V.A.)
| | - Elena V. Antontseva
- Department of Molecular Genetic, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (A.O.D.); (E.V.A.)
| | - Tatiana I. Merkulova
- Department of Molecular Genetic, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (A.O.D.); (E.V.A.)
- Department of Natural Sciences, Novosibirsk State University, 630090 Novosibirsk, Russia
| |
Collapse
|
38
|
Zrimec J, Buric F, Kokina M, Garcia V, Zelezniak A. Learning the Regulatory Code of Gene Expression. Front Mol Biosci 2021; 8:673363. [PMID: 34179082 PMCID: PMC8223075 DOI: 10.3389/fmolb.2021.673363] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Accepted: 05/24/2021] [Indexed: 11/13/2022] Open
Abstract
Data-driven machine learning is the method of choice for predicting molecular phenotypes from nucleotide sequence, modeling gene expression events including protein-DNA binding, chromatin states as well as mRNA and protein levels. Deep neural networks automatically learn informative sequence representations and interpreting them enables us to improve our understanding of the regulatory code governing gene expression. Here, we review the latest developments that apply shallow or deep learning to quantify molecular phenotypes and decode the cis-regulatory grammar from prokaryotic and eukaryotic sequencing data. Our approach is to build from the ground up, first focusing on the initiating protein-DNA interactions, then specific coding and non-coding regions, and finally on advances that combine multiple parts of the gene and mRNA regulatory structures, achieving unprecedented performance. We thus provide a quantitative view of gene expression regulation from nucleotide sequence, concluding with an information-centric overview of the central dogma of molecular biology.
Collapse
Affiliation(s)
- Jan Zrimec
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Filip Buric
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Mariia Kokina
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Victor Garcia
- School of Life Sciences and Facility Management, Zurich University of Applied Sciences, Wädenswil, Switzerland
| | - Aleksej Zelezniak
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Science for Life Laboratory, Stockholm, Sweden
| |
Collapse
|
39
|
Peculiarities of Plasmodium falciparum Gene Regulation and Chromatin Structure. Int J Mol Sci 2021; 22:ijms22105168. [PMID: 34068393 PMCID: PMC8153576 DOI: 10.3390/ijms22105168] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 05/10/2021] [Accepted: 05/10/2021] [Indexed: 12/14/2022] Open
Abstract
The highly complex life cycle of the human malaria parasite, Plasmodium falciparum, is based on an orchestrated and tightly regulated gene expression program. In general, eukaryotic transcription regulation is determined by a combination of sequence-specific transcription factors binding to regulatory DNA elements and the packaging of DNA into chromatin as an additional layer. The accessibility of regulatory DNA elements is controlled by the nucleosome occupancy and changes of their positions by an active process called nucleosome remodeling. These epigenetic mechanisms are poorly explored in P. falciparum. The parasite genome is characterized by an extraordinarily high AT-content and the distinct architecture of functional elements, and chromatin-related proteins also exhibit high sequence divergence compared to other eukaryotes. Together with the distinct biochemical properties of nucleosomes, these features suggest substantial differences in chromatin-dependent regulation. Here, we highlight the peculiarities of epigenetic mechanisms in P. falciparum, addressing chromatin structure and dynamics with respect to their impact on transcriptional control. We focus on the specialized chromatin remodeling enzymes and discuss their essential function in P. falciparum gene regulation.
Collapse
|
40
|
Figla promotes secondary follicle growth in mature mice. Sci Rep 2021; 11:9842. [PMID: 33972571 PMCID: PMC8110814 DOI: 10.1038/s41598-021-89052-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 04/20/2021] [Indexed: 11/09/2022] Open
Abstract
The in vitro growth (IVG) of human follicles is a potential fertility option for women for whom cryopreserved ovarian tissues cannot be transplanted due to the risk of cancer cell reintroduction; however, there is currently no established method. Furthermore, optimal IVG conditions may differ between the follicles of adult and pre-pubertal females due to molecular differences suggested by basic research. To systematically identify differences between the secondary follicles of adult and pre-pubertal females, a comparative transcriptomic study using mice was conducted herein. Among differentially expressed genes (DEGs), Figla was up-regulated in mature mice. We successfully down-regulated Figla expression in secondary follicle oocytes by a Figla siRNA microinjection, and the subsequent IVG of follicles showed that the diameter of these follicles was smaller than those of controls in mature mice, whereas no significant difference was observed in premature mice. The canonical pathways of DEGs between control and Figla-reduced secondary follicles suggest that Figla up-regulates VDR/RXR activation and down-regulates stem cell pluripotency as well as estrogen signaling. We demonstrated for the first time that folliculogenesis of the secondary follicles of premature and mature mice may be regulated by different factors, such as Figla with its possible target genes, providing insights into optimal IVG conditions for adult and pre-pubertal females, respectively.
Collapse
|
41
|
Zhang D, Lam J, Blobel GA. Engineering three-dimensional genome folding. Nat Genet 2021; 53:602-611. [PMID: 33958782 DOI: 10.1038/s41588-021-00860-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Accepted: 03/29/2021] [Indexed: 02/02/2023]
Abstract
Animal genomes are partitioned and folded at various scales that contribute distinctly to nuclear processes. While structural features have been disrupted either globally or at select loci in loss-of-function studies, gain-of-function studies that probe the role of genome architecture have lagged behind. Here we examine recent advances in experimentally creating chromatin loops, contact domains, boundaries and compartments. Furthermore, we explore parallels between this emerging theme and natural evolution of mammalian genomes with increasing architectural complexity. Finally, we provide a perspective on how insights arising from recent gain-of-function studies may inform future endeavors toward engineering the three-dimensional genome.
Collapse
Affiliation(s)
- Di Zhang
- Division of Hematology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA.,Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Jessica Lam
- Division of Hematology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA.,Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Gerd A Blobel
- Division of Hematology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA. .,Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
42
|
Panigrahi A, O'Malley BW. Mechanisms of enhancer action: the known and the unknown. Genome Biol 2021; 22:108. [PMID: 33858480 PMCID: PMC8051032 DOI: 10.1186/s13059-021-02322-1] [Citation(s) in RCA: 128] [Impact Index Per Article: 42.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 03/23/2021] [Indexed: 12/13/2022] Open
Abstract
Differential gene expression mechanisms ensure cellular differentiation and plasticity to shape ontogenetic and phylogenetic diversity of cell types. A key regulator of differential gene expression programs are the enhancers, the gene-distal cis-regulatory sequences that govern spatiotemporal and quantitative expression dynamics of target genes. Enhancers are widely believed to physically contact the target promoters to effect transcriptional activation. However, our understanding of the full complement of regulatory proteins and the definitive mechanics of enhancer action is incomplete. Here, we review recent findings to present some emerging concepts on enhancer action and also outline a set of outstanding questions.
Collapse
Affiliation(s)
- Anil Panigrahi
- Department of Molecular and Cellular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Bert W O'Malley
- Department of Molecular and Cellular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| |
Collapse
|
43
|
Froehlich JJ, Uyar B, Herzog M, Theil K, Glažar P, Akalin A, Rajewsky N. Parallel genetics of regulatory sequences using scalable genome editing in vivo. Cell Rep 2021; 35:108988. [PMID: 33852857 DOI: 10.1016/j.celrep.2021.108988] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 01/13/2021] [Accepted: 03/23/2021] [Indexed: 12/27/2022] Open
Abstract
How regulatory sequences control gene expression is fundamental for explaining phenotypes in health and disease. Regulatory elements must ultimately be understood within their genomic environment and development- or tissue-specific contexts. Because this is technically challenging, few regulatory elements have been characterized in vivo. Here, we use inducible Cas9 and multiplexed guide RNAs to create hundreds of mutations in enhancers/promoters and 3' UTRs of 16 genes in C. elegans. Our software crispr-DART analyzes indel mutations in targeted DNA sequencing. We quantify the impact of mutations on expression and fitness by targeted RNA sequencing and DNA sampling. When applying our approach to the lin-41 3' UTR, generating hundreds of mutants, we find that the two adjacent binding sites for the miRNA let-7 can regulate lin-41 expression independently of each other. Finally, we map regulatory genotypes to phenotypic traits for several genes. Our approach enables parallel analysis of regulatory sequences directly in animals.
Collapse
Affiliation(s)
- Jonathan J Froehlich
- Systems Biology of Gene Regulatory Elements, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany
| | - Bora Uyar
- Bioinformatics and Omics Data Science Platform, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany
| | - Margareta Herzog
- Systems Biology of Gene Regulatory Elements, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany
| | - Kathrin Theil
- Systems Biology of Gene Regulatory Elements, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany
| | - Petar Glažar
- Systems Biology of Gene Regulatory Elements, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany
| | - Altuna Akalin
- Bioinformatics and Omics Data Science Platform, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany
| | - Nikolaus Rajewsky
- Systems Biology of Gene Regulatory Elements, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany.
| |
Collapse
|
44
|
Ruiz Ramírez AV, Flores-Saiffe Farías A, Chávez Álvarez RDC, Prado Montes de Oca E. Predicted regulatory SNPs reveal potential drug targets and novel companion diagnostics in psoriasis. J Transl Autoimmun 2021; 4:100096. [PMID: 33898962 PMCID: PMC8060581 DOI: 10.1016/j.jtauto.2021.100096] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Revised: 02/27/2021] [Accepted: 03/10/2021] [Indexed: 11/25/2022] Open
Abstract
Psoriasis is an autoimmune disease associated with interleukins, their receptors, key transcription factors and more recently, antimicrobial peptides (AMPs). Cathelicidin LL-37 is an AMP proposed to play a fundamental role in psoriasis etiology. With our proprietary software SNPClinic v.1.0, we analyzed 203 common SNPs (MAF frequency > 1%) in proximal promoters of 22 genes associated with psoriasis. These include nine genes which protein products are classic drug targets for psoriasis (TNF, IL17A, IL17B, IL17C, IL17F, IL17RA, IL12A, IL12B and IL23A). SNPClinic predictions were run with DNAseI-HUP chromatin accessibility data in eight psoriasis/epithelia-relevant cell lines from ENCODE including keratinocytes (NHEK), TH1 and TH17 lymphocytes. Results were ranked quantitatively by transcriptional relevance according to our novel Functional Impact Factor (FIF) parameter. We found six rSNPs in five genes (CAMP/cathelicidin, S100A7/psoriasin, IL17C, IL17RA and TNF) and each was confirmed as true rSNP in at least one public eQTL database including GTEx portal and ENCODE (Phase 3). Predicted regulatory SNPs in cathelicidin, IL17C and IL17RA genes may explain hyperproliferation of keratinocytes. Predicted rSNPs in psoriasin, IL17C and cathelicidin may contribute to activation and polarization of lymphocytes. Predicted rSNPs in TNF gene are concordant with the epithelium-mesenchymal transition. In spite that these results must be validated in vitro and in vivo with a functional genomics approach, we propose FOXP2, RUNX2, NR2F1, ELF1 and HESX1 transcription factors (those with the highest FIF on each gene) as novel drug targets for psoriasis. Furthermore, four out of six rSNPs uncovered by SNPClinic v.1.0 software, could also be validated in the clinic as companion diagnostics/pharmacogenetics assays for psoriasis prescribed drugs that block TNF-α (e.g. Etanercept), IL-17 (e.g. Secukinumab) and IL-17 receptor (Brodalumab). We found six putative regulatory SNPs in cathelicidin (LL-37), psoriasin (S100A7), IL17C, IL17RA and TNF genes. These rSNPs could be validated also as companion diagnostics/pharmacogenetics assays for most approved psoriasis drugs. Regulatory SNPs in TNF gene are concordant with the epithelial-mesenchymal transition. Regulatory SNPs in IL17C and IL17RA may partially explain hyperproliferation of keratinocytes. Regulatory SNP rs12049559 in psoriasin (S100A7) may contribute to T-cell polarization.
Collapse
Affiliation(s)
- Andrea Virginia Ruiz Ramírez
- Laboratory of Regulatory SNPs, Personalized Medicine National Laboratory (LAMPER), Medical and Pharmaceutical Biotechnology, Research Center of Technology and Design Assistance of Jalisco State (CIATEJ A.C.), National Council of Science and Technology (CONACYT), C.P. 44270, Guadalajara, Jalisco, Mexico.,Doctorate Program in Human Genetics, Health Sciences Campus (CUCS), Guadalajara University, Sierra Mojada 950, Col. Independencia, C.P. 44340, Guadalajara, Jalisco, Mexico
| | - Adolfo Flores-Saiffe Farías
- Laboratory of Regulatory SNPs, Personalized Medicine National Laboratory (LAMPER), Medical and Pharmaceutical Biotechnology, Research Center of Technology and Design Assistance of Jalisco State (CIATEJ A.C.), National Council of Science and Technology (CONACYT), C.P. 44270, Guadalajara, Jalisco, Mexico
| | - Rocío Del Carmen Chávez Álvarez
- Laboratory of Regulatory SNPs, Personalized Medicine National Laboratory (LAMPER), Medical and Pharmaceutical Biotechnology, Research Center of Technology and Design Assistance of Jalisco State (CIATEJ A.C.), National Council of Science and Technology (CONACYT), C.P. 44270, Guadalajara, Jalisco, Mexico
| | - Ernesto Prado Montes de Oca
- Laboratory of Regulatory SNPs, Personalized Medicine National Laboratory (LAMPER), Medical and Pharmaceutical Biotechnology, Research Center of Technology and Design Assistance of Jalisco State (CIATEJ A.C.), National Council of Science and Technology (CONACYT), C.P. 44270, Guadalajara, Jalisco, Mexico.,Laboratory of Pharmacogenomics and Preventive Medicine, LAMPER, Pharmaceutical and Medical Biotechnology, CIATEJ, A.C., CONACYT, C.P. 44270, Guadalajara, Jalisco, Mexico.,Scripps Research Translational Institute, 3344 North Torrey Pines Court, Suite 300, La Jolla, CA, 92037, USA.,Integrative Structural and Computational Biology, Scripps Research Institute, 10550 North Torrey Pines Road, SGM 300, La Jolla, CA, 92037, USA
| |
Collapse
|
45
|
Jindal GA, Farley EK. Enhancer grammar in development, evolution, and disease: dependencies and interplay. Dev Cell 2021; 56:575-587. [PMID: 33689769 PMCID: PMC8462829 DOI: 10.1016/j.devcel.2021.02.016] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 02/15/2021] [Accepted: 02/16/2021] [Indexed: 12/19/2022]
Abstract
Each language has standard books describing that language's grammatical rules. Biologists have searched for similar, albeit more complex, principles relating enhancer sequence to gene expression. Here, we review the literature on enhancer grammar. We introduce dependency grammar, a model where enhancers encode information based on dependencies between enhancer features shaped by mechanistic, evolutionary, and biological constraints. Classifying enhancers based on the types of dependencies may identify unifying principles relating enhancer sequence to gene expression. Such rules would allow us to read the instructions for development within genomes and pinpoint causal enhancer variants underlying disease and evolutionary changes.
Collapse
Affiliation(s)
- Granton A Jindal
- Division of Cardiology, Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA; Division of Biological Sciences, Section of Molecular Biology, University of California San Diego, La Jolla, CA 92093, USA
| | - Emma K Farley
- Division of Cardiology, Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA; Division of Biological Sciences, Section of Molecular Biology, University of California San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
46
|
Fine gene expression regulation by minor sequence variations downstream of the polyadenylation signal. Mol Biol Rep 2021; 48:1539-1547. [PMID: 33517473 DOI: 10.1007/s11033-021-06160-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 01/12/2021] [Indexed: 12/22/2022]
Abstract
The termination of transcription is a complex process that substantially contributes to gene regulation in eukaryotes. Previously, it was noted that a single cytosine deletion at the position + 32 bp relative to the single polyadenylation signal AAUAAA (hereafter the dC mutation) causes a 2-fold increase in the transcription level of the upstream eGFP reporter in mouse embryonic stem cells. Here, we analyzed the conservation of this phenomenon in immortalized mouse, human and drosophila cell lines and the influence of the dC mutation on the choice of the pre-mRNA cleavage sites. We have constructed dual-reporter plasmids to accurately measure the effect of the dC and other nearby located mutations on eGFP mRNA level by RT-qPCR. In this way, we found that the dC mutation leads to a 2-fold increase in the expression level of the upstream eGFP reporter gene in cultured mouse and human, but not in drosophila cells. In addition, 3' RACE analysis demonstrated that eGFP pre-mRNAs are cut at multiple positions between + 14 to + 31, and that the most proximal cleavage site becomes almost exclusively utilized in the presence of the dC mutation. We also identified new short sequence variations located within positions + 25.. + 40 and + 33.. + 48 that increase eGFP expression up to ~2-4-fold. Altogether, the positive effect of the dC mutation seems to be conserved in mouse embryonic stem cells, mouse embryonic 3T3 fibroblasts and human HEK293T cells. In the latter cells, the dC mutation appears to be involved in regulating pre-mRNA cleavage site selection. Finally, a multiplexed approach is proposed to identify motifs located downstream of cleavage site(s) that are essential for transcription termination.
Collapse
|
47
|
Kolmykov S, Yevshin I, Kulyashov M, Sharipov R, Kondrakhin Y, Makeev VJ, Kulakovskiy IV, Kel A, Kolpakov F. GTRD: an integrated view of transcription regulation. Nucleic Acids Res 2021; 49:D104-D111. [PMID: 33231677 PMCID: PMC7778956 DOI: 10.1093/nar/gkaa1057] [Citation(s) in RCA: 126] [Impact Index Per Article: 42.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/18/2020] [Accepted: 11/03/2020] [Indexed: 12/24/2022] Open
Abstract
The Gene Transcription Regulation Database (GTRD; http://gtrd.biouml.org/) contains uniformly annotated and processed NGS data related to gene transcription regulation: ChIP-seq, ChIP-exo, DNase-seq, MNase-seq, ATAC-seq and RNA-seq. With the latest release, the database has reached a new level of data integration. All cell types (cell lines and tissues) presented in the GTRD were arranged into a dictionary and linked with different ontologies (BRENDA, Cell Ontology, Uberon, Cellosaurus and Experimental Factor Ontology) and with related experiments in specialized databases on transcription regulation (FANTOM5, ENCODE and GTEx). The updated version of the GTRD provides an integrated view of transcription regulation through a dedicated web interface with advanced browsing and search capabilities, an integrated genome browser, and table reports by cell types, transcription factors, and genes of interest.
Collapse
Affiliation(s)
- Semyon Kolmykov
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
- Federal Research Center for Information and Computational Technologies, Novosibirsk 630090, Russian Federation
- Federal Research Center Institute of Cytology and Genetics SB RAS, Novosibirsk 630090, Russian Federation
| | - Ivan Yevshin
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
- Federal Research Center for Information and Computational Technologies, Novosibirsk 630090, Russian Federation
| | - Mikhail Kulyashov
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
- Federal Research Center for Information and Computational Technologies, Novosibirsk 630090, Russian Federation
- Novosibirsk State University, Novosibirsk 630090, Russian Federation
| | - Ruslan Sharipov
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
- Federal Research Center for Information and Computational Technologies, Novosibirsk 630090, Russian Federation
- Novosibirsk State University, Novosibirsk 630090, Russian Federation
| | - Yury Kondrakhin
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
- Federal Research Center for Information and Computational Technologies, Novosibirsk 630090, Russian Federation
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics RAS, Moscow 119991, Russian Federation
- Moscow Institute of Physics and Technology (State University), Dolgoprudny 141700, Russian Federation
- NRC «Kurchatov Institute» - GOSNIIGENETIKA, Kurchatov Genomic Center, Moscow 123182, Russian Federation
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russian Federation
| | - Ivan V Kulakovskiy
- Vavilov Institute of General Genetics RAS, Moscow 119991, Russian Federation
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russian Federation
- Institute of Protein Research, Russian Academy of Sciences, Pushchino 142290, Russian Federation
| | - Alexander Kel
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
- geneXplain GmbH, 38302 Wolfenbüttel, Germany
- Institute of Chemical Biology and Fundamental Medicine SB RAS, Novosibirsk 630090, Russian Federation
| | - Fedor Kolpakov
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
- Federal Research Center for Information and Computational Technologies, Novosibirsk 630090, Russian Federation
| |
Collapse
|
48
|
Jana T, Brodsky S, Barkai N. Speed-Specificity Trade-Offs in the Transcription Factors Search for Their Genomic Binding Sites. Trends Genet 2021; 37:421-432. [PMID: 33414013 DOI: 10.1016/j.tig.2020.12.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Revised: 12/04/2020] [Accepted: 12/07/2020] [Indexed: 12/17/2022]
Abstract
Transcription factors (TFs) regulate gene expression by binding DNA sequences recognized by their DNA-binding domains (DBDs). DBD-recognized motifs are short and highly abundant in genomes. The ability of TFs to bind a specific subset of motif-containing sites, and to do so rapidly upon activation, is fundamental for gene expression in all eukaryotes. Despite extensive interest, our understanding of the TF-target search process is fragmented; although binding specificity and detection speed are two facets of this same process, trade-offs between them are rarely addressed. In this opinion article, we discuss potential speed-specificity trade-offs in the context of existing models. We further discuss the recently described 'distributed specificity' paradigm, suggesting that intrinsically disordered regions (IDRs) promote specificity while reducing the TF-target search time.
Collapse
Affiliation(s)
- Tamar Jana
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Sagie Brodsky
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Naama Barkai
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel.
| |
Collapse
|
49
|
Paredes O, Romo-Vázquez R, Román-Godínez I, Vélez-Pérez H, Salido-Ruiz RA, Morales JA. Frequency spectra characterization of noncoding human genomic sequences. Genes Genomics 2020; 42:1215-1226. [DOI: 10.1007/s13258-020-00980-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Accepted: 04/27/2020] [Indexed: 11/28/2022]
|
50
|
Bansal R, Hussain S, Chanana UB, Bisht D, Goel I, Muthuswami R. SMARCAL1, the annealing helicase and the transcriptional co-regulator. IUBMB Life 2020; 72:2080-2096. [PMID: 32754981 DOI: 10.1002/iub.2354] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 06/26/2020] [Accepted: 07/07/2020] [Indexed: 12/15/2022]
Abstract
The ATP-dependent chromatin remodeling proteins play an important role in DNA repair. The energy released by ATP hydrolysis is used for myriad functions ranging from nucleosome repositioning and nucleosome eviction to histone variant exchange. In addition, the distant member of the family, SMARCAL1, uses the energy to reanneal stalled replication forks in response to DNA damage. Biophysical studies have shown that this protein has the unique ability to recognize and bind specifically to DNA structures possessing double-strand to single-strand transition regions. Mutations in SMARCAL1 have been linked to Schimke immuno-osseous dysplasia, an autosomal recessive disorder that exhibits variable penetrance and expressivity. It has long been hypothesized that the variable expressivity and pleiotropic phenotypes observed in the patients might be due to the ability of SMARCAL1 to co-regulate the expression of a subset of genes within the genome. Recently, the role of SMARCAL1 in regulating transcription has been delineated. In this review, we discuss the biophysical and functional properties of the protein that help it to transcriptionally co-regulate DNA damage response as well as to bind to the stalled replication fork and stabilize it, thus ensuring genomic stability. We also discuss the role of SMARCAL1 in cancer and the possibility of using this protein as a chemotherapeutic target.
Collapse
Affiliation(s)
- Ritu Bansal
- Chromatin Remodeling Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Saddam Hussain
- Chromatin Remodeling Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Upasana Bedi Chanana
- Chromatin Remodeling Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Deepa Bisht
- Chromatin Remodeling Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Isha Goel
- Chromatin Remodeling Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Rohini Muthuswami
- Chromatin Remodeling Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| |
Collapse
|