1
|
Orlov MA, Sorokin AA. DNA sequence, physics, and promoter function: Analysis of high-throughput data On T7 promoter variants activity. J Bioinform Comput Biol 2021; 18:2040001. [PMID: 32404013 DOI: 10.1142/s0219720020400016] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
RNA polymerase/promoter recognition represents a basic problem of molecular biology. Decades-long efforts were made in the area, and yet certain challenges persist. The usage of certain most suitable model subjects is pivotal for the research. System of T7 bacteriophage RNA-polymerase/T7 native promoter represents an exceptional example for the purpose. Moreover, it has been studied the most and successfully applied to aims of biotechnology and bioengineering. Both structural simplicity and high specificity of this molecular duo are the reason for this. Despite highly similar sequences of distinct T7 native promoters, the T7 RNA-polymerase enzyme is capable of binding respective promoter in a highly specific and adjustable manner. One explanation here is that the process relies primarily on DNA physical properties rather than nucleotide sequence. Here, we address the issue by analyzing massive data recently published by Komura and colleagues. This initial study employed Next Generation Sequencing (NGS) in order to quantify activity of promoter variants including ones with multiple substitutions. As a result of our work substantial bias in simultaneous occurrence of single-nucleotide sequence alterations was found: the highest rate of co-occurrence was evidenced within specificity loop of binding region while the lowest - in initiation region of promoter. If both location and a kind of nucleotides involved in replacement (both initial and resulting) are taken into consideration, one can easily note that N to A substitutions are most preferred ones across the whole 19 b.p.-long sequence. At the same time, N to C are tolerated only at crucial position in recognition loop of binding region, and N to G are uniformly least tolerable. Later in this work the complete set of variants was split into groups with mutations (1) exclusively in binding region; (2) exclusively in melting region; (3) in both regions. Among these three groups second comprises extremely few variants (at triple-digit rate lesser than in two other groups, 46 versus over one and six thousand). Yet these are all promoter with substantial to high activity. This group two appeared heterogenous by primary sequence; indeed, upon further subdivision into above versus below average activity subgroups first one was found to comprise promoters with negligible conservation at -2 position of melting region; the second was hardly conserved in this region at all. This draws our attention to perfect consensus sequence of class III T7 promoter with -2 nucleotide randomized (all four are present by one to several copies in the previously published source dataset), the picture becomes even more pronounced. We therefore suggest that mutations at the position therefore do not cause significant changes in terms of promoter activity. At the same time, such modifications dramatically change DNA physical properties which were calculated in our study (namely electrostatic potential and propensity to bend). One possible suggestion here is that -2 nucleotide might function as a generic switch; if so, substitution -2A to -2T has important regulatory consequences. The fact that that -2 b.p. is the most evidently different nucleotide between class II versus class III promoters of T7 genome and that it also distinguishes the class III promoter in T7 genome versus promoters of its relative but reproductively isolated bacteriophage T3. In other words, it appears feasible that mutation at -2 nucleotide does not impede promoter activity yet alter its physical properties thus affecting differential RNA polymerase/promoter interaction.
Collapse
Affiliation(s)
- Mikhail A Orlov
- Institute of Cell Biophysics of RAS, 3 Institutskaya str., Poushchino, 142290, Russia
| | - Anatoly A Sorokin
- Institute of Cell Biophysics of RAS, 3 Institutskaya str., Poushchino, 142290, Russia
| |
Collapse
|
2
|
Orlov M, Garanina I, Fisunov GY, Sorokin A. Comparative Analysis of Mycoplasma gallisepticum vlhA Promoters. Front Genet 2018; 9:569. [PMID: 30519256 PMCID: PMC6258824 DOI: 10.3389/fgene.2018.00569] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 11/06/2018] [Indexed: 12/15/2022] Open
Abstract
Mycoplasma gallisepticum is an intracellular parasite affecting respiratory tract of poultry that belongs to class Mollicutes. M. gallisepticum features numerous variable lipoprotein hemagglutinin genes (vlhA) that play a role in immune escape. The vlhA promoters have a set of distinct properties in comparison to promoters of the other genes. The vlhA promoters carry a variable GAA repeats region at approximately 40 nts upstream of transcription start site. The promoters have been considered active only in the presence of exactly 12 GAA repeats. The mechanisms of vlhA expression regulation and GAA number variation are not described. Here we tried to understand these mechanisms using different computational methods. We conducted a comparative analysis among several M. gallisepticum strains. Nucleotide sequences analysis showed the presence of highly conserved regions flanking repeated trinucleotides that are not linked to GAA number variation. VlhA genes with 12 GAA repeats and their orthologs in 12 M. gallisepticum strains are more conserved than other vlhA genes and have narrower GAA number distribution. We conducted comparative analysis of physicochemical profiles of M. gallisepticum vlhA and sigma-70 promoters. Stress-induced duplex destabilization (SIDD) profiles showed that sigma-70 group is characterized by the common to prokaryotic promoters sharp maxima while vlhA promoters are hardly destabilized with the region between GAA repeats and transcription start site having zero opening probability. Electrostatic potential profiles of vlhA promoters indicate the presence of the distinct patterns that appear to govern initial stages of specific DNA-protein recognition. Open state dynamics profiles of vlhA demonstrate the pattern that might facilitate transcription bubble formation. Obtained data could be the basis for experimental identification of mechanisms of phase variation in M. gallisepticum.
Collapse
Affiliation(s)
- Mikhail Orlov
- Institute of Cell Biophysics, Russian Academy of Sciences, Pushchino, Russia
| | - Irina Garanina
- Federal Research and Clinical Center of Physical-Chemical Medicine, Federal Medical-Biological Agency, Moscow, Russia
| | - Gleb Y Fisunov
- Federal Research and Clinical Center of Physical-Chemical Medicine, Federal Medical-Biological Agency, Moscow, Russia
| | - Anatoly Sorokin
- Institute of Cell Biophysics, Russian Academy of Sciences, Pushchino, Russia
| |
Collapse
|
3
|
Ryasik A, Orlov M, Zykova E, Ermak T, Sorokin A. Bacterial promoter prediction: Selection of dynamic and static physical properties of DNA for reliable sequence classification. J Bioinform Comput Biol 2018; 16:1840003. [DOI: 10.1142/s0219720018400036] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Predicting promoter activity of DNA fragment is an important task for computational biology. Approaches using physical properties of DNA to predict bacterial promoters have recently gained a lot of attention. To select an adequate set of physical properties for training a classifier, various characteristics of DNA molecule should be taken into consideration. Here, we present a systematic approach that allows us to select less correlated properties for classification by means of both correlation and cophenetic coefficients as well as concordance matrices. To prove this concept, we have developed the first classifier that uses not only sequence and static physical properties of DNA fragment, but also dynamic properties of DNA open states. Therefore, the best performing models with accuracy values up to 90% for all types of sequences were obtained. Furthermore, we have demonstrated that the classifier can serve as a reliable tool enabling promoter DNA fragments to be distinguished from promoter islands despite the similarity of their nucleotide sequences.
Collapse
Affiliation(s)
- Artem Ryasik
- Mechanism of Cell Genome Functioning Laboratory, Institute of Cell Biophysics, ul. Institutskaya 3, Pushchino 142290, Russia
| | - Mikhail Orlov
- Mechanism of Cell Genome Functioning Laboratory, Institute of Cell Biophysics, ul. Institutskaya 3, Pushchino 142290, Russia
| | - Evgenia Zykova
- Mechanism of Cell Genome Functioning Laboratory, Institute of Cell Biophysics, ul. Institutskaya 3, Pushchino 142290, Russia
- Department of Applied Research Informatization, State Institute of Information Technologies and Telecommunications (SIIT&T Informika), per. Brusov 21 st.2, Moscow, 125009, Russia
| | - Timofei Ermak
- Laboratory of Molecular Genetics Systems, Institute of Cytology and Genetics, pr. Akademika Lavrentyeva 10, Novosibirsk 630090, Russia
| | - Anatoly Sorokin
- Mechanism of Cell Genome Functioning Laboratory, Institute of Cell Biophysics, ul. Institutskaya 3, Pushchino 142290, Russia
| |
Collapse
|
4
|
Krutinin GG, Krutinina EA, Kamzolova SG, Osypov AA. Bacteriophage λ: Electrostatic properties of the genome and its elements. Mol Biol 2015. [DOI: 10.1134/s0026893315030115] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
5
|
Temlyakova EA, Dzhelyadin TR, Kamzolova SG, Sorokin AA. 70 Electrostatic properties of bacterial DNA and promoter predictions. J Biomol Struct Dyn 2013. [DOI: 10.1080/07391102.2013.786504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
6
|
Osypov AA, Krutinin GG, Krutinina EA, Kamzolova SG. DEPPDB - DNA electrostatic potential properties database. Electrostatic properties of genome DNA elements. J Bioinform Comput Biol 2012; 10:1241004. [PMID: 22809340 DOI: 10.1142/s0219720012410041] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Electrostatic properties of genome DNA are important to its interactions with different proteins, in particular, related to transcription. DEPPDB - DNA Electrostatic Potential (and other Physical) Properties Database - provides information on the electrostatic and other physical properties of genome DNA combined with its sequence and annotation of biological and structural properties of genomes and their elements. Genomes are organized on taxonomical basis, supporting comparative and evolutionary studies. Currently, DEPPDB contains all completely sequenced bacterial, viral, mitochondrial, and plastids genomes according to the NCBI RefSeq, and some model eukaryotic genomes. Data for promoters, regulation sites, binding proteins, etc., are incorporated from established DBs and literature. The database is complemented by analytical tools. User sequences calculations are available. Case studies discovered electrostatics complementing DNA bending in E.coli plasmid BNT2 promoter functioning, possibly affecting host-environment metabolic switch. Transcription factors binding sites gravitate to high potential regions, confirming the electrostatics universal importance in protein-DNA interactions beyond the classical promoter-RNA polymerase recognition and regulation. Other genome elements, such as terminators, also show electrostatic peculiarities. Most intriguing are gene starts, exhibiting taxonomic correlations. The necessity of the genome electrostatic properties studies is discussed.
Collapse
Affiliation(s)
- Alexander A Osypov
- Laboratory of Mechanisms of the Cell Genome Functioning, Institute of Cell Biophysics RAS, Pushchino, 142290, Russia.
| | | | | | | |
Collapse
|
7
|
Osypov AA, Krutinin GG, Kamzolova SG. Deppdb--DNA electrostatic potential properties database: electrostatic properties of genome DNA. J Bioinform Comput Biol 2010; 8:413-25. [PMID: 20556853 DOI: 10.1142/s0219720010004811] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2009] [Revised: 01/28/2010] [Accepted: 02/12/2010] [Indexed: 11/18/2022]
Abstract
The electrostatic properties of genome DNA influence its interactions with different proteins, in particular, the regulation of transcription by RNA-polymerases. DEPPDB--DNA Electrostatic Potential Properties Database--was developed to hold and provide all available information on the electrostatic properties of genome DNA combined with its sequence and annotation of biological and structural properties of genome elements and whole genomes. Genomes in DEPPDB are organized on a taxonomical basis. Currently, the database contains all the completely sequenced bacterial and viral genomes according to NCBI RefSeq. General properties of the genome DNA electrostatic potential profile and principles of its formation are revealed. This potential correlates with the GC content but does not correspond to it exactly and strongly depends on both the sequence arrangement and its context (flanking regions). Analysis of the promoter regions for bacterial and viral RNA polymerases revealed a correspondence between the scale of these proteins' physical properties and electrostatic profile patterns. We also discovered a direct correlation between the potential value and the binding frequency of RNA polymerase to DNA, supporting the idea of the role of electrostatics in these interactions. This matches a pronounced tendency of the promoter regions to possess higher values of the electrostatic potential.
Collapse
Affiliation(s)
- Alexander A Osypov
- Laboratory of Mechanisms of the Cell Genome Functioning, Institute of Cell Biophysics RAS, Pushchino 142290, Russia.
| | | | | |
Collapse
|
8
|
Sorokin AA, Osipov AA, Beskaravainyi PM, Kamzolova SG. Analysis of the nucleotide sequence and electrostatic potential distribution in the Escherichia coli genome. Biophysics (Nagoya-shi) 2007. [DOI: 10.1134/s0006350907020042] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
9
|
Sorokin AA, Osypov AA, Dzhelyadin TR, Beskaravainy PM, Kamzolova SG. Electrostatic properties of promoter recognized by E. coli RNA polymerase Esigma70. J Bioinform Comput Biol 2006; 4:455-67. [PMID: 16819795 DOI: 10.1142/s0219720006002077] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2005] [Revised: 01/04/2006] [Accepted: 01/05/2006] [Indexed: 11/18/2022]
Abstract
A comparative analysis of electrostatic patterns for 359 sigma70-specific promoters and 359 nonpromoter regions on electrostatic map of Escherichia coli genome was carried out. It was found that DNA is not a uniformly charged molecule. There are some local inhomogeneities in its electrostatic profile which correlate with promoter sequences. Electrostatic patterns of promoter DNAs can be specified due to the presence of some distinctive motifs which differ for different promoter groups and may be involved as signal elements in differential recognition of various promoters by the enzyme. Some specific electrostatic elements which are responsible for modulating promoter activities due to ADP-ribosylation of RNA polymerase alpha-subunit were found in far upstream regions of T4 phage early promoters and E. coli ribosomal promoters.
Collapse
Affiliation(s)
- Anatoly A Sorokin
- Laboratory of Mechanisms of the Cell Genom Functioning, Institute of Cell Biophysics RAS, Pushchino, 142290, Russia.
| | | | | | | | | |
Collapse
|
10
|
Lavelle C, Benecke A. Chromatin physics: Replacing multiple, representation-centered descriptions at discrete scales by a continuous, function-dependent self-scaled model. THE EUROPEAN PHYSICAL JOURNAL. E, SOFT MATTER 2006; 19:379-84. [PMID: 16501873 DOI: 10.1140/epje/i2005-10059-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2005] [Accepted: 01/03/2006] [Indexed: 05/06/2023]
Abstract
This commentary on the inspiring works and ideas by Langowski, Mangeol et al., Lee et al., Bundschuh and Gerland, Schiessel, Vaillant et al., Lesne and Victor, Claudet and Bednar, Fuks, Allemand et al., and Blossey, all appearing in this issue (Eur. Phys. J. E 19 (2006)), expresses our felt need of novel approaches to chromatin modeling.
Collapse
Affiliation(s)
- C Lavelle
- Radiobiology and Oncology Group, Commissariat à l'Energie Atomique, B.P. 6, 92265, Fontenay-aux-Roses, France.
| | | |
Collapse
|
11
|
Bashford JD. Salerno's model of DNA re-analysed: could breather solitons have biological significance? J Biol Phys 2006; 32:27-47. [PMID: 19669433 DOI: 10.1007/s10867-006-2719-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
We investigate the sequence-dependent behaviour of localised excitations in a toy, nonlinear model of DNA base-pair opening originally proposed by Salerno. Specifically we ask whether "breather" solitons could play a role in the facilitated location of promoters by RNA polymerase (RNAP). In an effective potential formalism, we find excellent correlation between potential minima and Escherichia coli promoter recognition sites in the T7 bacteriophage genome. Evidence for a similar relationship between phage promoters and downstream coding regions is found and alternative reasons for links between AT richness and transcriptionally-significant sites are discussed. Consideration of the soliton energy of translocation provides a novel dynamical picture of sliding: steep potential gradients correspond to deterministic motion, while "flat" regions, corresponding to homogeneous AT or GC content, are governed by random, thermal motion. Finally we demonstrate an interesting equivalence between planar, breather solitons and the helical motion of a sliding protein "particle" about a bent DNA axis.
Collapse
Affiliation(s)
- J D Bashford
- School of Mathematics and Physics, University of Tasmania, Hobart 7001, Tasmania, Australia.
| |
Collapse
|
12
|
Bransburg-Zabary S, Nachliel E, Gutman M. Gauging of the PhoE channel by a single freely diffusing proton. Biophys J 2002; 83:2987-3000. [PMID: 12496072 PMCID: PMC1302380 DOI: 10.1016/s0006-3495(02)75305-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
In the present study we combined a continuum approximation with a detailed mapping of the electrostatic potential inside an ionic channel to define the most probable trajectory for proton propagation through the channel (propagation along a structure-supported trajectory (PSST)). The conversion of the three-dimensional diffusion space into propagation along a one-dimensional pathway permits reconstruction of an ion motion by a short calculation (a few seconds on a state-of-the-art workstation) rather than a laborious, time-consuming random walk simulations. The experimental system selected for testing the accuracy of this concept was the reversible dissociation of a proton from a single pyranine molecule (8-hydroxypyrene-1,2,3-trisulfonate) bound by electrostatic forces inside the PhoE ionic channel of the Escherichia coli outer membrane. The crystal structure coordinates were used for calculation of the intra-cavity electrostatic potential, and the reconstruction of the observed fluorescence decay curve was carried out using the dielectric constant of the intra-cavity space as an adjustable parameter. The fitting of past experimental observations (Shimoni, E., Y. Tsfadia, E. Nachliel, and M. Gutman. 1993. Biophys. J. 64:472-479) was carried out by a modified version of the Agmon geminate recombination program (Krissinel, E. B., and N. Agmon. 1996. J. Comp. Chem. 17:1085-1098), where the gradient of the electrostatic potential and the entropic terms were calculated by the PSST program. The best-fitted reconstruction of the observed dynamics was attained when the water in the cavity was assigned epsilon </= 55, corroborating the theoretical estimation of Sansom (Breed, J. R., I. D. Kerr, and M. S. P. Sansom. 1996. Biophys. J. 70:1643-1661). The dielectric constant calculated for reversed micelles of comparable size (Cohen, B., D. Huppert, K. M. Solntsev, Y. Tsfadia, E. Nachliel, and M. Gutman. 2002. JACS. 124:7539-7547) allows us to set a margin of epsilon = 50 +/- 5.
Collapse
Affiliation(s)
- Sharron Bransburg-Zabary
- Laser Laboratory for Fast Reactions in Biology, Department of Biochemistry, The George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Israel
| | | | | |
Collapse
|
13
|
Kamzolova SG, Sivozhelezov VS, Sorokin AA, Dzhelyadin TR, Ivanova NN, Polozov RV. RNA polymerase--promoter recognition. Specific features of electrostatic potential of "early" T4 phage DNA promoters. J Biomol Struct Dyn 2000; 18:325-34. [PMID: 11149509 DOI: 10.1080/07391102.2000.10506669] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Comparative analysis of electrostatic potential distribution for "early" T4 phage promoters was undertaken, along with calculation of topography of electrostatic potential around the native and ADP-ribosylated C-terminal domain of RNA polymerase alpha-subunit. The data obtained indicate that there is specific difference in the patterns of electrostatic potential distribution in far upstream regions of T4 promoters differing by their response to ADP-ribosylation of RNA polymerase. A specific change in profiles of electrostatic potential distribution for the native and ADP-ribosylated forms of RNA polymerase alpha-subunit was observed suggesting that this factor may be responsible for modulating T4 promoter activities in response to the enzyme modification.
Collapse
Affiliation(s)
- S G Kamzolova
- Institute of Cell Biophysics of RAS, Pushchino Moscow Region, Russia.
| | | | | | | | | | | |
Collapse
|