1
|
The Hypersaline Archaeal Histones HpyA and HstA Are DNA Binding Proteins That Defy Categorization According to Commonly Used Functional Criteria. mBio 2023; 14:e0344922. [PMID: 36779711 PMCID: PMC10128011 DOI: 10.1128/mbio.03449-22] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/14/2023] Open
Abstract
Histone proteins are found across diverse lineages of Archaea, many of which package DNA and form chromatin. However, previous research has led to the hypothesis that the histone-like proteins of high-salt-adapted archaea, or halophiles, function differently. The sole histone protein encoded by the model halophilic species Halobacterium salinarum, HpyA, is nonessential and expressed at levels too low to enable genome-wide DNA packaging. Instead, HpyA mediates the transcriptional response to salt stress. Here we compare the features of genome-wide binding of HpyA to those of HstA, the sole histone of another model halophile, Haloferax volcanii. hstA, like hpyA, is a nonessential gene. To better understand HpyA and HstA functions, protein-DNA binding data (chromatin immunoprecipitation sequencing [ChIP-seq]) of these halophilic histones are compared to publicly available ChIP-seq data from DNA binding proteins across all domains of life, including transcription factors (TFs), nucleoid-associated proteins (NAPs), and histones. These analyses demonstrate that HpyA and HstA bind the genome infrequently in discrete regions, which is similar to TFs but unlike NAPs, which bind a much larger genomic fraction. However, unlike TFs that typically bind in intergenic regions, HpyA and HstA binding sites are located in both coding and intergenic regions. The genome-wide dinucleotide periodicity known to facilitate histone binding was undetectable in the genomes of both species. Instead, TF-like and histone-like binding sequence preferences were detected for HstA and HpyA, respectively. Taken together, these data suggest that halophilic archaeal histones are unlikely to facilitate genome-wide chromatin formation and that their function defies categorization as a TF, NAP, or histone. IMPORTANCE Most cells in eukaryotic species-from yeast to humans-possess histone proteins that pack and unpack DNA in response to environmental cues. These essential proteins regulate genes necessary for important cellular processes, including development and stress protection. Although the histone fold domain originated in the domain of life Archaea, the function of archaeal histone-like proteins is not well understood relative to those of eukaryotes. We recently discovered that, unlike histones of eukaryotes, histones in hypersaline-adapted archaeal species do not package DNA and can act as transcription factors (TFs) to regulate stress response gene expression. However, the function of histones across species of hypersaline-adapted archaea still remains unclear. Here, we compare hypersaline histone function to a variety of DNA binding proteins across the tree of life, revealing histone-like behavior in some respects and specific transcriptional regulatory function in others.
Collapse
|
2
|
Behle A, Dietsch M, Goldschmidt L, Murugathas W, Berwanger L, Burmester J, Yao L, Brandt D, Busche T, Kalinowski J, Hudson E, Ebenhöh O, Axmann I, Machné R. Manipulation of topoisomerase expression inhibits cell division but not growth and reveals a distinctive promoter structure in Synechocystis. Nucleic Acids Res 2022; 50:12790-12808. [PMID: 36533444 PMCID: PMC9825172 DOI: 10.1093/nar/gkac1132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Revised: 11/03/2022] [Accepted: 11/10/2022] [Indexed: 12/23/2022] Open
Abstract
In cyanobacteria DNA supercoiling varies over the diurnal cycle and is integrated with temporal programs of transcription and replication. We manipulated DNA supercoiling in Synechocystis sp. PCC 6803 by CRISPRi-based knockdown of gyrase subunits and overexpression of topoisomerase I (TopoI). Cell division was blocked but cell growth continued in all strains. The small endogenous plasmids were only transiently relaxed, then became strongly supercoiled in the TopoI overexpression strain. Transcript abundances showed a pronounced 5'/3' gradient along transcription units, incl. the rRNA genes, in the gyrase knockdown strains. These observations are consistent with the basic tenets of the homeostasis and twin-domain models of supercoiling in bacteria. TopoI induction initially led to downregulation of G+C-rich and upregulation of A+T-rich genes. The transcriptional response quickly bifurcated into six groups which overlap with diurnally co-expressed gene groups. Each group shows distinct deviations from a common core promoter structure, where helically phased A-tracts are in phase with the transcription start site. Together, our data show that major co-expression groups (regulons) in Synechocystis all respond differentially to DNA supercoiling, and suggest to re-evaluate the long-standing question of the role of A-tracts in bacterial promoters.
Collapse
Affiliation(s)
| | | | - Louis Goldschmidt
- Institut f. Quantitative u. Theoretische Biologie, Heinrich-Heine Universität Düsseldorf, Universitätsstrasse 1, 40225 Düsseldorf, Germany
| | - Wandana Murugathas
- Institut f. Synthetische Mikrobiologie, Heinrich-Heine Universität Düsseldorf, Universitätsstrasse 1, 40225 Düsseldorf, Germany
| | - Lutz C Berwanger
- Institut f. Synthetische Mikrobiologie, Heinrich-Heine Universität Düsseldorf, Universitätsstrasse 1, 40225 Düsseldorf, Germany
| | - Jonas Burmester
- Institut f. Synthetische Mikrobiologie, Heinrich-Heine Universität Düsseldorf, Universitätsstrasse 1, 40225 Düsseldorf, Germany
| | - Lun Yao
- School of Engineering Sciences in Chemistry, Biotechnology and Health, Science for Life Laboratory, KTH – Royal Institute of Technology, Stockholm, Sweden
| | - David Brandt
- Centrum für Biotechnologie (CeBiTec), Universität Bielefeld, Universitätsstrasse 27, 33615 Bielefeld, Germany
| | - Tobias Busche
- Centrum für Biotechnologie (CeBiTec), Universität Bielefeld, Universitätsstrasse 27, 33615 Bielefeld, Germany
| | - Jörn Kalinowski
- Centrum für Biotechnologie (CeBiTec), Universität Bielefeld, Universitätsstrasse 27, 33615 Bielefeld, Germany
| | - Elton P Hudson
- School of Engineering Sciences in Chemistry, Biotechnology and Health, Science for Life Laboratory, KTH – Royal Institute of Technology, Stockholm, Sweden
| | - Oliver Ebenhöh
- Institut f. Quantitative u. Theoretische Biologie, Heinrich-Heine Universität Düsseldorf, Universitätsstrasse 1, 40225 Düsseldorf, Germany,Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich-Heine-Universität Düsseldorf, Universitätsstraße 1, 40225 Düsseldorf, Germany
| | - Ilka M Axmann
- Institut f. Synthetische Mikrobiologie, Heinrich-Heine Universität Düsseldorf, Universitätsstrasse 1, 40225 Düsseldorf, Germany
| | - Rainer Machné
- To whom correspondence should be addressed. Tel: +49 211 81 12923;
| |
Collapse
|
3
|
Sarkar S, Dey U, Khohliwe TB, Yella VR, Kumar A. Analysis of nucleoid-associated protein-binding regions reveals DNA structural features influencing genome organization in Mycobacterium tuberculosis. FEBS Lett 2021; 595:2504-2521. [PMID: 34387867 DOI: 10.1002/1873-3468.14178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 08/01/2021] [Accepted: 08/11/2021] [Indexed: 11/10/2022]
Abstract
Nucleoid-Associated Proteins (NAPs) maintain bacterial nucleoid configuration through their architectural properties of DNA bending, wrapping, and bridging. However, the contribution of DNA structural alterations to DNA-NAP recognition at the genomic scale remains unresolved. Present work dissects the DNA sequence, shape and altered structural preferences at a genomic scale for six NAPs in Mycobacterium tuberculosis. Results suggest narrower minor groove width and higher DNA rigidity are marked for the binding sites of EspR and Lsr2, while mIHF, MtHU and NapM have heterogeneous DNA structural predilections. In contrast, WhiB4-DNA binding sites were characterized by wider minor groove width, highly deformable and less curved DNA. This work provides systematic insight into NAP-mediated genome organization as a function of DNA structural features.
Collapse
Affiliation(s)
- Sharmilee Sarkar
- Department of Molecular Biology and Biotechnology, Tezpur University, Assam, India
| | - Upalabdha Dey
- Department of Molecular Biology and Biotechnology, Tezpur University, Assam, India
| | | | - Venkata Rajesh Yella
- Department of Biotechnology, Koneru Lakshmaiah Education Foundation, Andhra Pradesh, India
| | - Aditya Kumar
- Department of Molecular Biology and Biotechnology, Tezpur University, Assam, India
| |
Collapse
|
4
|
Abstract
Today massive amounts of sequenced metagenomic and metatranscriptomic data from different ecological niches and environmental locations are available. Scientific progress depends critically on methods that allow extracting useful information from the various types of sequence data. Here, we will first discuss types of information contained in the various flavours of biological sequence data, and how this information can be interpreted to increase our scientific knowledge and understanding. We argue that a mechanistic understanding of biological systems analysed from different perspectives is required to consistently interpret experimental observations, and that this understanding is greatly facilitated by the generation and analysis of dynamic mathematical models. We conclude that, in order to construct mathematical models and to test mechanistic hypotheses, time-series data are of critical importance. We review diverse techniques to analyse time-series data and discuss various approaches by which time-series of biological sequence data have been successfully used to derive and test mechanistic hypotheses. Analysing the bottlenecks of current strategies in the extraction of knowledge and understanding from data, we conclude that combined experimental and theoretical efforts should be implemented as early as possible during the planning phase of individual experiments and scientific research projects. This article is part of the theme issue ‘Integrative research perspectives on marine conservation’.
Collapse
Affiliation(s)
- Ovidiu Popa
- Institute of Quantitative and Theoretical Biology, CEPLAS, Heinrich-Heine University Düsseldorf, Germany
| | - Ellen Oldenburg
- Institute of Quantitative and Theoretical Biology, CEPLAS, Heinrich-Heine University Düsseldorf, Germany
| | - Oliver Ebenhöh
- Institute of Quantitative and Theoretical Biology, CEPLAS, Heinrich-Heine University Düsseldorf, Germany.,Cluster of Excellence on Plant Sciences, CEPLAS, Heinrich-Heine University Düsseldorf, Germany
| |
Collapse
|
5
|
Atzinger A, Lawrence JG. Selection for ancient periodic motifs that do not impart DNA bending. PLoS Genet 2020; 16:e1009042. [PMID: 33022009 PMCID: PMC7537859 DOI: 10.1371/journal.pgen.1009042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 08/11/2020] [Indexed: 11/19/2022] Open
Abstract
A ~10-11 bp periodicity in dinucleotides imparting DNA bending, with shorter periods found in organisms with positively-supercoiled DNA and longer periods found in organisms with negatively-supercoiled DNA, was previously suggested to assist in DNA compaction. However, when measured with more robust methods, variation in the observed periods between organisms with different growth temperatures is not consistent with that hypothesis. We demonstrate that dinucleotide periodicity does not arise solely by mutational biases but is under selection. We found variation between genomes in both the period and the suite of dinucleotides that are periodic. Whereas organisms with similar growth temperatures have highly variable periods, differences in periods increase with phylogenetic distance between organisms. In addition, while the suites of dinucleotides under selection for periodicity become more dissimilar among more distantly-related organisms, there is a core set of dinucleotides that are strongly periodic among genomes in all domains of life. Notably, this core set of periodic motifs are not involved in DNA bending. These data indicate that dinucleotide periodicity is an ancient genomic architecture which may play a role in shaping the evolution of genes and genomes.
Collapse
Affiliation(s)
- Aletheia Atzinger
- University of Pittsburgh, Department of Biological Sciences, Pittsburgh, United States of America
| | - Jeffrey G Lawrence
- University of Pittsburgh, Department of Biological Sciences, Pittsburgh, United States of America
| |
Collapse
|
6
|
McManus JB, Emanuel PA, Murray RM, Lux MW. A method for cost-effective and rapid characterization of engineered T7-based transcription factors by cell-free protein synthesis reveals insights into the regulation of T7 RNA polymerase-driven expression. Arch Biochem Biophys 2019; 674:108045. [PMID: 31326518 DOI: 10.1016/j.abb.2019.07.010] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Revised: 06/26/2019] [Accepted: 07/13/2019] [Indexed: 12/20/2022]
Abstract
The T7 bacteriophage RNA polymerase (T7 RNAP) serves as a model for understanding RNA synthesis, as a tool for protein expression, and as an actuator for synthetic gene circuit design in bacterial cells and cell-free extract. T7 RNAP is an attractive tool for orthogonal protein expression in bacteria owing to its compact single subunit structure and orthogonal promoter specificity. Understanding the mechanisms underlying T7 RNAP regulation is important to the design of engineered T7-based transcription factors, which can be used in gene circuit design. To explore regulatory mechanisms for T7 RNAP-driven expression, we developed a rapid and cost-effective method to characterize engineered T7-based transcription factors using cell-free protein synthesis and an acoustic liquid handler. Using this method, we investigated the effects of the tetracycline operator's proximity to the T7 promoter on the regulation of T7 RNAP-driven expression. Our results reveal a mechanism for regulation that functions by interfering with the transition of T7 RNAP from initiation to elongation and validates the use of the method described here to engineer future T7-based transcription factors.
Collapse
Affiliation(s)
- John B McManus
- Army Research Laboratory - West Campus, California Institute of Technology, 1200 East California Blvd, Pasadena, CA, 91125, USA
| | - Peter A Emanuel
- US Army Combat Capabilities Development Command Chemical Biological Center, 8198 Blackhawk Rd, APG, MD, 21010, USA
| | - Richard M Murray
- California Institute of Technology, Biology and Biological Engineering, 1200 East California Blvd, Pasadena, CA, 91125, USA
| | - Matthew W Lux
- US Army Combat Capabilities Development Command Chemical Biological Center, 8198 Blackhawk Rd, APG, MD, 21010, USA.
| |
Collapse
|
7
|
Chechetkin VR, Lobzin VV. Large-scale chromosome folding versus genomic DNA sequences: A discrete double Fourier transform technique. J Theor Biol 2017; 426:162-179. [PMID: 28552553 DOI: 10.1016/j.jtbi.2017.05.033] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2017] [Revised: 04/23/2017] [Accepted: 05/23/2017] [Indexed: 12/15/2022]
Abstract
Using state-of-the-art techniques combining imaging methods and high-throughput genomic mapping tools leaded to the significant progress in detailing chromosome architecture of various organisms. However, a gap still remains between the rapidly growing structural data on the chromosome folding and the large-scale genome organization. Could a part of information on the chromosome folding be obtained directly from underlying genomic DNA sequences abundantly stored in the databanks? To answer this question, we developed an original discrete double Fourier transform (DDFT). DDFT serves for the detection of large-scale genome regularities associated with domains/units at the different levels of hierarchical chromosome folding. The method is versatile and can be applied to both genomic DNA sequences and corresponding physico-chemical parameters such as base-pairing free energy. The latter characteristic is closely related to the replication and transcription and can also be used for the assessment of temperature or supercoiling effects on the chromosome folding. We tested the method on the genome of E. coli K-12 and found good correspondence with the annotated domains/units established experimentally. As a brief illustration of further abilities of DDFT, the study of large-scale genome organization for bacteriophage PHIX174 and bacterium Caulobacter crescentus was also added. The combined experimental, modeling, and bioinformatic DDFT analysis should yield more complete knowledge on the chromosome architecture and genome organization.
Collapse
Affiliation(s)
- V R Chechetkin
- Engelhardt Institute of Molecular Biology of Russian Academy of Sciences, Vavilov str., 32, Moscow 119334, Russia; Theoretical Department of Division for Perspective Investigations, Troitsk Institute of Innovation and Thermonuclear Investigations (TRINITI), Moscow, Troitsk District 108840, Russia.
| | - V V Lobzin
- School of Physics, University of Sydney, Sydney, NSW 2006, Australia.
| |
Collapse
|
8
|
Forsdyke DR. Complexity. Evol Bioinform Online 2016. [DOI: 10.1007/978-3-319-28755-3_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
|
9
|
Variation and constraints in species-specific promoter sequences. J Theor Biol 2014; 363:357-66. [PMID: 25149367 DOI: 10.1016/j.jtbi.2014.08.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2014] [Revised: 07/30/2014] [Accepted: 08/04/2014] [Indexed: 11/24/2022]
Abstract
A vast literature is nowadays devoted to the search of correlations between transcription related functions and the composition of sequences upstream the Transcription Start Site. Little is known about the possible functional effects of nucleotide distributions on the conformational landscape of DNA in such regions. We have used suitable statistical indicators for identifying sequences that may play an important role in regulating transcription processes. In particular, we have analyzed base composition, periodicity and information content in sets of aligned promoters clustered according to functional information in order to obtain an insight on the main structural differences between promoters regulating genes with different functions. Our results show that when we select promoters according to some biological information, in a single species, at least in vertebrates, we observe structurally different classes of sequences. The highly variable and differentiated gene expression patterns may explain the great extent of structural differentiation observed in complex organisms. In fact, despite our analysis is focused on Homo sapiens, we provide also a comparison with other species, selected at different positions in the phylogenetic tree.
Collapse
|
10
|
Abstract
A periodic bias in nucleotide frequency with a period of about 11 bp is characteristic for bacterial genomes. This signal is commonly interpreted to relate to the helical pitch of negatively supercoiled DNA. Functions in supercoiling-dependent RNA transcription or as a 'structural code' for DNA packaging have been suggested. Cyanobacterial genomes showed especially strong periodic signals and, on the other hand, DNA supercoiling and supercoiling-dependent transcription are highly dynamic and underlie circadian rhythms of these phototrophic bacteria. Focusing on this phylum and dinucleotides, we find that a minimal motif of AT-tracts (AT2) yields the strongest signal. Strong genome-wide periodicity is ancestral to a clade of unicellular and polyploid species but lost upon morphological transitions into two baeocyte-forming and a symbiotic species. The signal is intermediate in heterocystous species and weak in monoploid picocyanobacteria. A pronounced 'structural code' may support efficient nucleoid condensation and segregation in polyploid cells. The major source of the AT2 signal are protein-coding regions, where it is encoded preferentially in the first and third codon positions. The signal shows only few relations to supercoiling-dependent and diurnal RNA transcription in Synechocystis sp. PCC 6803. Strong and specific signals in two distinct transposons suggest roles in transposase transcription and transpososome formation.
Collapse
Affiliation(s)
- Robert Lehmann
- Institute for Theoretical Biology, Humboldt University, Berlin, Invalidenstraße 43, D-10115, Berlin, Germany
| | - Rainer Machné
- Institute for Theoretical Biology, Humboldt University, Berlin, Invalidenstraße 43, D-10115, Berlin, Germany Institute for Theoretical Chemistry, University of Vienna, Währinger Straße 17, A-1090, Vienna, Austria
| | - Hanspeter Herzel
- Institute for Theoretical Biology, Humboldt University, Berlin, Invalidenstraße 43, D-10115, Berlin, Germany
| |
Collapse
|
11
|
|
12
|
Tong H, Mrázek J. Investigating the interplay between nucleoid-associated proteins, DNA curvature, and CRISPR elements using comparative genomics. PLoS One 2014; 9:e90940. [PMID: 24595272 PMCID: PMC3940949 DOI: 10.1371/journal.pone.0090940] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2013] [Accepted: 02/06/2014] [Indexed: 02/03/2023] Open
Abstract
Many prokaryotic and eukaryotic genomes feature a characteristic periodic signal in distribution of short runs of A or T (A-tracts) phased with the DNA helical period of ∼10-11 bp. Such periodic spacing of A-tracts has been associated with intrinsic DNA curvature. In eukaryotes, this periodicity is a major component of the nucleosome positioning signal but its physiological role in prokaryotes is not clear. One hypothesis centers on possible role of intrinsic DNA bends in nucleoid compaction. We use comparative genomics to investigate possible relationship between the A-tract periodicity and nucleoid-associated proteins in prokaryotes. We found that genomes with DNA-bridging proteins tend to exhibit stronger A-tract periodicity, presumably indicative of more prevalent intrinsic DNA curvature. A weaker relationship was detected for nucleoid-associated proteins that do not form DNA bridges. We consider these results an indication that intrinsic DNA curvature acts collaboratively with DNA-bridging proteins in maintaining the compact structure of the nucleoid, and that previously observed differences among prokaryotic genomes in terms DNA curvature-related sequence periodicity may reflect differences in nucleoid organization. We subsequently investigated the relationship between A-tract periodicity and presence of CRISPR elements and we found that genomes with CRISPR tend to have stronger A-tract periodicity. This result is consistent with our earlier hypothesis that extensive A-tract periodicity could help protect the chromosome against integration of prophages, possibly due to its role in compaction of the nucleoid.
Collapse
Affiliation(s)
- Hao Tong
- Department of Statistics, University of Georgia, Athens, Georgia, United States of America
| | - Jan Mrázek
- Department of Microbiology and Institute of Bioinformatics, University of Georgia, Athens, Georgia, United States of America
- * E-mail:
| |
Collapse
|
13
|
Kravatskaya GI, Chechetkin VR, Kravatsky YV, Tumanyan VG. Structural attributes of nucleotide sequences in promoter regions of supercoiling-sensitive genes: how to relate microarray expression data with genomic sequences. Genomics 2012; 101:1-11. [PMID: 23085385 DOI: 10.1016/j.ygeno.2012.10.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2012] [Revised: 09/10/2012] [Accepted: 10/11/2012] [Indexed: 11/18/2022]
Abstract
The level of supercoiling in the chromosome can affect gene expression. To clarify the basis of supercoiling sensitivity, we analyzed the structural features of nucleotide sequences in the vicinity of promoters for the genes with expression enhanced and decreased in response to loss of chromosomal supercoiling in Escherichia coli. Fourier analysis of promoter sequences for supercoiling-sensitive genes reveals the tendency in selection of sequences with helical periodicities close to 10nt for relaxation-induced genes and to 11nt for relaxation-repressed genes. The helical periodicities in the subsets of promoters recognized by RNA polymerase with different sigma factors were also studied. A special procedure was developed for the study of correlations between the intensities of periodicities in promoter sequences and the expression levels of corresponding genes. Significant correlations of expression with the AT content and with AT periodicities about 10, 11, and 50nt indicate their role in regulation of supercoiling-sensitive genes.
Collapse
Affiliation(s)
- Galina I Kravatskaya
- Engelhardt Institute of Molecular Biology of Russian Academy of Sciences, Russia.
| | | | | | | |
Collapse
|
14
|
Abel J, Mrázek J. Differences in DNA curvature-related sequence periodicity between prokaryotic chromosomes and phages, and relationship to chromosomal prophage content. BMC Genomics 2012; 13:188. [PMID: 22587570 PMCID: PMC3431218 DOI: 10.1186/1471-2164-13-188] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2011] [Accepted: 05/07/2012] [Indexed: 02/07/2023] Open
Abstract
Background Periodic spacing of A-tracts (short runs of A or T) with the DNA helical period of ~10–11 bp is characteristic of intrinsically bent DNA. In eukaryotes, the DNA bending is related to chromatin structure and nucleosome positioning. However, the physiological role of strong sequence periodicity detected in many prokaryotic genomes is not clear. Results We developed measures of intensity and persistency of DNA curvature-related sequence periodicity and applied them to prokaryotic chromosomes and phages. The results indicate that strong periodic signals present in chromosomes are generally absent in phage genomes. Moreover, chromosomes containing prophages are less likely to possess a persistent periodic signal than chromosomes with no prophages. Conclusions Absence of DNA curvature-related sequence periodicity in phages could arise from constraints associated with DNA packaging in the viral capsid. Lack of prophages in chromosomes with persistent periodic signal suggests that the sequence periodicity and concomitant DNA curvature could play a role in protecting the chromosomes from integration of phage DNA.
Collapse
Affiliation(s)
- Jacob Abel
- Department of Microbiology, University of Georgia, Athens, GA 30602, USA
| | | |
Collapse
|
15
|
Mrázek J, Chaudhari T, Basu A. PerPlot & PerScan: tools for analysis of DNA curvature-related periodicity in genomic nucleotide sequences. MICROBIAL INFORMATICS AND EXPERIMENTATION 2011; 1:13. [PMID: 22587738 PMCID: PMC3372288 DOI: 10.1186/2042-5783-1-13] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2011] [Accepted: 11/28/2011] [Indexed: 04/12/2023]
Abstract
Background Periodic spacing of short adenine or thymine runs phased with DNA helical period of ~10.5 bp is associated with intrinsic DNA curvature and deformability, which play important roles in DNA-protein interactions and in the organization of chromosomes in both eukaryotes and prokaryotes. Local differences in DNA sequence periodicity have been linked to differences in gene expression in some organisms. Despite the significance of these periodic patterns, there are virtually no publicly accessible tools for their analysis. Results We present novel tools suitable for assessments of DNA curvature-related sequence periodicity in nucleotide sequences at the genome scale. Utility of the present software is demonstrated on a comparison of sequence periodicities in the genomes of Haemophilus influenzae, Methanocaldococcus jannaschii, Saccharomyces cerevisiae, and Arabidopsis thaliana. The software can be accessed through a web interface and the programs are also available for download. Conclusions The present software is suitable for comparing DNA curvature-related sequence periodicity among different genomes as well as for analysis of intrachromosomal heterogeneity of the sequence periodicity. It provides a quick and convenient way to detect anomalous regions of chromosomes that could have unusual structural and functional properties and/or distinct evolutionary history.
Collapse
Affiliation(s)
- Jan Mrázek
- Department of Microbiology and Institute of Bioinformatics, University of Georgia, Athens, GA 30602-2605, USA.
| | | | | |
Collapse
|
16
|
CAGO: a software tool for dynamic visual comparison and correlation measurement of genome organization. PLoS One 2011; 6:e27080. [PMID: 22114666 PMCID: PMC3219657 DOI: 10.1371/journal.pone.0027080] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2011] [Accepted: 10/10/2011] [Indexed: 11/26/2022] Open
Abstract
CAGO (Comparative Analysis of Genome Organization) is developed to address two critical shortcomings of conventional genome atlas plotters: lack of dynamic exploratory functions and absence of signal analysis for genomic properties. With dynamic exploratory functions, users can directly manipulate chromosome tracks of a genome atlas and intuitively identify distinct genomic signals by visual comparison. Signal analysis of genomic properties can further detect inconspicuous patterns from noisy genomic properties and calculate correlations between genomic properties across various genomes. To implement dynamic exploratory functions, CAGO presents each genome atlas in Scalable Vector Graphics (SVG) format and allows users to interact with it using a SVG viewer through JavaScript. Signal analysis functions are implemented using R statistical software and a discrete wavelet transformation package waveslim. CAGO is not only a plotter for generating complex genome atlases, but also a platform for exploring genome atlases with dynamic exploratory functions for visual comparison and with signal analysis for comparing genomic properties across multiple organisms. The web-based application of CAGO, its source code, user guides, video demos, and live examples are publicly available and can be accessed at http://cbs.ym.edu.tw/cago.
Collapse
|
17
|
Coexistence of different base periodicities in prokaryotic genomes as related to DNA curvature, supercoiling, and transcription. Genomics 2011; 98:223-31. [PMID: 21722724 DOI: 10.1016/j.ygeno.2011.06.006] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Revised: 05/30/2011] [Accepted: 06/13/2011] [Indexed: 01/15/2023]
Abstract
We analyzed the periodic patterns in E. coli promoters and compared the distributions of the corresponding patterns in promoters and in the complete genome to elucidate their function. Except the three-base periodicity, coincident with that in the coding regions and growing stronger in the region downstream from the transcriptions start (TS), all other salient periodicities are peaked upstream of TS. We found that helical periodicities with the lengths about B-helix pitch ~10.2-10.5 bp and A-helix pitch ~10.8-11.1 bp coexist in the genomic sequences. We mapped the distributions of stretches with A-, B-, and Z-like DNA periodicities onto E. coli genome. All three periodicities tend to concentrate within non-coding regions when their intensity becomes stronger and prevail in the promoter sequences. The comparison with available experimental data indicates that promoters with the most pronounced periodicities may be related to the supercoiling-sensitive genes.
Collapse
|
18
|
Rapoport AE, Frenkel ZM, Trifonov EN. Nucleosome positioning pattern derived from oligonucleotide compositions of genomic sequences. J Biomol Struct Dyn 2011; 28:567-74. [PMID: 21142224 DOI: 10.1080/07391102.2011.10531243] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Availability of nucleosome positioning pattern(s) is crucial for chromatin studies. The matrix form of the pattern has been recently derived (I. Gabdank, D. Barash, E. N. Trifonov. J Biomol Struct Dyn 26, 403-412 (2009), and E. N. Trifonov. J Biomol Struct Dyn 27, 741-746 (2010)). In its simplified linear form it is described by the motif CGRAAATTTYCG. Oligonucleotide components of the motif (say, triplets GRA, RAA, AAA, etc.) would be expected to appear in eukaryotic sequences more frequently. In this work we attempted the reconstruction of the bendability patterns for 13 genomes by a novel approach-extension of highest frequency trinucleotides. The consensus of the patterns reconstructed on the basis of trinucleotide frequencies in 13 eukaryotic genomes is derived: CRAAAATTTTYG. It conforms to the earlier established sequence motif. The reconstruction, thus, attests to the universality of the nucleosome DNA bendability pattern.
Collapse
Affiliation(s)
- Alexandra E Rapoport
- Genome Diversity Center, Institute of Evolution, University of Haifa, Mount Carmel, Haifa 31905, Israel.
| | | | | |
Collapse
|
19
|
Liu H, Duan X, Yu S, Sun X. Analysis of nucleosome positioning determined by DNA helix curvature in the human genome. BMC Genomics 2011; 12:72. [PMID: 21269520 PMCID: PMC3037905 DOI: 10.1186/1471-2164-12-72] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2010] [Accepted: 01/27/2011] [Indexed: 12/03/2022] Open
Abstract
Background Nucleosome positioning has an important role in gene regulation. However, dynamic positioning in vivo casts doubt on the reliability of predictions based on DNA sequence characteristics. What role does sequence-dependent positioning play? In this paper, using a curvature profile model, nucleosomes are predicted in the human genome and patterns of nucleosomes near some key sites are investigated. Results Curvature profiling revealed that in the vicinity of a transcription start site, there is also a nucleosome-free region. Near transcription factor binding sites, curvature profiling showed a trough, indicating nucleosome depletion. The trough of the curvature profile corresponds well to the high binding scores of transcription factors. Moreover, our analysis suggests that nucleosome positioning has a selective protection role. Target sites of miRNAs are occupied by nucleosomes, while single nucleotide polymorphism sites are depleted of nucleosomes. Conclusions The results indicate that DNA sequences play an important role in nucleosome positioning, and the positioning is important not only in gene regulation, but also in genetic variation and miRNA functions.
Collapse
Affiliation(s)
- Hongde Liu
- State Key Laboratory of Bioelectronics, Southeast University, Nanjing 210096, China
| | | | | | | |
Collapse
|
20
|
Comparative analysis of sequence periodicity among prokaryotic genomes points to differences in nucleoid structure and a relationship to gene expression. J Bacteriol 2010; 192:3763-72. [PMID: 20494989 DOI: 10.1128/jb.00149-10] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Regular spacing of short runs of A or T nucleotides in DNA sequences with a period close to the helical period of the DNA double helix has been associated with intrinsic DNA bending and nucleosome positioning in eukaryotes. Analogous periodic signals were also observed in prokaryotic genomes. While the exact role of this periodicity in prokaryotes is not known, it has been proposed to facilitate the DNA packaging in the prokaryotic nucleoid and/or to promote negative or positive supercoiling. We developed a methodology for assessments of intragenomic heterogeneity of these periodic patterns and applied it in analysis of 1,025 prokaryotic chromosomes. This technique allows more detailed analysis of sequence periodicity than previous methods where sequence periodicity was assessed in an integral form across the whole chromosome. We found that most genomes have the periodic signal confined to several chromosomal segments while most of the chromosome lacks a strong sequence periodicity. Moreover, there are significant differences among different prokaryotes in both the intensity and persistency of sequence periodicity related to DNA curvature. We proffer that the prokaryotic nucleoid consists of relatively rigid sections stabilized by short intrinsically bent DNA segments and characterized by locally strong periodic patterns alternating with regions featuring a weak periodic signal, which presumably permits higher structural flexibility. This model applies to most bacteria and archaea. In genomes with an exceptionally persistent periodic signal, highly expressed genes tend to concentrate in aperiodic sections, suggesting that structural heterogeneity of the nucleoid is related to local differences in transcriptional activity.
Collapse
|
21
|
INVERTER: INtegrated Variable numbER Tandem rEpeat findeR. COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE 2010. [DOI: 10.1007/978-3-642-16750-8_14] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
|
22
|
Seaman JD, Sanford JC. Skittle: A 2-Dimensional Genome Visualization Tool. BMC Bioinformatics 2009; 10:452. [PMID: 20042093 PMCID: PMC2817707 DOI: 10.1186/1471-2105-10-452] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2009] [Accepted: 12/30/2009] [Indexed: 11/16/2022] Open
Abstract
Background It is increasingly evident that there are multiple and overlapping patterns within the genome, and that these patterns contain different types of information - regarding both genome function and genome history. In order to discover additional genomic patterns which may have biological significance, novel strategies are required. To partially address this need, we introduce a new data visualization tool entitled Skittle. Results This program first creates a 2-dimensional nucleotide display by assigning four colors to the four nucleotides, and then text-wraps to a user adjustable width. This nucleotide display is accompanied by a "repeat map" which comprehensively displays all local repeating units, based upon analysis of all possible local alignments. Skittle includes a smooth-zooming interface which allows the user to analyze genomic patterns at any scale. Skittle is especially useful in identifying and analyzing tandem repeats, including repeats not normally detectable by other methods. However, Skittle is also more generally useful for analysis of any genomic data, allowing users to correlate published annotations and observable visual patterns, and allowing for sequence and construct quality control. Conclusions Preliminary observations using Skittle reveal intriguing genomic patterns not otherwise obvious, including structured variations inside tandem repeats. The striking visual patterns revealed by Skittle appear to be useful for hypothesis development, and have already led the authors to theorize that imperfect tandem repeats could act as information carriers, and may form tertiary structures within the interphase nucleus.
Collapse
|
23
|
Nov Klaiman T, Hosid S, Bolshoy A. Upstream curved sequences in E. coli are related to the regulation of transcription initiation. Comput Biol Chem 2009; 33:275-82. [PMID: 19646927 DOI: 10.1016/j.compbiolchem.2009.06.007] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2009] [Accepted: 06/17/2009] [Indexed: 01/03/2023]
Abstract
The advancement in Escherichia coli genome research has made the information regarding transcription start sites of many genes available. A study relying on the availability of transcription start locations was performed. The first question addressed was what an average DNA curvature profile upstream of genes would look like when these genes are aligned by transcription start sites in comparison to alignment by translation start sites. Since it was hypothesized that curvature plays a role in transcription regulation, the expectation was that curvature measurements relative to transcription starts, rather than translation, should strengthen the signal. Our study justified this expectation. The second question aimed to clarify the relation between DNA curvature and promoter strength. Through clustering based on DNA curvature profiles along promoter regions, a strong positive correlation between the promoter strength and the curved DNA was found. The third question dealt with dinucleotide periodicity in E. coli to see whether a periodicity pattern specific to promoter regions exists. Such unknown pattern might shed new light on transcription regulation mechanisms in E. coli. A sequence periodicity of about 11 bp is characteristic to the whole E. coli genome, and is especially well-expressed in intergenic regions. Here it was shown that regions of the size of about 100-150 bp centered 70-100 bp upstream to transcription starts carry hidden periodicity with a period of about 10.3 bp.
Collapse
Affiliation(s)
- Tamar Nov Klaiman
- Department of Evolutionary and Environmental Biology, University of Haifa, Haifa 31905, Israel
| | | | | |
Collapse
|
24
|
Mrazek J. Phylogenetic Signals in DNA Composition: Limitations and Prospects. Mol Biol Evol 2009; 26:1163-9. [DOI: 10.1093/molbev/msp032] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
|
25
|
Luijsterburg MS, White MF, van Driel R, Dame RT. The major architects of chromatin: architectural proteins in bacteria, archaea and eukaryotes. Crit Rev Biochem Mol Biol 2009; 43:393-418. [PMID: 19037758 DOI: 10.1080/10409230802528488] [Citation(s) in RCA: 155] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
The genomic DNA of all organisms across the three kingdoms of life needs to be compacted and functionally organized. Key players in these processes are DNA supercoiling, macromolecular crowding and architectural proteins that shape DNA by binding to it. The architectural proteins in bacteria, archaea and eukaryotes generally do not exhibit sequence or structural conservation especially across kingdoms. Instead, we propose that they are functionally conserved. Most of these proteins can be classified according to their architectural mode of action: bending, wrapping or bridging DNA. In order for DNA transactions to occur within a compact chromatin context, genome organization cannot be static. Indeed chromosomes are subject to a whole range of remodeling mechanisms. In this review, we discuss the role of (i) DNA supercoiling, (ii) macromolecular crowding and (iii) architectural proteins in genome organization, as well as (iv) mechanisms used to remodel chromosome structure and to modulate genomic activity. We conclude that the underlying mechanisms that shape and remodel genomes are remarkably similar among bacteria, archaea and eukaryotes.
Collapse
Affiliation(s)
- Martijn S Luijsterburg
- Swammerdam Institute for Life Sciences, University of Amsterdam, Kruislaan, Amsterdam, The Netherlands
| | | | | | | |
Collapse
|
26
|
Warnecke T, Batada NN, Hurst LD. The impact of the nucleosome code on protein-coding sequence evolution in yeast. PLoS Genet 2008; 4:e1000250. [PMID: 18989456 PMCID: PMC2570795 DOI: 10.1371/journal.pgen.1000250] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2008] [Accepted: 10/02/2008] [Indexed: 11/18/2022] Open
Abstract
Coding sequence evolution was once thought to be the result of selection on optimal protein function alone. Selection can, however, also act at the RNA level, for example, to facilitate rapid translation or ensure correct splicing. Here, we ask whether the way DNA works also imposes constraints on coding sequence evolution. We identify nucleosome positioning as a likely candidate to set up such a DNA-level selective regime and use high-resolution microarray data in yeast to compare the evolution of coding sequence bound to or free from nucleosomes. Controlling for gene expression and intra-gene location, we find a nucleosome-free "linker" sequence to evolve on average 5-6% slower at synonymous sites. A reduced rate of evolution in linker is especially evident at the 5' end of genes, where the effect extends to non-synonymous substitution rates. This is consistent with regular nucleosome architecture in this region being important in the context of gene expression control. As predicted, codons likely to generate a sequence unfavourable to nucleosome formation are enriched in linker sequence. Amino acid content is likewise skewed as a function of nucleosome occupancy. We conclude that selection operating on DNA to maintain correct positioning of nucleosomes impacts codon choice, amino acid choice, and synonymous and non-synonymous rates of evolution in coding sequence. The results support the exclusion model for nucleosome positioning and provide an alternative interpretation for runs of rare codons. As the intimate association of histones and DNA is a universal characteristic of genic sequence in eukaryotes, selection on coding sequence composition imposed by nucleosome positioning should be phylogenetically widespread.
Collapse
Affiliation(s)
- Tobias Warnecke
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Nizar N. Batada
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Laurence D. Hurst
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
- * E-mail:
| |
Collapse
|
27
|
Liu H, Wu J, Xie J, Yang X, Lu Z, Sun X. Characteristics of nucleosome core DNA and their applications in predicting nucleosome positions. Biophys J 2008; 94:4597-604. [PMID: 18326654 PMCID: PMC2397361 DOI: 10.1529/biophysj.107.117028] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2007] [Accepted: 01/18/2008] [Indexed: 11/18/2022] Open
Abstract
By analyzing dinucleotide position-frequency data of yeast nucleosome-bound DNA sequences, dinucleotide periodicities of core DNA sequences were investigated. Within frequency domains, weakly bound dinucleotides (AA, AT, and the combinations AA-TT-TA and AA-TT-TA-AT) present doublet peaks in a periodicity range of 10-11 bp, and strongly bound dinucleotides present a single peak. A time-frequency analysis, based on wavelet transformation, indicated that weakly bound dinucleotides of core DNA sequences were spaced smaller (approximately 10.3 bp) at the two ends, with larger (approximately 11.1 bp) spacing in the middle section. The finding was supported by DNA curvature and was prevalent in all core DNA sequences. Therefore, three approaches were developed to predict nucleosome positions. After analyzing a 2200-bp DNA sequence, results indicated that the predictions were feasible; areas near protein-DNA binding sites resulted in periodicity profiles with irregular signals. The effects of five dinucleotide patterns were evaluated, indicating that the AA-TT pattern exhibited better performance. A chromosome-scale prediction demonstrated that periodicity profiles perform better than previously described, with up to 59% accuracy. Based on predictions, nucleosome distributions near the beginning and end of open reading frames were analyzed. Results indicated that the majority of open reading frames' start and end sites were occupied by nucleosomes.
Collapse
Affiliation(s)
- Hongde Liu
- State Key Laboratory of Bioelectronics, Southeast University, Nanjing 210096, China
| | | | | | | | | | | |
Collapse
|
28
|
Wang Z, Smith CE, Atchley WR. Application of complex demodulation on bZIP and bHLH-PAS protein domains. Math Biosci 2007; 207:204-18. [PMID: 17374384 DOI: 10.1016/j.mbs.2007.01.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2006] [Revised: 12/29/2006] [Accepted: 01/10/2007] [Indexed: 11/22/2022]
Abstract
Proteins are built with molecular modular building blocks such as an alpha-helix, beta-sheet, loop region and other structures. This is an economical way of constructing complex molecules. Periodicity analysis of protein sequences has allowed us to obtain meaningful information concerning their structure, function and evolution. In this work, complex demodulation (CDM) is introduced to detect functional regions in protein sequences data. More specifically, we analyzed bZIP and bHLH-PAS protein domains. Complex demodulation provided insightful information about changing amplitudes of periodic components in protein sequences. Furthermore, it was found that the local amplitude minimum or local amplitude maximum of the 3.6-aa periodic component is associated with protein structural or functional information due to the observation that the extrema are mainly located in the boundary area of two structural or functional regions.
Collapse
Affiliation(s)
- Zhi Wang
- Graduate Program in Biomathematics, North Carolina State University, Raleigh, NC 27695-8203, USA
| | | | | |
Collapse
|
29
|
Cohanim AB, Trifonov EN, Kashi Y. Specific Selection Pressure at the Third Codon Positions: Contribution to 10- to 11-Base Periodicity in Prokaryotic Genomes. J Mol Evol 2006; 63:393-400. [PMID: 16897261 DOI: 10.1007/s00239-005-0258-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2005] [Accepted: 04/03/2006] [Indexed: 10/24/2022]
Abstract
Prokaryotic sequences are responsible for more than just protein coding. There are two 10- to 11-base periodical patterns superimposed on the protein coding message within the same sequence. Positional auto- and cross-correlation analysis of the sequences shows that these two patterns are a short-range counter-phase oscillation of AA and TT dinucleotides and a medium-range in-phase oscillation of the same dinucleotides, spanning distances of up to approximately 30 and approximately 100 bases, respectively. The short-range oscillation is encoded by the amino acid sequences themselves, apparently, due to the presence of amphipathic alpha-helices in the proteins. The medium-range oscillation, related to DNA folding in the cell, is created largely by a special choice of the bases in the third positions of the codons. Interestingly, the amino acid sequences do contribute to that signal as well. That is, the very amino acid sequences are, to some extent, degenerate to serve the same oscillating pattern that is associated with the degenerate third codon positions.
Collapse
Affiliation(s)
- Amir B Cohanim
- Department of Biotechnology and Food Engineering, Technion, Haifa, 32000, Israel
| | | | | |
Collapse
|
30
|
Abstract
Extensive DNA sequence analysis of three eukaryotes, S. cerevisiae, C. elegans, and D. melanogaster, reveals two different AA/TT periodical patterns associated with the nucleosome positioning. The first pattern is the counter-phase oscillation of AA and TT dinucleotides, which has been frequently considered as the nucleosome DNA pattern. This represents the sequence rule I for chromatin structure. The second pattern is the in-phase oscillation of the AA and TT dinucleotides with the same nucleosome DNA period, 10.4 bases. This pattern apparently corresponds to curved DNA, that also participates in the nucleosome formation, and represents the sequence rule II for chromatin. The positional correlations of AA and TT dinucleotides also indicate that the nucleosomes are separated by specific linker sizes (preferably 8, 18, ... bases), dictated by the steric exclusion rules. Thus, the sequence positions of the neighboring nucleosomes are correlated, and this represents the sequence rule III.
Collapse
Affiliation(s)
- Amir B Cohanim
- Department of Biotechnology and Food Engineering, Technion, Haifa 32000, Israel
| | | | | |
Collapse
|
31
|
Complexity. Evol Bioinform Online 2006. [DOI: 10.1007/978-0-387-33419-6_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|
32
|
Wang J, Hannenhalli S. Generalizations of Markov model to characterize biological sequences. BMC Bioinformatics 2005; 6:219. [PMID: 16144548 PMCID: PMC1236913 DOI: 10.1186/1471-2105-6-219] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2005] [Accepted: 09/06/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The currently used kth order Markov models estimate the probability of generating a single nucleotide conditional upon the immediately preceding (gap = 0) k units. However, this neither takes into account the joint dependency of multiple neighboring nucleotides, nor does it consider the long range dependency with gap > 0. RESULT We describe a configurable tool to explore generalizations of the standard Markov model. We evaluated whether the sequence classification accuracy can be improved by using an alternative set of model parameters. The evaluation was done on four classes of biological sequences--CpG-poor promoters, all promoters, exons and nucleosome positioning sequences. Using di- and tri-nucleotide as the model unit significantly improved the sequence classification accuracy relative to the standard single nucleotide model. In the case of nucleosome positioning sequences, optimal accuracy was achieved at a gap length of 4. Furthermore in the plot of classification accuracy versus the gap, a periodicity of 10-11 bps was observed which might indicate structural preferences in the nucleosome positioning sequence. The tool is implemented in Java and is available for download at ftp://ftp.pcbi.upenn.edu/GMM/. CONCLUSION Markov modeling is an important component of many sequence analysis tools. We have extended the standard Markov model to incorporate joint and long range dependencies between the sequence elements. The proposed generalizations of the Markov model are likely to improve the overall accuracy of sequence analysis tools.
Collapse
Affiliation(s)
- Junwen Wang
- Penn Center for Bioinformatics, Department of Genetics, University of Pennsylvania Philadelphia, PA 19104-6021, USA
| | - Sridhar Hannenhalli
- Penn Center for Bioinformatics, Department of Genetics, University of Pennsylvania Philadelphia, PA 19104-6021, USA
| |
Collapse
|
33
|
Larsabal E, Danchin A. Genomes are covered with ubiquitous 11 bp periodic patterns, the "class A flexible patterns". BMC Bioinformatics 2005; 6:206. [PMID: 16120222 PMCID: PMC1242344 DOI: 10.1186/1471-2105-6-206] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2005] [Accepted: 08/24/2005] [Indexed: 11/17/2022] Open
Abstract
Background The genomes of prokaryotes and lower eukaryotes display a very strong 11 bp periodic bias in the distribution of their nucleotides. This bias is present throughout a given genome, both in coding and non-coding sequences. Until now this bias remained of unknown origin. Results Using a technique for analysis of auto-correlations based on linear projection, we identified the sequences responsible for the bias. Prokaryotic and lower eukaryotic genomes are covered with ubiquitous patterns that we termed "class A flexible patterns". Each pattern is composed of up to ten conserved nucleotides or dinucleotides distributed into a discontinuous motif. Each occurrence spans a region up to 50 bp in length. They belong to what we named the "flexible pattern" type, in that there is some limited fluctuation in the distances between the nucleotides composing each occurrence of a given pattern. When taken together, these patterns cover up to half of the genome in the majority of prokaryotes. They generate the previously recognized 11 bp periodic bias. Conclusion Judging from the structure of the patterns, we suggest that they may define a dense network of protein interaction sites in chromosomes.
Collapse
Affiliation(s)
- Etienne Larsabal
- Unité de Génétique des Génomes Bactériens, Institut Pasteur, URA CNRS 2171, 28, rue du Docteur Roux, 75724 Paris Cedex 15, France
| | - Antoine Danchin
- Unité de Génétique des Génomes Bactériens, Institut Pasteur, URA CNRS 2171, 28, rue du Docteur Roux, 75724 Paris Cedex 15, France
| |
Collapse
|