1
|
Jia Y, Li H, Wang J, Meng H, Yang Z. Spectrum structures and biological functions of 8-mers in the human genome. Genomics 2018. [PMID: 29522801 DOI: 10.1016/j.ygeno.2018.03.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The spectra of k-mer frequencies can reveal the structures and evolution of genome sequences. We confirmed that the trimodal spectrum of 8-mers in human genome sequences is distinguished only by CG2, CG1 and CG0 8-mer sets, containing 2,1 or 0 CpG, respectively. This phenomenon is called independent selection law. The three types of CG 8-mers were considered as different functional elements. We conjectured that (1) nucleosome binding motifs are mainly characterized by CG1 8-mers and (2) the core structural units of CpG island sequences are predominantly characterized by CG2 8-mers. To validate our conjectures, nucleosome occupied sequences and CGI sequences were extracted, then the sequence parameters were constructed through the information of the three CG 8-mer sets respectively. ROC analysis showed that CG1 8-mers are more preference in nucleosome occupied segments (AUC > 0.7) and CG2 8-mers are more preference in CGI sequences (AUC > 0.99). This validates our conjecture in principle.
Collapse
Affiliation(s)
- Yun Jia
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China; College of Science, Inner Mongolia University of Technology, Hohhot 010051, China
| | - Hong Li
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China.
| | - Jingfeng Wang
- College of Science, Inner Mongolia University of Technology, Hohhot 010051, China
| | - Hu Meng
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Zhenhua Yang
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| |
Collapse
|
2
|
Meng H, Li H, Zheng Y, Yang Z, Jia Y, Bo S. Evolutionary analysis of nucleosome positioning sequences based on New Symmetric Relative Entropy. Genomics 2017; 110:154-161. [PMID: 28917635 DOI: 10.1016/j.ygeno.2017.09.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2017] [Revised: 09/06/2017] [Accepted: 09/12/2017] [Indexed: 10/18/2022]
Abstract
New Symmetric Relative Entropy (NSRE) was applied innovatively to analyze the nucleosome sequences in S. cerevisiae, S. pombe and Drosophila. NSRE distributions could well reflect the characteristic differences of nucleosome sequences among three organisms, and the differences indicate a concerted evolution in the sequence usage of nucleosome. Further analysis about the nucleosomes around TSS shows that the constitutive property of +1/-1 nucleosomes in S. cerevisiae is different from that in S. pombe and Drosophila, which indicates that S. cerevisiae has a different transcription regulation mechanism based on nucleosome. However, in either case, the nucleosome dyad region is conserved and always has a higher NSRE. Base composition analysis shows that this conservative property in nucleosome dyad region is mainly determined by base A and T, and the dependence degrees on base A and T are consistent in three organisms.
Collapse
Affiliation(s)
- Hu Meng
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Hong Li
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China.
| | - Yan Zheng
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Zhenhua Yang
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Yun Jia
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Suling Bo
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| |
Collapse
|
3
|
Zheng Y, Li H, Wang Y, Meng H, Zhang Q, Zhao X. Evolutionary mechanism and biological functions of 8-mers containing CG dinucleotide in yeast. Chromosome Res 2017; 25:173-189. [PMID: 28181048 DOI: 10.1007/s10577-017-9554-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2016] [Revised: 12/27/2016] [Accepted: 01/27/2017] [Indexed: 01/01/2023]
Abstract
The rules of k-mer non-random usage and the biological functions are worthy of special attention. Firstly, the article studied human 8-mer spectra and found that only the spectra of cytosine-guanine (CG) dinucleotide classification formed independent unimodal distributions when the 8-mers were classified into three subsets under 16 dinucleotide classifications. Secondly, the distribution rules were reproduced by other seven species including yeast, which showed that the evolution phenomenon had species universality. It followed that we proposed two theoretical conjectures: (1) CG1 motifs (8-mers including 1 CG) are the nucleosome-binding motifs. (2) CG2 motifs (8-mers including two or more than two CG) are the modular units of CpG islands. Our conjectures were confirmed in yeast by the following results: a maximum of average area under the receiver operating characteristic (AUC) resulted from CG1 information during nucleosome core sequences, and linker sequences were distinguished by three CG subsets; there was a one-to-one relationship between abundant CG1 signal regions and histone positions; the sequence changing of squeezed nucleosomes was relevant with the strength of CG1 signals; and the AUC value of 0.986 was based on CG2 information when CpG islands and non-CpG islands were distinguished by the three CG subsets.
Collapse
Affiliation(s)
- Yan Zheng
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot, 010021, China
| | - Hong Li
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot, 010021, China. .,, No.235, West University Street, Hohhot, Inner Mongolia, China.
| | - Yue Wang
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot, 010021, China
| | - Hu Meng
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot, 010021, China
| | - Qiang Zhang
- College of Science, Inner Mongolia Agricultural University, Hohhot, 010018, China
| | - Xiaoqing Zhao
- Biotechnology research centre, Inner Mongolia Academy of Agricultural and Animal Husbandry Science, Hohhot, 010021, China
| |
Collapse
|
4
|
Awazu A. Prediction of nucleosome positioning by the incorporation of frequencies and distributions of three different nucleotide segment lengths into a general pseudo k-tuple nucleotide composition. Bioinformatics 2016; 33:42-48. [PMID: 27563027 PMCID: PMC5860184 DOI: 10.1093/bioinformatics/btw562] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2016] [Revised: 08/02/2016] [Accepted: 08/19/2016] [Indexed: 11/13/2022] Open
Abstract
Motivation Nucleosome positioning plays important roles in many eukaryotic intranuclear processes, such as transcriptional regulation and chromatin structure formation. The investigations of nucleosome positioning rules provide a deeper understanding of these intracellular processes. Results Nucleosome positioning prediction was performed using a model consisting of three types of variables characterizing a DNA sequence—the number of five-nucleotide sequences, the number of three-nucleotide combinations in one period of a helix, and mono- and di-nucleotide distributions in DNA fragments. Using recently proposed stringent benchmark datasets with low biases for Saccharomyces cerevisiae, Homo sapiens, Caenorhabditis elegans and Drosophila melanogaster, the present model was shown to have a better prediction performance than the recently proposed predictors. This model was able to display the common and organism-dependent factors that affect nucleosome forming and inhibiting sequences as well. Therefore, the predictors developed here can accurately predict nucleosome positioning and help determine the key factors influencing this process. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Akinori Awazu
- Department of Mathematical and Life Sciences.,Research Center for Mathematics on Chromatin Live Dynamics, Hiroshima University, Kagami-yama 1-3-1, Higashi-Hiroshima, 739-8526, Japan
| |
Collapse
|
5
|
Wu X, Liu H, Liu H, Su J, Lv J, Cui Y, Wang F, Zhang Y. Z curve theory-based analysis of the dynamic nature of nucleosome positioning in Saccharomyces cerevisiae. Gene 2013; 530:8-18. [DOI: 10.1016/j.gene.2013.08.018] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Revised: 07/30/2013] [Accepted: 08/03/2013] [Indexed: 01/01/2023]
|
6
|
Battistini F, Hunter CA, Moore IK, Widom J. Structure-based identification of new high-affinity nucleosome binding sequences. J Mol Biol 2012; 420:8-16. [PMID: 22472421 DOI: 10.1016/j.jmb.2012.03.026] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2011] [Revised: 03/20/2012] [Accepted: 03/28/2012] [Indexed: 10/28/2022]
Abstract
The substrate for the proteins that express genetic information in the cell is not naked DNA but an assembly of nucleosomes, where the DNA is wrapped around histone proteins. The organization of these nucleosomes on genomic DNA is influenced by the DNA sequence. Here, we present a structure-based computational approach that translates sequence information into the energy required to bend DNA into a nucleosome-bound conformation. The calculations establish the relationship between DNA sequence and histone octamer binding affinity. In silico selection using this model identified several new DNA sequences, which were experimentally found to have histone octamer affinities comparable to the highest-affinity sequences known. The results provide insights into the molecular mechanism through which DNA sequence information encodes its organization. A quantitative appreciation of the thermodynamics of nucleosome positioning and rearrangement will be one of the key factors in understanding the regulation of transcription and in the design of new promoter architectures for the purposes of tuning gene expression dynamics.
Collapse
|
7
|
Teif VB, Shkrabkou AV, Egorova VP, Krot VI. Nucleosomes in gene regulation: Theoretical approaches. Mol Biol 2012. [DOI: 10.1134/s002689331106015x] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
8
|
Computational analysis suggests a highly bendable, fragile structure for nucleosomal DNA. Gene 2011; 476:10-4. [DOI: 10.1016/j.gene.2011.02.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2010] [Revised: 02/05/2011] [Accepted: 02/14/2011] [Indexed: 11/21/2022]
|
9
|
Nikolaou C, Althammer S, Beato M, Guigó R. Structural constraints revealed in consistent nucleosome positions in the genome of S. cerevisiae. Epigenetics Chromatin 2010; 3:20. [PMID: 21073701 PMCID: PMC2994855 DOI: 10.1186/1756-8935-3-20] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2010] [Accepted: 11/12/2010] [Indexed: 11/15/2022] Open
Abstract
Background Recent advances in the field of high-throughput genomics have rendered possible the performance of genome-scale studies to define the nucleosomal landscapes of eukaryote genomes. Such analyses are aimed towards providing a better understanding of the process of nucleosome positioning, for which several models have been suggested. Nevertheless, questions regarding the sequence constraints of nucleosomal DNA and how they may have been shaped through evolution remain open. In this paper, we analyze in detail different experimental nucleosome datasets with the aim of providing a hypothesis for the emergence of nucleosome-forming sequences. Results We compared the complete sets of nucleosome positions for the budding yeast (Saccharomyces cerevisiae) as defined in the output of two independent experiments with the use of two different experimental techniques. We found that < 10% of the experimentally defined nucleosome positions were consistently positioned in both datasets. This subset of well-positioned nucleosomes, when compared with the bulk, was shown to have particular properties at both sequence and structural levels. Consistently positioned nucleosomes were also shown to occur preferentially in pairs of dinucleosomes, and to be surprisingly less conserved compared with their adjacent nucleosome-free linkers. Conclusion Our findings may be combined into a hypothesis for the emergence of a weak nucleosome-positioning code. According to this hypothesis, consistent nucleosomes may be partly guided by nearby nucleosome-free regions through statistical positioning. Once established, a set of well-positioned consistent nucleosomes may impose secondary constraints that further shape the structure of the underlying DNA. We were able to capture these constraints through the application of a recently introduced structural property that is related to the symmetry of DNA curvature. Furthermore, we found that both consistently positioned nucleosomes and their adjacent nucleosome-free regions show an increased tendency towards conservation of this structural feature.
Collapse
Affiliation(s)
- Christoforos Nikolaou
- Bioinformatics and Genomics Group, Centre for Genomic Regulation (CRG), Biomedical Research Park of Barcelona (PRBB), Barcelona, 08003, Catalunya, Spain.
| | | | | | | |
Collapse
|