1
|
Herbert A. ALU non-B-DNA conformations, flipons, binary codes and evolution. ROYAL SOCIETY OPEN SCIENCE 2020; 7:200222. [PMID: 32742689 PMCID: PMC7353975 DOI: 10.1098/rsos.200222] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Accepted: 05/18/2020] [Indexed: 05/08/2023]
Abstract
ALUs contribute to genetic diversity by altering DNA's linear sequence through retrotransposition, recombination and repair. ALUs also have the potential to form alternative non-B-DNA conformations such as Z-DNA, triplexes and quadruplexes that alter the read-out of information from the genome. I suggest here these structures enable the rapid reprogramming of cellular pathways to offset DNA damage and regulate inflammation. The experimental data supporting this form of genetic encoding is presented. ALU sequence motifs that form non-B-DNA conformations under physiological conditions are called flipons. Flipons are binary switches. They are dissipative structures that trade energy for information. By efficiently targeting cellular machines to active genes, flipons expand the repertoire of RNAs compiled from a gene. Their action greatly increases the informational capacity of linearly encoded genomes. Flipons are programmable by epigenetic modification, synchronizing cellular events by altering both chromatin state and nucleosome phasing. Different classes of flipon exist. Z-flipons are based on Z-DNA and modify the transcripts compiled from a gene. T-flipons are based on triplexes and localize non-coding RNAs that direct the assembly of cellular machines. G-flipons are based on G-quadruplexes and sense DNA damage, then trigger the appropriate protective responses. Flipon conformation is dynamic, changing with context. When frozen in one state, flipons often cause disease. The propagation of flipons throughout the genome by ALU elements represents a novel evolutionary innovation that allows for rapid change. Each ALU insertion creates variability by extracting a different set of information from the neighbourhood in which it lands. By elaborating on already successful adaptations, the newly compiled transcripts work with the old to enhance survival. Systems that optimize flipon settings through learning can adapt faster than with other forms of evolution. They avoid the risk of relying on random and irreversible codon rewrites.
Collapse
|
2
|
Li C, Luscombe NM. Nucleosome positioning stability is a modulator of germline mutation rate variation across the human genome. Nat Commun 2020; 11:1363. [PMID: 32170069 PMCID: PMC7070026 DOI: 10.1038/s41467-020-15185-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Accepted: 02/23/2020] [Indexed: 02/08/2023] Open
Abstract
Nucleosome organization has been suggested to affect local mutation rates in the genome. However, the lack of de novo mutation and high-resolution nucleosome data has limited the investigation of this hypothesis. Additionally, analyses using indirect mutation rate measurements have yielded contradictory and potentially confounding results. Here, we combine data on >300,000 human de novo mutations with high-resolution nucleosome maps and find substantially elevated mutation rates around translationally stable (‘strong’) nucleosomes. We show that the mutational mechanisms affected by strong nucleosomes are low-fidelity replication, insufficient mismatch repair and increased double-strand breaks. Strong nucleosomes preferentially locate within young SINE/LINE transposons, suggesting that when subject to increased mutation rates, transposons are then more rapidly inactivated. Depletion of strong nucleosomes in older transposons suggests frequent positioning changes during evolution. The findings have important implications for human genetics and genome evolution. Nucleosome organization has been suggested to affect local mutation rates in the genome. Here, the authors analyse data on >300,000 human de novo mutations and high-resolution nucleosome maps and provide evidence that nucleosome positioning stability modulates germline mutation rate variation across the human genome.
Collapse
Affiliation(s)
- Cai Li
- The Francis Crick Institute, London, NW1 1AT, UK. .,School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China.
| | - Nicholas M Luscombe
- The Francis Crick Institute, London, NW1 1AT, UK.,Okinawa Institute of Science & Technology Graduate University, Okinawa, 904-0495, Japan.,UCL Genetics Institute, University College London, London, WC1E 6BT, UK
| |
Collapse
|
3
|
Rozenberg JM, Taylor JM, Mack CP. RBPJ binds to consensus and methylated cis elements within phased nucleosomes and controls gene expression in human aortic smooth muscle cells in cooperation with SRF. Nucleic Acids Res 2019; 46:8232-8244. [PMID: 29931229 PMCID: PMC6144787 DOI: 10.1093/nar/gky562] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2017] [Accepted: 06/07/2018] [Indexed: 11/15/2022] Open
Abstract
Given our previous demonstration that RBPJ binds a methylated repressor element and regulates smooth muscle cell (SMC)-specific gene expression, we used genome-wide approaches to identify RBPJ binding regions in human aortic SMC and to assess RBPJ's effects on chromatin structure and gene expression. RBPJ bound to consensus cis elements, but also to TCmGGGA sequences within Alu repeats that were less transcriptionally active as assessed by DNAse hypersensitivity, H3K9 acetylation, and Notch3 and RNA Pol II binding. Interestingly, RBPJ binding was frequently detected at the borders of open chromatin, and a large fraction of genes induced or repressed by RBPJ depletion were associated with this cluster of RBPJ binding sites. RBPJ binding dramatically co-localized with serum response factor (SRF) and RNA seq experiments in RBPJ- and SRF-depleted SMC demonstrated that these factors interact functionally to regulate the contraction and inflammatory gene programs that help define SMC phenotype. Finally, we showed that RBPJ bound preferentially to phased nucleosomes independent of active chromatin marks and to cis elements positioned at the beginning and middle of the nucleosome dyad. These novel findings add important insight into RBPJ's role in chromatin structure and gene expression in SMC.
Collapse
Affiliation(s)
- Julian M Rozenberg
- Department of Pathology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Joan M Taylor
- Department of Pathology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Christopher P Mack
- Department of Pathology, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
4
|
Enhancer histone-QTLs are enriched on autoimmune risk haplotypes and influence gene expression within chromatin networks. Nat Commun 2018; 9:2905. [PMID: 30046115 PMCID: PMC6060153 DOI: 10.1038/s41467-018-05328-9] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 07/02/2018] [Indexed: 01/23/2023] Open
Abstract
Genetic variants can confer risk to complex genetic diseases by modulating gene expression through changes to the epigenome. To assess the degree to which genetic variants influence epigenome activity, we integrate epigenetic and genotypic data from lupus patient lymphoblastoid cell lines to identify variants that induce allelic imbalance in the magnitude of histone post-translational modifications, referred to herein as histone quantitative trait loci (hQTLs). We demonstrate that enhancer hQTLs are enriched on autoimmune disease risk haplotypes and disproportionately influence gene expression variability compared with non-hQTL variants in strong linkage disequilibrium. We show that the epigenome regulates HLA class II genes differently in individuals who carry HLA-DR3 or HLA-DR15 haplotypes, resulting in differential 3D chromatin conformation and gene expression. Finally, we identify significant expression QTL (eQTL) x hQTL interactions that reveal substructure within eQTL gene expression, suggesting potential implications for functional genomic studies that leverage eQTL data for subject selection and stratification. Disease risk variants can exert their influence on phenotypes by altering epigenome function. Here, Pelikan et al. show that variants inducing allelic imbalance in histone marks in lymphoblastoid cell lines from lupus patients are enriched in autoimmune disease haplotypes and influence gene expression.
Collapse
|
5
|
Collings CK, Anderson JN. Links between DNA methylation and nucleosome occupancy in the human genome. Epigenetics Chromatin 2017; 10:18. [PMID: 28413449 PMCID: PMC5387343 DOI: 10.1186/s13072-017-0125-5] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2017] [Accepted: 04/03/2017] [Indexed: 12/20/2022] Open
Abstract
Background DNA methylation is an epigenetic modification that is enriched in heterochromatin but depleted at active promoters and enhancers. However, the debate on whether or not DNA methylation is a reliable indicator of high nucleosome occupancy has not been settled. For example, the methylation levels of DNA flanking CTCF sites are higher in linker DNA than in nucleosomal DNA, while other studies have shown that the nucleosome core is the preferred site of methylation. In this study, we make progress toward understanding these conflicting phenomena by implementing a bioinformatics approach that combines MNase-seq and NOMe-seq data and by comprehensively profiling DNA methylation and nucleosome occupancy throughout the human genome. Results The results demonstrated that increasing methylated CpG density is correlated with nucleosome occupancy in the total genome and within nearly all subgenomic regions. Features with elevated methylated CpG density such as exons, SINE-Alu sequences, H3K36-trimethylated peaks, and methylated CpG islands are among the highest nucleosome occupied elements in the genome, while some of the lowest occupancies are displayed by unmethylated CpG islands and unmethylated transcription factor binding sites. Additionally, outside of CpG islands, the density of CpGs within nucleosomes was shown to be important for the nucleosomal location of DNA methylation with low CpG frequencies favoring linker methylation and high CpG frequencies favoring core particle methylation. Prominent exceptions to the correlations between methylated CpG density and nucleosome occupancy include CpG islands marked by H3K27me3 and CpG-poor heterochromatin marked by H3K9me3, and these modifications, along with DNA methylation, distinguish the major silencing mechanisms of the human epigenome. Conclusions Thus, the relationship between DNA methylation and nucleosome occupancy is influenced by the density of methylated CpG dinucleotides and by other epigenomic components in chromatin. Electronic supplementary material The online version of this article (doi:10.1186/s13072-017-0125-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Clayton K Collings
- Department of Biochemistry and Molecular Genetics, Northwestern University Feinberg School of Medicine, 320 E. Superior Street, Chicago, IL 60611 USA
| | - John N Anderson
- Department of Biological Sciences, Purdue University, 915 W. State Street, West Lafayette, IN 47907 USA
| |
Collapse
|
6
|
Rodríguez-Martínez M, Pinzón N, Ghommidh C, Beyne E, Seitz H, Cayrou C, Méchali M. The gastrula transition reorganizes replication-origin selection in Caenorhabditis elegans. Nat Struct Mol Biol 2017; 24:290-299. [PMID: 28112731 DOI: 10.1038/nsmb.3363] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Accepted: 12/13/2016] [Indexed: 01/09/2023]
Abstract
Although some features underlying replication-origin activation in metazoan cells have been determined, little is known about their regulation during metazoan development. Using the nascent-strand purification method, here we identified replication origins throughout Caenorhabditis elegans embryonic development and found that the origin repertoire is thoroughly reorganized after gastrulation onset. During the pluripotent embryonic stages (pregastrula), potential cruciform structures and open chromatin are determining factors that establish replication origins. The observed enrichment of replication origins in transcription factor-binding sites and their presence in promoters of highly transcribed genes, particularly operons, suggest that transcriptional activity contributes to replication initiation before gastrulation. After the gastrula transition, when embryonic differentiation programs are set, new origins are selected at enhancers, close to CpG-island-like sequences, and at noncoding genes. Our findings suggest that origin selection coordinates replication initiation with transcriptional programs during metazoan development.
Collapse
Affiliation(s)
| | | | - Charles Ghommidh
- Agropolymer Engineering and Emerging Technologies, University of Montpellier, Montpellier, France
| | | | - Hervé Seitz
- Institute of Human Genetics, CNRS, Montpellier, France
| | | | | |
Collapse
|
7
|
Jordà M, Díez-Villanueva A, Mallona I, Martín B, Lois S, Barrera V, Esteller M, Vavouri T, Peinado MA. The epigenetic landscape of Alu repeats delineates the structural and functional genomic architecture of colon cancer cells. Genome Res 2016; 27:118-132. [PMID: 27999094 PMCID: PMC5204336 DOI: 10.1101/gr.207522.116] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2016] [Accepted: 11/10/2016] [Indexed: 12/16/2022]
Abstract
Cancer cells exhibit multiple epigenetic changes with prominent local DNA hypermethylation and widespread hypomethylation affecting large chromosomal domains. Epigenome studies often disregard the study of repeat elements owing to technical complexity and their undefined role in genome regulation. We have developed NSUMA (Next-generation Sequencing of UnMethylated Alu), a cost-effective approach allowing the unambiguous interrogation of DNA methylation in more than 130,000 individual Alu elements, the most abundant retrotransposon in the human genome. DNA methylation profiles of Alu repeats have been analyzed in colon cancers and normal tissues using NSUMA and whole-genome bisulfite sequencing. Normal cells show a low proportion of unmethylated Alu (1%–4%) that may increase up to 10-fold in cancer cells. In normal cells, unmethylated Alu elements tend to locate in the vicinity of functionally rich regions and display epigenetic features consistent with a direct impact on genome regulation. In cancer cells, Alu repeats are more resistant to hypomethylation than other retroelements. Genome segmentation based on high/low rates of Alu hypomethylation allows the identification of genomic compartments with differential genetic, epigenetic, and transcriptomic features. Alu hypomethylated regions show low transcriptional activity, late DNA replication, and its extent is associated with higher chromosomal instability. Our analysis demonstrates that Alu retroelements contribute to define the epigenetic landscape of normal and cancer cells and provides a unique resource on the epigenetic dynamics of a principal, but largely unexplored, component of the primate genome.
Collapse
Affiliation(s)
- Mireia Jordà
- Germans Trias i Pujol Health Science Research Institute (IGTP), Badalona 08916, Catalonia, Spain.,Institute of Predictive and Personalized Medicine of Cancer (IMPPC), Badalona 08916, Catalonia, Spain
| | - Anna Díez-Villanueva
- Germans Trias i Pujol Health Science Research Institute (IGTP), Badalona 08916, Catalonia, Spain.,Institute of Predictive and Personalized Medicine of Cancer (IMPPC), Badalona 08916, Catalonia, Spain
| | - Izaskun Mallona
- Germans Trias i Pujol Health Science Research Institute (IGTP), Badalona 08916, Catalonia, Spain.,Institute of Predictive and Personalized Medicine of Cancer (IMPPC), Badalona 08916, Catalonia, Spain
| | - Berta Martín
- Germans Trias i Pujol Health Science Research Institute (IGTP), Badalona 08916, Catalonia, Spain.,Institute of Predictive and Personalized Medicine of Cancer (IMPPC), Badalona 08916, Catalonia, Spain
| | - Sergi Lois
- Germans Trias i Pujol Health Science Research Institute (IGTP), Badalona 08916, Catalonia, Spain.,Institute of Predictive and Personalized Medicine of Cancer (IMPPC), Badalona 08916, Catalonia, Spain
| | - Víctor Barrera
- Germans Trias i Pujol Health Science Research Institute (IGTP), Badalona 08916, Catalonia, Spain.,Institute of Predictive and Personalized Medicine of Cancer (IMPPC), Badalona 08916, Catalonia, Spain
| | - Manel Esteller
- Cancer Epigenetics and Biology Program (PEBC), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona 08908, Catalonia, Spain.,Department of Physiological Sciences II, School of Medicine, University of Barcelona, Barcelona 08907, Catalonia, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona 08010, Catalonia, Spain
| | - Tanya Vavouri
- Germans Trias i Pujol Health Science Research Institute (IGTP), Badalona 08916, Catalonia, Spain.,Josep Carreras Leukaemia Research Institute (IJC), Badalona 08916, Catalonia, Spain
| | - Miguel A Peinado
- Germans Trias i Pujol Health Science Research Institute (IGTP), Badalona 08916, Catalonia, Spain.,Institute of Predictive and Personalized Medicine of Cancer (IMPPC), Badalona 08916, Catalonia, Spain
| |
Collapse
|
8
|
Wu X, Liu H, Liu H, Su J, Lv J, Cui Y, Wang F, Zhang Y. Z curve theory-based analysis of the dynamic nature of nucleosome positioning in Saccharomyces cerevisiae. Gene 2013; 530:8-18. [DOI: 10.1016/j.gene.2013.08.018] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Revised: 07/30/2013] [Accepted: 08/03/2013] [Indexed: 01/01/2023]
|
9
|
Periodic distribution of a putative nucleosome positioning motif in human, nonhuman primates, and archaea: mutual information analysis. Int J Genomics 2013; 2013:963956. [PMID: 23841049 PMCID: PMC3691935 DOI: 10.1155/2013/963956] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2013] [Accepted: 04/29/2013] [Indexed: 12/12/2022] Open
Abstract
Recently, Trifonov's group proposed a 10-mer DNA motif YYYYYRRRRR as a solution of the long-standing problem of sequence-based nucleosome positioning. To test whether this generic decamer represents a biological meaningful signal, we compare the distribution of this motif in primates and Archaea, which are known to contain nucleosomes, and in Eubacteria, which do not possess nucleosomes. The distribution of the motif is analyzed by the mutual information function (MIF) with a shifted version of itself (MIF profile). We found common features in the patterns of this generic decamer on MIF profiles among primate species, and interestingly we found conspicuous but dissimilar MIF profiles for each Archaea tested. The overall MIF profiles for each chromosome in each primate species also follow a similar pattern. Trifonov's generic decamer may be a highly conserved motif for the nucleosome positioning, but we argue that this is not the only motif. The distribution of this generic decamer exhibits previously unidentified periodicities, which are associated to highly repetitive sequences in the genome. Alu repetitive elements contribute to the most fundamental structure of nucleosome positioning in higher Eukaryotes. In some regions of primate chromosomes, the distribution of the decamer shows symmetrical patterns including inverted repeats.
Collapse
|
10
|
Trifonov EN. Nucleosome Positioning by Sequence, State of the Art and Apparent Finale. J Biomol Struct Dyn 2012; 27:741-6. [DOI: 10.1080/073911010010524944] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
11
|
Thirty years of multiple sequence codes. GENOMICS PROTEOMICS & BIOINFORMATICS 2011; 9:1-6. [PMID: 21641556 PMCID: PMC5054146 DOI: 10.1016/s1672-0229(11)60001-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 09/15/2010] [Accepted: 12/09/2010] [Indexed: 11/23/2022]
Abstract
An overview is presented on the status of studies on multiple codes in genetic sequences. Indirectly, the existence of multiple codes is recognized in the form of several rediscoveries of Second Genetic Code that is different each time. A due credit is given to earlier seminal work related to the codes often neglected in literature. The latest developments in the field of chromatin code are discussed, as well as perspectives of single-base resolution studies of nucleosome positioning, including rotational setting of DNA on the surface of the histone octamers.
Collapse
|
12
|
Bettecken T, Frenkel ZM, Trifonov EN. Human nucleosomes: special role of CG dinucleotides and Alu-nucleosomes. BMC Genomics 2011; 12:273. [PMID: 21627783 PMCID: PMC3117857 DOI: 10.1186/1471-2164-12-273] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2010] [Accepted: 05/31/2011] [Indexed: 11/28/2022] Open
Abstract
Background The periodical occurrence of dinucleotides with a period of 10.4 bases now is undeniably a hallmark of nucleosome positioning. Whereas many eukaryotic genomes contain visible and even strong signals for periodic distribution of dinucleotides, the human genome is rather featureless in this respect. The exact sequence features in the human genome that govern the nucleosome positioning remain largely unknown. Results When analyzing the human genome sequence with the positional autocorrelation method, we found that only the dinucleotide CG shows the 10.4 base periodicity, which is indicative of the presence of nucleosomes. There is a high occurrence of CG dinucleotides that are either 31 (10.4 × 3) or 62 (10.4 × 6) base pairs apart from one another - a sequence bias known to be characteristic of Alu-sequences. In a similar analysis with repetitive sequences removed, peaks of repeating CG motifs can be seen at positions 10, 21 and 31, the nearest integers of multiples of 10.4. Conclusions Although the CG dinucleotides are dominant, other elements of the standard nucleosome positioning pattern are present in the human genome as well. The positional autocorrelation analysis of the human genome demonstrates that the CG dinucleotide is, indeed, one visible element of the human nucleosome positioning pattern, which appears both in Alu sequences and in sequences without repeats. The dominant role that CG dinucleotides play in organizing human chromatin is to indicate the involvement of human nucleosomes in tuning the regulation of gene expression and chromatin structure, which is very likely due to cytosine-methylation/-demethylation in CG dinucleotides contained in the human nucleosomes. This is further confirmed by the positions of CG-periodical nucleosomes on Alu sequences. Alu repeats appear as monomers, dimers and trimers, harboring two to six nucleosomes in a run. Considering the exceptional role CG dinucleotides play in the nucleosome positioning, we hypothesize that Alu-nucleosomes, especially, those that form tightly positioned runs, could serve as "anchors" in organizing the chromatin in human cells.
Collapse
Affiliation(s)
- Thomas Bettecken
- CAGT-Center for Applied Genotyping, Max Planck Institute of Psychiatry, Kraepelinstr. 2-10, D-80804 Munich, Germany.
| | | | | |
Collapse
|
13
|
Babbitt GA, Cotter CR. Functional conservation of nucleosome formation selectively biases presumably neutral molecular variation in yeast genomes. Genome Biol Evol 2010; 3:15-22. [PMID: 21135411 PMCID: PMC3014273 DOI: 10.1093/gbe/evq081] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
One prominent pattern of mutational frequency, long appreciated in comparative genomics, is the bias of purine/pyrimidine conserving substitutions (transitions) over purine/pyrimidine altering substitutions (transversions). Traditionally, this transitional bias has been thought to be driven by the underlying rates of DNA mutation and/or repair. However, recent sequencing studies of mutation accumulation lines in model organisms demonstrate that substitutions generally do not accumulate at rates that would indicate a transitional bias. These observations have called into question a very basic assumption of molecular evolution; that naturally occurring patterns of molecular variation in noncoding regions accurately reflect the underlying processes of randomly accumulating neutral mutation in nuclear genomes. Here, in Saccharomyces yeasts, we report a very strong inverse association (r = −0.951, P < 0.004) between the genome-wide frequency of substitutions and their average energetic effect on nucleosome formation, as predicted by a structurally based energy model of DNA deformation around the nucleosome core. We find that transitions occurring at sites positioned nearest the nucleosome surface, which are believed to function most importantly in nucleosome formation, alter the deformation energy of DNA to the nucleosome core by only a fraction of the energy changes typical of most transversions. When we examined the same substitutions set against random background sequences as well as an existing study reporting substitutions arising in mutation accumulation lines of Saccharomyces cerevisiae, we failed to find a similar relationship. These results support the idea that natural selection acting to functionally conserve chromatin organization may contribute significantly to genome-wide transitional bias, even in noncoding regions. Because nucleosome core structure is highly conserved across eukaryotes, our observations may also help to further explain locally elevated transition bias at CpG islands, which are known to destabilize nucleosomes at vertebrate promoters.
Collapse
|
14
|
Arya G, Maitra A, Grigoryev SA. A structural perspective on the where, how, why, and what of nucleosome positioning. J Biomol Struct Dyn 2010; 27:803-20. [PMID: 20232935 DOI: 10.1080/07391102.2010.10508585] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The DNA in eukaryotic chromatin is packed by histones into arrays of repeating units called nucleosomes. Each nucleosome contains a nucleosome core, where the DNA is wrapped around a histone octamer, and a stretch of relatively unconstrained DNA called the linker DNA. Since nucleosome cores occlude the DNA from many DNA-binding factors, their positions provide important clues for understanding chromatin packing and gene regulation. Here we review the recent advances in the genome-wide mapping of nucleosome positions, the molecular and structural determinants of nucleosome positioning, and the importance of nucleosome positioning in chromatin higher order folding and transcriptional regulation.
Collapse
Affiliation(s)
- Gaurav Arya
- Department of NanoEngineering, University of California at San Diego, MC 0448, 9500 Gilman Drive, La Jolla, CA 92093, USA.
| | | | | |
Collapse
|
15
|
Gabdank I, Barash D, Trifonov EN. Single-base resolution nucleosome mapping on DNA sequences. J Biomol Struct Dyn 2010; 28:107-22. [PMID: 20476799 DOI: 10.1080/07391102.2010.10507347] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Nucleosome DNA bendability pattern extracted from large nucleosome DNA database of C. elegans is used for construction of full length (116 dinucleotide positions) nucleosome DNA bendability matrix. The matrix can be used for sequence-directed mapping of the nucleosomes on the sequences. Several alternative positions for a given nucleosome are typically predicted, separated by multiples of nucleosome DNA period. The corresponding computer program is successfully tested on best known experimental examples of accurately positioned nucleosomes. The uncertainty of the computational mapping is +/-1 base. The procedure is placed on publicly accessible server and can be applied to any DNA sequence of interest.
Collapse
Affiliation(s)
- I Gabdank
- Department of Computer Science, Ben Gurion University of the Negev, P.O.B 653 Be'er Sheva 84105, Israel.
| | | | | |
Collapse
|
16
|
Goldmann R, Tichý L, Freiberger T, Zapletalová P, Letocha O, Soska V, Fajkus J, Fajkusová L. Genomic characterization of large rearrangements of the LDLR gene in Czech patients with familial hypercholesterolemia. BMC MEDICAL GENETICS 2010; 11:115. [PMID: 20663204 PMCID: PMC2923121 DOI: 10.1186/1471-2350-11-115] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/12/2010] [Accepted: 07/27/2010] [Indexed: 02/02/2023]
Abstract
Background Mutations in the LDLR gene are the most frequent cause of Familial hypercholesterolemia, an autosomal dominant disease characterised by elevated concentrations of LDL in blood plasma. In many populations, large genomic rearrangements account for approximately 10% of mutations in the LDLR gene. Methods DNA diagnostics of large genomic rearrangements was based on Multiple Ligation dependent Probe Amplification (MLPA). Subsequent analyses of deletion and duplication breakpoints were performed using long-range PCR, PCR, and DNA sequencing. Results In set of 1441 unrelated FH patients, large genomic rearrangements were found in 37 probands. Eight different types of rearrangements were detected, from them 6 types were novel, not described so far. In all rearrangements, we characterized their exact extent and breakpoint sequences. Conclusions Sequence analysis of deletion and duplication breakpoints indicates that intrachromatid non-allelic homologous recombination (NAHR) between Alu elements is involved in 6 events, while a non-homologous end joining (NHEJ) is implicated in 2 rearrangements. Our study thus describes for the first time NHEJ as a mechanism involved in genomic rearrangements in the LDLR gene.
Collapse
Affiliation(s)
- Radan Goldmann
- University Hospital Brno, Centre of Molecular Biology and Gene Therapy, Cernopolní 9, CZ-62500 Brno, Czech Republic
| | | | | | | | | | | | | | | |
Collapse
|
17
|
Learning a weighted sequence model of the nucleosome core and linker yields more accurate predictions in Saccharomyces cerevisiae and Homo sapiens. PLoS Comput Biol 2010; 6:e1000834. [PMID: 20628623 PMCID: PMC2900294 DOI: 10.1371/journal.pcbi.1000834] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2009] [Accepted: 05/25/2010] [Indexed: 11/28/2022] Open
Abstract
DNA in eukaryotes is packaged into a chromatin complex, the most basic element of which is the nucleosome. The precise positioning of the nucleosome cores allows for selective access to the DNA, and the mechanisms that control this positioning are important pieces of the gene expression puzzle. We describe a large-scale nucleosome pattern that jointly characterizes the nucleosome core and the adjacent linkers and is predominantly characterized by long-range oscillations in the mono, di- and tri-nucleotide content of the DNA sequence, and we show that this pattern can be used to predict nucleosome positions in both Homo sapiens and Saccharomyces cerevisiae more accurately than previously published methods. Surprisingly, in both H. sapiens and S. cerevisiae, the most informative individual features are the mono-nucleotide patterns, although the inclusion of di- and tri-nucleotide features results in improved performance. Our approach combines a much longer pattern than has been previously used to predict nucleosome positioning from sequence—301 base pairs, centered at the position to be scored—with a novel discriminative classification approach that selectively weights the contributions from each of the input features. The resulting scores are relatively insensitive to local AT-content and can be used to accurately discriminate putative dyad positions from adjacent linker regions without requiring an additional dynamic programming step and without the attendant edge effects and assumptions about linker length modeling and overall nucleosome density. Our approach produces the best dyad-linker classification results published to date in H. sapiens, and outperforms two recently published models on a large set of S. cerevisiae nucleosome positions. Our results suggest that in both genomes, a comparable and relatively small fraction of nucleosomes are well-positioned and that these positions are predictable based on sequence alone. We believe that the bulk of the remaining nucleosomes follow a statistical positioning model. DNA in eukaryotes is packaged into a chromatin complex, the most basic element of which is the nucleosome. The precise positioning of the nucleosome cores allows for selective access to the DNA, and the mechanisms that control this positioning are important pieces of the gene expression puzzle. In this work, we describe a large-scale DNA sequence pattern that jointly characterizes the sequence preferences of the nucleosome core and the adjacent linkers. We show that this pattern can be used to predict nucleosome positions in both H. sapiens and S. cerevisiae more accurately than previously published methods. The model is most accurate in predicting the most stably positioned nucleosomes, and describes a sequence composition pattern that determines a locally optimal dyad (nucleosomal DNA mid-point) position. In contrast to some previous models, this model is not based primarily on excluding poly-A/T sequences, nor does the model prefer 10 bp periodicity. Our results suggest that local sequence composition is one of many factors that direct the positioning of nucleosomes, while dynamic processes such as transcriptional elongation and the actions of chromatin remodeling complexes also play a significant role in the overall chromatin landscape.
Collapse
|
18
|
Cui F, Zhurkin VB. Structure-based analysis of DNA sequence patterns guiding nucleosome positioning in vitro. J Biomol Struct Dyn 2010; 27:821-41. [PMID: 20232936 PMCID: PMC2993692 DOI: 10.1080/073911010010524947] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Recent studies of genome-wide nucleosomal organization suggest that the DNA sequence is one of the major determinants of nucleosome positioning. Although the search for underlying patterns encoded in nucleosomal DNA has been going on for about 30 years, our knowledge of these patterns still remains limited. Based on our evaluations of DNA deformation energy, we developed new scoring functions to predict nucleosome positioning. There are three principal differences between our approach and earlier studies: (i) we assume that the length of nucleosomal DNA varies from 146 to 147 bp; (ii) we consider the anisotropic flexibility of pyrimidine-purine (YR) dimeric steps in the context of their neighbors (e.g., YYRR versus RYRY); (iii) we postulate that alternating AT-rich and GC-rich motifs reflect sequence-dependent interactions between histone arginines and DNA in the minor groove. Using these functions, we analyzed 20 nucleosome positions mapped in vitro at single nucleotide resolution (including clones 601, 603, 605, the pGUB plasmid, chicken beta-globin and three 5S rDNA genes). We predicted 15 of the 20 positions with 1-bp precision, and two positions with 2-bp precision. The predicted position of the '601' nucleosome (i.e., the optimum of the computed score) deviates from the experimentally determined unique position by no more than 1 bp - an accuracy exceeding that of earlier predictions. Our analysis reveals a clear heterogeneity of the nucleosomal sequences which can be divided into two groups based on the positioning 'rules' they follow. The sequences of one group are enriched by highly deformable YR/YYRR motifs at the minor-groove bending sites SHL+/- 3.5 and +/- 5.5, which is similar to the alpha-satellite sequence used in most crystallized nucleosomes. Apparently, the positioning of these nucleosomes is determined by the interactions between histones H2A/H2B and the terminal parts of nucleosomal DNA. In the other group (that includes the '601' clone) the same YR/YYRR motifs occur predominantly at the sites SHL +/- 1.5. The interaction between the H3/H4 tetramer and the central part of the nucleosomal DNA is likely to be responsible for the positioning of nucleosomes of this group, and the DNA trajectory in these nucleosomes may differ in detail from the published structures. Thus, from the stereochemical perspective, the in vitro nucleosomes studied here follow either an X-ray-like pattern (with strong deformations in the terminal parts of nucleosomal DNA), or an alternative pattern (with the deformations occurring predominantly in the central part of the nucleosomal DNA). The results presented here may be useful for genome-wide classification of nucleosomes, linking together structural and thermodynamic characteristics of nucleosomes with the underlying DNA sequence patterns guiding their positions.
Collapse
Affiliation(s)
- Feng Cui
- Laboratory of Cell Biology, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | - Victor B. Zhurkin
- Laboratory of Cell Biology, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| |
Collapse
|
19
|
Il'icheva IA, Vlasov PK, Esipova NG, Tumanyan VG. The Intramolecular Impact to the Sequence Specificity of B→A Transition: Low Energy Conformational Variations in AA/TT and GG/CC Steps. J Biomol Struct Dyn 2010; 27:677-693. [DOI: 10.1080/07391102.2010.10508581] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
20
|
Karambataki M, Malousi A, Maglaveras N, Kouidou S. Synonymous polymorphisms at splicing regulatory sites are associated with CpGs in neurodegenerative disease-related genes. Neuromolecular Med 2010; 12:260-9. [PMID: 20077034 DOI: 10.1007/s12017-009-8111-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2009] [Accepted: 12/17/2009] [Indexed: 01/10/2023]
Abstract
Neuronal plasticity is associated with alternative splicing and epigenetic modulation. Recent evidence reveals the association of cytosine methylation with alternative splicing and splicing regulatory mechanisms. Single nucleotide polymorphisms (SNPs) are generally less frequent in conserved coding regions and probably in splice sites, compared to non-coding regions. CpG polymorphisms in coding regions and splice sites and their association with splicing regulatory elements have not been investigated till presently. We currently analyzed the CpG variability in 28 genes (361 constitutive and 105 alternative exons and the corresponding splice sites) associated with neurodegenerative diseases (ND). CpG polymorphisms in the splice sites of these genes are particularly frequent when compared to those at AG sequences. Moreover, in both constitutive and alternative exons, polymorphisms in CpGs are more frequent than in AG, GT sequences. On the contrary, in the polypyrimidine acceptor sequence C/T conservation is prominent indicating that in this locus the sequence of cytosines and thymines is preserved. Bioinformatic analysis of the splicing-associated regulatory elements in these exons and splice sites reveals that 18 out of a total of 39 SNPs which could strongly affect splicing (>1.5 score difference) contain CpG sequences. Cytosines are considerably more frequent and variable than expected at the position preceding the GT splice donors, while sites of epigenetic modification are absent from acceptors. The high CpG frequency in polymorphic splicing-associated sites implicates the involvement of epigenetic mechanisms in splicing selection decisions regulated by these sites, and indicates the complexity of genetic studies involving these, tentatively critical, polymorphisms in ND.
Collapse
Affiliation(s)
- Maria Karambataki
- Laboratory of Biological Chemistry, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece.
| | | | | | | |
Collapse
|
21
|
Bettecken T, Trifonov EN. Repertoires of the nucleosome-positioning dinucleotides. PLoS One 2009; 4:e7654. [PMID: 19888331 PMCID: PMC2765632 DOI: 10.1371/journal.pone.0007654] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2009] [Accepted: 10/02/2009] [Indexed: 11/18/2022] Open
Abstract
It is generally accepted that the organization of eukaryotic DNA into chromatin is strongly governed by a code inherent in the genomic DNA sequence. This code, as well as other codes, is superposed on the triplets coding for amino acids. The history of the chromatin code started three decades ago with the discovery of the periodic appearance of certain dinucleotides, with AA/TT and RR/YY giving the strongest signals, all with a period of 10.4 bases. Every base-pair stack in the DNA duplex has specific deformation properties, thus favoring DNA bending in a specific direction. The appearance of the corresponding dinucleotide at the distance 10.4 xn bases will facilitate DNA bending in that direction, which corresponds to the minimum energy of DNA folding in the nucleosome. We have analyzed the periodic appearances of all 16 dinucleotides in the genomes of thirteen different eukaryotic organisms. Our data show that a large variety of dinucleotides (if not all) are, apparently, contributing to the nucleosome positioning code. The choice of the periodical dinucleotides differs considerably from one organism to another. Among other 10.4 base periodicities, a strong and very regular 10.4 base signal was observed for CG dinucleotides in the genome of the honey bee A. mellifera. Also, the dinucleotide CG appears as the only periodical component in the human genome. This observation seems especially relevant since CpG methylation is well known to modulate chromatin packing and regularity. Thus, the selection of the dinucleotides contributing to the chromatin code is species specific, and may differ from region to region, depending on the sequence context.
Collapse
Affiliation(s)
- Thomas Bettecken
- CAGT-Center for Applied Genotyping, Max Planck Institute of Psychiatry, Munich, Germany.
| | | |
Collapse
|
22
|
Tolstorukov MY, Kharchenko PV, Goldman JA, Kingston RE, Park PJ. Comparative analysis of H2A.Z nucleosome organization in the human and yeast genomes. Genome Res 2009; 19:967-77. [PMID: 19246569 DOI: 10.1101/gr.084830.108] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Eukaryotic DNA is wrapped around a histone protein core to constitute the fundamental repeating units of chromatin, the nucleosomes. The affinity of the histone core for DNA depends on the nucleotide sequence; however, it is unclear to what extent DNA sequence determines nucleosome positioning in vivo, and if the same rules of sequence-directed positioning apply to genomes of varying complexity. Using the data generated by high-throughput DNA sequencing combined with chromatin immunoprecipitation, we have identified positions of nucleosomes containing the H2A.Z histone variant and histone H3 trimethylated at lysine 4 in human CD4(+) T-cells. We find that the 10-bp periodicity observed in nucleosomal sequences in yeast and other organisms is not pronounced in human nucleosomal sequences. This result was confirmed for a broader set of mononucleosomal fragments that were not selected for any specific histone variant or modification. We also find that human H2A.Z nucleosomes protect only approximately 120 bp of DNA from MNase digestion and exhibit specific sequence preferences, suggesting a novel mechanism of nucleosome organization for the H2A.Z variant.
Collapse
Affiliation(s)
- Michael Y Tolstorukov
- Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts 02115, USA
| | | | | | | | | |
Collapse
|
23
|
Gabdank I, Barash D, Trifonov EN. Nucleosome DNA Bendability Matrix(C. elegans). J Biomol Struct Dyn 2009; 26:403-11. [DOI: 10.1080/07391102.2009.10507255] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
24
|
Salih F, Salih B, Trifonov EN. Sequence Structure of Hidden 10.4-base Repeat in the Nucleosomes ofC. elegans. J Biomol Struct Dyn 2008; 26:273-82. [DOI: 10.1080/07391102.2008.10531241] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|