1
|
Benham CJ. DNA superhelicity. Nucleic Acids Res 2024; 52:22-48. [PMID: 37994702 PMCID: PMC10783518 DOI: 10.1093/nar/gkad1092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Revised: 10/20/2023] [Accepted: 11/06/2023] [Indexed: 11/24/2023] Open
Abstract
Closing each strand of a DNA duplex upon itself fixes its linking number L. This topological condition couples together the secondary and tertiary structures of the resulting ccDNA topoisomer, a constraint that is not present in otherwise identical nicked or linear DNAs. Fixing L has a range of structural, energetic and functional consequences. Here we consider how L having different integer values (that is, different superhelicities) affects ccDNA molecules. The approaches used are primarily theoretical, and are developed from a historical perspective. In brief, processes that either relax or increase superhelicity, or repartition what is there, may either release or require free energy. The energies involved can be substantial, sufficient to influence many events, directly or indirectly. Here two examples are developed. The changes of unconstrained superhelicity that occur during nucleosome attachment and release are examined. And a simple theoretical model of superhelically driven DNA structural transitions is described that calculates equilibrium distributions for populations of identical topoisomers. This model is used to examine how these distributions change with superhelicity and other factors, and applied to analyze several situations of biological interest.
Collapse
Affiliation(s)
- Craig J Benham
- UC Davis Genome Center, University of California, One Shields Avenue, Davis, CA 95616, USA
| |
Collapse
|
2
|
Kouzine F, Wojtowicz D, Przytycka TM, Levens D. Detection of Z-DNA Structures in Supercoiled Genome. Methods Mol Biol 2023; 2651:179-193. [PMID: 36892768 PMCID: PMC10512777 DOI: 10.1007/978-1-0716-3084-6_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/10/2023]
Abstract
Z-DNAs are nucleic acid secondary structures that form within a special pattern of nucleotides and are promoted by DNA supercoiling. Through Z-DNA formation, DNA encodes information by dynamic changes in its secondary structure. A growing body of evidence indicates that Z-DNA formation can play a role in gene regulation; it can affect chromatin architecture and demonstrates its association with genomic instability, genetic diseases, and genome evolution. Many functional roles of Z-DNA are yet to be discovered highlighting the need for techniques to detect genome-wide folding of DNA into this structure. Here, we describe an approach to convert linear genome into supercoiled genome sponsoring Z-DNA formation. Applying permanganate-based methodology and high-throughput sequencing to supercoiled genome allows genome-wide detection of single-stranded DNA. Single-stranded DNA is characteristic of the junctions between the classical B-form of DNA and Z-DNA. Consequently, analysis of single-stranded DNA map provides snapshots of the Z-DNA conformation over the whole genome.
Collapse
Affiliation(s)
- Fedor Kouzine
- Laboratory of Pathology, NCI/NIH, Bethesda, MD, USA.
| | | | | | - David Levens
- Laboratory of Pathology, NCI/NIH, Bethesda, MD, USA
| |
Collapse
|
3
|
Bowater RP, Bohálová N, Brázda V. Interaction of Proteins with Inverted Repeats and Cruciform Structures in Nucleic Acids. Int J Mol Sci 2022; 23:ijms23116171. [PMID: 35682854 PMCID: PMC9180970 DOI: 10.3390/ijms23116171] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 05/26/2022] [Accepted: 05/30/2022] [Indexed: 01/27/2023] Open
Abstract
Cruciforms occur when inverted repeat sequences in double-stranded DNA adopt intra-strand hairpins on opposing strands. Biophysical and molecular studies of these structures confirm their characterization as four-way junctions and have demonstrated that several factors influence their stability, including overall chromatin structure and DNA supercoiling. Here, we review our understanding of processes that influence the formation and stability of cruciforms in genomes, covering the range of sequences shown to have biological significance. It is challenging to accurately sequence repetitive DNA sequences, but recent advances in sequencing methods have deepened understanding about the amounts of inverted repeats in genomes from all forms of life. We highlight that, in the majority of genomes, inverted repeats are present in higher numbers than is expected from a random occurrence. It is, therefore, becoming clear that inverted repeats play important roles in regulating many aspects of DNA metabolism, including replication, gene expression, and recombination. Cruciforms are targets for many architectural and regulatory proteins, including topoisomerases, p53, Rif1, and others. Notably, some of these proteins can induce the formation of cruciform structures when they bind to DNA. Inverted repeat sequences also influence the evolution of genomes, and growing evidence highlights their significance in several human diseases, suggesting that the inverted repeat sequences and/or DNA cruciforms could be useful therapeutic targets in some cases.
Collapse
Affiliation(s)
- Richard P. Bowater
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK;
| | - Natália Bohálová
- Department of Biophysical Chemistry and Molecular Oncology, Institute of Biophysics of the Czech Academy of Sciences, 61265 Brno, Czech Republic;
- Department of Experimental Biology, Faculty of Science, Masaryk University, Kamenice 5, 62500 Brno, Czech Republic
| | - Václav Brázda
- Department of Biophysical Chemistry and Molecular Oncology, Institute of Biophysics of the Czech Academy of Sciences, 61265 Brno, Czech Republic;
- Correspondence:
| |
Collapse
|
4
|
Zhang Z, Zhou K, Tran D, Saier M. Insertion Sequence (IS) Element-Mediated Activating Mutations of the Cryptic Aromatic β-Glucoside Utilization ( BglGFB) Operon Are Promoted by the Anti-Terminator Protein (BglG) in Escherichia coli. Int J Mol Sci 2022; 23:ijms23031505. [PMID: 35163427 PMCID: PMC8836124 DOI: 10.3390/ijms23031505] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 01/25/2022] [Accepted: 01/27/2022] [Indexed: 01/24/2023] Open
Abstract
The cryptic β-glucoside GFB (bglGFB) operon in Escherichia coli (E. coli) can be activated by mutations arising under starvation conditions in the presence of an aromatic β-glucoside. This may involve the insertion of an insertion sequence (IS) element into a "stress-induced DNA duplex destabilization" (SIDD) region upstream of the operon promoter, although other types of mutations can also activate the bgl operon. Here, we show that increased expression of the bglG gene, encoding a well-characterized transcriptional antiterminator, dramatically increases the frequency of both IS-mediated and IS-independent Bgl+ mutations occurring on salicin- and arbutin-containing agar plates. Both mutation rates increased with increasing levels of bglG expression but IS-mediated mutations were more prevalent at lower BglG levels. Mutations depended on the presence of both BglG and an aromatic β-glucoside, and bglG expression did not influence IS insertion in other IS-activated operons tested. The N-terminal mRNA-binding domain of BglG was essential for mutational activation, and alteration of BglG's binding site in the mRNA nearly abolished Bgl+ mutant appearances. Increased bglG expression promoted residual bgl operon expression in parallel with the increases in mutation rates. Possible mechanisms are proposed explaining how BglG enhances the frequencies of bgl operon activating mutations.
Collapse
|
5
|
Non-B DNA-Forming Motifs Promote Mfd-Dependent Stationary-Phase Mutagenesis in Bacillus subtilis. Microorganisms 2021; 9:microorganisms9061284. [PMID: 34204686 PMCID: PMC8231525 DOI: 10.3390/microorganisms9061284] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 06/08/2021] [Accepted: 06/09/2021] [Indexed: 02/07/2023] Open
Abstract
Transcription-induced mutagenic mechanisms limit genetic changes to times when expression happens and to coding DNA. It has been hypothesized that intrinsic sequences that have the potential to form alternate DNA structures, such as non-B DNA structures, influence these mechanisms. Non-B DNA structures are promoted by transcription and induce genome instability in eukaryotic cells, but their impact in bacterial genomes is less known. Here, we investigated if G4 DNA- and hairpin-forming motifs influence stationary-phase mutagenesis in Bacillus subtilis. We developed a system to measure the influence of non-B DNA on B. subtilis stationary-phase mutagenesis by deleting the wild-type argF at its chromosomal position and introducing IPTG-inducible argF alleles differing in their ability to form hairpin and G4 DNA structures into an ectopic locus. Using this system, we found that sequences predicted to form non-B DNA structures promoted mutagenesis in B. subtilis stationary-phase cells; such a response did not occur in growing conditions. We also found that the transcription-coupled repair factor Mfd promoted mutagenesis at these predicted structures. In summary, we showed that non-B DNA-forming motifs promote genetic instability, particularly in coding regions in stressed cells; therefore, non-B DNA structures may have a spatial and temporal mutagenic effect in bacteria. This study provides insights into mechanisms that prevent or promote mutagenesis and advances our understanding of processes underlying bacterial evolution.
Collapse
|
6
|
Miglietta G, Russo M, Capranico G. G-quadruplex-R-loop interactions and the mechanism of anticancer G-quadruplex binders. Nucleic Acids Res 2020; 48:11942-11957. [PMID: 33137181 PMCID: PMC7708042 DOI: 10.1093/nar/gkaa944] [Citation(s) in RCA: 75] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 10/05/2020] [Accepted: 10/08/2020] [Indexed: 12/17/2022] Open
Abstract
Genomic DNA and cellular RNAs can form a variety of non-B secondary structures, including G-quadruplex (G4) and R-loops. G4s are constituted by stacked guanine tetrads held together by Hoogsteen hydrogen bonds and can form at key regulatory sites of eukaryote genomes and transcripts, including gene promoters, untranslated exon regions and telomeres. R-loops are 3-stranded structures wherein the two strands of a DNA duplex are melted and one of them is annealed to an RNA. Specific G4 binders are intensively investigated to discover new effective anticancer drugs based on a common rationale, i.e.: the selective inhibition of oncogene expression or specific impairment of telomere maintenance. However, despite the high number of known G4 binders, such a selective molecular activity has not been fully established and several published data point to a different mode of action. We will review published data that address the close structural interplay between G4s and R-loops in vitro and in vivo, and how these interactions can have functional consequences in relation to G4 binder activity. We propose that R-loops can play a previously-underestimated role in G4 binder action, in relation to DNA damage induction, telomere maintenance, genome and epigenome instability and alterations of gene expression programs.
Collapse
Affiliation(s)
- Giulia Miglietta
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum University of Bologna, via Selmi 3, 40126 Bologna, Italy
| | - Marco Russo
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum University of Bologna, via Selmi 3, 40126 Bologna, Italy
| | - Giovanni Capranico
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum University of Bologna, via Selmi 3, 40126 Bologna, Italy
| |
Collapse
|
7
|
Chedin F, Benham CJ. Emerging roles for R-loop structures in the management of topological stress. J Biol Chem 2020; 295:4684-4695. [PMID: 32107311 DOI: 10.1074/jbc.rev119.006364] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
R-loop structures are a prevalent class of alternative non-B DNA structures that form during transcription upon invasion of the DNA template by the nascent RNA. R-loops form universally in the genomes of organisms ranging from bacteriophages, bacteria, and yeasts to plants and animals, including mammals. A growing body of work has linked these structures to both physiological and pathological processes, in particular to genome instability. The rising interest in R-loops is placing new emphasis on understanding the fundamental physicochemical forces driving their formation and stability. Pioneering work in Escherichia coli revealed that DNA topology, in particular negative DNA superhelicity, plays a key role in driving R-loops. A clear role for DNA sequence was later uncovered. Here, we review and synthesize available evidence on the roles of DNA sequence and DNA topology in controlling R-loop formation and stability. Factoring in recent developments in R-loop modeling and single-molecule profiling, we propose a coherent model accounting for the interplay between DNA sequence and DNA topology in driving R-loop structure formation. This model reveals R-loops in a new light as powerful and reversible topological stress relievers, an insight that significantly expands the repertoire of R-loops' potential biological roles under both normal and aberrant conditions.
Collapse
Affiliation(s)
- Frederic Chedin
- Department of Molecular and Cellular Biology, University of California, Davis, California 95616 .,Genome Center, University of California, Davis, California 95616
| | - Craig J Benham
- Genome Center, University of California, Davis, California 95616 .,Departments of Mathematics and Biomedical Engineering, University of California, Davis, California 95616
| |
Collapse
|
8
|
The Rich World of p53 DNA Binding Targets: The Role of DNA Structure. Int J Mol Sci 2019; 20:ijms20225605. [PMID: 31717504 PMCID: PMC6888028 DOI: 10.3390/ijms20225605] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Revised: 10/29/2019] [Accepted: 11/08/2019] [Indexed: 12/14/2022] Open
Abstract
The tumor suppressor functions of p53 and its roles in regulating the cell cycle, apoptosis, senescence, and metabolism are accomplished mainly by its interactions with DNA. p53 works as a transcription factor for a significant number of genes. Most p53 target genes contain so-called p53 response elements in their promoters, consisting of 20 bp long canonical consensus sequences. Compared to other transcription factors, which usually bind to one concrete and clearly defined DNA target, the p53 consensus sequence is not strict, but contains two repeats of a 5′RRRCWWGYYY3′ sequence; therefore it varies remarkably among target genes. Moreover, p53 binds also to DNA fragments that at least partially and often completely lack this consensus sequence. p53 also binds with high affinity to a variety of non-B DNA structures including Holliday junctions, cruciform structures, quadruplex DNA, triplex DNA, DNA loops, bulged DNA, and hemicatenane DNA. In this review, we summarize information of the interactions of p53 with various DNA targets and discuss the functional consequences of the rich world of p53 DNA binding targets for its complex regulatory functions.
Collapse
|
9
|
Abstract
Three-stranded R-loop structures form during transcription when the nascent RNA transcript rehybridizes to the template DNA strand. This creates an RNA:DNA hybrid and forces the nontemplate DNA strand into a single-stranded, looped-out state. R-loops form universally over conserved hotspot regions. To date, the physicochemical bases underlying R-loop formation remain unclear. Using a “first-principle” mathematical approach backed by experimental validation, we elucidated the relative contributions of DNA sequence and DNA topology to R-loop formation. Our work provides a quantitative assessment of the energies underlying R-loop formation and of their interplay. It further reveals these structures as important regulators of the DNA topological state. R-loops are abundant three-stranded nucleic-acid structures that form in cis during transcription. Experimental evidence suggests that R-loop formation is affected by DNA sequence and topology. However, the exact manner by which these factors interact to determine R-loop susceptibility is unclear. To investigate this, we developed a statistical mechanical equilibrium model of R-loop formation in superhelical DNA. In this model, the energy involved in forming an R-loop includes four terms—junctional and base-pairing energies and energies associated with superhelicity and with the torsional winding of the displaced DNA single strand around the RNA:DNA hybrid. This model shows that the significant energy barrier imposed by the formation of junctions can be overcome in two ways. First, base-pairing energy can favor RNA:DNA over DNA:DNA duplexes in favorable sequences. Second, R-loops, by absorbing negative superhelicity, partially or fully relax the rest of the DNA domain, thereby returning it to a lower energy state. In vitro transcription assays confirmed that R-loops cause plasmid relaxation and that negative superhelicity is required for R-loops to form, even in a favorable region. Single-molecule R-loop footprinting following in vitro transcription showed a strong agreement between theoretical predictions and experimental mapping of stable R-loop positions and further revealed the impact of DNA topology on the R-loop distribution landscape. Our results clarify the interplay between base sequence and DNA superhelicity in controlling R-loop stability. They also reveal R-loops as powerful and reversible topology sinks that cells may use to nonenzymatically relieve superhelical stress during transcription.
Collapse
|
10
|
Miura O, Ogake T, Yoneyama H, Kikuchi Y, Ohyama T. A strong structural correlation between short inverted repeat sequences and the polyadenylation signal in yeast and nucleosome exclusion by these inverted repeats. Curr Genet 2018; 65:575-590. [PMID: 30498953 PMCID: PMC6420913 DOI: 10.1007/s00294-018-0907-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Revised: 11/14/2018] [Accepted: 11/15/2018] [Indexed: 11/22/2022]
Abstract
DNA sequences that read the same from 5′ to 3′ in either strand are called inverted repeat sequences or simply IRs. They are found throughout a wide variety of genomes, from prokaryotes to eukaryotes. Despite extensive research, their in vivo functions, if any, remain unclear. Using Saccharomyces cerevisiae, we performed genome-wide analyses for the distribution, occurrence frequency, sequence characteristics and relevance to chromatin structure, for the IRs that reportedly have a cruciform-forming potential. Here, we provide the first comprehensive map of these IRs in the S. cerevisiae genome. The statistically significant enrichment of the IRs was found in the close vicinity of the DNA positions corresponding to polyadenylation [poly(A)] sites and ~ 30 to ~ 60 bp downstream of start codon-coding sites (referred to as ‘start codons’). In the former, ApT- or TpA-rich IRs and A-tract- or T-tract-rich IRs are enriched, while in the latter, different IRs are enriched. Furthermore, we found a strong structural correlation between the former IRs and the poly(A) signal. In the chromatin formed on the gene end regions, the majority of the IRs causes low nucleosome occupancy. The IRs in the region ~ 30 to ~ 60 bp downstream of start codons are located in the + 1 nucleosomes. In contrast, fewer IRs are present in the adjacent region downstream of start codons. The current study suggests that the IRs play similar roles in Escherichia coli and S. cerevisiae to regulate or complete transcription at the RNA level.
Collapse
Affiliation(s)
- Osamu Miura
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Toshihiro Ogake
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Hiroki Yoneyama
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Yo Kikuchi
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Takashi Ohyama
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan. .,Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan.
| |
Collapse
|
11
|
Miura O, Ogake T, Ohyama T. Requirement or exclusion of inverted repeat sequences with cruciform-forming potential in Escherichia coli revealed by genome-wide analyses. Curr Genet 2018; 64:945-958. [PMID: 29484452 PMCID: PMC6060812 DOI: 10.1007/s00294-018-0815-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Revised: 02/16/2018] [Accepted: 02/19/2018] [Indexed: 12/31/2022]
Abstract
Inverted repeat (IR) sequences are DNA sequences that read the same from 5' to 3' in each strand. Some IRs can form cruciforms under the stress of negative supercoiling, and these IRs are widely found in genomes. However, their biological significance remains unclear. The aim of the current study is to explore this issue further. We constructed the first Escherichia coli genome-wide comprehensive map of IRs with cruciform-forming potential. Based on the map, we performed detailed and quantitative analyses. Here, we report that IRs with cruciform-forming potential are statistically enriched in the following five regions: the adjacent regions downstream of the stop codon-coding sites (referred to as the stop codons), on and around the positions corresponding to mRNA ends (referred to as the gene ends), ~ 20 to ~45 bp upstream of the start codon-coding sites (referred to as the start codons) within the 5'-UTR (untranslated region), ~ 25 to ~ 60 bp downstream of the start codons, and promoter regions. For the adjacent regions downstream of the stop codons and on and around the gene ends, most of the IRs with a repeat unit length of ≥ 8 bp and a spacer size of ≤ 8 bp were parts of the intrinsic terminators, regardless of the location, and presumably used for Rho-independent transcription termination. In contrast, fewer IRs were present in the small region preceding the start codons. In E. coli, IRs with cruciform-forming potential are actively placed or excluded in the regulatory regions for the initiation and termination of transcription and translation, indicating their deep involvement or influence in these processes.
Collapse
Affiliation(s)
- Osamu Miura
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Toshihiro Ogake
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Takashi Ohyama
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan.
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan.
| |
Collapse
|
12
|
Zou X, Morganella S, Glodzik D, Davies H, Li Y, Stratton MR, Nik-Zainal S. Short inverted repeats contribute to localized mutability in human somatic cells. Nucleic Acids Res 2017; 45:11213-11221. [PMID: 28977645 PMCID: PMC5737083 DOI: 10.1093/nar/gkx731] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Revised: 07/14/2017] [Accepted: 08/10/2017] [Indexed: 01/09/2023] Open
Abstract
Selected repetitive sequences termed short inverted repeats (SIRs) have the propensity to form secondary DNA structures called hairpins. SIRs comprise palindromic arm sequences separated by short spacer sequences that form the hairpin stem and loop respectively. Here, we show that SIRs confer an increase in localized mutability in breast cancer, which is domain-dependent with the greatest mutability observed within spacer sequences (∼1.35-fold above background). Mutability is influenced by factors that increase the likelihood of formation of hairpins such as loop lengths (of 4-5 bp) and stem lengths (of 7-15 bp). Increased mutability is an intrinsic property of SIRs as evidenced by how almost all mutational processes demonstrate a higher rate of mutagenesis of spacer sequences. We further identified 88 spacer sequences showing enrichment from 1.8- to 90-fold of local mutability distributed across 283 sites in the genome that intriguingly, can be used to inform the biological status of a tumor.
Collapse
Affiliation(s)
- Xueqing Zou
- Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
| | | | - Dominik Glodzik
- Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
| | - Helen Davies
- Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
| | - Yilin Li
- Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland
| | | | - Serena Nik-Zainal
- Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
- East Anglian Medical Genetics Service, Cambridge University Hospitals NHS Foundation Trust, Cambridge CB2 9NB, UK
| |
Collapse
|
13
|
Permanganate/S1 Nuclease Footprinting Reveals Non-B DNA Structures with Regulatory Potential across a Mammalian Genome. Cell Syst 2017; 4:344-356.e7. [PMID: 28237796 DOI: 10.1016/j.cels.2017.01.013] [Citation(s) in RCA: 134] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2016] [Revised: 09/06/2016] [Accepted: 01/13/2017] [Indexed: 12/11/2022]
Abstract
DNA in cells is predominantly B-form double helix. Though certain DNA sequences in vitro may fold into other structures, such as triplex, left-handed Z form, or quadruplex DNA, the stability and prevalence of these structures in vivo are not known. Here, using computational analysis of sequence motifs, RNA polymerase II binding data, and genome-wide potassium permanganate-dependent nuclease footprinting data, we map thousands of putative non-B DNA sites at high resolution in mouse B cells. Computational analysis associates these non-B DNAs with particular structures and indicates that they form at locations compatible with an involvement in gene regulation. Further analyses support the notion that non-B DNA structure formation influences the occupancy and positioning of nucleosomes in chromatin. These results suggest that non-B DNAs contribute to the control of a variety of critical cellular and organismal processes.
Collapse
|
14
|
Guo P, Lam SL. Unusual structures of CCTG repeats and their participation in repeat expansion. Biomol Concepts 2016; 7:331-340. [DOI: 10.1515/bmc-2016-0024] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2016] [Accepted: 11/01/2016] [Indexed: 11/15/2022] Open
Abstract
AbstractCCTG repeat expansion in intron 1 of the cellular nucleic acid-binding protein (CNBP) gene has been identified to be the genetic cause of myotonic dystrophy type 2 (DM2). Yet the underlying reasons for the genetic instability in CCTG repeats remain elusive. In recent years, CCTG repeats have been found to form various types of unusual secondary structures including mini-dumbbell (MDB), hairpin and dumbbell, revealing that there is a high structural diversity in CCTG repeats intrinsically. Upon strand slippage, the formation of unusual structures in the nascent strand during DNA replication has been proposed to be the culprit of CCTG repeat expansions. On the one hand, the thermodynamic stability, size, and conformational dynamics of these unusual structures affect the propensity of strand slippage. On the other hand, these structural properties determine whether the unusual structure can successfully escape from DNA repair. In this short overview, we first summarize the recent advances in elucidating the solution structures of CCTG repeats. We then discuss the potential pathways by which these unusual structures bring about variable sizes of repeat expansion, high strand slippage propensity and efficient repair escape.
Collapse
Affiliation(s)
- Pei Guo
- 1Department of Chemistry, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Sik Lok Lam
- 1Department of Chemistry, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| |
Collapse
|
15
|
Controlling gene expression by DNA mechanics: emerging insights and challenges. Biophys Rev 2016; 8:23-32. [PMID: 28510218 DOI: 10.1007/s12551-016-0243-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Accepted: 07/11/2016] [Indexed: 12/22/2022] Open
Abstract
Transcription initiation is a major control point for the precise regulation of gene expression. Our knowledge of this process has been mainly derived from protein-centric studies wherein cis-regulatory DNA sequences play a passive role, mainly in arranging the protein machinery to coalesce at the transcription start sites of genes in a spatial and temporal-specific manner. However, this is a highly dynamic process in which molecular motors such as RNA polymerase II (RNAPII), helicases, and other transcription factors, alter the level of mechanical force in DNA, rather than simply a set of static DNA-protein interactions. The double helix is a fiber that responds to flexural and torsional stress, which if accumulated, can affect promoter output as well as change DNA and chromatin structure. The relationship between DNA mechanics and the control of early transcription initiation events has been under-investigated. Genomic techniques to display topological stress and conformational variation in DNA across the mammalian genome provide an exciting new insight on the role of DNA mechanics in the early stages of the transcription cycle. Without understanding how torsional and flexural stresses are generated, transmitted, and dissipated, no model of transcription will be complete and accurate.
Collapse
|
16
|
Levens D, Baranello L, Kouzine F. Controlling gene expression by DNA mechanics: emerging insights and challenges. Biophys Rev 2016; 8:259-268. [PMID: 28510225 DOI: 10.1007/s12551-016-0216-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Accepted: 07/11/2016] [Indexed: 12/11/2022] Open
Abstract
Transcription initiation is a major control point for the precise regulation of gene expression. Our knowledge of this process has been mainly derived from protein-centric studies wherein cis-regulatory DNA sequences play a passive role, mainly in arranging the protein machinery to coalesce at the transcription start sites of genes in a spatial and temporal-specific manner. However, this is a highly dynamic process in which molecular motors such as RNA polymerase II (RNAPII), helicases, and other transcription factors, alter the level of mechanical force in DNA, rather than simply a set of static DNA-protein interactions. The double helix is a fiber that responds to flexural and torsional stress, which if accumulated, can affect promoter output as well as change DNA and chromatin structure. The relationship between DNA mechanics and the control of early transcription initiation events has been under-investigated. Genomic techniques to display topological stress and conformational variation in DNA across the mammalian genome provide an exciting new insight on the role of DNA mechanics in the early stages of the transcription cycle. Without understanding how torsional and flexural stresses are generated, transmitted, and dissipated, no model of transcription will be complete and accurate.
Collapse
Affiliation(s)
- David Levens
- Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA.
| | - Laura Baranello
- Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Fedor Kouzine
- Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| |
Collapse
|
17
|
Delihas N. Complexity of a small non-protein coding sequence in chromosomal region 22q11.2: presence of specialized DNA secondary structures and RNA exon/intron motifs. BMC Genomics 2015; 16:785. [PMID: 26467088 PMCID: PMC4607176 DOI: 10.1186/s12864-015-1958-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2015] [Accepted: 09/28/2015] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND DiGeorge Syndrome is a genetic abnormality involving ~3 Mb deletion in human chromosome 22, termed 22q.11.2. To better understand the non-coding regions of 22q.11.2, a small 10,000 bp non-protein-coding sequence close to the DiGeorge Critical Region 6 gene (DGCR6) was chosen for analysis and functional entities as the homologous sequence in the chimpanzee genome could be aligned and used for comparisons. METHODS The GenBank database provided genomic sequences. In silico computer programs were used to find homologous DNA sequences in human and chimpanzee genomes, generate random sequences, determine DNA sequence alignments, sequence comparisons and nucleotide repeat copies, and to predicted DNA secondary structures. RESULTS At its 5' half, the 10,000 bp sequence has three distinct sections that represent phylogenetically variable sequences. These Variable Regions contain biased mutations with a very high A + T content, multiple copies of the motif TATAATATA and sequences that fold into long A:T-base-paired stem loops. The 3' half of the 10,000 bp unit, highly conserved between human and chimpanzee, has sequences representing exons of lncRNA genes and segments of introns of protein genes. Central to the 10,000 bp unit are the multiple copies of a sequence that originates from the flanking 5' end of the translocation breakpoint Type A sequence. This breakpoint flanking sequence carries the exon and intron motifs. The breakpoint Type A sequence seems to be a major player in the proliferation of these RNA motifs, as well as the proliferation of Variable Regions in the 10,000 bp segment and other regions within 22q.11.2. CONCLUSIONS The data indicate that a non-coding region of the chromosome may be reserved for highly biased mutations that lead to formation of specialized sequences and DNA secondary structures. On the other hand, the highly conserved nucleotide sequence of the non-coding region may form storage sites for RNA motifs.
Collapse
Affiliation(s)
- Nicholas Delihas
- Department of Molecular Genetics and Microbiology, School of Medicine, Stony, Brook University, Stony Brook, NY, 11794, USA.
| |
Collapse
|
18
|
Amosova O, Alvarez-Dominguez JR, Fresco JR. Why the DNA self-depurination mechanism operates in HB-β but not in β-globin paralogs HB-δ, HB-ɛ1, HB-γ1 and HB-γ2. Mutat Res 2015; 778:11-7. [PMID: 26042536 DOI: 10.1016/j.mrfmmm.2015.05.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 05/07/2015] [Indexed: 02/02/2023]
Abstract
The human β-globin, δ-globin and ɛ-globin genes contain almost identical coding strand sequences centered about codon 6 having potential to form a stem-loop with a 5'GAGG loop. Provided with a sufficiently stable stem, such a structure can self-catalyze depurination of the loop 5'G residue, leading to a potential mutation hotspot. Previously, we showed that such a hotspot exists about codon 6 of β-globin, with by far the highest incidence of mutations across the gene, including those responsible for 6 anemias (notably Sickle Cell Anemia) and β-thalassemias. In contrast, we show here that despite identical loop sequences, there is no mutational hotspot in the δ- or ɛ1-globin potential self-depurination sites, which differ by only one or two base pairs in the stem region from that of the β-globin gene. These differences result in either one or two additional mismatches in the potential 7-base pair-forming stem region, thereby weakening its stability, so that either DNA cruciform extrusion from the duplex is rendered ineffective or the lifetime of the stem-loop becomes too short to permit self-catalysis to occur. Having that same loop sequence, paralogs HB-γ1 and HB-γ2 totally lack stem-forming potential. Hence the absence in δ- and ɛ1-globin genes of a mutational hotspot in what must now be viewed as non-functional homologs of the self-depurination site in β-globin. Such stem-destabilizing variants appeared early among vertebrates and remained conserved among mammals and primates. Thus, this study has revealed conserved sequence determinants of self-catalytic DNA depurination associated with variability of mutation incidence among human β-globin paralogs.
Collapse
Affiliation(s)
- Olga Amosova
- Department of Molecular Biology, Princeton University, Princeton, NJ 08544, USA.
| | | | - Jacques R Fresco
- Department of Molecular Biology, Princeton University, Princeton, NJ 08544, USA.
| |
Collapse
|
19
|
Aygun N. Correlations between long inverted repeat (LIR) features, deletion size and distance from breakpoint in human gross gene deletions. Sci Rep 2015; 5:8300. [PMID: 25657065 PMCID: PMC4319165 DOI: 10.1038/srep08300] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2014] [Accepted: 01/14/2015] [Indexed: 11/09/2022] Open
Abstract
Long inverted repeats (LIRs) have been shown to induce genomic deletions in yeast. In this study, LIRs were investigated within ±10 kb spanning each breakpoint from 109 human gross deletions, using Inverted Repeat Finder (IRF) software. LIR number was significantly higher at the breakpoint regions, than in control segments (P < 0.001). In addition, it was found that strong correlation between 5' and 3' LIR numbers, suggesting contribution to DNA sequence evolution (r = 0.85, P < 0.001). 138 LIR features at ±3 kb breakpoints in 89 (81%) of 109 gross deletions were evaluated. Significant correlations were found between distance from breakpoint and loop length (r = -0.18, P < 0.05) and stem length (r = -0.18, P < 0.05), suggesting DNA strands are potentially broken in locations closer to bigger LIRs. In addition, bigger loops cause larger deletions (r = 0.19, P < 0.05). Moreover, loop length (r = 0.29, P < 0.02) and identity between stem copies (r = 0.30, P < 0.05) of 3' LIRs were more important in larger deletions. Consequently, DNA breaks may form via LIR-induced cruciform structure during replication. DNA ends may be later repaired by non-homologous end-joining (NHEJ), with following deletion.
Collapse
Affiliation(s)
- Nevim Aygun
- Department of Medical Biology, Faculty of Medicine, Dokuz Eylul University, Inciralti, Izmir, Turkey
| |
Collapse
|
20
|
Du X, Gertz EM, Wojtowicz D, Zhabinskaya D, Levens D, Benham CJ, Schäffer AA, Przytycka TM. Potential non-B DNA regions in the human genome are associated with higher rates of nucleotide mutation and expression variation. Nucleic Acids Res 2014; 42:12367-79. [PMID: 25336616 PMCID: PMC4227770 DOI: 10.1093/nar/gku921] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
While individual non-B DNA structures have been shown to impact gene expression, their broad regulatory role remains elusive. We utilized genomic variants and expression quantitative trait loci (eQTL) data to analyze genome-wide variation propensities of potential non-B DNA regions and their relation to gene expression. Independent of genomic location, these regions were enriched in nucleotide variants. Our results are consistent with previously observed mutagenic properties of these regions and counter a previous study concluding that G-quadruplex regions have a reduced frequency of variants. While such mutagenicity might undermine functionality of these elements, we identified in potential non-B DNA regions a signature of negative selection. Yet, we found a depletion of eQTL-associated variants in potential non-B DNA regions, opposite to what might be expected from their proposed regulatory role. However, we also observed that genes downstream of potential non-B DNA regions showed higher expression variation between individuals. This coupling between mutagenicity and tolerance for expression variability of downstream genes may be a result of evolutionary adaptation, which allows reconciling mutagenicity of non-B DNA structures with their location in functionally important regions and their potential regulatory role.
Collapse
Affiliation(s)
- Xiangjun Du
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - E Michael Gertz
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Damian Wojtowicz
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Dina Zhabinskaya
- Laboratory of Pathology, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - David Levens
- UC Davis Genome Center, University of California Davis, Davis, CA 95616, USA
| | - Craig J Benham
- Laboratory of Pathology, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Alejandro A Schäffer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Teresa M Przytycka
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
21
|
Zhabinskaya D, Madden S, Benham CJ. SIST: stress-induced structural transitions in superhelical DNA. Bioinformatics 2014; 31:421-2. [PMID: 25282644 DOI: 10.1093/bioinformatics/btu657] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
SUMMARY Supercoiling imposes stress on a DNA molecule that can drive susceptible sequences into alternative non-B form structures. This phenomenon occurs frequently in vivo and has been implicated in biological processes, such as replication, transcription, recombination and translocation. SIST is a software package that analyzes sequence-dependent structural transitions in kilobase length superhelical DNA molecules. The numerical algorithms in SIST are based on a statistical mechanical model that calculates the equilibrium probability of transition for each base pair in the domain. They are extensions of the original stress-induced duplex destabilization (SIDD) method, which analyzes stress-driven DNA strand separation. SIST also includes algorithms to analyze B-Z transitions and cruciform extrusion. The SIST pipeline has an option to use the DZCBtrans algorithm, which analyzes the competition among these three transitions within a superhelical domain. AVAILABILITY AND IMPLEMENTATION The package and additional documentation are freely available at https://bitbucket.org/benhamlab/sist_codes. CONTACT dzhabinskaya@ucdavis.edu.
Collapse
Affiliation(s)
- Dina Zhabinskaya
- UC Davis Genome Center and Department of Mathematics, University of California, Davis, CA 95616, USA
| | - Sally Madden
- UC Davis Genome Center and Department of Mathematics, University of California, Davis, CA 95616, USA
| | - Craig J Benham
- UC Davis Genome Center and Department of Mathematics, University of California, Davis, CA 95616, USA
| |
Collapse
|
22
|
Zhang Y, Saini N, Sheng Z, Lobachev KS. Genome-wide screen reveals replication pathway for quasi-palindrome fragility dependent on homologous recombination. PLoS Genet 2013; 9:e1003979. [PMID: 24339793 PMCID: PMC3855049 DOI: 10.1371/journal.pgen.1003979] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Accepted: 10/12/2013] [Indexed: 02/07/2023] Open
Abstract
Inverted repeats capable of forming hairpin and cruciform structures present a threat to chromosomal integrity. They induce double strand breaks, which lead to gross chromosomal rearrangements, the hallmarks of cancers and hereditary diseases. Secondary structure formation at this motif has been proposed to be the driving force for the instability, albeit the mechanisms leading to the fragility are not well-understood. We carried out a genome-wide screen to uncover the genetic players that govern fragility of homologous and homeologous Alu quasi-palindromes in the yeast Saccharomyces cerevisiae. We found that depletion or lack of components of the DNA replication machinery, proteins involved in Fe-S cluster biogenesis, the replication-pausing checkpoint pathway, the telomere maintenance complex or the Sgs1-Top3-Rmi1 dissolvasome augment fragility at Alu-IRs. Rad51, a component of the homologous recombination pathway, was found to be required for replication arrest and breakage at the repeats specifically in replication-deficient strains. These data demonstrate that Rad51 is required for the formation of breakage-prone secondary structures in situations when replication is compromised while another mechanism operates in DSB formation in replication-proficient strains. Inverted repeats are found in many eukaryotic genomes including humans. They have a potential to cause chromosomal breakage and rearrangements that contribute to genome polymorphism and the development of diseases. Instability of inverted repeats is accounted for by their propensity to adopt DNA secondary structures that is negatively affected by the distance between the repeats and level of sequence divergence. However, the genetic factors that promote the abnormal structure formation or affect the ability of the repeats to break are largely unknown. Here, using a genome-wide screen we identified 38 mutants that destabilize imperfect human inverted Alu repeats and predispose them to breakage. The proteins that are required to maintain repeat stability belong to the core of the DNA replication machinery and to the accessory proteins that help replication fork to move through the difficult templates. Remarkably, when replication machinery is compromised, the proteins involved in homologous recombination promote the formation of secondary structures and replication block thereby triggering breakage at the inverted repeats. These results reveal a powerful pathway for the destabilization of chromosomes containing inverted repeats that requires the activity of homologous recombination.
Collapse
Affiliation(s)
- Yu Zhang
- School of Biology and Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Natalie Saini
- School of Biology and Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Ziwei Sheng
- School of Biology and Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Kirill S. Lobachev
- School of Biology and Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia, United States of America
- * E-mail:
| |
Collapse
|