1
|
Scheuren M, Möhner J, Müller M, Zischler H. DSB profiles in human spermatozoa highlight the role of TMEJ in the male germline. Front Genet 2024; 15:1423674. [PMID: 39040993 PMCID: PMC11260735 DOI: 10.3389/fgene.2024.1423674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Accepted: 06/13/2024] [Indexed: 07/24/2024] Open
Abstract
The male mammalian germline is characterized by substantial chromatin remodeling associated with the transition from histones to protamines during spermatogenesis, followed by the reversal to nucleohistones in the male pronucleus preceding the zygotic genome activation. Both transitions are associated with the extensive formation of DNA double-strand breaks (DSBs), requiring an estimated 5 to 10 million transient DSBs per spermatozoa. Additionally, the high transcription rate in early stages of spermatogenesis leads to transcription-coupled damage preceding meiotic homologous recombination, potentially further contributing to the DSB landscape in mature spermatozoa. Once meiosis is completed, spermatozoa remain haploid and therefore cannot rely on error-free homologous recombination, but instead depend on error-prone classical non-homologous end joining (cNHEJ). This DNA damage/repair-scenario is proposed to be one of the main causes of the observed paternal mutation propensity in human evolution. Recent studies have shown that DSBs in the male pronucleus are repaired by maternally provided Polθ in Caenorhabditis elegans through Polθ-mediated end joining (TMEJ). Additionally, population genetic datasets have revealed a preponderance of TMEJ signatures associated with human variation. Since these signatures are the result of the combined effect of TMEJ and DSB formation in spermatozoa and male pronuclei, we used a BLISS-based protocol to analyze recurrent DSBs in mature human sperm heads as a proxy of the male pronucleus before zygotic chromatin remodeling. The DSBs were found to be enriched in (YR)n short tandem repeats and in evolutionarily young SINEs, reminiscent to patterns observed in murine spermatids, indicating evolutionary hotspots of recurrent DSB formation in mammalian spermatozoa. Additionally, we detected a similar DSB pattern in diploid human IMR90 cells when cNHEJ was selectively inhibited, indicating the significant impact of absent cNHEJ on the sperm DSB landscape. Strikingly, regions associated with most retained histones, and therefore less condensed chromatin, were not strongly enriched with recurrent DSBs. In contrast, the fraction of retained H3K27me3 in the mature spermatozoa displayed a strong association with recurrent DSBs. DSBs in H3K27me3 are associated with a preference for TMEJ over cNHEJ during repair. We hypothesize that the retained H3K27me3 may trigger transgenerational DNA repair by priming maternal Polθ to these regions.
Collapse
Affiliation(s)
- Maurice Scheuren
- Division of Anthropology, Faculty of Biology, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Jonas Möhner
- Division of Anthropology, Faculty of Biology, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Max Müller
- Institute for Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, Mainz, Germany
| | - Hans Zischler
- Division of Anthropology, Faculty of Biology, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, Mainz, Germany
| |
Collapse
|
2
|
Jia H, Tan S, Cai Y, Guo Y, Shen J, Zhang Y, Ma H, Zhang Q, Chen J, Qiao G, Ruan J, Zhang YE. Low-input PacBio sequencing generates high-quality individual fly genomes and characterizes mutational processes. Nat Commun 2024; 15:5644. [PMID: 38969648 PMCID: PMC11226609 DOI: 10.1038/s41467-024-49992-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 06/20/2024] [Indexed: 07/07/2024] Open
Abstract
Long-read sequencing, exemplified by PacBio, revolutionizes genomics, overcoming challenges like repetitive sequences. However, the high DNA requirement ( > 1 µg) is prohibitive for small organisms. We develop a low-input (100 ng), low-cost, and amplification-free library-generation method for PacBio sequencing (LILAP) using Tn5-based tagmentation and DNA circularization within one tube. We test LILAP with two Drosophila melanogaster individuals, and generate near-complete genomes, surpassing preexisting single-fly genomes. By analyzing variations in these two genomes, we characterize mutational processes: complex transpositions (transposon insertions together with extra duplications and/or deletions) prefer regions characterized by non-B DNA structures, and gene conversion of transposons occurs on both DNA and RNA levels. Concurrently, we generate two complete assemblies for the endosymbiotic bacterium Wolbachia in these flies and similarly detect transposon conversion. Thus, LILAP promises a broad PacBio sequencing adoption for not only mutational studies of flies and their symbionts but also explorations of other small organisms or precious samples.
Collapse
Affiliation(s)
- Hangxing Jia
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.
| | - Shengjun Tan
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.
| | - Yingao Cai
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yanyan Guo
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jieyu Shen
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yaqiong Zhang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Huijing Ma
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Qingzhu Zhang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jinfeng Chen
- University of Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Gexia Qiao
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jue Ruan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
| | - Yong E Zhang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.
- University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
3
|
Bartas M, Brázda V, Pečinka P. Special Issue "Bioinformatics of Unusual DNA and RNA Structures". Int J Mol Sci 2024; 25:5226. [PMID: 38791265 PMCID: PMC11121459 DOI: 10.3390/ijms25105226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 04/29/2024] [Accepted: 05/06/2024] [Indexed: 05/26/2024] Open
Abstract
Nucleic acids are not only static carriers of genetic information but also play vital roles in controlling cellular lifecycles through their fascinating structural diversity [...].
Collapse
Affiliation(s)
- Martin Bartas
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, 710 00 Ostrava, Czech Republic;
| | - Václav Brázda
- Institute of Biophysics, Czech Academy of Sciences, Královopolská 135, 612 00 Brno, Czech Republic;
| | - Petr Pečinka
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, 710 00 Ostrava, Czech Republic;
| |
Collapse
|
4
|
Puzzo F, Crossley MP, Goswami A, Zhang F, Pekrun K, Garzon JL, Cimprich KA, Kay MA. AAV-mediated genome editing is influenced by the formation of R-loops. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.07.592855. [PMID: 38766176 PMCID: PMC11100726 DOI: 10.1101/2024.05.07.592855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Recombinant adeno-associated viral vectors (rAAV) hold an intrinsic ability to stimulate homologous recombination (AAV-HR) and are the most used in clinical settings for in vivo gene therapy. However, rAAVs also integrate throughout the genome. Here, we describe DNA-RNA immunoprecipitation sequencing (DRIP-seq) in murine HEPA1-6 hepatoma cells and whole murine liver to establish the similarities and differences in genomic R-loop formation in a transformed cell line and intact tissue. We show enhanced AAV-HR in mice upon genetic and pharmacological upregulation of R-loops. Selecting the highly expressed Albumin gene as a model locus for genome editing in both in vitro and in vivo experiments showed that the R-loop prone, 3' end of Albumin was efficiently edited by AAV-HR, whereas the upstream R-loop-deficient region did not result in detectable vector integration. In addition, we found a positive correlation between previously reported off-target rAAV integration sites and R-loop enriched genomic regions. Thus, we conclude that high levels of R-loops, present in highly transcribed genes, promote rAAV vector genome integration. These findings may shed light on potential mechanisms for improving the safety and efficacy of genome editing by modulating R-loops and may enhance our ability to predict regions most susceptible to off-target insertional mutagenesis by rAAV vectors.
Collapse
Affiliation(s)
- Francesco Puzzo
- Department of Genetics, Stanford University, Stanford, CA
- Department of Pediatrics, Stanford University, Stanford, CA
| | | | - Aranyak Goswami
- Department of Genetics, Stanford University, Stanford, CA
- Department of Pediatrics, Stanford University, Stanford, CA
| | - Feijie Zhang
- Department of Genetics, Stanford University, Stanford, CA
- Department of Pediatrics, Stanford University, Stanford, CA
| | - Katja Pekrun
- Department of Genetics, Stanford University, Stanford, CA
- Department of Pediatrics, Stanford University, Stanford, CA
| | - Jada L Garzon
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA
| | - Karlene A Cimprich
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA
| | - Mark A Kay
- Department of Genetics, Stanford University, Stanford, CA
- Department of Pediatrics, Stanford University, Stanford, CA
| |
Collapse
|
5
|
Fang Y, Bansal K, Mostafavi S, Benoist C, Mathis D. AIRE relies on Z-DNA to flag gene targets for thymic T cell tolerization. Nature 2024; 628:400-407. [PMID: 38480882 PMCID: PMC11091860 DOI: 10.1038/s41586-024-07169-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 02/06/2024] [Indexed: 03/18/2024]
Abstract
AIRE is an unconventional transcription factor that enhances the expression of thousands of genes in medullary thymic epithelial cells and promotes clonal deletion or phenotypic diversion of self-reactive T cells1-4. The biological logic of AIRE's target specificity remains largely unclear as, in contrast to many transcription factors, it does not bind to a particular DNA sequence motif. Here we implemented two orthogonal approaches to investigate AIRE's cis-regulatory mechanisms: construction of a convolutional neural network and leveraging natural genetic variation through analysis of F1 hybrid mice5. Both approaches nominated Z-DNA and NFE2-MAF as putative positive influences on AIRE's target choices. Genome-wide mapping studies revealed that Z-DNA-forming and NFE2L2-binding motifs were positively associated with the inherent ability of a gene's promoter to generate DNA double-stranded breaks, and promoters showing strong double-stranded break generation were more likely to enter a poised state with accessible chromatin and already-assembled transcriptional machinery. Consequently, AIRE preferentially targets genes with poised promoters. We propose a model in which Z-DNA anchors the AIRE-mediated transcriptional program by enhancing double-stranded break generation and promoter poising. Beyond resolving a long-standing mechanistic conundrum, these findings suggest routes for manipulating T cell tolerance.
Collapse
Affiliation(s)
- Yuan Fang
- Department of Immunology, Harvard Medical School, Boston, MA, USA
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - Kushagra Bansal
- Molecular Biology and Genetics Unit, Jawaharlal Nehru Centre for Advanced Scientific Research, Bangalore, India
| | - Sara Mostafavi
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
- Canadian Institute for Advanced Research, Toronto, Ontario, Canada
| | | | - Diane Mathis
- Department of Immunology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
6
|
Xu Q, del Mundo IMA, Zewail-Foote M, Luke BT, Vasquez KM, Kowalski J. MoCoLo: a testing framework for motif co-localization. Brief Bioinform 2024; 25:bbae019. [PMID: 38521050 PMCID: PMC10960634 DOI: 10.1093/bib/bbae019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 01/08/2024] [Accepted: 01/09/2024] [Indexed: 03/25/2024] Open
Abstract
Sequence-level data offers insights into biological processes through the interaction of two or more genomic features from the same or different molecular data types. Within motifs, this interaction is often explored via the co-occurrence of feature genomic tracks using fixed-segments or analytical tests that respectively require window size determination and risk of false positives from over-simplified models. Moreover, methods for robustly examining the co-localization of genomic features, and thereby understanding their spatial interaction, have been elusive. We present a new analytical method for examining feature interaction by introducing the notion of reciprocal co-occurrence, define statistics to estimate it and hypotheses to test for it. Our approach leverages conditional motif co-occurrence events between features to infer their co-localization. Using reverse conditional probabilities and introducing a novel simulation approach that retains motif properties (e.g. length, guanine-content), our method further accounts for potential confounders in testing. As a proof-of-concept, motif co-localization (MoCoLo) confirmed the co-occurrence of histone markers in a breast cancer cell line. As a novel analysis, MoCoLo identified significant co-localization of oxidative DNA damage within non-B DNA-forming regions that significantly differed between non-B DNA structures. Altogether, these findings demonstrate the potential utility of MoCoLo for testing spatial interactions between genomic features via their co-localization.
Collapse
Affiliation(s)
- Qi Xu
- Department of Molecular Biosciences, College of Natural Sciences, The University of Texas at Austin, Austin, TX, 78712, USA
- Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Imee M A del Mundo
- Dell Pediatric Research Institute, Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, Austin, Texas, 78723, USA
| | - Maha Zewail-Foote
- Department of Chemistry and Biochemistry, Southwestern University, Georgetown, TX, 78626, USA
| | - Brian T Luke
- Bioinformatics and Computational Science, Frederick National Laboratory for Cancer Research, Frederick, Maryland, 21701, USA
| | - Karen M Vasquez
- Dell Pediatric Research Institute, Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, Austin, Texas, 78723, USA
| | - Jeanne Kowalski
- Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX, 78712, USA
| |
Collapse
|
7
|
Gopalakrishnan V, Roy U, Srivastava S, Kariya KM, Sharma S, Javedakar SM, Choudhary B, Raghavan SC. Delineating the mechanism of fragility at BCL6 breakpoint region associated with translocations in diffuse large B cell lymphoma. Cell Mol Life Sci 2024; 81:21. [PMID: 38196006 PMCID: PMC11072719 DOI: 10.1007/s00018-023-05042-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 11/05/2023] [Accepted: 11/07/2023] [Indexed: 01/11/2024]
Abstract
BCL6 translocation is one of the most common chromosomal translocations in cancer and results in its enhanced expression in germinal center B cells. It involves the fusion of BCL6 with any of its twenty-six Ig and non-Ig translocation partners associated with diffuse large B cell lymphoma (DLBCL). Despite being discovered long back, the mechanism of BCL6 fragility is largely unknown. Analysis of the translocation breakpoints in 5' UTR of BCL6 reveals the clustering of most of the breakpoints around a region termed Cluster II. In silico analysis of the breakpoint cluster sequence identified sequence motifs that could potentially fold into non-B DNA. Results revealed that the Cluster II sequence folded into overlapping hairpin structures and identified sequences that undergo base pairing at the stem region. Further, the formation of cruciform DNA blocked DNA replication. The sodium bisulfite modification assay revealed the single-strandedness of the region corresponding to hairpin DNA in both strands of the genome. Further, we report the formation of intramolecular parallel G4 and triplex DNA, at Cluster II. Taken together, our studies reveal that multiple non-canonical DNA structures exist at the BCL6 cluster II breakpoint region and contribute to the fragility leading to BCL6 translocation in DLBCL patients.
Collapse
Affiliation(s)
- Vidya Gopalakrishnan
- Department of Biochemistry, Indian Institute of Science, Bangalore, 560 012, India
- Institute of Bioinformatics and Applied Biotechnology, Electronics City, Bangalore, 560 100, India
- Department of Zoology, St. Joseph's College (Autonomous), Irinjalakuda, Kerala, 680121, India
| | - Urbi Roy
- Department of Biochemistry, Indian Institute of Science, Bangalore, 560 012, India
| | - Shikha Srivastava
- Department of Biochemistry, Indian Institute of Science, Bangalore, 560 012, India
- Department of Bioscience and Biotechnology, Banasthali Vidyapith, Tonk, Rajasthan, 304022, India
| | - Khyati M Kariya
- Department of Biochemistry, Indian Institute of Science, Bangalore, 560 012, India
| | - Shivangi Sharma
- Department of Biochemistry, Indian Institute of Science, Bangalore, 560 012, India
| | - Saniya M Javedakar
- Department of Biochemistry, Indian Institute of Science, Bangalore, 560 012, India
| | - Bibha Choudhary
- Institute of Bioinformatics and Applied Biotechnology, Electronics City, Bangalore, 560 100, India.
| | - Sathees C Raghavan
- Department of Biochemistry, Indian Institute of Science, Bangalore, 560 012, India.
| |
Collapse
|
8
|
Yang Y, Badura ML, O’Leary PC, Delavan HM, Robinson TM, Egusa EA, Zhong X, Swinderman JT, Li H, Zhang M, Kim M, Ashworth A, Feng FY, Chou J, Yang L. Large tandem duplications in cancer result from transcription and DNA replication collisions. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.05.17.23290140. [PMID: 38260434 PMCID: PMC10802642 DOI: 10.1101/2023.05.17.23290140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Despite the abundance of somatic structural variations (SVs) in cancer, the underlying molecular mechanisms of their formation remain unclear. Here, we use 6,193 whole-genome sequenced tumors to study the contributions of transcription and DNA replication collisions to genome instability. After deconvoluting robust SV signatures in three independent pan-cancer cohorts, we detect transcription-dependent replicated-strand bias, the expected footprint of transcription-replication collision (TRC), in large tandem duplications (TDs). Large TDs are abundant in female-enriched, upper gastrointestinal tract and prostate cancers. They are associated with poor patient survival and mutations in TP53, CDK12, and SPOP. Upon inactivating CDK12, cells display significantly more TRCs, R-loops, and large TDs. Inhibition of G2/M checkpoint proteins, such as WEE1, CHK1, and ATR, selectively inhibits the growth of cells deficient in CDK12. Our data suggest that large TDs in cancer form due to TRCs, and their presence can be used as a biomarker for prognosis and treatment.
Collapse
Affiliation(s)
- Yang Yang
- Ben May Department for Cancer Research, University of Chicago, Chicago, IL, USA
| | - Michelle L. Badura
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, CA, USA
- Departments of Radiation Oncology and Urology, University of California, San Francisco, CA, USA
| | - Patrick C. O’Leary
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, CA, USA
| | - Henry M. Delavan
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, CA, USA
- Division of Hematology/Oncology, Department of Medicine, University of California, San Francisco, CA, USA
| | - Troy M. Robinson
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, CA, USA
- Departments of Radiation Oncology and Urology, University of California, San Francisco, CA, USA
| | - Emily A. Egusa
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, CA, USA
- Departments of Radiation Oncology and Urology, University of California, San Francisco, CA, USA
| | - Xiaoming Zhong
- Ben May Department for Cancer Research, University of Chicago, Chicago, IL, USA
| | - Jason T. Swinderman
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, CA, USA
- Departments of Radiation Oncology and Urology, University of California, San Francisco, CA, USA
| | - Haolong Li
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, CA, USA
- Departments of Radiation Oncology and Urology, University of California, San Francisco, CA, USA
| | - Meng Zhang
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, CA, USA
- Departments of Radiation Oncology and Urology, University of California, San Francisco, CA, USA
| | - Minkyu Kim
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, CA, USA
- Department of Cellular Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA
| | - Alan Ashworth
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, CA, USA
- Division of Hematology/Oncology, Department of Medicine, University of California, San Francisco, CA, USA
| | - Felix Y. Feng
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, CA, USA
- Departments of Radiation Oncology and Urology, University of California, San Francisco, CA, USA
- Division of Hematology/Oncology, Department of Medicine, University of California, San Francisco, CA, USA
| | - Jonathan Chou
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, CA, USA
- Division of Hematology/Oncology, Department of Medicine, University of California, San Francisco, CA, USA
| | - Lixing Yang
- Ben May Department for Cancer Research, University of Chicago, Chicago, IL, USA
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- University of Chicago Comprehensive Cancer Center, Chicago, IL, USA
| |
Collapse
|
9
|
Koh GCC, Boushaki S, Zhao SJ, Pregnall AM, Sadiyah F, Badja C, Memari Y, Georgakopoulos-Soares I, Nik-Zainal S. The chemotherapeutic drug CX-5461 is a potent mutagen in cultured human cells. Nat Genet 2024; 56:23-26. [PMID: 38036782 PMCID: PMC10786719 DOI: 10.1038/s41588-023-01602-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 10/30/2023] [Indexed: 12/02/2023]
Abstract
The chemotherapeutic agent CX-5461, or pidnarulex, has been fast-tracked by the United States Food and Drug Administration for early-stage clinical studies of BRCA1-, BRCA2- and PALB2-mutated cancers. It is under investigation in phase I and II trials. Here, we find that, although CX-5461 exhibits synthetic lethality in BRCA1-/BRCA2-deficient cells, it also causes extensive, nonselective, collateral mutagenesis in all three cell lines tested, to magnitudes that exceed known environmental carcinogens.
Collapse
Affiliation(s)
- Gene Ching Chiek Koh
- Department of Oncology, Early Cancer Institute, University of Cambridge, Cambridge, UK
- Academic Department of Medical Genetics, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | - Soraya Boushaki
- Academic Department of Medical Genetics, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | - Salome Jingchen Zhao
- Academic Department of Medical Genetics, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | - Andrew Marcel Pregnall
- Academic Department of Medical Genetics, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | - Firas Sadiyah
- Department of Oncology, Early Cancer Institute, University of Cambridge, Cambridge, UK
- Academic Department of Medical Genetics, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | - Cherif Badja
- Department of Oncology, Early Cancer Institute, University of Cambridge, Cambridge, UK
- Academic Department of Medical Genetics, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | - Yasin Memari
- Department of Oncology, Early Cancer Institute, University of Cambridge, Cambridge, UK
- Academic Department of Medical Genetics, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | - Ilias Georgakopoulos-Soares
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Serena Nik-Zainal
- Department of Oncology, Early Cancer Institute, University of Cambridge, Cambridge, UK.
- Academic Department of Medical Genetics, School of Clinical Medicine, University of Cambridge, Cambridge, UK.
| |
Collapse
|
10
|
Lu Y, Lee J, Li J, Allu SR, Wang J, Kim H, Bullaughey KL, Fisher SA, Nordgren CE, Rosario JG, Anderson SA, Ulyanova AV, Brem S, Chen HI, Wolf JA, Grady MS, Vinogradov SA, Kim J, Eberwine J. CHEX-seq detects single-cell genomic single-stranded DNA with catalytical potential. Nat Commun 2023; 14:7346. [PMID: 37963886 PMCID: PMC10645931 DOI: 10.1038/s41467-023-43158-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 11/02/2023] [Indexed: 11/16/2023] Open
Abstract
Genomic DNA (gDNA) undergoes structural interconversion between single- and double-stranded states during transcription, DNA repair and replication, which is critical for cellular homeostasis. We describe "CHEX-seq" which identifies the single-stranded DNA (ssDNA) in situ in individual cells. CHEX-seq uses 3'-terminal blocked, light-activatable probes to prime the copying of ssDNA into complementary DNA that is sequenced, thereby reporting the genome-wide single-stranded chromatin landscape. CHEX-seq is benchmarked in human K562 cells, and its utilities are demonstrated in cultures of mouse and human brain cells as well as immunostained spatially localized neurons in brain sections. The amount of ssDNA is dynamically regulated in response to perturbation. CHEX-seq also identifies single-stranded regions of mitochondrial DNA in single cells. Surprisingly, CHEX-seq identifies single-stranded loci in mouse and human gDNA that catalyze porphyrin metalation in vitro, suggesting a catalytic activity for genomic ssDNA. We posit that endogenous DNA enzymatic activity is a function of genomic ssDNA.
Collapse
Affiliation(s)
- Youtao Lu
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jaehee Lee
- Department of Systems Pharmacology and Translational Therapeutics Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jifen Li
- Department of Systems Pharmacology and Translational Therapeutics Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Srinivasa Rao Allu
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jinhui Wang
- Department of Systems Pharmacology and Translational Therapeutics Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - HyunBum Kim
- Department of Systems Pharmacology and Translational Therapeutics Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Kevin L Bullaughey
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Stephen A Fisher
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - C Erik Nordgren
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jean G Rosario
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Stewart A Anderson
- Department of Psychiatry, Children's Hospital of Philadelphia, ARC 517, 3615 Civic Center Blvd, Philadelphia, PA, 19104, USA
| | - Alexandra V Ulyanova
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Steven Brem
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - H Isaac Chen
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - John A Wolf
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - M Sean Grady
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Sergei A Vinogradov
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Junhyong Kim
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - James Eberwine
- Department of Systems Pharmacology and Translational Therapeutics Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
11
|
Yella VR, Vanaja A. Computational analysis on the dissemination of non-B DNA structural motifs in promoter regions of 1180 cellular genomes. Biochimie 2023; 214:101-111. [PMID: 37311475 DOI: 10.1016/j.biochi.2023.06.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 05/05/2023] [Accepted: 06/05/2023] [Indexed: 06/15/2023]
Abstract
The promoter regions of gene regulation are under evolutionary constraints and earlier studies uncovered that they are characterized by enrichment of functional non-B DNA structural signatures like curved DNA, cruciform DNA, G-quadruplex, triple-helical DNA, slipped DNA structures, and Z-DNA. However, these studies are restricted to a few model organisms, single non-B DNA motif types, or whole genomic sequences, and their comparative accumulation in promoter regions of different domains of life has not been reported comprehensively. In this study, for the first time, we investigated the preponderance of non-B DNA-prone motifs in promoter regions in 1180 genomes belonging to 28 taxonomic groups using the non-B DNA Motif Search Tool (nBMST). The trends suggest that they are predominant in promoters compared to the upstream and downstream regions of all three domains of life and variably linked to taxonomic groups. Cruciform DNA motif is the most abundant form of non-B DNA, spanning from archaea to lower eukaryotes. Curved DNA motifs are prominent in host-associated bacteria, and suppressed in mammals. Triplex-DNA and slipped DNA structure repeats are discretely dispersed in all lineages. G-quadruplex motifs are significantly enriched in mammals. We also observed that the unique enrichment of non-B DNA in promoters is strongly linked to genome GC, size, evolutionary time divergence, and ecological adaptations. Overall, our work systematically reports the unique non-B DNA structural landscape of cellular organisms from the perspective of the cis-regulatory code of genomes.
Collapse
Affiliation(s)
- Venkata Rajesh Yella
- Department of Biotechnology, Koneru Lakshmaiah Education Foundation, Guntur, 522302, Andhra Pradesh, India.
| | - Akkinepally Vanaja
- Department of Biotechnology, Koneru Lakshmaiah Education Foundation, Guntur, 522302, Andhra Pradesh, India; KL College of Pharmacy, Koneru Lakshmaiah Education Foundation, Guntur, 522302, Andhra Pradesh, India
| |
Collapse
|
12
|
Shimojima Yamamoto K, Tamura T, Okamoto N, Nishi E, Noguchi A, Takahashi I, Sawaishi Y, Shimizu M, Kanno H, Minakuchi Y, Toyoda A, Yamamoto T. Identification of small-sized intrachromosomal segments at the ends of INV-DUP-DEL patterns. J Hum Genet 2023; 68:751-757. [PMID: 37423943 DOI: 10.1038/s10038-023-01181-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 06/14/2023] [Accepted: 06/27/2023] [Indexed: 07/11/2023]
Abstract
The mechanism of chromosomal rearrangement associated with inverted-duplication-deletion (INV-DUP-DEL) pattern formation has been investigated by many researchers, and several possible mechanisms have been proposed. Currently, fold-back and subsequent dicentric chromosome formation has been established as non-recurrent INV-DUP-DEL pattern formation mechanisms. In the present study, we analyzed the breakpoint junctions of INV-DUP-DEL patterns in five patients using long-read whole-genome sequencing and detected 2.2-6.1 kb copy-neutral regions in all five patients. At the end of the INV-DUP-DEL, two patients exhibited chromosomal translocations, which are recognized as telomere capture, and one patient showed direct telomere healing. The remaining two patients had additional small-sized intrachromosomal segments at the end of the derivative chromosomes. These findings have not been previously reported but they may only be explained by the presence of telomere capture breakage. Further investigations are required to better understand the mechanisms underlying this finding.
Collapse
Affiliation(s)
- Keiko Shimojima Yamamoto
- Department of Transfusion Medicine and Cell Processing, Tokyo Women's Medical University, Tokyo, 162-8666, Japan
- Institute of Medical Genetics, Tokyo Women's Medical University, Tokyo, 162-8666, Japan
| | - Takeaki Tamura
- Department of Pediatrics and Child Health, Nihon University School of Medicine, Tokyo, 173-8610, Japan
- Division of Gene Medicine, Graduate Scholl of Medical Science, Tokyo Women's Medical University, Tokyo, 162-8666, Japan
| | - Nobuhiko Okamoto
- Department of Medical Genetics, Osaka Women's and Children's Hospital, Izumi, 594-1101, Japan
| | - Eriko Nishi
- Department of Medical Genetics, Osaka Women's and Children's Hospital, Izumi, 594-1101, Japan
| | - Atsuko Noguchi
- Department of Pediatrics, Akita University Graduate School of Medicine, Akita, 010-8543, Japan
| | - Ikuko Takahashi
- Department of Pediatrics, Akita University Graduate School of Medicine, Akita, 010-8543, Japan
| | - Yukio Sawaishi
- Department of Pediatrics, Akita Prefectural Center on Development and Disability, Akita, 010-0000, Japan
| | - Masaki Shimizu
- Department of Pediatrics and Developmental Biology, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, 113-8519, Japan
| | - Hitoshi Kanno
- Department of Transfusion Medicine and Cell Processing, Tokyo Women's Medical University, Tokyo, 162-8666, Japan
- Institute of Medical Genetics, Tokyo Women's Medical University, Tokyo, 162-8666, Japan
| | - Yohei Minakuchi
- Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Shizuoka, 411-0801, Japan
| | - Atsushi Toyoda
- Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Shizuoka, 411-0801, Japan
| | - Toshiyuki Yamamoto
- Institute of Medical Genetics, Tokyo Women's Medical University, Tokyo, 162-8666, Japan.
- Division of Gene Medicine, Graduate Scholl of Medical Science, Tokyo Women's Medical University, Tokyo, 162-8666, Japan.
| |
Collapse
|
13
|
Yang Y, Yang L. Somatic structural variation signatures in pediatric brain tumors. Cell Rep 2023; 42:113276. [PMID: 37851574 PMCID: PMC10748741 DOI: 10.1016/j.celrep.2023.113276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 08/26/2023] [Accepted: 09/28/2023] [Indexed: 10/20/2023] Open
Abstract
Brain cancer is the leading cause of cancer-related death in children. Somatic structural variations (SVs), large-scale alterations in DNA, remain poorly understood in pediatric brain tumors. Here, we detect a total of 13,199 high-confidence somatic SVs in 744 whole-genome sequences of pediatric brain tumors from the Pediatric Brain Tumor Atlas. The somatic SV occurrences have tremendous diversity among the cohort and across different tumor types. We decompose mutational signatures of clustered complex SVs, non-clustered complex SVs, and simple SVs separately to infer their mutational mechanisms. Our finding of many tumor types carrying unique sets of SV signatures suggests that distinct molecular mechanisms shape genome instability in different tumor types. The patterns of somatic SV signatures in pediatric brain tumors are substantially different from those in adult cancers. The convergence of multiple SV signatures on several major cancer driver genes implies vital roles of somatic SVs in disease progression.
Collapse
Affiliation(s)
- Yang Yang
- Ben May Department for Cancer Research, University of Chicago, Chicago, IL 60637, USA
| | - Lixing Yang
- Ben May Department for Cancer Research, University of Chicago, Chicago, IL 60637, USA; Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA; University of Chicago Comprehensive Cancer Center, Chicago, IL 60637, USA.
| |
Collapse
|
14
|
Matos-Rodrigues G, Hisey JA, Nussenzweig A, Mirkin SM. Detection of alternative DNA structures and its implications for human disease. Mol Cell 2023; 83:3622-3641. [PMID: 37863029 DOI: 10.1016/j.molcel.2023.08.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 08/01/2023] [Accepted: 08/16/2023] [Indexed: 10/22/2023]
Abstract
Around 3% of the genome consists of simple DNA repeats that are prone to forming alternative (non-B) DNA structures, such as hairpins, cruciforms, triplexes (H-DNA), four-stranded guanine quadruplexes (G4-DNA), and others, as well as composite RNA:DNA structures (e.g., R-loops, G-loops, and H-loops). These DNA structures are dynamic and favored by the unwinding of duplex DNA. For many years, the association of alternative DNA structures with genome function was limited by the lack of methods to detect them in vivo. Here, we review the recent advancements in the field and present state-of-the-art technologies and methods to study alternative DNA structures. We discuss the limitations of these methods as well as how they are beginning to provide insights into causal relationships between alternative DNA structures, genome function and stability, and human disease.
Collapse
Affiliation(s)
| | - Julia A Hisey
- Department of Biology, Tufts University, Medford, MA, USA
| | - André Nussenzweig
- Laboratory of Genome Integrity, National Cancer Institute, NIH, Bethesda, MD, USA.
| | | |
Collapse
|
15
|
Singh D, Desai N, Shah V, Datta B. In Silico Identification of Potential Quadruplex Forming Sequences in LncRNAs of Cervical Cancer. Int J Mol Sci 2023; 24:12658. [PMID: 37628839 PMCID: PMC10454738 DOI: 10.3390/ijms241612658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 08/07/2023] [Accepted: 08/08/2023] [Indexed: 08/27/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) have emerged as auxiliary regulators of gene expression influencing tumor microenvironment, metastasis and radio-resistance in cancer. The presence of lncRNA in extracellular fluids makes them promising diagnostic markers. LncRNAs deploy higher-order structures to facilitate a complex range of functions. Among such structures, G-quadruplexes (G4s) can be detected or targeted by small molecular probes to drive theranostic applications. The in vitro identification of G4 formation in lncRNAs can be a tedious and expensive proposition. Bioinformatics-driven strategies can provide comprehensive and economic alternatives in conjunction with suitable experimental validation. We propose a pipeline to identify G4-forming sequences, protein partners and biological functions associated with dysregulated lncRNAs in cervical cancer. We identified 17 lncRNA clusters which possess transcripts that can fold into a G4 structure. We confirmed in vitro G4 formation in the four biologically active isoforms of SNHG20, MEG3, CRNDE and LINP1 by Circular Dichroism spectroscopy and Thioflavin-T-assisted fluorescence spectroscopy and reverse-transcriptase stop assay. Gene expression data demonstrated that these four lncRNAs can be potential prognostic biomarkers of cervical cancer. Two approaches were employed for identifying G4 specific protein partners for these lncRNAs and FMR2 was a potential interacting partner for all four clusters. We report a detailed investigation of G4 formation in lncRNAs that are dysregulated in cervical cancer. LncRNAs MEG3, CRNDE, LINP1 and SNHG20 are shown to influence cervical cancer progression and we report G4 specific protein partners for these lncRNAs. The protein partners and G4s predicted in lncRNAs can be exploited for theranostic objectives.
Collapse
Affiliation(s)
- Deepshikha Singh
- Department of Biological Engineering, Indian Institute of Technology Gandhinagar, Gandhinagar 382355, India; (D.S.); (N.D.); (V.S.)
| | - Nakshi Desai
- Department of Biological Engineering, Indian Institute of Technology Gandhinagar, Gandhinagar 382355, India; (D.S.); (N.D.); (V.S.)
| | - Viraj Shah
- Department of Biological Engineering, Indian Institute of Technology Gandhinagar, Gandhinagar 382355, India; (D.S.); (N.D.); (V.S.)
| | - Bhaskar Datta
- Department of Biological Engineering, Indian Institute of Technology Gandhinagar, Gandhinagar 382355, India; (D.S.); (N.D.); (V.S.)
- Department of Chemistry, Indian Institute of Technology Gandhinagar, Gandhinagar 382355, India
| |
Collapse
|
16
|
Ma H, Ding W, Chen Y, Zhou J, Chen W, Lan C, Mao H, Li Q, Yan W, Su H. Centromere Plasticity With Evolutionary Conservation and Divergence Uncovered by Wheat 10+ Genomes. Mol Biol Evol 2023; 40:msad176. [PMID: 37541261 PMCID: PMC10422864 DOI: 10.1093/molbev/msad176] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Revised: 06/26/2023] [Accepted: 07/28/2023] [Indexed: 08/06/2023] Open
Abstract
Centromeres (CEN) are the chromosomal regions that play a crucial role in maintaining genomic stability. The underlying highly repetitive DNA sequences can evolve quickly in most eukaryotes, and promote karyotype evolution. Despite their variability, it is not fully understood how these widely variable sequences ensure the homeostasis of centromere function. In this study, we investigated the genetics and epigenetics of CEN in a population of wheat lines from global breeding programs. We captured a high degree of sequences, positioning, and epigenetic variations in the large and complex wheat CEN. We found that most CENH3-associated repeats are Cereba element of retrotransposons and exhibit phylogenetic homogenization across different wheat lines, but the less-associated repeat sequences diverge on their own way in each wheat line, implying specific mechanisms for selecting certain repeat types as functional core CEN. Furthermore, we observed that CENH3 nucleosome structures display looser wrapping of DNA termini on complex centromeric repeats, including the repositioned CEN. We also found that strict CENH3 nucleosome positioning and intrinsic DNA features play a role in determining centromere identity among different lines. Specific non-B form DNAs were substantially associated with CENH3 nucleosomes for the repositioned centromeres. These findings suggest that multiple mechanisms were involved in the adaptation of CENH3 nucleosomes that can stabilize CEN. Ultimately, we proposed a remarkable epigenetic plasticity of centromere chromatin within the diverse genomic context, and the high robustness is crucial for maintaining centromere function and genome stability in wheat 10+ lines as a result of past breeding selections.
Collapse
Affiliation(s)
- Huan Ma
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, China
| | - Wentao Ding
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, China
| | - Yiqian Chen
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, China
| | - Jingwei Zhou
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, China
| | - Wei Chen
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, China
| | - Caixia Lan
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, China
| | - Hailiang Mao
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, China
| | - Qiang Li
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, China
| | - Wenhao Yan
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, China
| | - Handong Su
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, China
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| |
Collapse
|
17
|
Maryami F, Davoudi-Dehaghani E, Khalesi N, Rismani E, Rahimi H, Talebi S, Zeinali S. Identification and characterization of the largest deletion in the PCCA gene causing severe acute early-onset form of propionic acidemia. Mol Genet Genomics 2023; 298:905-917. [PMID: 37131081 DOI: 10.1007/s00438-023-02023-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 04/16/2023] [Indexed: 05/04/2023]
Abstract
Whole-exome sequencing (WES) is an excellent method for the diagnosis of diseases of uncertain or heterogeneous genetic origin. However, it has limitations for detecting structural variations such as InDels, which the bioinformatics analyzers must be aware of. This study aimed at using WES to evaluate the genetic cause of the metabolic crisis in a 3-day-old neonate admitted to the neonatal intensive care unit (NICU) and deceased after a few days. Tandem mass spectrometry (MS/MS) showed a significant increase in propionyl carnitine (C3), proposing methylmalonic acidemia (MMA) or propionic acidemia (PA). WES demonstrated a homozygous missense variant in exon 4 of the BTD gene (NM_000060.4(BTD):c.1330G > C), responsible for partial biotinidase deficiency. Segregation analysis of the BTD variant revealed the homozygous status of the asymptomatic mother. Furthermore, observation of the bam file, around genes responsible for PA or MMA, by Integrative Genomics Viewer (IGV) software displayed a homozygous large deletion in the PCCA gene. Comprehensive confirmatory studies identified and segregated a novel outframe deletion of 217,877 bp length, "NG_008768.1:g.185211_403087delinsTA", extended from intron 11 to 21 of the PCCA, inducing a premature termination codon and activation of nonsense-mediated mRNA decay (NMD). Homology modeling of the mutant PCCA demonstrated eliminating the protein's active site and critical functional domains. Thereupon, this novel variant is suggested as the largest deletion in the PCCA gene, causing an acute early-onset PA. These results could expand the PCCA variants spectrum, and improve the existing knowledge on the molecular basis of PA, as well as provide new evidence of pathogenicity of the variant (NM_000060.4(BTD):c.1330G > C.
Collapse
Affiliation(s)
- Fereshteh Maryami
- Department of Molecular Medicine, Biotechnology Research Center, Pasteur Institute of Iran, Pasteur St., Tehran, Iran
| | - Elham Davoudi-Dehaghani
- Department of Molecular Medicine, Biotechnology Research Center, Pasteur Institute of Iran, Pasteur St., Tehran, Iran
| | - Nasrin Khalesi
- Department of Pediatrics and Neonatal Intensive Care Unit, Ali-Asghar Children's Hospital, Iran University of Medical Sciences, Shahid Vahid Dastgerdi Street, Modarres Highway, Tehran, Iran.
| | - Elham Rismani
- Department of Molecular Medicine, Biotechnology Research Center, Pasteur Institute of Iran, Pasteur St., Tehran, Iran
| | - Hamzeh Rahimi
- Department of Molecular Medicine, Biotechnology Research Center, Pasteur Institute of Iran, Pasteur St., Tehran, Iran
- Texas Biomedical Research Center, San Antonio, USA
| | - Saeed Talebi
- Department of Medical Genetics and Molecular Biology, Faculty of Medicine, Iran University of Medical Sciences (IUMS), Tehran, Iran
- Department of Medical Genetics, Ali-Asghar Children's Hospital, Iran University of Medical Sciences, Tehran, Iran
| | - Sirous Zeinali
- Department of Molecular Medicine, Biotechnology Research Center, Pasteur Institute of Iran, Pasteur St., Tehran, Iran.
- Medical Genetics Lab, Kawsar Human Genetics Research Center, No. 41 Majlesi St., ValiAsr St., 1595645513, Tehran, Iran.
- Iranian Molecular Medicine Network, Pasteur Institute of Iran, Pasteur St, Tehran, Iran.
| |
Collapse
|
18
|
Zürcher JF, Kleefeldt AA, Funke LFH, Birnbaum J, Fredens J, Grazioli S, Liu KC, Spinck M, Petris G, Murat P, Rehm FBH, Sale JE, Chin JW. Continuous synthesis of E. coli genome sections and Mb-scale human DNA assembly. Nature 2023; 619:555-562. [PMID: 37380776 PMCID: PMC7614783 DOI: 10.1038/s41586-023-06268-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 05/26/2023] [Indexed: 06/30/2023]
Abstract
Whole-genome synthesis provides a powerful approach for understanding and expanding organism function1-3. To build large genomes rapidly, scalably and in parallel, we need (1) methods for assembling megabases of DNA from shorter precursors and (2) strategies for rapidly and scalably replacing the genomic DNA of organisms with synthetic DNA. Here we develop bacterial artificial chromosome (BAC) stepwise insertion synthesis (BASIS)-a method for megabase-scale assembly of DNA in Escherichia coli episomes. We used BASIS to assemble 1.1 Mb of human DNA containing numerous exons, introns, repetitive sequences, G-quadruplexes, and long and short interspersed nuclear elements (LINEs and SINEs). BASIS provides a powerful platform for building synthetic genomes for diverse organisms. We also developed continuous genome synthesis (CGS)-a method for continuously replacing sequential 100 kb stretches of the E. coli genome with synthetic DNA; CGS minimizes crossovers1,4 between the synthetic DNA and the genome such that the output for each 100 kb replacement provides, without sequencing, the input for the next 100 kb replacement. Using CGS, we synthesized a 0.5 Mb section of the E. coli genome-a key intermediate in its total synthesis1-from five episomes in 10 days. By parallelizing CGS and combining it with rapid oligonucleotide synthesis and episome assembly5,6, along with rapid methods for compiling a single genome from strains bearing distinct synthetic genome sections1,7,8, we anticipate that it will be possible to synthesize entire E. coli genomes from functional designs in less than 2 months.
Collapse
Affiliation(s)
- Jérôme F Zürcher
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | - Askar A Kleefeldt
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | - Louise F H Funke
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
- Department of Biomedical Engineering, National University of Singapore, Singapore, Singapore
| | - Jakob Birnbaum
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | - Julius Fredens
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
- Synthetic Biology for Clinical and Technological Innovation, Department of Biochemistry, National University of Singapore, Singapore, Singapore
| | - Simona Grazioli
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | - Kim C Liu
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | - Martin Spinck
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | - Gianluca Petris
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Pierre Murat
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | - Fabian B H Rehm
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | - Julian E Sale
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | - Jason W Chin
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK.
| |
Collapse
|
19
|
Hosseini M, Palmer A, Manka W, Grady PGS, Patchigolla V, Bi J, O'Neill RJ, Chi Z, Aguiar D. Deep statistical modelling of nanopore sequencing translocation times reveals latent non-B DNA structures. Bioinformatics 2023; 39:i242-i251. [PMID: 37387144 DOI: 10.1093/bioinformatics/btad220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Non-canonical (or non-B) DNA are genomic regions whose three-dimensional conformation deviates from the canonical double helix. Non-B DNA play an important role in basic cellular processes and are associated with genomic instability, gene regulation, and oncogenesis. Experimental methods are low-throughput and can detect only a limited set of non-B DNA structures, while computational methods rely on non-B DNA base motifs, which are necessary but not sufficient indicators of non-B structures. Oxford Nanopore sequencing is an efficient and low-cost platform, but it is currently unknown whether nanopore reads can be used for identifying non-B structures. RESULTS We build the first computational pipeline to predict non-B DNA structures from nanopore sequencing. We formalize non-B detection as a novelty detection problem and develop the GoFAE-DND, an autoencoder that uses goodness-of-fit (GoF) tests as a regularizer. A discriminative loss encourages non-B DNA to be poorly reconstructed and optimizing Gaussian GoF tests allows for the computation of P-values that indicate non-B structures. Based on whole genome nanopore sequencing of NA12878, we show that there exist significant differences between the timing of DNA translocation for non-B DNA bases compared with B-DNA. We demonstrate the efficacy of our approach through comparisons with novelty detection methods using experimental data and data synthesized from a new translocation time simulator. Experimental validations suggest that reliable detection of non-B DNA from nanopore sequencing is achievable. AVAILABILITY AND IMPLEMENTATION Source code is available at https://github.com/bayesomicslab/ONT-nonb-GoFAE-DND.
Collapse
Affiliation(s)
- Marjan Hosseini
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269-4155, United States
| | - Aaron Palmer
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269-4155, United States
| | - William Manka
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269-4155, United States
| | - Patrick G S Grady
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269-3003, United States
| | - Venkata Patchigolla
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269-4155, United States
| | - Jinbo Bi
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269-4155, United States
| | - Rachel J O'Neill
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269-3003, United States
| | - Zhiyi Chi
- Department of Statistics, University of Connecticut, Storrs, CT 06269-4120, United States
| | - Derek Aguiar
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269-4155, United States
| |
Collapse
|
20
|
Weissensteiner MH, Cremona MA, Guiblet WM, Stoler N, Harris RS, Cechova M, Eckert KA, Chiaromonte F, Huang YF, Makova KD. Accurate sequencing of DNA motifs able to form alternative (non-B) structures. Genome Res 2023; 33:907-922. [PMID: 37433640 PMCID: PMC10519405 DOI: 10.1101/gr.277490.122] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 05/04/2023] [Indexed: 07/13/2023]
Abstract
Approximately 13% of the human genome at certain motifs have the potential to form noncanonical (non-B) DNA structures (e.g., G-quadruplexes, cruciforms, and Z-DNA), which regulate many cellular processes but also affect the activity of polymerases and helicases. Because sequencing technologies use these enzymes, they might possess increased errors at non-B structures. To evaluate this, we analyzed error rates, read depth, and base quality of Illumina, Pacific Biosciences (PacBio) HiFi, and Oxford Nanopore Technologies (ONT) sequencing at non-B motifs. All technologies showed altered sequencing success for most non-B motif types, although this could be owing to several factors, including structure formation, biased GC content, and the presence of homopolymers. Single-nucleotide mismatch errors had low biases in HiFi and ONT for all non-B motif types but were increased for G-quadruplexes and Z-DNA in all three technologies. Deletion errors were increased for all non-B types but Z-DNA in Illumina and HiFi, as well as only for G-quadruplexes in ONT. Insertion errors for non-B motifs were highly, moderately, and slightly elevated in Illumina, HiFi, and ONT, respectively. Additionally, we developed a probabilistic approach to determine the number of false positives at non-B motifs depending on sample size and variant frequency, and applied it to publicly available data sets (1000 Genomes, Simons Genome Diversity Project, and gnomAD). We conclude that elevated sequencing errors at non-B DNA motifs should be considered in low-read-depth studies (single-cell, ancient DNA, and pooled-sample population sequencing) and in scoring rare variants. Combining technologies should maximize sequencing accuracy in future studies of non-B DNA.
Collapse
Affiliation(s)
| | - Marzia A Cremona
- Department of Operations and Decision Systems, Université Laval, Quebec, Quebec G1V0A6, Canada
- Population Health and Optimal Health Practices, CHU de Québec-Université Laval Research Center, Québec, Quebec G1V4G2, Canada
- Center for Medical Genomics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Wilfried M Guiblet
- Department of Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Laboratory of Cell Biology, NCI-CCR, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Nicholas Stoler
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Robert S Harris
- Department of Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Monika Cechova
- Department of Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Faculty of Informatics, Masaryk University, 60200 Brno, Czech Republic
| | - Kristin A Eckert
- Center for Medical Genomics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Pathology, The Pennsylvania State University, College of Medicine, Hershey, Pennsylvania 17033, USA
| | - Francesca Chiaromonte
- Center for Medical Genomics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Statistics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Institute of Economics and L'EMbeDS, Sant'Anna School of Advanced Studies, Pisa 56127, Italy
| | - Yi-Fei Huang
- Department of Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Center for Medical Genomics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Kateryna D Makova
- Department of Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA;
- Center for Medical Genomics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
21
|
Bastos CAC, Afreixo V, Rodrigues JMOS, Pinho AJ. Concentration of inverted repeats along human DNA. J Integr Bioinform 2023; 20:jib-2022-0052. [PMID: 37486620 PMCID: PMC10561070 DOI: 10.1515/jib-2022-0052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 02/27/2023] [Indexed: 07/25/2023] Open
Abstract
This work aims to describe the observed enrichment of inverted repeats in the human genome; and to identify and describe, with detailed length profiles, the regions with significant and relevant enriched occurrence of inverted repeats. The enrichment is assessed and tested with a recently proposed measure (z-scores based measure). We simulate a genome using an order 7 Markov model trained with the data from the real genome. The simulated genome is used to establish the critical values which are used as decision thresholds to identify the regions with significant enriched concentrations. Several human genome regions are highly enriched in the occurrence of inverted repeats. This is observed in all the human chromosomes. The distribution of inverted repeat lengths varies along the genome. The majority of the regions with severely exaggerated enrichment contain mainly short length inverted repeats. There are also regions with regular peaks along the inverted repeats lengths distribution (periodic regularities) and other regions with exaggerated enrichment for long lengths (less frequent). However, adjacent regions tend to have similar distributions.
Collapse
Affiliation(s)
- Carlos A. C. Bastos
- DETI – Department of Electronics, Telecommunications and Informatics, IEETA – Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, 3810-193Aveiro, Portugal
- LASI – Intelligent Systems Associate Laboratory, Aveiro, Portugal
| | - Vera Afreixo
- CIDMA – Center for Research and Development in Mathematics and Applications, DMAT – Department of Mathematics, University of Aveiro, 3810-193Aveiro, Portugal
| | - João M. O. S. Rodrigues
- DETI – Department of Electronics, Telecommunications and Informatics, IEETA – Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, 3810-193Aveiro, Portugal
- LASI – Intelligent Systems Associate Laboratory, Aveiro, Portugal
| | - Armando J. Pinho
- DETI – Department of Electronics, Telecommunications and Informatics, IEETA – Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, 3810-193Aveiro, Portugal
- LASI – Intelligent Systems Associate Laboratory, Aveiro, Portugal
| |
Collapse
|
22
|
Yang Y, Yang L. Somatic structural variation signatures in pediatric brain tumors. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.05.18.23290139. [PMID: 37292789 PMCID: PMC10246126 DOI: 10.1101/2023.05.18.23290139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Brain cancer is the leading cause of cancer-related death in children. Somatic structural variations (SVs), large scale alterations in DNA, remain poorly understood in pediatric brain tumors. Here, we detect a total of 13,199 high confidence somatic SVs in 744 whole-genome-sequenced pediatric brain tumors from Pediatric Brain Tumor Atlas. The somatic SV occurrences have tremendous diversity among the cohort and across different tumor types. We decompose mutational signatures of clustered complex SVs, non-clustered complex SVs, and simple SVs separately to infer the mutational mechanisms of SV formation. Our finding of many tumor types carrying unique sets of SV signatures suggests that distinct molecular mechanisms are active in different tumor types to shape genome instability. The patterns of somatic SV signatures in pediatric brain tumors are substantially different from those in adult cancers. The convergence of multiple signatures to alter several major cancer driver genes suggesting the functional importance of somatic SVs in disease progression.
Collapse
Affiliation(s)
- Yang Yang
- Ben May Department for Cancer Research, University of Chicago, Chicago, IL, USA
| | - Lixing Yang
- Ben May Department for Cancer Research, University of Chicago, Chicago, IL, USA
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- University of Chicago Comprehensive Cancer Center, Chicago, IL, USA
| |
Collapse
|
23
|
Xu Q, Kowalski J. NBBC: a non-B DNA burden explorer in cancer. Nucleic Acids Res 2023:7177884. [PMID: 37224529 DOI: 10.1093/nar/gkad379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 04/16/2023] [Accepted: 05/12/2023] [Indexed: 05/26/2023] Open
Abstract
Alternate (non-B) DNA-forming structures, such as Z-DNA, G-quadruplex, triplex have demonstrated a potential role in cancer etiology. It has been found that non-B DNA-forming sequences can stimulate genetic instability in human cancer genomes, implicating them in the development of cancer and other genetic diseases. While there exist several non-B prediction tools and databases, they lack the ability to both analyze and visualize non-B data within a cancer context. Herein, we introduce NBBC, a non-B DNA burden explorer in cancer, that offers analyses and visualizations for non-B DNA forming motifs. To do so, we introduce 'non-B burden' as a metric to summarize the prevalence of non-B DNA motifs at the gene-, signature- and genomic site-levels. Using our non-B burden metric, we developed two analyses modules within a cancer context to assist in exploring both gene- and motif-level non-B type heterogeneity among gene signatures. NBBC is designed to serve as a new analysis and visualization platform for the exploration of non-B DNA, guided by non-B burden as a novel marker.
Collapse
Affiliation(s)
- Qi Xu
- Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Jeanne Kowalski
- Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
24
|
Wang G, Vasquez KM. Dynamic alternative DNA structures in biology and disease. Nat Rev Genet 2023; 24:211-234. [PMID: 36316397 DOI: 10.1038/s41576-022-00539-9] [Citation(s) in RCA: 42] [Impact Index Per Article: 42.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/27/2022] [Indexed: 11/06/2022]
Abstract
Repetitive elements in the human genome, once considered 'junk DNA', are now known to adopt more than a dozen alternative (that is, non-B) DNA structures, such as self-annealed hairpins, left-handed Z-DNA, three-stranded triplexes (H-DNA) or four-stranded guanine quadruplex structures (G4 DNA). These dynamic conformations can act as functional genomic elements involved in DNA replication and transcription, chromatin organization and genome stability. In addition, recent studies have revealed a role for these alternative structures in triggering error-generating DNA repair processes, thereby actively enabling genome plasticity. As a driving force for genetic variation, non-B DNA structures thus contribute to both disease aetiology and evolution.
Collapse
Affiliation(s)
- Guliang Wang
- Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, Dell Paediatric Research Institute, Austin, TX, USA
| | - Karen M Vasquez
- Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, Dell Paediatric Research Institute, Austin, TX, USA.
| |
Collapse
|
25
|
Revisiting mutagenesis at non-B DNA motifs in the human genome. Nat Struct Mol Biol 2023; 30:417-424. [PMID: 36914796 DOI: 10.1038/s41594-023-00936-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 02/03/2023] [Indexed: 03/16/2023]
Abstract
Non-B DNA structures formed by repetitive sequence motifs are known instigators of mutagenesis in experimental systems. Analyzing this phenomenon computationally in the human genome requires careful disentangling of intrinsic confounding factors, including overlapping and interrupted motifs and recurrent sequencing errors. Here, we show that accounting for these factors eliminates all signals of repeat-induced mutagenesis that extend beyond the motif boundary, and eliminates or dramatically shrinks the magnitude of mutagenesis within some motifs, contradicting previous reports. Mutagenesis not attributable to artifacts revealed several biological mechanisms. Polymerase slippage generates frequent indels within every variety of short tandem repeat motif, implicating slipped-strand structures. Interruption-correcting single nucleotide variants within short tandem repeats may originate from error-prone polymerases. Secondary-structure formation promotes single nucleotide variants within palindromic repeats and duplications within direct repeats. G-quadruplex motifs cause recurrent sequencing errors, whereas mutagenesis at Z-DNAs is conspicuously absent.
Collapse
|
26
|
Specialized DNA Structures Act as Genomic Beacons for Integration by Evolutionarily Diverse Retroviruses. Viruses 2023; 15:v15020465. [PMID: 36851678 PMCID: PMC9962126 DOI: 10.3390/v15020465] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 02/04/2023] [Accepted: 02/06/2023] [Indexed: 02/10/2023] Open
Abstract
Retroviral integration site targeting is not random and plays a critical role in expression and long-term survival of the integrated provirus. To better understand the genomic environment surrounding retroviral integration sites, we performed a meta-analysis of previously published integration site data from evolutionarily diverse retroviruses, including new experimental data from HIV-1 subtypes A, B, C and D. We show here that evolutionarily divergent retroviruses exhibit distinct integration site profiles with strong preferences for integration near non-canonical B-form DNA (non-B DNA). We also show that in vivo-derived HIV-1 integration sites are significantly more enriched in transcriptionally silent regions and transcription-silencing non-B DNA features of the genome compared to in vitro-derived HIV-1 integration sites. Integration sites from individuals infected with HIV-1 subtype A, B, C or D viruses exhibited different preferences for common genomic and non-B DNA features. In addition, we identified several integration site hotspots shared between different HIV-1 subtypes, all of which were located in the non-B DNA feature slipped DNA. Together, these data show that although evolutionarily divergent retroviruses exhibit distinct integration site profiles, they all target non-B DNA for integration. These findings provide new insight into how retroviruses integrate into genomes for long-term survival.
Collapse
|
27
|
Fan C, Chen K, Wang Y, Ball EV, Stenson PD, Mort M, Bacolla A, Kehrer-Sawatzki H, Tainer JA, Cooper DN, Zhao H. Profiling human pathogenic repeat expansion regions by synergistic and multi-level impacts on molecular connections. Hum Genet 2023; 142:245-274. [PMID: 36344696 PMCID: PMC10290229 DOI: 10.1007/s00439-022-02500-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 10/24/2022] [Indexed: 11/09/2022]
Abstract
Whilst DNA repeat expansions cause numerous heritable human disorders, their origins and underlying pathological mechanisms are often unclear. We collated a dataset comprising 224 human repeat expansions encompassing 203 different genes, and performed a systematic analysis with respect to key topological features at the DNA, RNA and protein levels. Comparison with controls without known pathogenicity and genomic regions lacking repeats, allowed the construction of the first tool to discriminate repeat regions harboring pathogenic repeat expansions (DPREx). At the DNA level, pathogenic repeat expansions exhibited stronger signals for DNA regulatory factors (e.g. H3K4me3, transcription factor-binding sites) in exons, promoters, 5'UTRs and 5'genes but were not significantly different from controls in introns, 3'UTRs and 3'genes. Additionally, pathogenic repeat expansions were also found to be enriched in non-B DNA structures. At the RNA level, pathogenic repeat expansions were characterized by lower free energy for forming RNA secondary structure and were closer to splice sites in introns, exons, promoters and 5'genes than controls. At the protein level, pathogenic repeat expansions exhibited a preference to form coil rather than other types of secondary structure, and tended to encode surface-located protein domains. Guided by these features, DPREx ( http://biomed.nscc-gz.cn/zhaolab/geneprediction/# ) achieved an Area Under the Curve (AUC) value of 0.88 in a test on an independent dataset. Pathogenic repeat expansions are thus located such that they exert a synergistic influence on the gene expression pathway involving inter-molecular connections at the DNA, RNA and protein levels.
Collapse
Affiliation(s)
- Cong Fan
- Department of Medical Research Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, 107 Yan Jiang West Road, Guangzhou, 500001, People's Republic of China
| | - Ken Chen
- School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou, 500001, China
| | - Yukai Wang
- School of Life Science, Sun Yat-Sen University, Guangzhou, 500001, China
| | - Edward V Ball
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Peter D Stenson
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Matthew Mort
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Albino Bacolla
- Department of Molecular and Cellular Oncology, The University of Texas MD Anderson Cancer Center, 6767 Bertner Avenue, Houston, TX, 77030, USA
| | | | - John A Tainer
- Department of Molecular and Cellular Oncology, The University of Texas MD Anderson Cancer Center, 6767 Bertner Avenue, Houston, TX, 77030, USA
| | - David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Huiying Zhao
- Department of Medical Research Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, 107 Yan Jiang West Road, Guangzhou, 500001, People's Republic of China.
| |
Collapse
|
28
|
Ajoge HO, Renner TM, Bélanger K, Greig M, Dankar S, Kohio HP, Coleman MD, Ndashimye E, Arts EJ, Langlois MA, Barr SD. Antiretroviral APOBEC3 cytidine deaminases alter HIV-1 provirus integration site profiles. Nat Commun 2023; 14:16. [PMID: 36627271 PMCID: PMC9832166 DOI: 10.1038/s41467-022-35379-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 11/30/2022] [Indexed: 01/12/2023] Open
Abstract
APOBEC3 (A3) proteins are host-encoded deoxycytidine deaminases that provide an innate immune barrier to retroviral infection, notably against HIV-1. Low levels of deamination are believed to contribute to the genetic evolution of HIV-1, while intense catalytic activity of these proteins can induce catastrophic hypermutation in proviral DNA leading to near-total HIV-1 restriction. So far, little is known about how A3 cytosine deaminases might impact HIV-1 proviral DNA integration sites in human chromosomal DNA. Using a deep sequencing approach, we analyze the influence of catalytic active and inactive APOBEC3F and APOBEC3G on HIV-1 integration site selections. Here we show that DNA editing is detected at the extremities of the long terminal repeat regions of the virus. Both catalytic active and non-catalytic A3 mutants decrease insertions into gene coding sequences and increase integration sites into SINE elements, oncogenes and transcription-silencing non-B DNA features. Our data implicates A3 as a host factor influencing HIV-1 integration site selection and also promotes what appears to be a more latent expression profile.
Collapse
Affiliation(s)
- Hannah O Ajoge
- Western University, Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, London, ON, Canada
| | - Tyler M Renner
- Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Kasandra Bélanger
- Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Matthew Greig
- Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Samar Dankar
- Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Hinissan P Kohio
- Western University, Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, London, ON, Canada
| | - Macon D Coleman
- Western University, Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, London, ON, Canada
| | - Emmanuel Ndashimye
- Western University, Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, London, ON, Canada
| | - Eric J Arts
- Western University, Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, London, ON, Canada
| | - Marc-André Langlois
- Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada. .,Ottawa Center for Infection, Immunity and Inflammation (CI3), Ottawa, ON, Canada.
| | - Stephen D Barr
- Western University, Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, London, ON, Canada.
| |
Collapse
|
29
|
Non-B-form DNA tends to form in centromeric regions and has undergone changes in polyploid oat subgenomes. Proc Natl Acad Sci U S A 2023; 120:e2211683120. [PMID: 36574697 PMCID: PMC9910436 DOI: 10.1073/pnas.2211683120] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Centromeres are the specialized regions of the chromosomes that direct faithful chromosome segregation during cell division. Despite their functional conservation, centromeres display features of rapidly evolving DNA and wide evolutionary diversity in size and organization. Previous work found that the noncanonical B-form DNA structures are abundant in the centromeres of several eukaryotic species with a possible implication for centromere specification. Thus far, systematic studies into the organization and function of non-B-form DNA in plants remain scarce. Here, we applied the oat system to investigate the role of non-B-form DNA in centromeres. We conducted chromatin immunoprecipitation sequencing using an antibody to the centromere-specific histone H3 variant (CENH3); this accurately positioned oat centromeres with different ploidy levels and identified a series of centromere-specific sequences including minisatellites and retrotransposons. To define genetic characteristics of oat centromeres, we surveyed the repeat sequences and found that dyad symmetries were abundant in oat centromeres and were predicted to form non-B-DNA structures in vivo. These structures including bent DNA, slipped DNA, Z-DNA, G-quadruplexes, and R-loops were prone to form within CENH3-binding regions. Dynamic conformational changes of predicted non-B-DNA occurred during the evolution from diploid to tetraploid to hexaploid oat. Furthermore, we applied the single-molecule technique of AFM and DNA:RNA immunoprecipitation with deep sequencing to validate R-loop enrichment in oat centromeres. Centromeric retrotransposons exhibited strong associations with R-loop formation. Taken together, our study elucidates the fundamental character of non-B-form DNA in the oat genome and reveals its potential role in centromeres.
Collapse
|
30
|
Herbert A. Nucleosomes and flipons exchange energy to alter chromatin conformation, the readout of genomic information, and cell fate. Bioessays 2022; 44:e2200166. [DOI: 10.1002/bies.202200166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 09/24/2022] [Accepted: 09/28/2022] [Indexed: 11/27/2022]
|
31
|
Ajoge HO, Kohio HP, Paparisto E, Coleman MD, Wong K, Tom SK, Bain KL, Berry CC, Arts EJ, Barr SD. G-Quadruplex DNA and Other Non-Canonical B-Form DNA Motifs Influence Productive and Latent HIV-1 Integration and Reactivation Potential. Viruses 2022; 14:v14112494. [PMID: 36423103 PMCID: PMC9692945 DOI: 10.3390/v14112494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 11/04/2022] [Accepted: 11/08/2022] [Indexed: 11/16/2022] Open
Abstract
The integration of the HIV-1 genome into the host genome is an essential step in the life cycle of the virus and it plays a critical role in the expression, long-term persistence, and reactivation of HIV expression. To better understand the local genomic environment surrounding HIV-1 proviruses, we assessed the influence of non-canonical B-form DNA (non-B DNA) on the HIV-1 integration site selection. We showed that productively and latently infected cells exhibit different integration site biases towards non-B DNA motifs. We identified a correlation between the integration sites of the latent proviruses and non-B DNA features known to potently influence gene expression (e.g., cruciform, guanine-quadruplex (G4), triplex, and Z-DNA). The reactivation potential of latent proviruses with latency reversal agents also correlated with their proximity to specific non-B DNA motifs. The perturbation of G4 structures in vitro using G4 structure-destabilizing or -stabilizing ligands resulted in a significant reduction in integration within 100 base pairs of G4 motifs. The stabilization of G4 structures increased the integration within 300-500 base pairs from G4 motifs, increased integration near transcription start sites, and increased the proportion of latently infected cells. Moreover, we showed that host lens epithelium-derived growth factor (LEDGF)/p75 and cleavage and polyadenylation specificity factor 6 (CPSF6) influenced the distribution of integration sites near several non-B DNA motifs, especially G4 DNA. Our findings identify non-B DNA motifs as important factors that influence productive and latent HIV-1 integration and the reactivation potential of latent proviruses.
Collapse
Affiliation(s)
- Hannah O. Ajoge
- Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, Western University, Dental Sciences Building Room 3007, London, ON N6A 5C1, Canada
| | - Hinissan P. Kohio
- Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, Western University, Dental Sciences Building Room 3007, London, ON N6A 5C1, Canada
| | - Ermela Paparisto
- Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, Western University, Dental Sciences Building Room 3007, London, ON N6A 5C1, Canada
| | - Macon D. Coleman
- Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, Western University, Dental Sciences Building Room 3007, London, ON N6A 5C1, Canada
| | - Kemen Wong
- Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, Western University, Dental Sciences Building Room 3007, London, ON N6A 5C1, Canada
| | - Sean K. Tom
- Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, Western University, Dental Sciences Building Room 3007, London, ON N6A 5C1, Canada
| | - Katie L. Bain
- Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, Western University, Dental Sciences Building Room 3007, London, ON N6A 5C1, Canada
| | - Charles C. Berry
- Department of Family Medicine and Public Health, University of California San Diego, La Jolla, CA 92093, USA
| | - Eric J. Arts
- Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, Western University, Dental Sciences Building Room 3007, London, ON N6A 5C1, Canada
| | - Stephen D. Barr
- Schulich School of Medicine and Dentistry, Department of Microbiology and Immunology, Western University, Dental Sciences Building Room 3007, London, ON N6A 5C1, Canada
- Correspondence:
| |
Collapse
|
32
|
Du H, Jolly A, Grochowski CM, Yuan B, Dawood M, Jhangiani SN, Li H, Muzny D, Fatih JM, Coban-Akdemir Z, Carlin ME, Scheuerle AE, Witzl K, Posey JE, Pendleton M, Harrington E, Juul S, Hastings PJ, Bi W, Gibbs RA, Sedlazeck FJ, Lupski JR, Carvalho CMB, Liu P. The multiple de novo copy number variant (MdnCNV) phenomenon presents with peri-zygotic DNA mutational signatures and multilocus pathogenic variation. Genome Med 2022; 14:122. [PMID: 36303224 PMCID: PMC9609164 DOI: 10.1186/s13073-022-01123-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 10/10/2022] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND The multiple de novo copy number variant (MdnCNV) phenotype is described by having four or more constitutional de novo CNVs (dnCNVs) arising independently throughout the human genome within one generation. It is a rare peri-zygotic mutational event, previously reported to be seen once in every 12,000 individuals referred for genome-wide chromosomal microarray analysis due to congenital abnormalities. These rare families provide a unique opportunity to understand the genetic factors of peri-zygotic genome instability and the impact of dnCNV on human diseases. METHODS Chromosomal microarray analysis (CMA), array-based comparative genomic hybridization, short- and long-read genome sequencing (GS) were performed on the newly identified MdnCNV family to identify de novo mutations including dnCNVs, de novo single-nucleotide variants (dnSNVs), and indels. Short-read GS was performed on four previously published MdnCNV families for dnSNV analysis. Trio-based rare variant analysis was performed on the newly identified individual and four previously published MdnCNV families to identify potential genetic etiologies contributing to the peri-zygotic genomic instability. Lin semantic similarity scores informed quantitative human phenotype ontology analysis on three MdnCNV families to identify gene(s) driving or contributing to the clinical phenotype. RESULTS In the newly identified MdnCNV case, we revealed eight de novo tandem duplications, each ~ 1 Mb, with microhomology at 6/8 breakpoint junctions. Enrichment of de novo single-nucleotide variants (SNV; 6/79) and de novo indels (1/12) was found within 4 Mb of the dnCNV genomic regions. An elevated post-zygotic SNV mutation rate was observed in MdnCNV families. Maternal rare variant analyses identified three genes in distinct families that may contribute to the MdnCNV phenomenon. Phenotype analysis suggests that gene(s) within dnCNV regions contribute to the observed proband phenotype in 3/3 cases. CNVs in two cases, a contiguous gene duplication encompassing PMP22 and RAI1 and another duplication affecting NSD1 and SMARCC2, contribute to the clinically observed phenotypic manifestations. CONCLUSIONS Characteristic features of dnCNVs reported here are consistent with a microhomology-mediated break-induced replication (MMBIR)-driven mechanism during the peri-zygotic period. Maternal genetic variants in DNA repair genes potentially contribute to peri-zygotic genomic instability. Variable phenotypic features were observed across a cohort of three MdnCNV probands, and computational quantitative phenotyping revealed that two out of three had evidence for the contribution of more than one genetic locus to the proband's phenotype supporting the hypothesis of de novo multilocus pathogenic variation (MPV) in those families.
Collapse
Affiliation(s)
- Haowei Du
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Angad Jolly
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Christopher M Grochowski
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Bo Yuan
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Baylor Genetics Laboratory, Houston, TX, 77021, USA
- Seattle Children's Hospital, Seattle, WA, 98105, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Moez Dawood
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, TX, 77030, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Shalini N Jhangiani
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - He Li
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Donna Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Jawid M Fatih
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Zeynep Coban-Akdemir
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Mary Esther Carlin
- Division of Genetics and Metabolism, Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Angela E Scheuerle
- Division of Genetics and Metabolism, Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
- Division of Genetics Diagnostics, Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
- McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Karin Witzl
- Clinical Institute of Medical Genetics, University Medical Centre Ljubljana, 1000, Ljubljana, Slovenia
- Medical Faculty, University of Ljubljana, 1000, Ljubljana, Slovenia
| | - Jennifer E Posey
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | | | | | - Sissel Juul
- Oxford Nanopore Technologies Inc, New York, NY, 10013, USA
| | - P J Hastings
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Dan L. Duncan Comprehensive Cancer Center, BCM, Houston, TX, 77030, USA
| | - Weimin Bi
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Baylor Genetics Laboratory, Houston, TX, 77021, USA
| | - Richard A Gibbs
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Fritz J Sedlazeck
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - James R Lupski
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA.
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA.
- Texas Children's Hospital, Houston, TX, 77030, USA.
| | - Claudia M B Carvalho
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Pacific Northwest Research Institute, 720 Broadway, Seattle, WA, 98122, USA.
| | - Pengfei Liu
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Baylor Genetics Laboratory, Houston, TX, 77021, USA.
| |
Collapse
|
33
|
Shi X, Teng H, Sun Z. An updated overview of experimental and computational approaches to identify non-canonical DNA/RNA structures with emphasis on G-quadruplexes and R-loops. Brief Bioinform 2022; 23:6751149. [PMID: 36208174 PMCID: PMC9677470 DOI: 10.1093/bib/bbac441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 08/22/2022] [Accepted: 09/13/2022] [Indexed: 12/14/2022] Open
Abstract
Multiple types of non-canonical nucleic acid structures play essential roles in DNA recombination and replication, transcription, and genomic instability and have been associated with several human diseases. Thus, an increasing number of experimental and bioinformatics methods have been developed to identify these structures. To date, most reviews have focused on the features of non-canonical DNA/RNA structure formation, experimental approaches to mapping these structures, and the association of these structures with diseases. In addition, two reviews of computational algorithms for the prediction of non-canonical nucleic acid structures have been published. One of these reviews focused only on computational approaches for G4 detection until 2020. The other mainly summarized the computational tools for predicting cruciform, H-DNA and Z-DNA, in which the algorithms discussed were published before 2012. Since then, several experimental and computational methods have been developed. However, a systematic review including the conformation, sequencing mapping methods and computational prediction strategies for these structures has not yet been published. The purpose of this review is to provide an updated overview of conformation, current sequencing technologies and computational identification methods for non-canonical nucleic acid structures, as well as their strengths and weaknesses. We expect that this review will aid in understanding how these structures are characterised and how they contribute to related biological processes and diseases.
Collapse
Affiliation(s)
- Xiaohui Shi
- Key Laboratory of Clinical Laboratory Diagnosis and Translational Research of Zhejiang Province, The first Affiliated Hospital of WMU; Beijing Institutes of Life Science, Chinese Academy of Sciences; University of Chinese Academy of Sciences, Ouhai District, Wenzhou 325000, China
| | - Huajing Teng
- Department of Radiation Oncology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education) at Peking University Cancer Hospital and Institute, Ouhai District, Wenzhou 325000, China
| | - Zhongsheng Sun
- Corresponding author: Zhongsheng Sun, Key Laboratory of Clinical Laboratory Diagnosis and Translational Research of Zhejiang Province, The 1st Affiliated Hospital of WMU, Nanbaixiang Wenyi Yiyuan Xinyuan District, Ouhai District, Wenzhou 325000, China. E-mail:
| |
Collapse
|
34
|
Guilbaud G, Murat P, Wilkes HS, Lerner LK, Sale JE, Krude T. Determination of human DNA replication origin position and efficiency reveals principles of initiation zone organisation. Nucleic Acids Res 2022; 50:7436-7450. [PMID: 35801867 PMCID: PMC9303276 DOI: 10.1093/nar/gkac555] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 06/14/2022] [Accepted: 06/20/2022] [Indexed: 12/16/2022] Open
Abstract
Replication of the human genome initiates within broad zones of ∼150 kb. The extent to which firing of individual DNA replication origins within initiation zones is spatially stochastic or localised at defined sites remains a matter of debate. A thorough characterisation of the dynamic activation of origins within initiation zones is hampered by the lack of a high-resolution map of both their position and efficiency. To address this shortcoming, we describe a modification of initiation site sequencing (ini-seq), based on density substitution. Newly replicated DNA is rendered 'heavy-light' (HL) by incorporation of BrdUTP while unreplicated DNA remains 'light-light' (LL). Replicated HL-DNA is separated from unreplicated LL-DNA by equilibrium density gradient centrifugation, then both fractions are subjected to massive parallel sequencing. This allows precise mapping of 23,905 replication origins simultaneously with an assignment of a replication initiation efficiency score to each. We show that origin firing within early initiation zones is not randomly distributed. Rather, origins are arranged hierarchically with a set of very highly efficient origins marking zone boundaries. We propose that these origins explain much of the early firing activity arising within initiation zones, helping to unify the concept of replication initiation zones with the identification of discrete replication origin sites.
Collapse
Affiliation(s)
- Guillaume Guilbaud
- Division of Protein and Nucleic Acid Chemistry, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Pierre Murat
- Division of Protein and Nucleic Acid Chemistry, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Helen S Wilkes
- Department of Zoology, University of Cambridge, Downing Street, Cambridge, CB2 3EJ, UK
| | - Leticia Koch Lerner
- Division of Protein and Nucleic Acid Chemistry, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Julian E Sale
- Division of Protein and Nucleic Acid Chemistry, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Torsten Krude
- Department of Zoology, University of Cambridge, Downing Street, Cambridge, CB2 3EJ, UK
| |
Collapse
|
35
|
APOBEC mutagenesis is low in most types of non-B DNA structures. iScience 2022; 25:104535. [PMID: 35754742 PMCID: PMC9213766 DOI: 10.1016/j.isci.2022.104535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 04/22/2022] [Accepted: 06/01/2022] [Indexed: 11/22/2022] Open
Abstract
While somatic mutations are known to be enriched in genome regions with non-canonical DNA secondary structure, the impact of particular mutagens still needs to be elucidated. Here, we demonstrate that in human cancers, the APOBEC mutagenesis is not enriched in direct repeats, mirror repeats, short tandem repeats, and G-quadruplexes, and even decreased below its level in B-DNA for cancer samples with very high APOBEC activity. In contrast, we observe that the APOBEC-induced mutational density is positively associated with APOBEC activity in inverted repeats (cruciform structures), where the impact of cytosine at the 3’-end of the hairpin loop is substantial. Surprisingly, the APOBEC-signature mutation density per TC motif in the single-stranded DNA of a G-quadruplex (G4) is lower than in the four-stranded part of G4 and in B-DNA. The APOBEC mutagenesis, as well as the UV-mutagenesis in melanoma samples, are absent in Z-DNA regions, owing to the depletion of their mutational signature motifs. APOBEC mutagenesis is not enriched in most non-canonical DNA structures Inverted repeats (cruciform structures) show increased APOBEC mutagenesis G-quadruplex’s unstructured strand has low APOBEC-induced mutation density Decrease of APOBEC mutagenesis in non-B DNA possibly associated with PrimPol
Collapse
|
36
|
Georgakopoulos-Soares I, Parada GE, Wong HY, Medhi R, Furlan G, Munita R, Miska EA, Kwok CK, Hemberg M. Alternative splicing modulation by G-quadruplexes. Nat Commun 2022; 13:2404. [PMID: 35504902 PMCID: PMC9065059 DOI: 10.1038/s41467-022-30071-7] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Accepted: 03/30/2022] [Indexed: 12/14/2022] Open
Abstract
Alternative splicing is central to metazoan gene regulation, but the regulatory mechanisms are incompletely understood. Here, we show that G-quadruplex (G4) motifs are enriched ~3-fold near splice junctions. The importance of G4s in RNA is emphasised by a higher enrichment for the non-template strand. RNA-seq data from mouse and human neurons reveals an enrichment of G4s at exons that were skipped following depolarisation induced by potassium chloride. We validate the formation of stable RNA G4s for three candidate splice sites by circular dichroism spectroscopy, UV-melting and fluorescence measurements. Moreover, we find that sQTLs are enriched at G4s, and a minigene experiment provides further support for their role in promoting exon inclusion. Analysis of >1,800 high-throughput experiments reveals multiple RNA binding proteins associated with G4s. Finally, exploration of G4 motifs across eleven species shows strong enrichment at splice sites in mammals and birds, suggesting an evolutionary conserved splice regulatory mechanism. Here the authors shows that G-quadruplexes, non-canonical DNA/RNA structures, can have a direct impact on alternative splicing and that binding of splicing regulators is affected by their presence.
Collapse
Affiliation(s)
- Ilias Georgakopoulos-Soares
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK.,Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, 94158, USA
| | - Guillermo E Parada
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK.,Wellcome Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN, UK.,Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK.,Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, M5S 3E1, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON, M5A 1A8, Canada
| | - Hei Yuen Wong
- Department of Chemistry and State Key Laboratory of Marine Pollution, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
| | - Ragini Medhi
- Wellcome Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN, UK.,Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK
| | - Giulia Furlan
- Wellcome Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN, UK.,Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK
| | - Roberto Munita
- Division of Molecular Hematology, Department of Laboratory Medicine, Lund Stem Cell Center, Faculty of Medicine, Lund University, Lund, Sweden
| | - Eric A Miska
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK.,Wellcome Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN, UK.,Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK
| | - Chun Kit Kwok
- Department of Chemistry and State Key Laboratory of Marine Pollution, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China.,Shenzhen Research Institute of City University of Hong Kong, Shenzhen, China
| | - Martin Hemberg
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK. .,Wellcome Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN, UK. .,Evergrande Center for Immunologic Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA, 02115, USA.
| |
Collapse
|
37
|
Georgakopoulos-Soares I, Victorino J, Parada GE, Agarwal V, Zhao J, Wong HY, Umar MI, Elor O, Muhwezi A, An JY, Sanders SJ, Kwok CK, Inoue F, Hemberg M, Ahituv N. High-throughput characterization of the role of non-B DNA motifs on promoter function. CELL GENOMICS 2022; 2:100111. [PMID: 35573091 PMCID: PMC9105345 DOI: 10.1016/j.xgen.2022.100111] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 10/21/2021] [Accepted: 02/18/2022] [Indexed: 12/24/2022]
Abstract
lternative DNA conformations, termed non-B DNA structures, can affect transcription, but the underlying mechanisms and their functional impact have not been systematically characterized. Here, we used computational genomic analyses coupled with massively parallel reporter assays (MPRAs) to show that certain non-B DNA structures have a substantial effect on gene expression. Genomic analyses found that non-B DNA structures at promoters harbor an excess of germline variants. Analysis of multiple MPRAs, including a promoter library specifically designed to perturb non-B DNA structures, functionally validated that Z-DNA can significantly affect promoter activity. We also observed that biophysical properties of non-B DNA motifs, such as the length of Z-DNA motifs and the orientation of G-quadruplex structures relative to transcriptional direction, have a significant effect on promoter activity. Combined, their higher mutation rate and functional effect on transcription implicate a subset of non-B DNA motifs as major drivers of human gene-expression-associated phenotypes.
Collapse
Affiliation(s)
- Ilias Georgakopoulos-Soares
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Jesus Victorino
- Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), 28029 Madrid, Spain
- Departamento de Bioquímica, Facultad de Medicina, Universidad Autónoma de Madrid (UAM), 28029 Madrid, Spain
| | - Guillermo E. Parada
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
- Wellcome Trust Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK
| | | | - Jingjing Zhao
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Hei Yuen Wong
- Department of Chemistry and State Key Laboratory of Marine Pollution, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
| | - Mubarak Ishaq Umar
- Department of Chemistry and State Key Laboratory of Marine Pollution, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
| | - Orry Elor
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Allan Muhwezi
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Joon-Yong An
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
- School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul, Republic of Korea
| | - Stephan J. Sanders
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| | - Chun Kit Kwok
- Department of Chemistry and State Key Laboratory of Marine Pollution, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
- Shenzhen Research Institute of City University of Hong Kong, Shenzhen, China
| | - Fumitaka Inoue
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Martin Hemberg
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
- Wellcome Trust Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| |
Collapse
|
38
|
Vanaja A, Yella VR. Delineation of the DNA Structural Features of Eukaryotic Core Promoter Classes. ACS OMEGA 2022; 7:5657-5669. [PMID: 35224327 PMCID: PMC8867553 DOI: 10.1021/acsomega.1c04603] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 01/27/2022] [Indexed: 05/02/2023]
Abstract
The eukaryotic transcription is orchestrated from a chunk of the DNA region stated as the core promoter. Multifarious and punctilious core promoter signals, viz., TATA-box, Inr, BREs, and Pause Button, are associated with a subset of genes and regulate their spatiotemporal expression. However, the core promoter architecture linked with these signals has not been investigated exhaustively for several species. In this study, we attempted to envisage the adaptive binding landscape of the transcription initiation machinery as a function of DNA structure. To this end, we deployed a set of k-mer based DNA structural estimates and regular expression models derived from experiments, molecular dynamic simulations, and theoretical frameworks, and high-throughout promoter data sets retrieved from the eukaryotic promoter database. We categorized protein-coding gene core promoters based on characteristic motifs at precise locations and analyzed the B-DNA structural properties and non-B-DNA structural motifs for 15 different eukaryotic genomes. We observed that Inr, BREd, and no-motif classes display common patterns of DNA sequence and structural environment. TATA-containing, BREu, and Pause Button classes show a deviant behavior with the TATA class displaying varied axial and twisting flexibility while BREu and Pause Button leaned toward G-quadruplex motif enrichment. Intriguingly, DNA meltability and shape signals are conserved irrespective of the presence or absence of distinct core promoter motifs in the majority of species. Altogether, here we delineated the conserved DNA structural signals associated with several promoter classes that may contribute to the chromatin configuration, orchestration of transcription machinery, and DNA duplex melting during the transcription process.
Collapse
Affiliation(s)
- Akkinepally Vanaja
- Department
of Biotechnology, Koneru Lakshmaiah Education
Foundation, Vaddeswaram, Guntur 522502, Andhra
Pradesh, India
- KL
College of Pharmacy, Koneru Lakshmaiah Education
Foundation, Vaddeswaram, Guntur 522502, Andhra
Pradesh, India
| | - Venkata Rajesh Yella
- Department
of Biotechnology, Koneru Lakshmaiah Education
Foundation, Vaddeswaram, Guntur 522502, Andhra
Pradesh, India
- . Tel: +91-863-2399999, Extn-1021. Website: https://www.kluniversity.in/bt/faculty-list.aspx
| |
Collapse
|
39
|
Lyu R, Wu T, Zhu AC, West-Szymanski DC, Weng X, Chen M, He C. KAS-seq: genome-wide sequencing of single-stranded DNA by N 3-kethoxal-assisted labeling. Nat Protoc 2022; 17:402-420. [PMID: 35013616 PMCID: PMC8923001 DOI: 10.1038/s41596-021-00647-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 09/22/2021] [Indexed: 02/03/2023]
Abstract
Transcription and its dynamics are crucial for gene expression regulation. However, very few methods can directly read out transcriptional activity with low-input material and high temporal resolution. This protocol describes KAS-seq, a robust and sensitive approach for capturing genome-wide single-stranded DNA (ssDNA) profiles using N3-kethoxal-assisted labeling. We developed N3-kethoxal, an azido derivative of kethoxal that reacts with deoxyguanosine bases of ssDNA in live cells within 5-10 min at 37 °C, allowing the capture of dynamic changes. Downstream biotinylation of labeled DNA occurs via copper-free click chemistry. Altogether, the KAS-seq procedure involves N3-kethoxal labeling, DNA isolation, biotinylation, fragmentation, affinity pull-down, library preparation, sequencing and bioinformatics analysis. The pre-library construction labeling and enrichment can be completed in as little as 3-4 h and is applicable to both animal tissue and as few as 1,000 cultured cells. Our recent study shows that ssDNA signals measured by KAS-seq simultaneously reveal the dynamics of transcriptionally engaged RNA polymerase (Pol) II, transcribing enhancers, RNA Pol I and Pol III activities and potentially non-canonical DNA structures with high analytical sensitivity. In addition to the experimental protocol, we also introduce here KAS-pipe, a user-friendly integrative data analysis pipeline for KAS-seq.
Collapse
Affiliation(s)
- Ruitu Lyu
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL, USA
| | - Tong Wu
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL, USA
| | - Allen C Zhu
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL, USA
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL, USA
- Medical Scientist Training Program, The University of Chicago, Chicago, IL, USA
| | | | - Xiaocheng Weng
- College of Chemistry and Molecular Sciences, Hubei Province Key Laboratory of Allergy and Immunology, Wuhan University, Wuhan, Hubei, China
| | - Mengjie Chen
- Department of Medicine, The University of Chicago, Chicago, IL, USA
- Department of Human Genetics, The University of Chicago, Chicago, IL, USA
| | - Chuan He
- Department of Chemistry, Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL, USA.
- Howard Hughes Medical Institute, The University of Chicago, Chicago, IL, USA.
| |
Collapse
|
40
|
Stefos GC, Theodorou G, Politis I. Genomic landscape, polymorphism and possible LINE-associated delivery of G-quadruplex motifs in the bovine genes. Genomics 2022; 114:110272. [PMID: 35092818 DOI: 10.1016/j.ygeno.2022.110272] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 01/12/2022] [Accepted: 01/18/2022] [Indexed: 12/29/2022]
Abstract
G-Quadruplex structures are non-B DNA structures that occur in regions carrying short runs of guanines. They are implicated in several biological processes including transcription, translation, replication and telomere maintenance as well as in several pathological conditions like cancer and thus they have gained the attention of the scientific community. The rise of the -omics era significantly affected the G-quadruplex research and the genome-wide characterization of G-Quadruplexes has been rendered a necessary first step towards applying genomics approaches for their study. While in human and several model organisms there is a considerable number of works studying genome-wide the DNA motifs with potential to form G-quadruplexes (G4-motifs), there is a total absence of any similar studies regarding livestock animals. The objectives of the present study were to provide a detailed characterization of the bovine genic G4-motifs' distribution and properties and to suggest a possible mechanism for the delivery of G4 motifs in the genes. Our data indicate that the distribution of G4-motifs within bovine genes and the annotation of said genes to Gene Ontology terms are similar to what is already shown for other organisms. By investigating their structural characteristics and polymorphism, it is obvious that the overall stability of the putative quadruplex structures is in line with the current notion in the G4 field. Similarly to human, the bovine G4-motifs are overrepresented in specific LINE repeat elements, the L1_BTs in the case of cattle. We highlight the potential role of these elements as vehicles for delivery of G4 motifs in the introns of the bovine genes. Lastly, it seems that a basis exists for connecting traits of agricultural importance to the genetic variation of G4 motifs, thus, the value of cattle as an interesting new model organism for G4-related genetic studies might be worth to be investigated.
Collapse
Affiliation(s)
- Georgios C Stefos
- Agricultural University of Athens, Department of Animal Science, Laboratory of Animal Breeding & Husbandry, 75 Iera Odos, 118 55, Athens, Greece.
| | - Georgios Theodorou
- Agricultural University of Athens, Department of Animal Science, Laboratory of Animal Breeding & Husbandry, 75 Iera Odos, 118 55, Athens, Greece.
| | - Ioannis Politis
- Agricultural University of Athens, Department of Animal Science, Laboratory of Animal Breeding & Husbandry, 75 Iera Odos, 118 55, Athens, Greece
| |
Collapse
|
41
|
Searching for New Z-DNA/Z-RNA Binding Proteins Based on Structural Similarity to Experimentally Validated Zα Domain. Int J Mol Sci 2022; 23:ijms23020768. [PMID: 35054954 PMCID: PMC8775963 DOI: 10.3390/ijms23020768] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 01/03/2022] [Accepted: 01/05/2022] [Indexed: 11/17/2022] Open
Abstract
Z-DNA and Z-RNA are functionally important left-handed structures of nucleic acids, which play a significant role in several molecular and biological processes including DNA replication, gene expression regulation and viral nucleic acid sensing. Most proteins that have been proven to interact with Z-DNA/Z-RNA contain the so-called Zα domain, which is structurally well conserved. To date, only eight proteins with Zα domain have been described within a few organisms (including human, mouse, Danio rerio, Trypanosoma brucei and some viruses). Therefore, this paper aimed to search for new Z-DNA/Z-RNA binding proteins in the complete PDB structures database and from the AlphaFold2 protein models. A structure-based similarity search found 14 proteins with highly similar Zα domain structure in experimentally-defined proteins and 185 proteins with a putative Zα domain using the AlphaFold2 models. Structure-based alignment and molecular docking confirmed high functional conservation of amino acids involved in Z-DNA/Z-RNA, suggesting that Z-DNA/Z-RNA recognition may play an important role in a variety of cellular processes.
Collapse
|
42
|
Bacolla A, Tainer JA. Robust Computational Approaches to Defining Insights on the Interface of DNA Repair with Replication and Transcription in Cancer. Methods Mol Biol 2022; 2444:1-13. [PMID: 35290628 PMCID: PMC9377048 DOI: 10.1007/978-1-0716-2063-2_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The massive amount of experimental DNA and RNA sequence information provides an encyclopedia for cell biology that requires computational tools for efficient interpretation. The ability to write and apply simple computing scripts propels the investigator beyond the boundaries of online analysis tools to more broadly interrogate laboratory experimental data and to integrate them with all available datasets to test and challenge hypotheses. Here we describe robust prototypic bash and C++ scripts with metrics and methods for validation that we have made publicly available to address the roles of non-B DNA-forming motifs in eliciting genetic instability and to query The Cancer Genome Atlas. Importantly, the methods presented provide practical data interpretation tools to examine fundamental relationships and to enable insights and correlations between alterations in gene expression patterns and patient outcome. The exemplary source codes described are simple and can be efficiently modified, elaborated, and applied to other relationships and areas of investigation.
Collapse
Affiliation(s)
- Albino Bacolla
- Departments of Cancer Biology and of Molecular and Cellular Oncology, The University of Texas M.D. Anderson Cancer Center, Houston, TX, USA.
| | - John A Tainer
- Departments of Cancer Biology and of Molecular and Cellular Oncology, The University of Texas M.D. Anderson Cancer Center, Houston, TX, USA.
| |
Collapse
|
43
|
Qi M, Stenson PD, Ball EV, Tainer JA, Bacolla A, Kehrer-Sawatzki H, Cooper DN, Zhao H. Distinct sequence features underlie microdeletions and gross deletions in the human genome. Hum Mutat 2021; 43:328-346. [PMID: 34918412 PMCID: PMC9069542 DOI: 10.1002/humu.24314] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Revised: 11/02/2021] [Accepted: 12/14/2021] [Indexed: 11/18/2022]
Abstract
Microdeletions and gross deletions are important causes (~20%) of human inherited disease and their genomic locations are strongly influenced by the local DNA sequence environment. This notwithstanding, no study has systematically examined their underlying generative mechanisms. Here, we obtained 42,098 pathogenic microdeletions and gross deletions from the Human Gene Mutation Database (HGMD) that together form a continuum of germline deletions ranging in size from 1 to 28,394,429 bp. We analyzed the DNA sequence within 1 kb of the breakpoint junctions and found that the frequencies of non‐B DNA‐forming repeats, GC‐content, and the presence of seven of 78 specific sequence motifs in the vicinity of pathogenic deletions correlated with deletion length for deletions of length ≤30 bp. Further, we found that the presence of DR, GQ, and STR repeats is important for the formation of longer deletions (>30 bp) but not for the formation of shorter deletions (≤30 bp) while significantly (χ2, p < 2E−16) more microhomologies were identified flanking short deletions than long deletions (length >30 bp). We provide evidence to support a functional distinction between microdeletions and gross deletions. Finally, we propose that a deletion length cut‐off of 25–30 bp may serve as an objective means to functionally distinguish microdeletions from gross deletions.
Collapse
Affiliation(s)
- Mengling Qi
- Department of Medical Research Center, Sun Yat-sen Memorial Hospital; Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangzhou, China
| | - Peter D Stenson
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Edward V Ball
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - John A Tainer
- Departments of Cancer Biology and of Molecular and Cellular Oncology, University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Albino Bacolla
- Departments of Cancer Biology and of Molecular and Cellular Oncology, University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | | | - David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Huiying Zhao
- Department of Medical Research Center, Sun Yat-sen Memorial Hospital; Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangzhou, China
| |
Collapse
|
44
|
Medium levels of transcription and replication related chromosomal instability are associated with poor clinical outcome. Sci Rep 2021; 11:23429. [PMID: 34873180 PMCID: PMC8648741 DOI: 10.1038/s41598-021-02787-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 11/08/2021] [Indexed: 11/09/2022] Open
Abstract
Genomic instability (GI) influences treatment efficacy and resistance, and an accurate measure of it is lacking. Current measures of GI are based on counts of specific structural variation (SV) and mutational signatures. Here, we present a holistic approach to measuring GI based on the quantification of the steady-state equilibrium between DNA damage and repair as assessed by the residual breakpoints (BP) remaining after repair, irrespective of SV type. We use the notion of Hscore, a BP "hotspotness" magnitude scale, to measure the propensity of genomic structural or functional DNA elements to break more than expected by chance. We then derived new measures of transcription- and replication-associated GI that we call iTRAC (transcription-associated chromosomal instability index) and iRACIN (replication-associated chromosomal instability index). We show that iTRAC and iRACIN are predictive of metastatic relapse in Leiomyosarcoma (LMS) and that they may be combined to form a new classifier called MAGIC (mixed transcription- and replication-associated genomic instability classifier). MAGIC outperforms the gold standards FNCLCC and CINSARC in stratifying metastatic risk in LMS. Furthermore, iTRAC stratifies chemotherapeutic response in LMS. We finally show that this approach is applicable to other cancers.
Collapse
|
45
|
Translesion polymerase eta both facilitates DNA replication and promotes increased human genetic variation at common fragile sites. Proc Natl Acad Sci U S A 2021; 118:2106477118. [PMID: 34815340 PMCID: PMC8640788 DOI: 10.1073/pnas.2106477118] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/08/2021] [Indexed: 01/23/2023] Open
Abstract
Common fragile sites (CFSs) are difficult-to-replicate genomic regions that form gaps and breaks on metaphase chromosomes under replication stress. They are hotspots for chromosomal instability in cancer. Repetitive sequences located at CFS loci are inefficiently copied by replicative DNA polymerase (Pol) delta. However, translesion synthesis Pol eta has been shown to efficiently polymerize CFS-associated repetitive sequences in vitro and facilitate CFS stability by a mechanism that is not fully understood. Here, by locus-specific, single-molecule replication analysis, we identified a crucial role for Pol eta (encoded by the gene POLH) in the in vivo replication of CFSs, even without exogenous stress. We find that Pol eta deficiency induces replication pausing, increases initiation events, and alters the direction of replication-fork progression at CFS-FRA16D in both lymphoblasts and fibroblasts. Furthermore, certain replication pause sites at CFS-FRA16D were associated with the presence of non-B DNA-forming motifs, implying that non-B DNA structures could increase replication hindrance in the absence of Pol eta. Further, in Pol eta-deficient fibroblasts, there was an increase in fork pausing at fibroblast-specific CFSs. Importantly, while not all pause sites were associated with non-B DNA structures, they were embedded within regions of increased genetic variation in the healthy human population, with mutational spectra consistent with Pol eta activity. From these findings, we propose that Pol eta replicating through CFSs may result in genetic variations found in the human population at these sites.
Collapse
|
46
|
He YH, Yeh MH, Chen HF, Wang TS, Wong RH, Wei YL, Huynh TK, Hu DW, Cheng FJ, Chen JY, Hu SW, Huang CC, Chen Y, Yu J, Cheng WC, Shen PC, Liu LC, Huang CH, Chang YJ, Huang WC. ERα determines the chemo-resistant function of mutant p53 involving the switch between lincRNA-p21 and DDB2 expressions. MOLECULAR THERAPY. NUCLEIC ACIDS 2021; 25:536-553. [PMID: 34589276 PMCID: PMC8463322 DOI: 10.1016/j.omtn.2021.07.022] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Accepted: 07/30/2021] [Indexed: 12/16/2022]
Abstract
Mutant p53 (mutp53) commonly loses its DNA binding affinity to p53 response elements (p53REs) and fails to induce apoptosis fully. However, the p53 mutation does not predict chemoresistance in all subtypes of breast cancers, and the critical determinants remain to be identified. In this study, mutp53 was found to mediate chemotherapy-induced long intergenic noncoding RNA-p21 (lincRNA-p21) expression by targeting the G-quadruplex structure rather than the p53RE on its promoter to promote chemosensitivity. However, estrogen receptor alpha (ERα) suppressed mutp53-mediated lincRNA-p21 expression by hijacking mutp53 to upregulate damaged DNA binding protein 2 (DDB2) transcription for subsequent DNA repair and chemoresistance. Levels of lincRNA-p21 positively correlated with the clinical responses of breast cancer patients to neoadjuvant chemotherapy and had an inverse correlation with the ER status and DDB2 level. In contrast, the carboplatin-induced DDB2 expression was higher in ER-positive breast tumor tissues. These results demonstrated that ER status determines the oncogenic function of mutp53 in chemoresistance by switching its target gene preference from lincRNA-p21 to DDB2 and suggest that induction of lincRNA-p21 and targeting DDB2 would be effective strategies to increase the chemosensitivity of mutp53 breast cancer patients.
Collapse
Affiliation(s)
- Yu-Hao He
- The PhD Program for Cancer Biology and Drug Discovery, China Medical University and Academia Sinica, Taichung 40402, Taiwan.,Center for Molecular Medicine, China Medical University Hospital, Taichung 40402, Taiwan
| | - Ming-Hsin Yeh
- Department of Surgery, Chung Shan Medical University Hospital, Taichung 40201, Taiwan.,Institute of Medicine, School of Medicine, Chung Shan Medical University, Taichung 40201, Taiwan
| | - Hsiao-Fan Chen
- Center for Molecular Medicine, China Medical University Hospital, Taichung 40402, Taiwan.,Drug Development Center, China Medical University, Taichung 40402, Taiwan
| | - Tsu-Shing Wang
- Department of Biomedical Sciences, Chung Shan Medical University, Taichung 40201, Taiwan
| | - Ruey-Hong Wong
- Department of Public Health, Chung Shan Medical University, Taichung 40201, Taiwan.,Department of Occupational Medicine, Chung Shan Medical University Hospital, Taichung 40201, Taiwan
| | - Ya-Ling Wei
- Center for Molecular Medicine, China Medical University Hospital, Taichung 40402, Taiwan
| | - Thanh Kieu Huynh
- Center for Molecular Medicine, China Medical University Hospital, Taichung 40402, Taiwan.,Graduate Institute of Biomedical Sciences, China Medical University, Taichung 40402, Taiwan
| | - Dai-Wei Hu
- Center for Molecular Medicine, China Medical University Hospital, Taichung 40402, Taiwan.,Graduate Institute of Biomedical Sciences, China Medical University, Taichung 40402, Taiwan
| | - Fang-Ju Cheng
- Center for Molecular Medicine, China Medical University Hospital, Taichung 40402, Taiwan.,Graduate Institute of Basic Medical Sciences, China Medical University, Taichung 40402, Taiwan
| | - Jhen-Yu Chen
- Center for Molecular Medicine, China Medical University Hospital, Taichung 40402, Taiwan
| | - Shu-Wei Hu
- Center for Molecular Medicine, China Medical University Hospital, Taichung 40402, Taiwan.,Graduate Institute of Biomedical Sciences, China Medical University, Taichung 40402, Taiwan
| | - Chia-Chen Huang
- Department of Public Health, Chung Shan Medical University, Taichung 40201, Taiwan
| | - Yeh Chen
- Drug Development Center, China Medical University, Taichung 40402, Taiwan.,Institute of New Drug Development, China Medical University, Taichung 40402, Taiwan
| | - Jiaxin Yu
- AI Innovation Center, China Medical University Hospital, Taiwan 40402, Taiwan
| | - Wei-Chung Cheng
- The PhD Program for Cancer Biology and Drug Discovery, China Medical University and Academia Sinica, Taichung 40402, Taiwan.,Research Center for Cancer Biology, China Medical University, Taichung 40402, Taiwan
| | - Pei-Chun Shen
- Research Center for Cancer Biology, China Medical University, Taichung 40402, Taiwan
| | - Liang-Chih Liu
- Division of Breast Surgery, China Medical University Hospital, Taichung 40402, Taiwan
| | - Chih-Hao Huang
- Division of Breast Surgery, China Medical University Hospital, Taichung 40402, Taiwan
| | - Ya-Jen Chang
- The PhD Program for Cancer Biology and Drug Discovery, China Medical University and Academia Sinica, Taichung 40402, Taiwan.,Institute of Biomedical Sciences, Academia Sinica, Taipei 11529, Taiwan
| | - Wei-Chien Huang
- The PhD Program for Cancer Biology and Drug Discovery, China Medical University and Academia Sinica, Taichung 40402, Taiwan.,Center for Molecular Medicine, China Medical University Hospital, Taichung 40402, Taiwan.,Drug Development Center, China Medical University, Taichung 40402, Taiwan.,Graduate Institute of Biomedical Sciences, China Medical University, Taichung 40402, Taiwan.,Research Center for Cancer Biology, China Medical University, Taichung 40402, Taiwan.,Department of Medical Laboratory Science and Biotechnology, Asia University, Taichung 41354, Taiwan
| |
Collapse
|
47
|
Dahal S, Siddiqua H, Katapadi VK, Iyer D, Raghavan SC. Characterization of G4 DNA formation in mitochondrial DNA and their potential role in mitochondrial genome instability. FEBS J 2021; 289:163-182. [PMID: 34228888 DOI: 10.1111/febs.16113] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 05/29/2021] [Accepted: 07/06/2021] [Indexed: 12/16/2022]
Abstract
Mitochondria possess their own genome which can be replicated independently of nuclear DNA. Mitochondria being the powerhouse of the cell produce reactive oxygen species, due to which the mitochondrial genome is frequently exposed to oxidative damage. Previous studies have demonstrated an association of mitochondrial deletions to aging and human disorders. Many of these deletions were present adjacent to non-B DNA structures. Thus, we investigate noncanonical structures associated with instability in mitochondrial genome. In silico studies revealed the presence of > 100 G-quadruplex motifs (of which 5 have the potential to form 3-plate G4 DNA), 23 inverted repeats, and 3 mirror repeats in the mitochondrial DNA (mtDNA). Further analysis revealed that among the deletion breakpoints from patients with mitochondrial disorders, majority are located at G4 DNA motifs. Interestingly, ~ 50% of the deletions were at base-pair positions 8271-8281, ~ 35% were due to deletion at 12362-12384, and ~ 12% due to deletion at 15516-15545. Formation of 3-plate G-quadruplex DNA structures at mitochondrial fragile regions was characterized using electromobility shift assay, circular dichroism (CD), and Taq polymerase stop assay. All 5 regions could fold into both intramolecular and intermolecular G-quadruplex structures in a KCl-dependent manner. G4 DNA formation was in parallel orientation, which was abolished in the presence of LiCl. The formation of G4 DNA affected both replication and transcription. Finally, immunolocalization of BG4 with MitoTracker confirmed the formation of G-quadruplex in mitochondrial genome. Thus, we characterize the formation of 5 different G-quadruplex structures in human mitochondrial region, which may contribute toward formation of mitochondrial deletions.
Collapse
Affiliation(s)
- Sumedha Dahal
- Department of Biochemistry, Indian Institute of Science, Bangalore, India
| | - Humaira Siddiqua
- Department of Biochemistry, Indian Institute of Science, Bangalore, India
| | - Vijeth K Katapadi
- Department of Biochemistry, Indian Institute of Science, Bangalore, India
| | - Divyaanka Iyer
- Department of Biochemistry, Indian Institute of Science, Bangalore, India
| | - Sathees C Raghavan
- Department of Biochemistry, Indian Institute of Science, Bangalore, India
| |
Collapse
|
48
|
Radiotherapy is associated with a deletion signature that contributes to poor outcomes in patients with cancer. Nat Genet 2021; 53:1088-1096. [PMID: 34045764 PMCID: PMC8483261 DOI: 10.1038/s41588-021-00874-3] [Citation(s) in RCA: 76] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 04/21/2021] [Indexed: 02/04/2023]
Abstract
Ionizing radiation causes DNA damage and is a mainstay for cancer treatment, but understanding of its genomic impact is limited. We analyzed mutational spectra following radiotherapy in 190 paired primary and recurrent gliomas from the Glioma Longitudinal Analysis Consortium and 3,693 post-treatment metastatic tumors from the Hartwig Medical Foundation. We identified radiotherapy-associated significant increases in the burden of small deletions (5-15 bp) and large deletions (20+ bp to chromosome-arm length). Small deletions were characterized by a larger span size, lacking breakpoint microhomology and were genomically more dispersed when compared to pre-existing deletions and deletions in non-irradiated tumors. Mutational signature analysis implicated classical non-homologous end-joining-mediated DNA damage repair and APOBEC mutagenesis following radiotherapy. A high radiation-associated deletion burden was associated with worse clinical outcomes, suggesting that effective repair of radiation-induced DNA damage is detrimental to patient survival. These results may be leveraged to predict sensitivity to radiation therapy in recurrent cancer.
Collapse
|
49
|
Dhaka B, Sabarinathan R. Differential chromatin accessibility landscape of gain-of-function mutant p53 tumours. BMC Cancer 2021; 21:669. [PMID: 34090364 PMCID: PMC8180165 DOI: 10.1186/s12885-021-08362-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Accepted: 05/13/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Mutations in TP53 not only affect its tumour suppressor activity but also exerts oncogenic gain-of-function activity. While the genome-wide mutant p53 binding sites have been identified in cancer cell lines, the chromatin accessibility landscape driven by mutant p53 in primary tumours is unknown. Here, we leveraged the chromatin accessibility data of primary tumours from The Cancer Genome Atlas (TCGA) to identify differentially accessible regions in mutant p53 tumours compared to wild-type p53 tumours, especially in breast and colon cancers. RESULTS We identified 1587 lost and 984 gained accessible chromatin regions in breast, and 1143 lost and 640 gained regions in colon cancers. However, only less than half of those regions in both cancer types contain sequence motifs for wild-type or mutant p53 binding. Whereas, the remaining showed enrichment for master transcriptional regulators, such as FOX-Family TFs and NF-kB in lost and SMAD and KLF TFs in gained regions of breast. In colon, ATF3 and FOS/JUN TFs were enriched in lost, and CDX family TFs and HNF4A in gained regions. By integrating the gene expression data, we identified known and novel target genes regulated by the mutant p53. CONCLUSION This study reveals the direct and indirect mechanisms by which gain-of-function mutant p53 targets the chromatin and subsequent gene expression patterns in a tumour-type specific manner. This furthers our understanding of the impact of mutant p53 in cancer development.
Collapse
Affiliation(s)
- Bhavya Dhaka
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bengaluru, 560065, India
| | - Radhakrishnan Sabarinathan
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bengaluru, 560065, India.
| |
Collapse
|
50
|
Gajos M, Jasnovidova O, van Bömmel A, Freier S, Vingron M, Mayer A. Conserved DNA sequence features underlie pervasive RNA polymerase pausing. Nucleic Acids Res 2021; 49:4402-4420. [PMID: 33788942 PMCID: PMC8096220 DOI: 10.1093/nar/gkab208] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Revised: 03/05/2021] [Accepted: 03/15/2021] [Indexed: 12/17/2022] Open
Abstract
Pausing of transcribing RNA polymerase is regulated and creates opportunities to control gene expression. Research in metazoans has so far mainly focused on RNA polymerase II (Pol II) promoter-proximal pausing leaving the pervasive nature of pausing and its regulatory potential in mammalian cells unclear. Here, we developed a pause detecting algorithm (PDA) for nucleotide-resolution occupancy data and a new native elongating transcript sequencing approach, termed nested NET-seq, that strongly reduces artifactual peaks commonly misinterpreted as pausing sites. Leveraging PDA and nested NET-seq reveal widespread genome-wide Pol II pausing at single-nucleotide resolution in human cells. Notably, the majority of Pol II pauses occur outside of promoter-proximal gene regions primarily along the gene-body of transcribed genes. Sequence analysis combined with machine learning modeling reveals DNA sequence properties underlying widespread transcriptional pausing including a new pause motif. Interestingly, key sequence determinants of RNA polymerase pausing are conserved between human cells and bacteria. These studies indicate pervasive sequence-induced transcriptional pausing in human cells and the knowledge of exact pause locations implies potential functional roles in gene expression.
Collapse
Affiliation(s)
- Martyna Gajos
- Otto-Warburg-Laboratory, Max Planck Institute for Molecular Genetics, Berlin 14195, Germany.,Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin 14195, Germany
| | - Olga Jasnovidova
- Otto-Warburg-Laboratory, Max Planck Institute for Molecular Genetics, Berlin 14195, Germany
| | - Alena van Bömmel
- Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin 14195, Germany.,Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin 14195, Germany
| | - Susanne Freier
- Otto-Warburg-Laboratory, Max Planck Institute for Molecular Genetics, Berlin 14195, Germany
| | - Martin Vingron
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin 14195, Germany
| | - Andreas Mayer
- Otto-Warburg-Laboratory, Max Planck Institute for Molecular Genetics, Berlin 14195, Germany
| |
Collapse
|