1
|
Murata MM, Giuliano AE, Tanaka H. Genome-Wide Analysis of Palindrome Formation with Next-Generation Sequencing (GAPF-Seq) and a Bioinformatics Pipeline for Assessing De Novo Palindromes in Cancer Genomes. Methods Mol Biol 2023; 2660:13-22. [PMID: 37191787 DOI: 10.1007/978-1-0716-3163-8_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
DNA palindromes are a type of chromosomal aberration that appears frequently during tumorigenesis. They are characterized by sequences of nucleotides that are identical to their reverse complements and often arise due to illegitimate repair of DNA double-strand breaks, fusion of telomeres, or stalled replication forks, all of which are common adverse early events in cancer. Here, we describe the protocol for enriching palindromes from genomic DNA sources with low-input DNA amounts and detail a bioinformatics tool for assessing the enrichment and location of de novo palindrome formation from low-coverage whole-genome sequencing data.
Collapse
Affiliation(s)
- Michael M Murata
- Department of Surgery, Cedars-Sinai Medical Center, West Hollywood, CA, USA.
| | - Armando E Giuliano
- Department of Surgery, Cedars-Sinai Medical Center, West Hollywood, CA, USA
- Department of Surgery, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, West Hollywood, CA, USA
| | - Hisashi Tanaka
- Department of Surgery, Cedars-Sinai Medical Center, West Hollywood, CA, USA.
- Department of Surgery, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, West Hollywood, CA, USA.
- Departments of Surgery and Biomedical Sciences, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, West Hollywood, CA, USA.
| |
Collapse
|
2
|
Svetec Miklenić M, Svetec IK. Palindromes in DNA-A Risk for Genome Stability and Implications in Cancer. Int J Mol Sci 2021; 22:2840. [PMID: 33799581 PMCID: PMC7999016 DOI: 10.3390/ijms22062840] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 03/04/2021] [Accepted: 03/08/2021] [Indexed: 02/07/2023] Open
Abstract
A palindrome in DNA consists of two closely spaced or adjacent inverted repeats. Certain palindromes have important biological functions as parts of various cis-acting elements and protein binding sites. However, many palindromes are known as fragile sites in the genome, sites prone to chromosome breakage which can lead to various genetic rearrangements or even cell death. The ability of certain palindromes to initiate genetic recombination lies in their ability to form secondary structures in DNA which can cause replication stalling and double-strand breaks. Given their recombinogenic nature, it is not surprising that palindromes in the human genome are involved in genetic rearrangements in cancer cells as well as other known recurrent translocations and deletions associated with certain syndromes in humans. Here, we bring an overview of current understanding and knowledge on molecular mechanisms of palindrome recombinogenicity and discuss possible implications of DNA palindromes in carcinogenesis. Furthermore, we overview the data on known palindromic sequences in the human genome and efforts to estimate their number and distribution, as well as underlying mechanisms of genetic rearrangements specific palindromic sequences cause.
Collapse
Affiliation(s)
| | - Ivan Krešimir Svetec
- Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva 6, 10000 Zagreb, Croatia;
| |
Collapse
|
3
|
Mikhailov KV, Efeykin BD, Panchin AY, Knorre DA, Logacheva MD, Penin AA, Muntyan MS, Nikitin MA, Popova OV, Zanegina ON, Vyssokikh MY, Spiridonov SE, Aleoshin VV, Panchin YV. Coding palindromes in mitochondrial genes of Nematomorpha. Nucleic Acids Res 2020; 47:6858-6870. [PMID: 31194871 PMCID: PMC6649704 DOI: 10.1093/nar/gkz517] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 05/29/2019] [Accepted: 06/01/2019] [Indexed: 12/11/2022] Open
Abstract
Inverted repeats are common DNA elements, but they rarely overlap with protein-coding sequences due to the ensuing conflict with the structure and function of the encoded protein. We discovered numerous perfect inverted repeats of considerable length (up to 284 bp) embedded within the protein-coding genes in mitochondrial genomes of four Nematomorpha species. Strikingly, both arms of the inverted repeats encode conserved regions of the amino acid sequence. We confirmed enzymatic activity of the respiratory complex I encoded by inverted repeat-containing genes. The nucleotide composition of inverted repeats suggests strong selection at the amino acid level in these regions. We conclude that the inverted repeat-containing genes are transcribed and translated into functional proteins. The survey of available mitochondrial genomes reveals that several other organisms possess similar albeit shorter embedded repeats. Mitochondrial genomes of Nematomorpha demonstrate an extraordinary evolutionary compromise where protein function and stringent secondary structure elements within the coding regions are preserved simultaneously.
Collapse
Affiliation(s)
- Kirill V Mikhailov
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Boris D Efeykin
- Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation.,Severtsov Institute of Ecology and Evolution, Moscow 119071, Russian Federation
| | - Alexander Y Panchin
- Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Dmitry A Knorre
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Institute of Molecular Medicine, Sechenov First Moscow State Medical University, Moscow 119991, Russian Federation
| | - Maria D Logacheva
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation.,Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, Moscow 143028, Russian Federation
| | - Aleksey A Penin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Maria S Muntyan
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation
| | - Mikhail A Nikitin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Olga V Popova
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation
| | - Olga N Zanegina
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation
| | - Mikhail Y Vyssokikh
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation
| | - Sergei E Spiridonov
- Severtsov Institute of Ecology and Evolution, Moscow 119071, Russian Federation
| | - Vladimir V Aleoshin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Yuri V Panchin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| |
Collapse
|
4
|
Abstract
Animal and plant centromeres are embedded in repetitive "satellite" DNA, but are thought to be epigenetically specified. To define genetic characteristics of centromeres, we surveyed satellite DNA from diverse eukaryotes and identified variation in <10-bp dyad symmetries predicted to adopt non-B-form conformations. Organisms lacking centromeric dyad symmetries had binding sites for sequence-specific DNA-binding proteins with DNA-bending activity. For example, human and mouse centromeres are depleted for dyad symmetries, but are enriched for non-B-form DNA and are associated with binding sites for the conserved DNA-binding protein CENP-B, which is required for artificial centromere function but is paradoxically nonessential. We also detected dyad symmetries and predicted non-B-form DNA structures at neocentromeres, which form at ectopic loci. We propose that centromeres form at non-B-form DNA because of dyad symmetries or are strengthened by sequence-specific DNA binding proteins. This may resolve the CENP-B paradox and provide a general basis for centromere specification.
Collapse
Affiliation(s)
- Sivakanthan Kasinathan
- Medical Scientist Training Program, University of Washington School of Medicine, Seattle, WA.,Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA
| | - Steven Henikoff
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA.,Howard Hughes Medical Institute, Seattle, WA
| |
Collapse
|
5
|
McGurk MP, Barbash DA. Double insertion of transposable elements provides a substrate for the evolution of satellite DNA. Genome Res 2018; 28:714-725. [PMID: 29588362 PMCID: PMC5932611 DOI: 10.1101/gr.231472.117] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Accepted: 03/22/2018] [Indexed: 02/06/2023]
Abstract
Eukaryotic genomes are replete with repeated sequences in the form of transposable elements (TEs) dispersed across the genome or as satellite arrays, large stretches of tandemly repeated sequences. Many satellites clearly originated as TEs, but it is unclear how mobile genetic parasites can transform into megabase-sized tandem arrays. Comprehensive population genomic sampling is needed to determine the frequency and generative mechanisms of tandem TEs, at all stages from their initial formation to their subsequent expansion and maintenance as satellites. The best available population resources, short-read DNA sequences, are often considered to be of limited utility for analyzing repetitive DNA due to the challenge of mapping individual repeats to unique genomic locations. Here we develop a new pipeline called ConTExt that demonstrates that paired-end Illumina data can be successfully leveraged to identify a wide range of structural variation within repetitive sequence, including tandem elements. By analyzing 85 genomes from five populations of Drosophila melanogaster, we discover that TEs commonly form tandem dimers. Our results further suggest that insertion site preference is the major mechanism by which dimers arise and that, consequently, dimers form rapidly during periods of active transposition. This abundance of TE dimers has the potential to provide source material for future expansion into satellite arrays, and we discover one such copy number expansion of the DNA transposon hobo to approximately 16 tandem copies in a single line. The very process that defines TEs—transposition—thus regularly generates sequences from which new satellites can arise.
Collapse
Affiliation(s)
- Michael P McGurk
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Daniel A Barbash
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| |
Collapse
|
6
|
Subramanian S, Chaparala S, Avali V, Ganapathiraju MK. A pilot study on the prevalence of DNA palindromes in breast cancer genomes. BMC Med Genomics 2016; 9:73. [PMID: 28117658 PMCID: PMC5260791 DOI: 10.1186/s12920-016-0232-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background DNA palindromes are a unique pattern of repeat sequences that are present in the human genome. It consists of a sequence of nucleotides in which the second half is the complement of the first half but appearing in reverse order. These palindromic sequences may have a significant role in DNA replication, transcription and gene regulation processes. They occur frequently in human cancers by clustering at specific locations of the genome that undergo gene amplification and tumorigenesis. Moreover, some studies showed that palindromes are clustered in amplified regions of breast cancer genomes especially in chromosomes (chr) 8 and 11. With the large number of personal genomes and cancer genomes becoming available, it is now possible to study their association to diseases using computational methods. Here, we conducted a pilot study on chromosomes 8 and 11 of cancer genomes to identify computationally the differentially occurring palindromes. Methods We processed 69 breast cancer genomes from The Cancer Genome Atlas including serum-normal and tumor genomes, and 1000 Genomes to serve as control group. The Biological Language Modelling Toolkit (BLMT) computes palindromes in whole genomes. We developed a computational pipeline integrating BLMT to compute and compare prevalence of palindromes in personal genomes. Results We carried out a pilot study on chr 8 and chr 11 taking into account single nucleotide polymorphisms, insertions and deletions. Of all the palindromes that showed any variation in cancer genomes, 38% of what were near breast cancer genes happened to be the most differentiated palindromes in tumor (i.e. they ranked among the top 25% by our heuristic measure). Conclusions These results will shed light on the prevalence of palindromes in oncogenes and the mutations that are present in the palindromic regions that could contribute to genomic rearrangements, and breast cancer progression.
Collapse
Affiliation(s)
- Sandeep Subramanian
- Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Srilakshmi Chaparala
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Blvd, Suite 522, Pittsburgh, PA, 15206, USA
| | - Viji Avali
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Blvd, Suite 522, Pittsburgh, PA, 15206, USA
| | - Madhavi K Ganapathiraju
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Blvd, Suite 522, Pittsburgh, PA, 15206, USA. .,Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, 15213, USA.
| |
Collapse
|
7
|
Brewer BJ, Payen C, Di Rienzi SC, Higgins MM, Ong G, Dunham MJ, Raghuraman MK. Origin-Dependent Inverted-Repeat Amplification: Tests of a Model for Inverted DNA Amplification. PLoS Genet 2015; 11:e1005699. [PMID: 26700858 PMCID: PMC4689423 DOI: 10.1371/journal.pgen.1005699] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2015] [Accepted: 11/03/2015] [Indexed: 01/20/2023] Open
Abstract
DNA replication errors are a major driver of evolution—from single nucleotide polymorphisms to large-scale copy number variations (CNVs). Here we test a specific replication-based model to explain the generation of interstitial, inverted triplications. While no genetic information is lost, the novel inversion junctions and increased copy number of the included sequences create the potential for adaptive phenotypes. The model—Origin-Dependent Inverted-Repeat Amplification (ODIRA)—proposes that a replication error at pre-existing short, interrupted, inverted repeats in genomic sequences generates an extrachromosomal, inverted dimeric, autonomously replicating intermediate; subsequent genomic integration of the dimer yields this class of CNV without loss of distal chromosomal sequences. We used a combination of in vitro and in vivo approaches to test the feasibility of the proposed replication error and its downstream consequences on chromosome structure in the yeast Saccharomyces cerevisiae. We show that the proposed replication error—the ligation of leading and lagging nascent strands to create “closed” forks—can occur in vitro at short, interrupted inverted repeats. The removal of molecules with two closed forks results in a hairpin-capped linear duplex that we show replicates in vivo to create an inverted, dimeric plasmid that subsequently integrates into the genome by homologous recombination, creating an inverted triplication. While other models have been proposed to explain inverted triplications and their derivatives, our model can also explain the generation of human, de novo, inverted amplicons that have a 2:1 mixture of sequences from both homologues of a single parent—a feature readily explained by a plasmid intermediate that arises from one homologue and integrates into the other homologue prior to meiosis. Our tests of key features of ODIRA lend support to this mechanism and suggest further avenues of enquiry to unravel the origins of interstitial, inverted CNVs pivotal in human health and evolution. Chromosomal aberration such as gene amplification is a common event in human diseases and is often selected during adaptation of microorganism to stress. We proposed a replication-based model to explain the formation of a particular type of genomic aberration: internal inverted DNA amplification with retention of the distal end of the chromosome. In this study, using yeast as a model, we test the feasibility of several of these steps for the formation of an inverted amplification: a specific DNA replication anomaly (1) leading to the formation of a palindromic extrachromosomal circular molecule (2) followed by the homologous reintegration of this molecule into the genome (3). A significant feature of this mode of amplification is that the amplified sequences contain one or more replication origins. The instability of the inverted junctions can lead, through homology driven processes, to more complex genomic structures that contain a partial triplication within a duplicated segment, a structure commonly found associated with human disease.
Collapse
Affiliation(s)
- Bonita J. Brewer
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| | - Celia Payen
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Sara C. Di Rienzi
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Megan M. Higgins
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Giang Ong
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Maitreya J. Dunham
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - M. K. Raghuraman
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|