1
|
Wesp V, Theißen G, Schuster S. Statistical analysis of synonymous and stop codons in pseudo-random and real sequences as a function of GC content. Sci Rep 2023; 13:22996. [PMID: 38151539 PMCID: PMC10752896 DOI: 10.1038/s41598-023-49626-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 12/10/2023] [Indexed: 12/29/2023] Open
Abstract
Knowledge of the frequencies of synonymous triplets in protein-coding and non-coding DNA stretches can be used in gene finding. These frequencies depend on the GC content of the genome or parts of it. An example of interest is provided by stop codons. This is relevant for the definition of Open Reading Frames. A generic case is provided by pseudo-random sequences, especially when they code for complex proteins or when they are non-coding and not subject to selection pressure. Here, we calculate, for such sequences and for all 25 known genetic codes, the frequency of each amino acid and stop codon based on their set of codons and as a function of GC content. The amino acids can be classified into five groups according to the GC content where their expected frequency reaches its maximum. We determine the overall Shannon information based on groups of synonymous codons and show that it becomes maximum at a percent GC of 43.3% (for the standard code). This is in line with the observation that in most fungi, plants, and animals, this genomic parameter is in the range from 35 to 50%. By analysing natural sequences, we show that there is a clear bias for triplets corresponding to stop codons near the 5'- and 3'-splice sites in the introns of various clades.
Collapse
Affiliation(s)
- Valentin Wesp
- Department of Bioinformatics, Matthias Schleiden Institute, Friedrich Schiller University Jena, Ernst-Abbe-Platz 2, 07743, Jena, Germany
| | - Günter Theißen
- Department of Genetics, Matthias Schleiden Institute, Friedrich Schiller University Jena, Philosophenweg 12, 07743, Jena, Germany
| | - Stefan Schuster
- Department of Bioinformatics, Matthias Schleiden Institute, Friedrich Schiller University Jena, Ernst-Abbe-Platz 2, 07743, Jena, Germany.
| |
Collapse
|
2
|
Pflughaupt P, Sahakyan AB. Generalised interrelations among mutation rates drive the genomic compliance of Chargaff's second parity rule. Nucleic Acids Res 2023; 51:7409-7423. [PMID: 37293966 PMCID: PMC10415130 DOI: 10.1093/nar/gkad477] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 05/05/2023] [Accepted: 05/17/2023] [Indexed: 06/10/2023] Open
Abstract
Chargaff's second parity rule (PR-2), where the complementary base and k-mer contents are matching within the same strand of a double stranded DNA (dsDNA), is a phenomenon that invited many explanations. The strict compliance of nearly all nuclear dsDNA to PR-2 implies that the explanation should also be similarly adamant. In this work, we revisited the possibility of mutation rates driving PR-2 compliance. Starting from the assumption-free approach, we constructed kinetic equations for unconstrained simulations. The results were analysed for their PR-2 compliance by employing symbolic regression and machine learning techniques. We arrived to a generalised set of mutation rate interrelations in place in most species that allow for their full PR-2 compliance. Importantly, our constraints explain PR-2 in genomes out of the scope of the prior explanations based on the equilibration under mutation rates with simpler no-strand-bias constraints. We thus reinstate the role of mutation rates in PR-2 through its molecular core, now shown, under our formulation, to be tolerant to previously noted strand biases and incomplete compositional equilibration. We further investigate the time for any genome to reach PR-2, showing that it is generally earlier than the compositional equilibrium, and well within the age of life on Earth.
Collapse
Affiliation(s)
- Patrick Pflughaupt
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, OX3 9DS, UK
| | - Aleksandr B Sahakyan
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, OX3 9DS, UK
| |
Collapse
|
3
|
Moeckel C, Zaravinos A, Georgakopoulos-Soares I. Strand Asymmetries Across Genomic Processes. Comput Struct Biotechnol J 2023; 21:2036-2047. [PMID: 36968020 PMCID: PMC10030826 DOI: 10.1016/j.csbj.2023.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 03/08/2023] [Accepted: 03/08/2023] [Indexed: 03/12/2023] Open
Abstract
Across biological systems, a number of genomic processes, including transcription, replication, DNA repair, and transcription factor binding, display intrinsic directionalities. These directionalities are reflected in the asymmetric distribution of nucleotides, motifs, genes, transposon integration sites, and other functional elements across the two complementary strands. Strand asymmetries, including GC skews and mutational biases, have shaped the nucleotide composition of diverse organisms. The investigation of strand asymmetries often serves as a method to understand underlying biological mechanisms, including protein binding preferences, transcription factor interactions, retrotransposition, DNA damage and repair preferences, transcription-replication collisions, and mutagenesis mechanisms. Research into this subject also enables the identification of functional genomic sites, such as replication origins and transcription start sites. Improvements in our ability to detect and quantify DNA strand asymmetries will provide insights into diverse functionalities of the genome, the contribution of different mutational mechanisms in germline and somatic mutagenesis, and our knowledge of genome instability and evolution, which all have significant clinical implications in human disease, including cancer. In this review, we describe key developments that have been made across the field of genomic strand asymmetries, as well as the discovery of associated mechanisms.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Apostolos Zaravinos
- Department of Life Sciences, European University Cyprus, Diogenis Str., 6, Nicosia 2404, Cyprus
- Cancer Genetics, Genomics and Systems Biology laboratory, Basic and Translational Cancer Research Center (BTCRC), Nicosia 1516, Cyprus
- Corresponding author at: Department of Life Sciences, European University Cyprus, Diogenis Str., 6, Nicosia 2404, Cyprus.
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Corresponding author.
| |
Collapse
|
4
|
Rosandić M, Vlahović I, Pilaš I, Glunčić M, Paar V. An Explanation of Exceptions from Chargaff's Second Parity Rule/Strand Symmetry of DNA Molecules. Genes (Basel) 2022; 13:1929. [PMID: 36360166 PMCID: PMC9689577 DOI: 10.3390/genes13111929] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Revised: 10/12/2022] [Accepted: 10/17/2022] [Indexed: 11/04/2022] Open
Abstract
In this article, we show that mono/oligonucleotide quadruplets, as basic structures of DNA, along with our classification of trinucleotides, disclose an organization of genomes based on purine-pyrimidine symmetry. Moreover, the structure and stability of DNA are influenced by the Watson-Crick pairing and the natural law of DNA creation and conservation, according to which the same mono- or oligonucleotide insertion must be inserted simultaneously into both strands of DNA. Taken together, they lead to quadruplets with central mirror symmetry and bidirectional DNA strand orientation and are incorporated into Chargaff's second parity rule (CSPR). Performing our quadruplet frequency analysis of all human chromosomes and of Neuroblastoma BreakPoint Family (NBPF) genes, which code Olduvai protein domains in the human genome, we show that the coding part of DNA violates CSPR. This may shed new light and give rise to a novel hypothesis on DNA creation and its evolution. In this framework, the logarithmic relationship between oligonucleotide order and minimal DNA sequence length, to establish the validity of CSPR, automatically follows from the quadruplet structure of the genomic sequence. The problem of the violation of CSPR in rare symbionts is discussed.
Collapse
Affiliation(s)
- Marija Rosandić
- University Hospital Centre Zagreb (Ret.), 10000 Zagreb, Croatia
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia
| | - Ines Vlahović
- Faculty of Science, Algebra University College, 10000 Zagreb, Croatia
| | - Ivan Pilaš
- Forest Research Institute, 10450 Jastrebarsko, Croatia
| | - Matko Glunčić
- Physics Department, Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia
| | - Vladimir Paar
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia
- Physics Department, Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia
| |
Collapse
|
5
|
Almirantis Y, Provata A, Li W. Noether's Theorem as a Metaphor for Chargaff's 2nd Parity Rule in Genomics. J Mol Evol 2022; 90:231-238. [PMID: 35704064 DOI: 10.1007/s00239-022-10062-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Accepted: 05/18/2022] [Indexed: 10/18/2022]
Abstract
In the present note, the genomic compositional rule largely known as 'Chargaff's 2nd parity rule' (asserting equimolarity between Adenine-Thymine and Guanine-Cytosine in any of the two DNA strands) is regarded in association with Noether's theorem linking symmetries with conservation laws in physics. In the case of the genome, the strict physical and mathematical prerequisites of Noether's theorem do not hold. However, we conclude that a metaphor can be established with Noether's theorem, as inter-strand symmetry concerning DNA functionality engenders specific features in genome composition. Inversely, when inter-strand symmetry does not hold, the corresponding quantitative relations fail to appear. This association is also considered from the point of view of the existence of emergent laws and properties in evolutionary genomics.
Collapse
Affiliation(s)
- Yannis Almirantis
- Theoretical Biology and Computational Genomics Laboratory, Institute of Bioscience and Applications, National Center for Scientific Research "Demokritos", 15341, Athens, Greece.
| | - Astero Provata
- Statistical Mechanics and Dynamical Systems Laboratory, Institute of Nanoscience and Nanotechnology, National Center for Scientific Research, "Demokritos", 15341, Athens, Greece
| | - Wentian Li
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
| |
Collapse
|
6
|
Tikhomirova TS, Matyunin MA, Lobanov MY, Galzitskaya OV. In-depth analysis of amino acid and nucleotide sequences of Hsp60: how conserved is this protein? Proteins 2021; 90:1119-1141. [PMID: 34964171 DOI: 10.1002/prot.26294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Revised: 12/21/2021] [Accepted: 12/23/2021] [Indexed: 11/07/2022]
Abstract
Chaperonin Hsp60, as a protein found in all organisms, is of great interest in medicine, since it is present in many tissues and can be used both as a drug and as an object of targeted therapy. Hence, Hsp60 deserves a fundamental comparative analysis to assess its evolutionary characteristics. It was found that the percent identity of Hsp60 amino acid sequences both within and between phyla was not high enough to identify Hsp60s as highly conserved proteins. However, their ATP binding sites are largely conserved. The amino acid composition of Hsp60s remained relatively constant. At the same time, the analysis of the nucleotide sequences showed that GC content in the Hsp60 genes was comparable to or greater than the genomic values, which may indicate a high resistance to mutations due to tight control of the nucleotide composition by DNA repair systems. Natural selection plays a dominant role in the evolution of Hsp60 genes. The degree of mutational pressure affecting the Hsp60 genes is quite low, and its direction does not depend on taxonomy. Interestingly, for the Hsp60 genes from Chordata, Arthropoda, and Proteobacteria the exact direction of mutational pressure could not be determined. However, upon further division into classes, it was found that the direction of the mutational pressure for Hsp60 genes from Fish differs from that for other chordates. The direction of the mutational pressure affects the synonymous codon usage bias. The number of high and low represented codons increases with increasing GC content, which can improve codon usage. Special server has been created for bioinformatics analysis of Hsp60: http://oka.protres.ru:4202/.
Collapse
Affiliation(s)
- Tatyana S Tikhomirova
- Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, Moscow Region, Russia
| | - Maxim A Matyunin
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region, Russia
| | - Michail Yu Lobanov
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region, Russia
| | - Oxana V Galzitskaya
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region, Russia
- Institute of Theoretical and Experimental Biophysics, Russian Academy of Sciences, Pushchino, Moscow Region, Russia
| |
Collapse
|
7
|
Rue CR, Selwyn JD, Cockett PM, Gillis B, Gurski L, Jose P, Kutil BL, Magnuson SF, Ángela López de Mesa L, Overath RD, Smee DL, Bird CE. Genetic diversity across the mitochondrial genome of eastern oysters ( Crassostrea virginica) in the northern Gulf of Mexico. PeerJ 2021; 9:e12205. [PMID: 34692250 PMCID: PMC8485835 DOI: 10.7717/peerj.12205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 09/03/2021] [Indexed: 11/20/2022] Open
Abstract
The eastern oyster, Crassostrea virginica, is divided into four populations along the western North Atlantic, however, the only published mitochondrial genome sequence was assembled using one individual in Delaware. This study aimed to (1) assemble C. virginica mitochondrial genomes from Texas with pooled restriction-site-associated DNA sequencing (ezRAD), (2) evaluate the validity of the mitochondrial genome assemblies including comparison with Sanger sequencing data, and (3) evaluate genetic differentiation both between the Delaware and Texas genomes, as well as among three bays in Texas. The pooled-genome-assembled-genomes (PAGs) from Texas exhibited several characteristics indicating that they were valid, including elevated nucleotide diversity in non-coding and the third position of codons, placement as the sister haplotype of the genome from Delaware in a phylogenetic reconstruction of Crassostrea mitochondrial genomes, and a lack of genetic structure in the ND4 gene among the three Texas bays as was found with Sanger amplicons in samples from the same bays several years prior. In the comparison between the Delaware and Texas genome, 27 of 38 coding regions exhibited variability between the two populations, which were differentiated by 273 mutations, versus 1-13 mutations among the Texas samples. Using the full PAGs, there was no additional evidence for population structure among the three Texas bays. While population genetics is rapidly moving towards larger high-density datasets, studies of mitochondrial DNA (and genomes) can be particularly useful for comparing historic data prior to the modern era of genomics. As such, being able to reliably compile mitochondrial genomes from genomic data can improve the ability to compare results across studies.
Collapse
Affiliation(s)
- Chani R Rue
- Department of Life Sciences, Texas A&M University-Corpus Christi, Corpus Christi, TX, United States of America
| | - Jason D Selwyn
- Department of Life Sciences, Texas A&M University-Corpus Christi, Corpus Christi, TX, United States of America
| | - Patricia M Cockett
- Harte Research Institute, Texas A&M University-Corpus Christi, Corpus Christi, TX, United States of America
| | - Bryan Gillis
- Conrad Blucher Institute, Texas A&M University-Corpus Christi, Corpus Christi, TX, United States of America
| | - Lauren Gurski
- Department of Life Sciences, Texas A&M University-Corpus Christi, Corpus Christi, TX, United States of America
| | - Philip Jose
- Department of Life Sciences, Texas A&M University-Corpus Christi, Corpus Christi, TX, United States of America
| | - Brandi L Kutil
- Department of Undergraduate Studies, Texas A&M University-Corpus Christi, Corpus Christi, TX, United States of America
| | - Sharon F Magnuson
- Department of Life Sciences, Texas A&M University-Corpus Christi, Corpus Christi, TX, United States of America
| | - Luz Ángela López de Mesa
- Department of Life Sciences, Texas A&M University-Corpus Christi, Corpus Christi, TX, United States of America
| | - R Deborah Overath
- Department of Mathematics and Sciences, Texas Southmost College, Brownsville, TX, United States of America
| | - Delbert Lee Smee
- Dauphin Island Sea Lab, Dauphin Island, AL, United States of America.,Marine Sciences, University of South Alabama, Mobile, AL, United States of America
| | - Christopher E Bird
- Department of Life Sciences, Texas A&M University-Corpus Christi, Corpus Christi, TX, United States of America.,Hawai'i Institute of Marine Biology, University of Hawaii at Mānoa, Kāne'ohe, Hawai'i, United States of America
| |
Collapse
|
8
|
Khoshbin Z, Abnous K, Taghdisi SM, Verdian A. A novel liquid crystal-based aptasensor for ultra-low detection of ochratoxin a using a π-shaped DNA structure: Promising for future on-site detection test strips. Biosens Bioelectron 2021; 191:113457. [PMID: 34175647 DOI: 10.1016/j.bios.2021.113457] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 05/22/2021] [Accepted: 06/17/2021] [Indexed: 12/19/2022]
Abstract
Ochratoxin A (OTA) as the most dangerous mycotoxin is produced by Aspergillus Ochraceus and Penicillium verrucosum. OTA can be found in beverages and foodstuffs that induces the teratogenic, nephrotoxic, carcinogenic, and immunosuppressive effects on humans. Hence, developing highly sensitive methods for its detection is of great importance. Herein, a novel aptasensor was designed for the label-free monitoring of the ultra-low OTA levels by a combination of the superiority of aptamers and long-range orientational order of liquid crystals (LCs). The aptasensing strategy was based on the conformational switch of the immobilized π-shaped DNA structure on the glass substrate in presence of the target. A shift in the orientation of LCs from random to homeotropic state led to the apparent alteration of the optical appearance of the aptasensor platform from bright to dark. The LC-based aptasensor especially detects OTA at the ultra-trace level as low as 0.63 aM with comparable selectivity. The aptasensor could detect OTA successfully in the grape juice, coffee, and human serum samples. The LC-based aptasensor paves a way for developing portable and real-time sensing probes with high performance for food safety control and clinical application.
Collapse
Affiliation(s)
- Zahra Khoshbin
- Pharmaceutical Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran; Department of Medicinal Chemistry, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Khalil Abnous
- Pharmaceutical Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran; Department of Medicinal Chemistry, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Seyed Mohammad Taghdisi
- Targeted Drug Delivery Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Asma Verdian
- Department of Food Safety and Quality Control, Research Institute of Food Science and Technology (RIFST), Mashhad, Iran
| |
Collapse
|
9
|
Fariselli P, Taccioli C, Pagani L, Maritan A. DNA sequence symmetries from randomness: the origin of the Chargaff's second parity rule. Brief Bioinform 2021; 22:2172-2181. [PMID: 32266404 PMCID: PMC7986665 DOI: 10.1093/bib/bbaa041] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Revised: 02/27/2020] [Accepted: 03/05/2020] [Indexed: 01/13/2023] Open
Abstract
Most living organisms rely on double-stranded DNA (dsDNA) to store their genetic information and perpetuate themselves. This biological information has been considered as the main target of evolution. However, here we show that symmetries and patterns in the dsDNA sequence can emerge from the physical peculiarities of the dsDNA molecule itself and the maximum entropy principle alone, rather than from biological or environmental evolutionary pressure. The randomness justifies the human codon biases and context-dependent mutation patterns in human populations. Thus, the DNA 'exceptional symmetries,' emerged from the randomness, have to be taken into account when looking for the DNA encoded information. Our results suggest that the double helix energy constraints and, more generally, the physical properties of the dsDNA are the hard drivers of the overall DNA sequence architecture, whereas the selective biological processes act as soft drivers, which only under extraordinary circumstances overtake the overall entropy content of the genome.
Collapse
Affiliation(s)
- Piero Fariselli
- Department of Medical Sciences of the University of Turin, Italy
| | | | - Luca Pagani
- Department of Biology of the University of Padova, Italy
| | - Amos Maritan
- Department of Physics of the University of Padova, Italy
| |
Collapse
|
10
|
Revisiting the Relationships Between Genomic G + C Content, RNA Secondary Structures, and Optimal Growth Temperature. J Mol Evol 2020; 89:165-171. [PMID: 33216148 DOI: 10.1007/s00239-020-09974-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 11/09/2020] [Indexed: 10/23/2022]
Abstract
Over twenty years ago Galtier and Lobry published a manuscript entitled "Relationships between Genomic G + C Content, RNA Secondary Structure, and Optimal Growth Temperature" in the Journal of Molecular Evolution that showcased the lack of a relationship between genomic G + C content and optimal growth temperature (OGT) in a set of about 200 prokaryotes. Galtier and Lobry also assessed the relationship between RNA secondary structures (rRNA stems, tRNAs) and OGT, and in this case a clear relationship emerged. Increasing structured RNA G + C content (particularly in regions that are double-stranded) correlates with increased OGT. Both of these fundamental relationships have withstood test of many additional sequences and spawned a variety of different applications that include prediction of OGT from rRNA sequence and computational ncRNA identification approaches. In this work, I present the motivation behind Galtier and Lobry's original paper and the larger questions addressed by the work, how these questions have evolved over the last two decades, and the impact of Galtier and Lobry's manuscript in fields beyond these questions.
Collapse
|
11
|
Pentamers with Non-redundant Frames: Bias for Natural Circular Code Codons. J Mol Evol 2020; 88:194-201. [DOI: 10.1007/s00239-019-09925-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Accepted: 12/17/2019] [Indexed: 02/06/2023]
|
12
|
Rosandić M, Vlahović I, Paar V. Novel look at DNA and life-Symmetry as evolutionary forcing. J Theor Biol 2019; 483:109985. [PMID: 31469987 DOI: 10.1016/j.jtbi.2019.08.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2018] [Revised: 06/21/2018] [Accepted: 08/22/2019] [Indexed: 11/20/2022]
Abstract
After explanation of the Chargaff´s first parity rule in terms of the Watson-Crick base-pairing between the two DNA strands, the Chargaff´s second parity rule for each strand of DNA (also named strand symmetry), which cannot be explained by Watson-Crick base-pairing only, is still a challenging issue already fifty years. We show that during evolution DNA preserves its identity in the form of quadruplet A+T and C+G rich matrices based on purine-pyrimidine mirror symmetries of trinucleotides. Identical symmetries are present in our classification of trinucleotides and the genetic code table. All eukaryotes and almost all prokaryotes (bacteria and archaea) have quadruplet mirror symmetries in structural form and frequencies following the principle of Chargaff's second parity rule and Natural symmetry law of DNA creation and conservation. Some rare symbionts have mirror symmetry only in their structural form within each DNA strand. Based on our matrix analysis of closely related species, humans and Neanderthals, we find that the circular cycle of inverse proportionality between trinucleotides preserves identical relative frequencies of trinucleotides in each quadruplet and in the whole genome. According to our calculations, a change in frequencies in quadruplet matrices could lead to the creation of new species. Violation of quadruplet symmetries is practically inconsistent with life. DNA symmetries provide a key for understanding the restriction of disorder (entropy) due to mutations in the evolution of DNA.
Collapse
Affiliation(s)
- Marija Rosandić
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia; University hospital centre Zagreb (ret.), Zagreb, Croatia.
| | - Ines Vlahović
- Department of Physics, Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia; Algebra University College, 10000 Zagreb, Croatia.
| | - Vladimir Paar
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia; Department of Physics, Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia.
| |
Collapse
|
13
|
Fimmel E, Gumbel M, Karpuzoglu A, Petoukhov S. On comparing composition principles of long DNA sequences with those of random ones. Biosystems 2019; 180:101-108. [DOI: 10.1016/j.biosystems.2019.04.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2019] [Revised: 04/05/2019] [Accepted: 04/06/2019] [Indexed: 11/25/2022]
|
14
|
Weerts MJA, Sleijfer S, Martens JWM. The role of mitochondrial DNA in breast tumors. Drug Discov Today 2019; 24:1202-1208. [PMID: 30910739 DOI: 10.1016/j.drudis.2019.03.019] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 03/08/2019] [Accepted: 03/18/2019] [Indexed: 12/29/2022]
Abstract
Somatic variation in mitochondrial DNA (mtDNA) has been described in primary breast tumors, including single-nucleotide variants and variation in the number of mtDNA molecules per cell (mtDNA content). However, there is currently a gap in the knowledge on the link between mitochondrial variation in breast cancer cells and their phenotypic behavior (i.e., tumorigenesis) or outcome. This review focuses on recent findings on mtDNA content and mtDNA somatic mutations in breast cancer and the potential biological impact and clinical relevance.
Collapse
Affiliation(s)
- Marjolein J A Weerts
- Department of Medical Oncology and Cancer Genomics Netherlands, Erasmus MC Cancer Institute, Erasmus University Medical Center, Rotterdam, The Netherlands.
| | - Stefan Sleijfer
- Department of Medical Oncology and Cancer Genomics Netherlands, Erasmus MC Cancer Institute, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - John W M Martens
- Department of Medical Oncology and Cancer Genomics Netherlands, Erasmus MC Cancer Institute, Erasmus University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
15
|
Kumar V, Tyagi K, Kundu S, Chakraborty R, Singha D, Chandra K. The first complete mitochondrial genome of marigold pest thrips, Neohydatothrips samayunkur (Sericothripinae) and comparative analysis. Sci Rep 2019; 9:191. [PMID: 30655597 PMCID: PMC6336932 DOI: 10.1038/s41598-018-37889-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Accepted: 12/10/2018] [Indexed: 11/16/2022] Open
Abstract
Complete mitogenomes from the order Thysanoptera are limited to representatives of the subfamily Thripinae. Therefore, in the present study, we sequenced the mitochondrial genome of Neohydatothrips samayunkur (15,295 bp), a member of subfamily Sericothripinae. The genome possesses the canonical 13 protein-coding genes (PCGs), 22 transfer RNA genes (tRNAs), and two ribosomal RNA genes (rRNAs) as well as two putative control regions (CRs). The majority strand was 77.42% A + T content, and 22.58% G + C with weakly positive AT skew (0.04) and negative GC skew (-0.03). The majority of PCGs start with ATN codons as observed in other insect mitochondrial genomes. The GCG codon (Alanine) was not used in N. samayunkur. Most tRNAs have the typical cloverleaf secondary structure, however the DHU stem and loop were absent in trnV and trnS1, while the TΨC loop was absent in trnR and trnT. The two putative control regions (CR1 and CR2) show 99% sequence similarity indicated a possible duplication, and shared 57 bp repeats were identified. N. samayunkur showed extensive gene rearrangements, with 11 PCGs, 22 tRNAs, and two rRNAs translocated when compared to the ancestral insect. The gene trnL2 was separated from the 'trnL2-cox2' gene block, which is a conserved, ancestral gene order found in all previously sequenced thrips mitogenomes. Both maximum likelihood (ML) and Bayesian inference (BI) phylogenetic trees resulted in similar topologies. The phylogenetic position of N. samayunkur indicates that subfamily Sericothripinae is sister to subfamily Thripinae. More molecular data from different taxonomic groups is needed to understand thrips phylogeny and evolution.
Collapse
Affiliation(s)
- Vikas Kumar
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, M- Block, New Alipore, Kolkata, 700 053, West Bengal, India
| | - Kaomud Tyagi
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, M- Block, New Alipore, Kolkata, 700 053, West Bengal, India.
| | - Shantanu Kundu
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, M- Block, New Alipore, Kolkata, 700 053, West Bengal, India
| | - Rajasree Chakraborty
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, M- Block, New Alipore, Kolkata, 700 053, West Bengal, India
| | - Devkant Singha
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, M- Block, New Alipore, Kolkata, 700 053, West Bengal, India
| | - Kailash Chandra
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, M- Block, New Alipore, Kolkata, 700 053, West Bengal, India
| |
Collapse
|
16
|
Li W, Freudenberg J, Freudenberg J. Alignment-free approaches for predicting novel Nuclear Mitochondrial Segments (NUMTs) in the human genome. Gene 2019; 691:141-152. [PMID: 30630097 DOI: 10.1016/j.gene.2018.12.040] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Revised: 12/07/2018] [Accepted: 12/14/2018] [Indexed: 10/27/2022]
Abstract
The nuclear human genome harbors sequences of mitochondrial origin, indicating an ancestral transfer of DNA from the mitogenome. Several Nuclear Mitochondrial Segments (NUMTs) have been detected by alignment-based sequence similarity search, as implemented in the Basic Local Alignment Search Tool (BLAST). Identifying NUMTs is important for the comprehensive annotation and understanding of the human genome. Here we explore the possibility of detecting NUMTs in the human genome by alignment-free sequence similarity search, such as k-mers (k-tuples, k-grams, oligos of length k) distributions. We find that when k=6 or larger, the k-mer approach and BLAST search produce almost identical results, e.g., detect the same set of NUMTs longer than 3 kb. However, when k=5 or k=4, certain signals are only detected by the alignment-free approach, and these may indicate yet unrecognized, and potentially more ancestral NUMTs. We introduce a "Manhattan plot" style representation of NUMT predictions across the genome, which are calculated based on the reciprocal of the Jensen-Shannon divergence between the nuclear and mitochondrial k-mer frequencies. The further inspection of the k-mer-based NUMT predictions however shows that most of them contain long-terminal-repeat (LTR) annotations, whereas BLAST-based NUMT predictions do not. Thus, similarity of the mitogenome to LTR sequences is recognized, which we validate by finding the mitochondrial k-mer distribution closer to those for transposable sequences and specifically, close to some types of LTR.
Collapse
Affiliation(s)
- Wentian Li
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, USA.
| | - Jerome Freudenberg
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, USA
| | - Jan Freudenberg
- Regeneron Genetics Center, Regeneron Pharmaceuticals, Inc., Tarrytown, NY, USA
| |
Collapse
|
17
|
Chakraborty R, Tyagi K, Kundu S, Rahaman I, Singha D, Chandra K, Patnaik S, Kumar V. The complete mitochondrial genome of Melon thrips, Thrips palmi (Thripinae): Comparative analysis. PLoS One 2018; 13:e0199404. [PMID: 30379813 PMCID: PMC6209132 DOI: 10.1371/journal.pone.0199404] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Accepted: 09/21/2018] [Indexed: 11/19/2022] Open
Abstract
The melon thrips, Thrips palmi is a serious pest and vector for plant viruses on a wide range of economically important crops. DNA barcoding evidenced the presence of cryptic diversity in T. palmi and that warrants exhaustive molecular studies. Our present study is on decoding the first complete mitochondrial genome of T. palmi (15,333 bp) through next-generation sequencing (NGS). The T. palmi mt genome contains 37 genes, including 13 Protein coding genes (PCGs), two ribosomal RNA (rRNAs), 22 transfer RNA (tRNAs), and two control regions (CRs). The majority strand of T. palmi revealed 78.29% A+T content, and 21.72% G+C content with positive AT skew (0.09) and negative GC skew (-0.06). The ATN initiation codons were observed in 12 PCGs except for cox1 which have unique start codon (TTG). The relative synonymous codon usage (RSCU) analysis revealed Phe, Leu, Ile, Tyr, Asn, Lys and Met were the most frequently used amino acids in all PCGs. The codon (CGG) which is assigned to Arginine in most insects but absent in T. palmi. The Ka/Ks ratio ranges from 0.078 in cox1 to 0.913 in atp8. We observed the typical cloverleaf secondary structure in most of the tRNA genes with a few exceptions; absence of DHU stem and loop in trnV and trnS, absence of DHU loop in trnE, lack of T-arm and loop in trnN. The T. palmi gene order (GO) was compared with ancestral GO and observed an extensive gene arrangement in PCGs, tRNAs and rRNAs. The cox2 gene was separated from the gene block 'cox2-trnL2' in T. palmi as compared with the other thrips mt genomes, including ancestor GO. Further, the nad1, trnQ, trnC, trnL1, trnV, trnF, rrnS, and rrnL were inversely transpositioned in T. palmi GO. The gene blocks 'trnQ-trnS2-trnD' and 'trnN-trnE-trnS1-trnL1' seems to be genus specific. The T. palmi mt genome contained 24 intergenic spacer regions and 12 overlapping regions. The 62 bp of CR2 shows the similarity with CR1 indicating a possible duplication. The occurrence of multiple CRs in thrips mt genomes seems to be a derived trait which needs further investigation. Although, the study depicted extensive gene rearrangements in T. palmi mt genome, but the negative GC skew reflects only strand asymmetry. Both the ML and BI phylogenetic trees revealed the close relationships of Thrips with Scirtothrips as compared to Frankliniella. Thus, more mt genomes of the diverse thrips species are required to understand the in-depth phylogenetic and evolutionary relationships.
Collapse
Affiliation(s)
- Rajasree Chakraborty
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, New Alipore, Kolkata, West Bengal, India
- School of Biotechnology, Kalinga Institute of Industrial Technology (KIIT), Deemed to be University, Bhubaneswar, Odisha, India
| | - Kaomud Tyagi
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, New Alipore, Kolkata, West Bengal, India
| | - Shantanu Kundu
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, New Alipore, Kolkata, West Bengal, India
| | - Iftikar Rahaman
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, New Alipore, Kolkata, West Bengal, India
| | - Devkant Singha
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, New Alipore, Kolkata, West Bengal, India
| | - Kailash Chandra
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, New Alipore, Kolkata, West Bengal, India
| | - Srinivas Patnaik
- School of Biotechnology, Kalinga Institute of Industrial Technology (KIIT), Deemed to be University, Bhubaneswar, Odisha, India
| | - Vikas Kumar
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, New Alipore, Kolkata, West Bengal, India
| |
Collapse
|
18
|
Cristadoro G, Degli Esposti M, Altmann EG. The common origin of symmetry and structure in genetic sequences. Sci Rep 2018; 8:15817. [PMID: 30361485 PMCID: PMC6202410 DOI: 10.1038/s41598-018-34136-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Accepted: 10/09/2018] [Indexed: 12/20/2022] Open
Abstract
Biologists have long sought a way to explain how statistical properties of genetic sequences emerged and are maintained through evolution. On the one hand, non-random structures at different scales indicate a complex genome organisation. On the other hand, single-strand symmetry has been scrutinised using neutral models in which correlations are not considered or irrelevant, contrary to empirical evidence. Different studies investigated these two statistical features separately, reaching minimal consensus despite sustained efforts. Here we unravel previously unknown symmetries in genetic sequences, which are organized hierarchically through scales in which non-random structures are known to be present. These observations are confirmed through the statistical analysis of the human genome and explained through a simple domain model. These results suggest that domain models which account for the cumulative action of mobile elements can explain simultaneously non-random structures and symmetries in genetic sequences.
Collapse
Affiliation(s)
- Giampaolo Cristadoro
- Dipartimento di Matematica e Applicazioni, Università di Milano-Bicocca, 20125, Milano, Italy.
| | | | - Eduardo G Altmann
- School of Mathematics and Statistics, University of Sydney, Sydney, 2006, NSW, Australia
| |
Collapse
|
19
|
Sinha DK, Atray I, Agarrwal R, Bentur JS, Nair S. Genomics of the Asian rice gall midge and its interactions with rice. CURRENT OPINION IN INSECT SCIENCE 2017; 19:76-81. [PMID: 28521946 DOI: 10.1016/j.cois.2017.03.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Revised: 10/04/2016] [Accepted: 03/13/2017] [Indexed: 05/28/2023]
Abstract
Understanding virulence and manipulative strategies of gall formers will reveal new facets of plant defense and insect counter defense. Among the gall midges, the Asian rice gall midge (AGM) has emerged as a model for studies on plant-insect interactions. Data from several genomics, transcriptomics and metabolomics studies have revealed diverse strategies adopted by AGM to successfully invade the host while overcoming its defense. Adaptive skills of AGM transcend from its genomic and transcriptomic make-up. Information arising from studies on genetics, mitochondrial genome and miRNAs, amongst other parameters, highlights AGM's capacity to maneuver the host defense, reorient host metabolome and redirect its morphogenesis.
Collapse
Affiliation(s)
- Deepak Kumar Sinha
- Plant-Insect Interaction Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi 110 067, India
| | - Isha Atray
- Plant-Insect Interaction Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi 110 067, India
| | - Ruchi Agarrwal
- Plant-Insect Interaction Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi 110 067, India
| | | | - Suresh Nair
- Plant-Insect Interaction Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi 110 067, India.
| |
Collapse
|
20
|
Gouveia S, Scotto MG, Weiß CH, Ferreira PJSG. Binary auto-regressive geometric modelling in a DNA context. J R Stat Soc Ser C Appl Stat 2016. [DOI: 10.1111/rssc.12172] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
21
|
Apostolou-Karampelis K, Nikolaou C, Almirantis Y. A novel skew analysis reveals substitution asymmetries linked to genetic code GC-biases and PolIII a-subunit isoforms. DNA Res 2016; 23:353-63. [PMID: 27345720 PMCID: PMC4991834 DOI: 10.1093/dnares/dsw021] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2016] [Accepted: 05/09/2016] [Indexed: 11/30/2022] Open
Abstract
Strand biases reflect deviations from a null expectation of DNA evolution that assumes strand-symmetric substitution rates. Here, we present strong evidence that nearest-neighbour preferences are a strand-biased feature of bacterial genomes, indicating neighbour-dependent substitution asymmetries. To detect such asymmetries we introduce an alignment free index (relative abundance skews). The profiles of relative abundance skews along coding sequences can trace the phylogenetic relations of bacteria, suggesting that the patterns of neighbour-dependent substitution strand-biases are not common among different lineages, but are rather species-specific. Analysis of neighbour-dependent and codon-site skews sheds light on the origins of substitution asymmetries. Via a simple model we argue that the structure of the genetic code imposes position-dependent substitution strand-biases along coding sequences, as a response to GC mutation pressure. Thus, the organization of the genetic code per se can lead to an uneven distribution of nucleotides among different codon sites, even when requirements for specific codons and amino-acids are not accounted for. Moreover, our results suggest that strand-biases in replication fidelity of PolIII α-subunit induce substitution asymmetries, both neighbour-dependent and independent, on a genome scale. The role of DNA repair systems, such as transcription-coupled repair, is also considered.
Collapse
Affiliation(s)
| | - Christoforos Nikolaou
- Computational Genomics Group, Department of Biology, University of Crete, 71409 Heraklion, Greece
| | - Yannis Almirantis
- Institute of Biosciences and Applications, National Center for Scientific Research "Demokritos", 15310 Athens, Greece
| |
Collapse
|
22
|
Rosandić M, Vlahović I, Glunčić M, Paar V. Trinucleotide's quadruplet symmetries and natural symmetry law of DNA creation ensuing Chargaff's second parity rule. J Biomol Struct Dyn 2016; 34:1383-94. [PMID: 26524490 DOI: 10.1080/07391102.2015.1080628] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
For almost 50 years the conclusive explanation of Chargaff's second parity rule (CSPR), the equality of frequencies of nucleotides A=T and C=G or the equality of direct and reverse complement trinucleotides in the same DNA strand, has not been determined yet. Here, we relate CSPR to the interstrand mirror symmetry in 20 symbolic quadruplets of trinucleotides (direct, reverse complement, complement, and reverse) mapped to double-stranded genome. The symmetries of Q-box corresponding to quadruplets can be obtained as a consequence of Watson-Crick base pairing and CSPR together. Alternatively, assuming Natural symmetry law for DNA creation that each trinucleotide in one strand of DNA must simultaneously appear also in the opposite strand automatically leads to Q-box direct-reverse mirror symmetry which in conjunction with Watson-Crick base pairing generates CSPR. We demonstrate quadruplet's symmetries in chromosomes of wide range of organisms, from Escherichia coli to Neanderthal and human genomes, introducing novel quadruplet-frequency histograms and 3D-diagrams with combined interstrand frequencies. These "landscapes" are mutually similar in all mammals, including extinct Neanderthals, and somewhat different in most of older species. In human chromosomes 1-12, and X, Y the "landscapes" are almost identical and slightly different in the remaining smaller and telocentric chromosomes. Quadruplet frequencies could provide a new robust tool for characterization and classification of genomes and their evolutionary trajectories.
Collapse
Affiliation(s)
- Marija Rosandić
- a Croatian Academy of Sciences and Arts, HAZU, Bioinformatics and Biological Physics , Zrinski trg 11, 10000 Zagreb , Croatia
| | - Ines Vlahović
- b Faculty of Science , University of Zagreb , Bijenicka 32, 10000 Zagreb , Croatia
| | - Matko Glunčić
- b Faculty of Science , University of Zagreb , Bijenicka 32, 10000 Zagreb , Croatia
| | - Vladimir Paar
- a Croatian Academy of Sciences and Arts, HAZU, Bioinformatics and Biological Physics , Zrinski trg 11, 10000 Zagreb , Croatia.,b Faculty of Science , University of Zagreb , Bijenicka 32, 10000 Zagreb , Croatia
| |
Collapse
|
23
|
Zhang SH. Persistence and breakdown of strand symmetry in the human genome. J Theor Biol 2015; 370:202-4. [PMID: 25576243 DOI: 10.1016/j.jtbi.2014.12.014] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Revised: 12/26/2014] [Accepted: 12/29/2014] [Indexed: 10/24/2022]
Abstract
Afreixo, V., Bastos, C.A.C., Garcia, S.P., Rodrigues, J.M.O.S., Pinho, A.J., Ferreira, P.J.S.G., 2013. The breakdown of the word symmetry in the human genome. J. Theor. Biol. 335, 153-159 analyzed the word symmetry (strand symmetry or the second parity rule) in the human genome. They concluded that strand symmetry holds for oligonucleotides up to 6 nt and is no longer statistically significant for oligonucleotides of higher orders. However, although they provided some new results for the issue, their interpretation would not be fully justified. Also, their conclusion needs to be further evaluated. Further analysis of their results, especially those of equivalence tests and word symmetry distance, shows that strand symmetry would persist for higher-order oligonucleotides up to 9 nt in the human genome, at least for its overall frequency framework (oligonucleotide frequency pattern).
Collapse
Affiliation(s)
- Shang-Hong Zhang
- Key Laboratory of Gene Engineering of Ministry of Education, and Biotechnology Research Center, Sun Yat-sen University, Guangzhou 510275, China.
| |
Collapse
|
24
|
The mitochondrial genome of Dastarcus helophoroides (Coleoptera: Bothrideridae) and related phylogenetic analyses. Gene 2014; 560:15-24. [PMID: 25523091 DOI: 10.1016/j.gene.2014.12.026] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2014] [Revised: 11/21/2014] [Accepted: 12/12/2014] [Indexed: 11/23/2022]
Abstract
The complete mitochondrial genome of Dastarcus helophoroides (Coleoptera: Bothrideridae) which consists of 13 PCGs, 22 tRNA genes, two rRNA genes and a non-coding region (D-loop), is sequenced for its nucleotide sequence of 15,878 bp (GenBank: KF811054.1). The genome has a typical gene order which is identical to other Coleoptera species. Except for COI gene generally starts with non-canonical initial codon, all protein-coding genes start with ATN codon and terminate with the stop codon TA(A) or TAG. The secondary structure of rrnL and rrnS consists of 48 helices (contains four newly proposed helices) and 35 helices (contains two newly proposed helices) respectively. All 22 tRNAs in D. helophoroides are predicted to fold into typical cloverleaf secondary structure, except trnS1 (AGN), in which the dihydrouracil arm (DHU arm) could not form stable stem-loop structure. Thirteen protein-coding genes (nucleotide dataset and nucleic acid dataset) of the available species (29 taxa) have been used to infer the phylogenetic relationships among these orders. Tenebrionoidea and Cucujoidea form a sister group, and D. helophoroides is classified into Cucujoidea (Bothrideridae). The study first research on the phylogenetic analyses involving to the D. helophoroides mitogenome, and the results strongly bolster the current morphology-based hypothesis.
Collapse
|
25
|
Ju YS, Alexandrov LB, Gerstung M, Martincorena I, Nik-Zainal S, Ramakrishna M, Davies HR, Papaemmanuil E, Gundem G, Shlien A, Bolli N, Behjati S, Tarpey PS, Nangalia J, Massie CE, Butler AP, Teague JW, Vassiliou GS, Green AR, Du MQ, Unnikrishnan A, Pimanda JE, Teh BT, Munshi N, Greaves M, Vyas P, El-Naggar AK, Santarius T, Collins VP, Grundy R, Taylor JA, Hayes DN, Malkin D, Foster CS, Warren AY, Whitaker HC, Brewer D, Eeles R, Cooper C, Neal D, Visakorpi T, Isaacs WB, Bova GS, Flanagan AM, Futreal PA, Lynch AG, Chinnery PF, McDermott U, Stratton MR, Campbell PJ. Origins and functional consequences of somatic mitochondrial DNA mutations in human cancer. eLife 2014; 3:e02935. [PMID: 25271376 PMCID: PMC4371858 DOI: 10.7554/elife.02935] [Citation(s) in RCA: 284] [Impact Index Per Article: 28.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2014] [Accepted: 09/26/2014] [Indexed: 01/04/2023] Open
Abstract
Recent sequencing studies have extensively explored the somatic alterations present in the nuclear genomes of cancers. Although mitochondria control energy metabolism and apoptosis, the origins and impact of cancer-associated mutations in mtDNA are unclear. In this study, we analyzed somatic alterations in mtDNA from 1675 tumors. We identified 1907 somatic substitutions, which exhibited dramatic replicative strand bias, predominantly C > T and A > G on the mitochondrial heavy strand. This strand-asymmetric signature differs from those found in nuclear cancer genomes but matches the inferred germline process shaping primate mtDNA sequence content. A number of mtDNA mutations showed considerable heterogeneity across tumor types. Missense mutations were selectively neutral and often gradually drifted towards homoplasmy over time. In contrast, mutations resulting in protein truncation undergo negative selection and were almost exclusively heteroplasmic. Our findings indicate that the endogenous mutational mechanism has far greater impact than any other external mutagens in mitochondria and is fundamentally linked to mtDNA replication.
Collapse
Affiliation(s)
- Young Seok Ju
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Ludmil B Alexandrov
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Moritz Gerstung
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Inigo Martincorena
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Serena Nik-Zainal
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Manasa Ramakrishna
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Helen R Davies
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Elli Papaemmanuil
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Gunes Gundem
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Adam Shlien
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Niccolo Bolli
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Sam Behjati
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Patrick S Tarpey
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Jyoti Nangalia
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
- Cambridge University Hospitals NHS Foundation
Trust, Cambridge, United Kingdom
- Department of Haematology,
University of Cambridge, Cambridge, United
Kingdom
| | - Charles E Massie
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
- Cambridge University Hospitals NHS Foundation
Trust, Cambridge, United Kingdom
- Department of Haematology,
University of Cambridge, Cambridge, United
Kingdom
| | - Adam P Butler
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Jon W Teague
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - George S Vassiliou
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
- Cambridge University Hospitals NHS Foundation
Trust, Cambridge, United Kingdom
- Department of Haematology,
University of Cambridge, Cambridge, United
Kingdom
| | - Anthony R Green
- Cambridge University Hospitals NHS Foundation
Trust, Cambridge, United Kingdom
- Department of Haematology,
University of Cambridge, Cambridge, United
Kingdom
| | - Ming-Qing Du
- Cambridge University Hospitals NHS Foundation
Trust, Cambridge, United Kingdom
| | - Ashwin Unnikrishnan
- Lowy Cancer Research
Centre, University of New South Wales,
Sydney, Australia
| | - John E Pimanda
- Lowy Cancer Research
Centre, University of New South Wales,
Sydney, Australia
| | - Bin Tean Teh
- Laboratory of Cancer
Epigenome, National Cancer Centre,
Singapore, Singapore
- Duke-NUS Graduate Medical School,
Singapore, Singapore
| | - Nikhil Munshi
- Department of Hematologic
Oncology, Dana-Farber Cancer Institute,
Boston, United States
| | - Mel Greaves
- Institute of Cancer Research, Sutton,
London, United Kingdom
| | - Paresh Vyas
- Weatherall Institute for Molecular
Medicine, University of Oxford,
Oxford, United Kingdom
| | - Adel K El-Naggar
- Department of Pathology,
MD Anderson Cancer Center, Houston, United
States
| | - Tom Santarius
- Cambridge University Hospitals NHS Foundation
Trust, Cambridge, United Kingdom
| | - V Peter Collins
- Cambridge University Hospitals NHS Foundation
Trust, Cambridge, United Kingdom
| | - Richard Grundy
- Children's Brain Tumour Research
Centre, University of Nottingham,
Nottingham, United Kingdom
| | - Jack A Taylor
- National Institute of Environmental
Health Sciences, National Institute of
Health, Triangle,
North Carolina, United
States
| | - D Neil Hayes
- Department of Internal
Medicine, University of North Carolina,
Chapel
Hill, United States
| | - David Malkin
- Hospital for Sick
Children, University of Toronto,
Toronto, Canada
| | - Christopher S Foster
- Department of Molecular and Clinical
Cancer Medicine, University of Liverpool,
London, United Kingdom
- HCA Pathology Laboratories,
London, United Kingdom
| | - Anne Y Warren
- Cambridge University Hospitals NHS Foundation
Trust, Cambridge, United Kingdom
| | - Hayley C Whitaker
- Cancer Research UK Cambridge
Institute, University of Cambridge,
Cambridge, United Kingdom
| | - Daniel Brewer
- Institute of Cancer Research, Sutton,
London, United Kingdom
- School of Biological
Sciences, University of East Anglia,
Norwich, United Kingdom
| | - Rosalind Eeles
- Institute of Cancer Research, Sutton,
London, United Kingdom
| | - Colin Cooper
- Institute of Cancer Research, Sutton,
London, United Kingdom
- School of Biological
Sciences, University of East Anglia,
Norwich, United Kingdom
| | - David Neal
- Cancer Research UK Cambridge
Institute, University of Cambridge,
Cambridge, United Kingdom
| | - Tapio Visakorpi
- Institute of Biosciences and Medical
Technology - BioMediTech and Fimlab Laboratories,
University of Tampere and Tampere University Hospital,
Tampere, Finland
| | - William B Isaacs
- Department of Oncology,
Johns Hopkins University, Baltimore, United
States
| | - G Steven Bova
- Institute of Biosciences and Medical
Technology - BioMediTech and Fimlab Laboratories,
University of Tampere and Tampere University Hospital,
Tampere, Finland
| | - Adrienne M Flanagan
- Department of
Histopathology, Royal National Orthopaedic
Hospital, Middlesex, United Kingdom
- University College London Cancer
Institute, University College London,
London, United Kingdom
| | - P Andrew Futreal
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
- Department of Genomic
Medicine, The University of Texas, MD Anderson Cancer
Center, Houston, Texas, United States
| | - Andy G Lynch
- Cancer Research UK Cambridge
Institute, University of Cambridge,
Cambridge, United Kingdom
| | - Patrick F Chinnery
- Wellcome Trust Centre for Mitochondrial
Research, Institute of Genetic Medicine, Newcastle
University, Newcastle-upon-tyne, United
Kingdom
| | - Ultan McDermott
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
- Cambridge University Hospitals NHS Foundation
Trust, Cambridge, United Kingdom
| | - Michael R Stratton
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
| | - Peter J Campbell
- Cancer Genome Project,
Wellcome Trust Sanger Institute,
Hinxton, United Kingdom
- Cambridge University Hospitals NHS Foundation
Trust, Cambridge, United Kingdom
- Department of Haematology,
University of Cambridge, Cambridge, United
Kingdom
| |
Collapse
|
26
|
Wang S, Tu J, Jia Z, Lu Z. High order intra-strand partial symmetry increases with organismal complexity in animal evolution. Sci Rep 2014; 4:6400. [PMID: 25263801 PMCID: PMC4178289 DOI: 10.1038/srep06400] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2014] [Accepted: 08/28/2014] [Indexed: 12/02/2022] Open
Abstract
For sufficiently long genomic sequence, the frequency of any short nucleotide fragment on one strand is approximately equal to the frequency of its reverse complement on the same strand. Despite being studied over two decades, the precise mechanism involved has not yet been made clear. In this study, we calculated the high order intra-strand partial symmetry (IPS) for 14 animal species by using a fixed sliding window method to scan each genome sequence. The study showed that the IPS was positive associated with organismal complexity measured by the number of distinct cell types. The results indicated that the IPS might be resulted from the increasing of functional non-coding DNAs, and plays an important role in the evolution process of complex body plans.
Collapse
Affiliation(s)
- Shengqin Wang
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China
| | - Jing Tu
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China
| | - Zhongwei Jia
- National Institute of Drug Dependence, Peking University, Beijing 100191, China
| | - Zuhong Lu
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China
- Department of Biomedical Engineering, College of Engineering, Peking University, Beijing, 100781, China
| |
Collapse
|
27
|
Seligmann H. Species radiation by DNA replication that systematically exchanges nucleotides? J Theor Biol 2014; 363:216-22. [PMID: 25192628 DOI: 10.1016/j.jtbi.2014.08.036] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2014] [Revised: 08/14/2014] [Accepted: 08/19/2014] [Indexed: 11/28/2022]
Abstract
RNA and DNA syntheses share many properties. Therefore, the existence of 'swinger' RNAs, presumed 'orphan' transcripts matching genomic sequences only if transcription systematically exchanged nucleotides, suggests replication producing swinger DNA. Transcripts occur in many short-lived copies, the few cellular DNA molecules are long-lived. Hence pressures for functional swinger DNAs are greater than for swinger RNAs. Protein coding properties of swinger sequences differ from original sequences, suggesting rarity of corresponding swinger DNA. For genes producing structural RNAs, such as tRNAs and rRNAs, three exchanges (A<->T, C<->G and A<->T+C<->G) conserve self-hybridization properties. All nuclear eukaryote swinger DNA sequences detected in GenBank are for rRNA genes assuming A<->T+C<->G exchanges. In brachyuran crabs, 25 species had A<->T+C<->G swinger 18S rDNA, all matching the reverse-exchanged version of regular 18S rDNA of a related species. In this taxon, swinger replication of 18S rDNA apparently associated with, or even resulted in species radiation. A<->T+C<->G transformation doesn't invert sequence direction, differing from inverted repeats. Swinger repeats (detectable only assuming swinger transformations, A<->T+C<->G swinger repeats most frequent) within regular human rRNAs, independently confirm swinger polymerizations for most swinger types. Swinger replication might be an unsuspected molecular mechanism for ultrafast speciation.
Collapse
Affiliation(s)
- Hervé Seligmann
- Unité de Recherche sur les Maladies Infectieuses et Tropicales Émergentes, Faculté de Médecine, URMITE CNRS-IRD 198 UMER 6236, Université de la Méditerranée, Marseille, France.
| |
Collapse
|
28
|
Nikolaou C, Bermúdez I, Manichanh C, García-Martinez J, Guigó R, Pérez-Ortín JE, Roca J. Topoisomerase II regulates yeast genes with singular chromatin architectures. Nucleic Acids Res 2013; 41:9243-56. [PMID: 23935120 PMCID: PMC3814376 DOI: 10.1093/nar/gkt707] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Eukaryotic topoisomerase II (topo II) is the essential decatenase of newly replicated chromosomes and the main relaxase of nucleosomal DNA. Apart from these general tasks, topo II participates in more specialized functions. In mammals, topo IIα interacts with specific RNA polymerases and chromatin-remodeling complexes, whereas topo IIβ regulates developmental genes in conjunction with chromatin remodeling and heterochromatin transitions. Here we show that in budding yeast, topo II regulates the expression of specific gene subsets. To uncover this, we carried out a genomic transcription run-on shortly after the thermal inactivation of topo II. We identified a modest number of genes not involved in the general stress response but strictly dependent on topo II. These genes present distinctive functional and structural traits in comparison with the genome average. Yeast topo II is a positive regulator of genes with well-defined promoter architecture that associates to chromatin remodeling complexes; it is a negative regulator of genes extremely hypo-acetylated with complex promoters and undefined nucleosome positioning, many of which are involved in polyamine transport. These findings indicate that yeast topo II operates on singular chromatin architectures to activate or repress DNA transcription and that this activity produces functional responses to ensure chromatin stability.
Collapse
Affiliation(s)
- Christoforos Nikolaou
- Molecular Biology Institute of Barcelona, CSIC, 08028 Barcelona, Spain, Department of Biology, University of Crete, 71409 Heraklion, Greece, Department of Genetics and ERI Biotecmed, University of Valencia, 46100 Burjassot, Spain, Centre for Genomic Regulation (CRG), 08003 Barcelona, Spain and Department of Biochemistry and Molecular Biology and ERI Biotecmed, University of Valencia, 46100 Burjassot, Spain
| | | | | | | | | | | | | |
Collapse
|
29
|
Patterns of nucleotide asymmetries in plant and animal genomes. Biosystems 2013; 111:181-9. [PMID: 23438636 DOI: 10.1016/j.biosystems.2013.02.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2012] [Revised: 11/29/2012] [Accepted: 02/07/2013] [Indexed: 11/20/2022]
Abstract
Symmetry in biology provides many intriguing puzzles to the scientist's mind. Chargaff's second parity rule states a symmetric distribution of oligonucleotides within a single strand of double-stranded DNA. While this rule has been verified in a wide range of microbial genomes, it still awaits explanation. In our study, we inquired into patterns of mono- and trinucleotide intra-strand parity in complex plant genomic sequences that became available during the last few years, and compared these to equally complex animal genomes. The degree and patterns of deviation from Chargaff's second rule were different between plant and animal species. We observed a universal inter-chromosomal homogeneity of mononucleotide skews in coding sequences of plant chromosomes, while the base composition of animal coding sequences differed between chromosomes even within a single species. We also found differences in the base composition of dicot introns in comparison to those of monocots. These genome-wide patterns were limited to genic regions and were not encountered in inter-genic sequences. We discuss the implications of our findings in relation to hypotheses about functional correlations of intra-strand parity which have hitherto been put forward. Furthermore, we propose more recent polyploidization and subsequent homogenization of homoeologues as a possible reason for more homogeneous skew patterns in plants.
Collapse
|
30
|
Zhang SH, Wang L. Two common profiles exist for genomic oligonucleotide frequencies. BMC Res Notes 2012; 5:639. [PMID: 23158698 PMCID: PMC3532236 DOI: 10.1186/1756-0500-5-639] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2012] [Accepted: 11/14/2012] [Indexed: 11/19/2022] Open
Abstract
Background It was reported that there is a majority profile for trinucleotide frequencies among genomes. And further study has revealed that two common profiles, rather than one majority profile, exist for genomic trinucleotide frequencies. However, the origins of the common/majority profile remain elusive. Moreover, it is not clear whether the features of common profile may be extended to oligonucleotides other than trinucleotides. Findings We analyzed 571 prokaryotic genomes (chromosomes) and some selected eukaryotic nuclear genomes as well as other genetic systems to study their compositional features. We found that there are also two common profiles for genomic oligonucleotide frequencies: one is from low-GC content genomes, and the other is from high-GC content genomes. Furthermore, each common profile is highly correlated to the average profile of random sequences with corresponding GC content and generated according to first-order symmetry. Conclusions The causes for the existence of two common profiles would mainly be GC content variations and strand symmetry of genomic sequences. Therefore, both GC content and strand symmetry would play important roles in genome evolution.
Collapse
Affiliation(s)
- Shang-Hong Zhang
- Key Laboratory of Gene Engineering of Ministry of Education, and Biotechnology Research Center, Sun Yat-sen University, Guangzhou, 510275, China.
| | | |
Collapse
|
31
|
Xia X. DNA replication and strand asymmetry in prokaryotic and mitochondrial genomes. Curr Genomics 2012; 13:16-27. [PMID: 22942672 PMCID: PMC3269012 DOI: 10.2174/138920212799034776] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2011] [Revised: 09/26/2011] [Accepted: 10/02/2011] [Indexed: 11/22/2022] Open
Abstract
Different patterns of strand asymmetry have been documented in a variety of prokaryotic genomes as well as mitochondrial genomes. Because different replication mechanisms often lead to different patterns of strand asymmetry, much can be learned of replication mechanisms by examining strand asymmetry. Here I summarize the diverse patterns of strand asymmetry among different taxonomic groups to suggest that (1) the single-origin replication may not be universal among bacterial species as the endosymbionts Wigglesworthia glossinidia, Wolbachia species, cyanobacterium Synechocystis 6803 and Mycoplasma pulmonis genomes all exhibit strand asymmetry patterns consistent with the multiple origins of replication, (2) different replication origins in some archaeal genomes leave quite different patterns of strand asymmetry, suggesting that different replication origins in the same genome may be differentially used, (3) mitochondrial genomes from representative vertebrate species share one strand asymmetry pattern consistent with the strand-displacement replication documented in mammalian mtDNA, suggesting that the mtDNA replication mechanism in mammals may be shared among all vertebrate species, and (4) mitochondrial genomes from primitive forms of metazoans such as the sponge and hydra (representing Porifera and Cnidaria, respectively), as well as those from plants, have strand asymmetry patterns similar to single-origin or multi-origin replications observed in prokaryotes and are drastically different from mitochondrial genomes from other metazoans. This may explain why sponge and hydra mitochondrial genomes, as well as plant mitochondrial genomes, evolves much slower than those from other metazoans.
Collapse
Affiliation(s)
- Xuhua Xia
- Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, 30 Marie Curie, P.O. Box 450, Station A, Ottawa, Ontario, Canada
| |
Collapse
|
32
|
Arakawa K, Tomita M. Measures of compositional strand bias related to replication machinery and its applications. Curr Genomics 2012; 13:4-15. [PMID: 22942671 PMCID: PMC3269016 DOI: 10.2174/138920212799034749] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2011] [Revised: 09/10/2011] [Accepted: 09/20/2011] [Indexed: 11/22/2022] Open
Abstract
The compositional asymmetry of complementary bases in nucleotide sequences implies the existence of a mutational or selectional bias in the two strands of the DNA duplex, which is commonly shaped by strand-specific mechanisms in transcription or replication. Such strand bias in genomes, frequently visualized by GC skew graphs, is used for the computational prediction of transcription start sites and replication origins, as well as for comparative evolutionary genomics studies. The use of measures of compositional strand bias in order to quantify the degree of strand asymmetry is crucial, as it is the basis for determining the applicability of compositional analysis and comparing the strength of the mutational bias in different biological machineries in various species. Here, we review the measures of strand bias that have been proposed to date, including the ∆GC skew, the B1 index, the predictability score of linear discriminant analysis for gene orientation, the signal-to-noise ratio of the oligonucleotide bias, and the GC skew index. These measures have been predominantly designed for and applied to the analysis of replication-related mutational processes in prokaryotes, but we also give research examples in eukaryotes.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan
| | | |
Collapse
|
33
|
Selva Kumar C, Nair RR, Sivaramakrishnan KG, Ganesh D, Janarthanan S, Arunachalam M, Sivaruban T. Influence of certain forces on evolution of synonymous codon usage bias in certain species of three basal orders of aquatic insects. ACTA ACUST UNITED AC 2012; 23:447-60. [PMID: 22943112 DOI: 10.3109/19401736.2012.710203] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Forces that influence the evolution of synonymous codon usage bias are analyzed in six species of three basal orders of aquatic insects. The rationale behind choosing six species of aquatic insects (three from Ephemeroptera, one from Plecoptera, and two from Odonata) for the present analysis is based on phylogenetic position at the basal clades of the Order Insecta facilitating the understanding of the evolution of codon bias and of factors shaping codon usage patterns in primitive clades of insect lineages and their subtle differences in some of their ecological and environmental requirements in terms of habitat-microhabitat requirements, altitudinal preferences, temperature tolerance ranges, and consequent responses to climate change impacts. The present analysis focuses on open reading frames of the 13 protein-coding genes in the mitochondrial genome of six carefully chosen insect species to get a comprehensive picture of the evolutionary intricacies of codon bias. In all the six species, A and T contents are observed to be significantly higher than G and C, and are used roughly equally. Since transcription hypothesis on codon usage demands A richness and T poorness, it is quite likely that mutation pressure may be the key factor associated with synonymous codon usage (SCU) variations in these species because the mutation hypothesis predicts AT richness and GC poorness in the mitochondrial DNA. Thus, AT-biased mutation pressure seems to be an important factor in framing the SCU variation in all the selected species of aquatic insects, which in turn explains the predominance of A and T ending codons in these species. This study does not find any association between microhabitats and codon usage variations in the mitochondria of selected aquatic insects. However, this study has identified major forces, such as compositional constraints and mutation pressure, which shape patterns of codon usage in mitochondrial genes in the primitive clades of insect lineages.
Collapse
Affiliation(s)
- C Selva Kumar
- Department of Zoology, University of Madras, Chennai 600 025, Tamil Nadu, India
| | | | | | | | | | | | | |
Collapse
|
34
|
Kensche PR, Duarte I, Huynen MA. A three-dimensional topology of complex I inferred from evolutionary correlations. BMC STRUCTURAL BIOLOGY 2012; 12:19. [PMID: 22857522 PMCID: PMC3436739 DOI: 10.1186/1472-6807-12-19] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/13/2012] [Accepted: 06/28/2012] [Indexed: 11/22/2022]
Abstract
Background The quaternary structure of eukaryotic NADH:ubiquinone oxidoreductase (complex I), the largest complex of the oxidative phosphorylation, is still mostly unresolved. Furthermore, it is unknown where transiently bound assembly factors interact with complex I. We therefore asked whether the evolution of complex I contains information about its 3D topology and the binding positions of its assembly factors. We approached these questions by correlating the evolutionary rates of eukaryotic complex I subunits using the mirror-tree method and mapping the results into a 3D representation by multidimensional scaling. Results More than 60% of the evolutionary correlation among the conserved seven subunits of the complex I matrix arm can be explained by the physical distance between the subunits. The three-dimensional evolutionary model of the eukaryotic conserved matrix arm has a striking similarity to the matrix arm quaternary structure in the bacterium Thermus thermophilus (rmsd=19 Å) and supports the previous finding that in eukaryotes the N-module is turned relative to the Q-module when compared to bacteria. By contrast, the evolutionary rates contained little information about the structure of the membrane arm. A large evolutionary model of 45 subunits and assembly factors allows to predict subunit positions and interactions (rmsd = 52.6 Å). The model supports an interaction of NDUFAF3, C8orf38 and C2orf56 during the assembly of the proximal matrix arm and the membrane arm. The model further suggests a tight relationship between the assembly factor NUBPL and NDUFA2, which both have been linked to iron-sulfur cluster assembly, as well as between NDUFA12 and its paralog, the assembly factor NDUFAF2. Conclusions The physical distance between subunits of complex I is a major correlate of the rate of protein evolution in the complex I matrix arm and is sufficient to infer parts of the complex’s structure with high accuracy. The resulting evolutionary model predicts the positions of a number of subunits and assembly factors.
Collapse
Affiliation(s)
- Philip R Kensche
- Center for Molecular and Biomolecular Informatics/Nijmegen Center for Molecular Life Sciences, Radboud University Medical Center, PO Box 9101, Nijmegen, HB, 6500, The Netherlands.
| | | | | |
Collapse
|
35
|
Mutational bias plays an important role in shaping longevity-related amino acid content in mammalian mtDNA-encoded proteins. J Mol Evol 2012; 74:332-41. [PMID: 22752047 DOI: 10.1007/s00239-012-9510-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2012] [Accepted: 06/12/2012] [Indexed: 10/28/2022]
Abstract
During the course of evolution, amino acid shifts might have resulted in mitochondrial proteomes better endowed to resist oxidative stress. However, owing to the problem of distinguishing between functional constraints/adaptations in protein sequences and mutation-driven biases in the composition of these sequences, the adaptive value of such amino acid shifts remains under discussion. We have analyzed the coding sequences of mtDNA from 173 mammalian species, dissecting the effect of nucleotide composition on amino acid usages. We found remarkable cysteine avoidance in mtDNA-encoded proteins. However, no effect of longevity on cysteine content could be detected. On the other hand, nucleotide compositional shifts fully accounted for threonine usages. In spite of a strong effect of mutational bias on methionine abundances, our results suggest a role of selection in determining the composition of methionine. Whether this selective effect is linked or not to protection against oxidative stress is still a subject of debate.
Collapse
|
36
|
Wei SJ, Shi M, Chen XX, Sharkey MJ, van Achterberg C, Ye GY, He JH. New views on strand asymmetry in insect mitochondrial genomes. PLoS One 2010; 5:e12708. [PMID: 20856815 PMCID: PMC2939890 DOI: 10.1371/journal.pone.0012708] [Citation(s) in RCA: 198] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2009] [Accepted: 08/20/2010] [Indexed: 01/16/2023] Open
Abstract
Strand asymmetry in nucleotide composition is a remarkable feature of animal mitochondrial genomes. Understanding the mutation processes that shape strand asymmetry is essential for comprehensive knowledge of genome evolution, demographical population history and accurate phylogenetic inference. Previous studies found that the relative contributions of different substitution types to strand asymmetry are associated with replication alone or both replication and transcription. However, the relative contributions of replication and transcription to strand asymmetry remain unclear. Here we conducted a broad survey of strand asymmetry across 120 insect mitochondrial genomes, with special reference to the correlation between the signs of skew values and replication orientation/gene direction. The results show that the sign of GC skew on entire mitochondrial genomes is reversed in all species of three distantly related families of insects, Philopteridae (Phthiraptera), Aleyrodidae (Hemiptera) and Braconidae (Hymenoptera); the replication-related elements in the A+T-rich regions of these species are inverted, confirming that reversal of strand asymmetry (GC skew) was caused by inversion of replication origin; and finally, the sign of GC skew value is associated with replication orientation but not with gene direction, while that of AT skew value varies with gene direction, replication and codon positions used in analyses. These findings show that deaminations during replication and other mutations contribute more than selection on amino acid sequences to strand compositions of G and C, and that the replication process has a stronger affect on A and T content than does transcription. Our results may contribute to genome-wide studies of replication and transcription mechanisms.
Collapse
Affiliation(s)
- Shu-Jun Wei
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
- Institute of Plant and Environmental Protection, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Min Shi
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| | - Xue-Xin Chen
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| | - Michael J. Sharkey
- Department of Entomology, University of Kentucky, Lexington, Kentucky, United States of America
| | | | - Gong-Yin Ye
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| | - Jun-Hua He
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| |
Collapse
|
37
|
Zhang SH, Huang YZ. Limited contribution of stem-loop potential to symmetry of single-stranded genomic DNA. ACTA ACUST UNITED AC 2009; 26:478-85. [PMID: 20031973 DOI: 10.1093/bioinformatics/btp703] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
MOTIVATION The phenomenon of strand symmetry, which may provide clues to genome evolution, exists in all prokaryotic and eukaryotic genomes studied. Several possible mechanisms for its origins have been proposed, including: no strand biases for mutation and selection, strand inversion and selection of stem-loop structures. However, the relative contributions of these mechanisms to strand symmetry are not clear. In this article, we studied specifically the role of stem-loop potential of single-stranded DNA in strand symmetry. RESULTS We analyzed the complete genomes of 90 prokaryotes. We found that most oligonucleotides (pentanucleotides and higher) do not have a reverse complement in close proximity in the genomic sequences. Combined with further analysis, we conclude that the contribution of the widespread stem-loop potential of single-stranded genomic DNA to the formation and maintenance of strand symmetry would be very limited, at least for higher-order oligonucleotides. Therefore, other possible causes for strand symmetry must be taken into account to a deeper degree.
Collapse
Affiliation(s)
- Shang-Hong Zhang
- The Key Laboratory of Gene Engineering of Ministry of Education, and Biotechnology Research Center, Sun Yat-sen University, Guangzhou 510275, China.
| | | |
Collapse
|
38
|
Powdel BR, Satapathy SS, Kumar A, Jha PK, Buragohain AK, Borah M, Ray SK. A study in entire chromosomes of violations of the intra-strand parity of complementary nucleotides (Chargaff's second parity rule). DNA Res 2009; 16:325-43. [PMID: 19861381 PMCID: PMC2780954 DOI: 10.1093/dnares/dsp021] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Chargaff's rule of intra-strand parity (ISP) between complementary mono/oligonucleotides in chromosomes is well established in the scientific literature. Although a large numbers of papers have been published citing works and discussions on ISP in the genomic era, scientists are yet to find all the factors responsible for such a universal phenomenon in the chromosomes. In the present work, we have tried to address the issue from a new perspective, which is a parallel feature to ISP. The compositional abundance values of mono/oligonucleotides were determined in all non-overlapping sub-chromosomal regions of specific size. Also the frequency distributions of the mono/oligonucleotides among the regions were compared using the Kolmogorov–Smirnov test. Interestingly, the frequency distributions between the complementary mono/oligonucleotides revealed statistical similarity, which we named as intra-strand frequency distribution parity (ISFDP). ISFDP was observed as a general feature in chromosomes of bacteria, archaea and eukaryotes. Violation of ISFDP was also observed in several chromosomes. Chromosomes of different strains belonging a species in bacteria/archaea (Haemophilus influenza, Xylella fastidiosa etc.) and chromosomes of a eukaryote are found to be different among each other with respect to ISFDP violation. ISFDP correlates weakly with ISP in chromosomes suggesting that the latter one is not entirely responsible for the former. Asymmetry of replication topography and composition of forward-encoded sequences between the strands in chromosomes are found to be insufficient to explain the ISFDP feature in all chromosomes. This suggests that multiple factors in chromosomes are responsible for establishing ISFDP.
Collapse
Affiliation(s)
- B R Powdel
- 1Department of Mathematical Sciences, Tezpur University, Tezpur, Assam 784 028, India
| | | | | | | | | | | | | |
Collapse
|
39
|
Krishnan NM, Rao BJ. A comparative approach to elucidate chloroplast genome replication. BMC Genomics 2009; 10:237. [PMID: 19457260 PMCID: PMC2695485 DOI: 10.1186/1471-2164-10-237] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2008] [Accepted: 05/20/2009] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND Electron microscopy analyses of replicating chloroplast molecules earlier predicted bidirectional Cairns replication as the prevalent mechanism, perhaps followed by rounds of a rolling circle mechanism. This standard model is being challenged by the recent proposition of homologous recombination-mediated replication in chloroplasts. RESULTS We address this issue in our current study by analyzing nucleotide composition in genome regions between known replication origins, with an aim to reveal any adenine to guanine deamination gradients. These gradual linear gradients typically result from the accumulation of deaminations over the time spent single-stranded by one of the strands of the circular molecule during replication and can, therefore, be used to model the course of replication. Our linear regression analyses on the nucleotide compositions of the non-coding regions and the synonymous third codon position of coding regions, between pairs of replication origins, reveal the existence of significant adenine to guanine deamination gradients in portions overlapping the Small Single Copy (SSC) and the Large Single Copy (LSC) regions between inverted repeats. These gradients increase bi-directionally from the center of each region towards the respective ends, suggesting that both the strands were left single-stranded during replication. CONCLUSION Single-stranded regions of the genome and gradients in time that these regions are left single-stranded, as revealed by our nucleotide composition analyses, appear to converge with the original bi-directional dual displacement loop model and restore evidence for its existence as the primary mechanism. Other proposed faster modes such as homologous recombination and rolling circle initiation could exist in addition to this primary mechanism to facilitate homoplasmy among the intra-cellular chloroplast population.
Collapse
Affiliation(s)
- Neeraja M Krishnan
- B-202, Department of Biological Sciences, Tata Institute of Fundamental Research, 1 Homi Bhabha road, Colaba, Mumbai 400 005, India
- Current address: Molecular Reproduction, Development and Genetics, Indian Institute of Science, Bangalore 560 012, India
| | - Basuthkar J Rao
- B-202, Department of Biological Sciences, Tata Institute of Fundamental Research, 1 Homi Bhabha road, Colaba, Mumbai 400 005, India
| |
Collapse
|
40
|
Jobson RW, Qiu YL. Did RNA editing in plant organellar genomes originate under natural selection or through genetic drift? Biol Direct 2008; 3:43. [PMID: 18939975 PMCID: PMC2584032 DOI: 10.1186/1745-6150-3-43] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2008] [Accepted: 10/21/2008] [Indexed: 11/15/2022] Open
Abstract
Background The C↔U substitution types of RNA editing have been observed frequently in organellar genomes of land plants. Although various attempts have been made to explain why such a seemingly inefficient genetic mechanism would have evolved, no satisfactory explanation exists in our view. In this study, we examined editing patterns in chloroplast genomes of the hornwort Anthoceros formosae and the fern Adiantum capillus-veneris and in mitochondrial genomes of the angiosperms Arabidopsis thaliana, Beta vulgaris and Oryza sativa, to gain an understanding of the question of how RNA editing originated. Results We found that 1) most editing sites were distributed at the 2nd and 1st codon positions, 2) editing affected codons that resulted in larger hydrophobicity and molecular size changes much more frequently than those with little change involved, 3) editing uniformly increased protein hydrophobicity, 4) editing occurred more frequently in ancestrally T-rich sequences, which were more abundant in genes encoding membrane-bound proteins with many hydrophobic amino acids than in genes encoding soluble proteins, and 5) editing occurred most often in genes found to be under strong selective constraint. Conclusion These analyses show that editing mostly affects functionally important and evolutionarily conserved codon positions, codons and genes encoding membrane-bound proteins. In particular, abundance of RNA editing in plant organellar genomes may be associated with disproportionately large percentages of genes in these two genomes that encode membrane-bound proteins, which are rich in hydrophobic amino acids and selectively constrained. These data support a hypothesis that natural selection imposed by protein functional constraints has contributed to selective fixation of certain editing sites and maintenance of the editing activity in plant organelles over a period of more than four hundred millions years. The retention of genes encoding RNA editing activity may be driven by forces that shape nucleotide composition equilibrium in two organellar genomes of these plants. Nevertheless, the causes of lineage-specific occurrence of a large portion of RNA editing sites remain to be determined. Reviewers This article was reviewed by Michael Gray (nominated by Laurence Hurst), Kirsten Krause (nominated by Martin Lercher), and Jeffery Mower (nominated by David Ardell).
Collapse
Affiliation(s)
- Richard W Jobson
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan 48109-1048, USA.
| | | |
Collapse
|
41
|
Sorimachi K, Okayasu T. Codon evolution is governed by linear formulas. Amino Acids 2008; 34:661-8. [PMID: 18180868 DOI: 10.1007/s00726-007-0024-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2007] [Accepted: 12/17/2007] [Indexed: 10/22/2022]
Abstract
When nucleotide (G, C, T and A) contents were plotted against each nucleotide, their relationships were clearly expressed by a linear formula, y = alphax + beta in the coding and non-coding regions. This linear relationship was obtained from the complete single-stranded DNA. Similarly, nucleotide contents at all three codon positions were expressed by linear regression lines based on the content of each nucleotide. In addition, 64 codon usages were also expressed by linear formulas against nucleotide content. Thus, the nucleotide content not only in coding sequence but also in non-coding sequence can be expressed by a linear formula, y = alphax + beta, in 145 organisms (112 bacteria, 15 archaea and 18 eukaryotes). Based on these results, the ratio of C/T, G/T, C/A or G/A one can essentially estimate all four nucleotide contents in the complete single-stranded DNA, and the determination of any ratio of two kinds of nucleotides can essentially estimate four nucleotide contents, nucleotide contents at the three different codon positions and codon distributions at 64 codons in the coding region. The maximum and minimum values of G content were approximately 0.35 and approximately 0.15, respectively, among various organisms examined. Codon evolution occurs according to linear formulas between these two values.
Collapse
Affiliation(s)
- K Sorimachi
- Educational Support Center, Dokkyo Medical University, Mibu, Tochigi 321-0293, Japan.
| | | |
Collapse
|
42
|
Evolutionary implications of inversions that have caused intra-strand parity in DNA. BMC Genomics 2007; 8:160. [PMID: 17562011 PMCID: PMC1913523 DOI: 10.1186/1471-2164-8-160] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2007] [Accepted: 06/11/2007] [Indexed: 11/22/2022] Open
Abstract
Background Chargaff's rule of DNA base composition, stating that DNA comprises equal amounts of adenine and thymine (%A = %T) and of guanine and cytosine (%C = %G), is well known because it was fundamental to the conception of the Watson-Crick model of DNA structure. His second parity rule stating that the base proportions of double-stranded DNA are also reflected in single-stranded DNA (%A = %T, %C = %G) is more obscure, likely because its biological basis and significance are still unresolved. Within each strand, the symmetry of single nucleotide composition extends even further, being demonstrated in the balance of di-, tri-, and multi-nucleotides with their respective complementary oligonucleotides. Results Here, we propose that inversions are sufficient to account for the symmetry within each single-stranded DNA. Human mitochondrial DNA does not demonstrate such intra-strand parity, and we consider how its different functional drivers may relate to our theory. This concept is supported by the recent observation that inversions occur frequently. Conclusion Along with chromosomal duplications, inversions must have been shaping the architecture of genomes since the origin of life.
Collapse
|