1
|
Biró B, Gál Z, Fekete Z, Klecska E, Hoffmann OI. Mitochondrial genome plasticity of mammalian species. BMC Genomics 2024; 25:278. [PMID: 38486136 PMCID: PMC10941376 DOI: 10.1186/s12864-024-10201-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 03/08/2024] [Indexed: 03/17/2024] Open
Abstract
There is an ongoing process in which mitochondrial sequences are being integrated into the nuclear genome. The importance of these sequences has already been revealed in cancer biology, forensic, phylogenetic studies and in the evolution of the eukaryotic genetic information. Human and numerous model organisms' genomes were described from those sequences point of view. Furthermore, recent studies were published on the patterns of these nuclear localised mitochondrial sequences in different taxa.However, the results of the previously released studies are difficult to compare due to the lack of standardised methods and/or using few numbers of genomes. Therefore, in this paper our primary goal is to establish a uniform mining pipeline to explore these nuclear localised mitochondrial sequences.Our results show that the frequency of several repetitive elements is higher in the flanking regions of these sequences than expected. A machine learning model reveals that the flanking regions' repetitive elements and different structural characteristics are highly influential during the integration process.In this paper, we introduce a general mining pipeline for all mammalian genomes. The workflow is publicly available and is believed to serve as a validated baseline for future research in this field. We confirm the widespread opinion, on - as to our current knowledge - the largest dataset, that structural circumstances and events corresponding to repetitive elements are highly significant. An accurate model has also been trained to predict these sequences and their corresponding flanking regions.
Collapse
Affiliation(s)
- Bálint Biró
- Agribiotechnology and Precision Breeding for Food Security National Laboratory, Department of Animal Biotechnology, Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life Sciences, Szent-Györgyi Albert str. 4, 2100, Gödöllő, Hungary.
- Group BM, Data Insights Team, _VOIS, Kerepesi str. 35, 1087, Budapest, Hungary.
| | - Zoltán Gál
- Agribiotechnology and Precision Breeding for Food Security National Laboratory, Department of Animal Biotechnology, Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life Sciences, Szent-Györgyi Albert str. 4, 2100, Gödöllő, Hungary
| | - Zsófia Fekete
- Department of Genetics and Genomics, Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life Sciences, Szent-Györgyi Albert str. 4, 2100, Gödöllő, Hungary
| | - Eszter Klecska
- FamiCord Group, Krio Institute, Kelemen László str, 1026, Budapest, Hungary
| | - Orsolya Ivett Hoffmann
- Agribiotechnology and Precision Breeding for Food Security National Laboratory, Department of Animal Biotechnology, Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life Sciences, Szent-Györgyi Albert str. 4, 2100, Gödöllő, Hungary.
| |
Collapse
|
2
|
Mikhailova AG, Mikhailova AA, Ushakova K, Tretiakov EO, Iliushchenko D, Shamansky V, Lobanova V, Kozenkov I, Efimenko B, Yurchenko AA, Kozenkova E, Zdobnov EM, Makeev V, Yurov V, Tanaka M, Gostimskaya I, Fleischmann Z, Annis S, Franco M, Wasko K, Denisov S, Kunz WS, Knorre D, Mazunin I, Nikolaev S, Fellay J, Reymond A, Khrapko K, Gunbin K, Popadin K. A mitochondria-specific mutational signature of aging: increased rate of A > G substitutions on the heavy strand. Nucleic Acids Res 2022; 50:10264-10277. [PMID: 36130228 PMCID: PMC9561281 DOI: 10.1093/nar/gkac779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 08/02/2022] [Accepted: 09/07/2022] [Indexed: 11/21/2022] Open
Abstract
The mutational spectrum of the mitochondrial DNA (mtDNA) does not resemble any of the known mutational signatures of the nuclear genome and variation in mtDNA mutational spectra between different organisms is still incomprehensible. Since mitochondria are responsible for aerobic respiration, it is expected that mtDNA mutational spectrum is affected by oxidative damage. Assuming that oxidative damage increases with age, we analyse mtDNA mutagenesis of different species in regards to their generation length. Analysing, (i) dozens of thousands of somatic mtDNA mutations in samples of different ages (ii) 70053 polymorphic synonymous mtDNA substitutions reconstructed in 424 mammalian species with different generation lengths and (iii) synonymous nucleotide content of 650 complete mitochondrial genomes of mammalian species we observed that the frequency of AH > GH substitutions (H: heavy strand notation) is twice bigger in species with high versus low generation length making their mtDNA more AH poor and GH rich. Considering that AH > GH substitutions are also sensitive to the time spent single-stranded (TSSS) during asynchronous mtDNA replication we demonstrated that AH > GH substitution rate is a function of both species-specific generation length and position-specific TSSS. We propose that AH > GH is a mitochondria-specific signature of oxidative damage associated with both aging and TSSS.
Collapse
Affiliation(s)
- Alina G Mikhailova
- Center for Mitochondrial Functional Genomics, Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation
- Vavilov Institute of General Genetics RAS, Moscow, Russia
| | - Alina A Mikhailova
- Center for Mitochondrial Functional Genomics, Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation
| | - Kristina Ushakova
- Center for Mitochondrial Functional Genomics, Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation
| | - Evgeny O Tretiakov
- Center for Mitochondrial Functional Genomics, Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation
- Department of Molecular Neurosciences, Center for Brain Research, Medical University of Vienna, Vienna, Austria
| | - Dmitrii Iliushchenko
- Center for Mitochondrial Functional Genomics, Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation
| | - Victor Shamansky
- Center for Mitochondrial Functional Genomics, Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation
| | - Valeria Lobanova
- Center for Mitochondrial Functional Genomics, Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation
| | - Ivan Kozenkov
- Center for Mitochondrial Functional Genomics, Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation
| | - Bogdan Efimenko
- Center for Mitochondrial Functional Genomics, Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation
| | - Andrey A Yurchenko
- INSERM U981, Gustave Roussy Cancer Campus, Université Paris Saclay, Villejuif, France
| | - Elena Kozenkova
- Institute of Physics, Mathematics and Information Technology, Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation
| | - Evgeny M Zdobnov
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Vsevolod Makeev
- Vavilov Institute of General Genetics RAS, Moscow, Russia
- Moscow Institute of Physics and Technology, Moscow, Russian Federation
| | - Valerian Yurov
- Institute of Physics, Mathematics and Information Technology, Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation
| | - Masashi Tanaka
- Department of Neurology, Juntendo University Graduate School of Medicine, Tokyo, Japan
| | - Irina Gostimskaya
- Manchester Institute of Biotechnology, The University of Manchester, Manchester, United Kingdom
| | - Zoe Fleischmann
- Department of Biology, Northeastern University, Boston, MA, USA
| | - Sofia Annis
- Department of Biology, Northeastern University, Boston, MA, USA
| | - Melissa Franco
- Department of Biology, Northeastern University, Boston, MA, USA
| | - Kevin Wasko
- Department of Biology, Northeastern University, Boston, MA, USA
| | - Stepan Denisov
- Center for Mitochondrial Functional Genomics, Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation
- School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
| | - Wolfram S Kunz
- Department of Epileptology and Institute of Experimental Epileptology and Cognition Research, University Bonn, Bonn, Germany
| | - Dmitry Knorre
- The A.N. Belozersky Institute Of Physico-Chemical Biology, Moscow State University, Moscow, Russian Federation
| | - Ilya Mazunin
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology (Skoltech), Skolkovo, Russian Federation
- Fomin Clinic, Moscow, Russian Federation
- Medical Genomics LLC, Moscow, Russian Federation
| | - Sergey Nikolaev
- INSERM U981, Gustave Roussy Cancer Campus, Université Paris Saclay, Villejuif, France
| | - Jacques Fellay
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | | | - Konstantin Gunbin
- Center for Mitochondrial Functional Genomics, Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation
- Institute of Molecular and Cellular Biology SB RAS, Novosibirsk, Russian Federation
| | - Konstantin Popadin
- Center for Mitochondrial Functional Genomics, Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| |
Collapse
|
3
|
Montaña-Lozano P, Moreno-Carmona M, Ochoa-Capera M, Medina NS, Boore JL, Prada CF. Comparative genomic analysis of vertebrate mitochondrial reveals a differential of rearrangements rate between taxonomic class. Sci Rep 2022; 12:5479. [PMID: 35361853 PMCID: PMC8971445 DOI: 10.1038/s41598-022-09512-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 03/21/2022] [Indexed: 11/09/2022] Open
Abstract
Vertebrate mitochondrial genomes have been extensively studied for genetic and evolutionary purposes, these are normally believed to be extremely conserved, however, different cases of gene rearrangements have been reported. To verify the level of rearrangement and the mitogenome evolution, we performed a comparative genomic analysis of the 2831 vertebrate mitochondrial genomes representing 12 classes available in the NCBI database. Using a combination of bioinformatics methods, we determined there is a high number of errors in the annotation of mitochondrial genes, especially in tRNAs. We determined there is a large variation in the proportion of rearrangements per gene and per taxonomic class, with higher values observed in Actinopteri, Amphibia and Reptilia. We highlight that these are results for currently available vertebrate sequences, so an increase in sequence representativeness in some groups may alter the rearrangement rates, so in a few years it would be interesting to see if these rates are maintained or altered with the new mitogenome sequences. In addition, within each vertebrate class, different patterns in rearrangement proportion with distinct hotspots in the mitochondrial genome were found. We also determined that there are eleven convergence events in gene rearrangement, nine of which are new reports to the scientific community.
Collapse
Affiliation(s)
- Paula Montaña-Lozano
- Grupo de Investigación de Biología y Ecología de Artrópodos, Facultad de Ciencias, Universidad del Tolima, Ibague, Colombia
| | - Manuela Moreno-Carmona
- Grupo de Investigación de Biología y Ecología de Artrópodos, Facultad de Ciencias, Universidad del Tolima, Ibague, Colombia
| | - Mauricio Ochoa-Capera
- Grupo de Investigación de Biología y Ecología de Artrópodos, Facultad de Ciencias, Universidad del Tolima, Ibague, Colombia
| | - Natalia S Medina
- Grupo de Investigación de Biología y Ecología de Artrópodos, Facultad de Ciencias, Universidad del Tolima, Ibague, Colombia
| | - Jeffrey L Boore
- Providence St. Joseph Health and Institute for Systems Biology, 401 Terry Avenue N, Seattle, WA, 98109, USA
| | - Carlos F Prada
- Grupo de Investigación de Biología y Ecología de Artrópodos, Facultad de Ciencias, Universidad del Tolima, Ibague, Colombia.
| |
Collapse
|
4
|
Sanchez-Contreras M, Sweetwyne MT, Kohrn BF, Tsantilas KA, Hipp MJ, Schmidt EK, Fredrickson J, Whitson JA, Campbell MD, Rabinovitch PS, Marcinek DJ, Kennedy SR. A replication-linked mutational gradient drives somatic mutation accumulation and influences germline polymorphisms and genome composition in mitochondrial DNA. Nucleic Acids Res 2021; 49:11103-11118. [PMID: 34614167 PMCID: PMC8565317 DOI: 10.1093/nar/gkab901] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 09/10/2021] [Accepted: 09/22/2021] [Indexed: 11/22/2022] Open
Abstract
Mutations in mitochondrial DNA (mtDNA) cause maternally inherited diseases, while somatic mutations are linked to common diseases of aging. Although mtDNA mutations impact health, the processes that give rise to them are under considerable debate. To investigate the mechanism by which de novo mutations arise, we analyzed the distribution of naturally occurring somatic mutations across the mouse and human mtDNA obtained by Duplex Sequencing. We observe distinct mutational gradients in G→A and T→C transitions delimited by the light-strand origin and the mitochondrial Control Region (mCR). The gradient increases unequally across the mtDNA with age and is lost in the absence of DNA polymerase γ proofreading activity. In addition, high-resolution analysis of the mCR shows that important regulatory elements exhibit considerable variability in mutation frequency, consistent with them being mutational ‘hot-spots’ or ‘cold-spots’. Collectively, these patterns support genome replication via a deamination prone asymmetric strand-displacement mechanism as the fundamental driver of mutagenesis in mammalian DNA. Moreover, the distribution of mtDNA single nucleotide polymorphisms in humans and the distribution of bases in the mtDNA across vertebrate species mirror this gradient, indicating that replication-linked mutations are likely the primary source of inherited polymorphisms that, over evolutionary timescales, influences genome composition during speciation.
Collapse
Affiliation(s)
- Monica Sanchez-Contreras
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, USA
| | - Mariya T Sweetwyne
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, USA
| | - Brendan F Kohrn
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, USA
| | | | - Michael J Hipp
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, USA
| | - Elizabeth K Schmidt
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, USA
| | - Jeanne Fredrickson
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, USA
| | - Jeremy A Whitson
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, USA
| | - Matthew D Campbell
- Department of Radiology, University of Washington, Seattle, WA 98195, USA
| | - Peter S Rabinovitch
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, USA
| | - David J Marcinek
- Department of Radiology, University of Washington, Seattle, WA 98195, USA
| | - Scott R Kennedy
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
5
|
Wan X, Tan X. A Simple Protein Evolutionary Classification Method Based on the Mutual Relations Between Protein Sequences. Curr Bioinform 2021. [DOI: 10.2174/1574893615666200305090055] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
Protein is a kind of important organics in life. It is varied with its
sequences, structures and functions. Protein evolutionary classification is one of the popular
research topics in computational bioinformatics. Many studies have used protein sequence
information to classify the evolutionary relationships of proteins. As the amount of protein
sequence data increases, efficient computational tools are needed to make efficient protein
evolutionary classifications with high accuracies in the big data paradigm.
Methods:
In this study, we propose a new simple and efficient computational approach based on
the normalized mutual information rates to compute the relationship between protein sequences,
we then use the “distances” defined on the relationships to perform the evolutionary classifications
of proteins. The new method is computational efficient, model-free and unsupervised, which does
not require training data when performing classifications.
Result:
Simulation studies on various examples demonstrate the efficiency of the new method.
We use precision-recall curves to compare the efficiency of our new method with traditional
methods, results show that the new method outperforms the traditional methods in most of the
cases when performing evolutionary classifications.
Conclusion:
The new method is simple and proved to be efficient in protein evolutionary
classifications, which is useful in future evolutionary analysis particularly in the big data paradigm.
Collapse
Affiliation(s)
- Xiaogeng Wan
- Department of Mathematics, College of Mathematics and Physics, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Xinying Tan
- The Fourth Center of PLA General Hospital, Beijing, 100037, China
| |
Collapse
|
6
|
Demongeot J, Seligmann H. Deamination gradients within codons after 1<->2 position swap predict amino acid hydrophobicity and parallel β-sheet conformational preference. Biosystems 2020; 191-192:104116. [PMID: 32081715 DOI: 10.1016/j.biosystems.2020.104116] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Revised: 12/04/2019] [Accepted: 02/10/2020] [Indexed: 12/16/2022]
Abstract
Deaminations C->T and A->G are frequent mutations producing nucleotide content gradients across genomes proportional to singlestrandedness during replication/transcription. Hence, within single codons, deamination risks increase from first to third codon positions, while second codon positions are functionally most crucial. Here genetic codes are analyzed assuming that after anticodons protected codons from deaminations, first and second codon positions swapped (N2N1N3->N1N2N3), with lowest deamination risks for N2 in presumed primitive N2N1N3 codons. N2N1N3, not standard N1N2N3, codon structure minimizes deaminations inversely proportionally to cognate amino acid hydrophobicity and parallel betasheet conformational preference. For N1N2N3, deamination minimization increases with genetic code integration order of cognate amino acids: during the presumed N2N1N3->N1N2N3 codon structure transition, protein synthesis combined direct codon-amino acid interactions for late amino acids and tRNA-based translation for early amino acids. Hence N2N1N3 codons would correspond to tRNA-free translation by spontaneous codon-amino acid affinities, and tRNA-mediated translation presumably caused N2N1N3->N1N2N3 swaps. Results show that rational, not arbitrary rules link codon and amino acid structures. Some analyses detect mitochondrial RNAs and peptides in public data corresponding to systematic position swaps, suggesting occasional swapping polymerase activity.
Collapse
Affiliation(s)
- Jacques Demongeot
- Université Grenoble Alpes, Faculty of Medicine, Laboratory AGEIS EA 7407, Team Tools for e-Gnosis Medical, F-38700, La Tronche, France.
| | - Hervé Seligmann
- Université Grenoble Alpes, Faculty of Medicine, Laboratory AGEIS EA 7407, Team Tools for e-Gnosis Medical, F-38700, La Tronche, France; The National Natural History Collections, The Hebrew University of Jerusalem, 91404, Jerusalem, Israel.
| |
Collapse
|
7
|
Demongeot J, Seligmann H. Theoretical minimal RNA rings designed according to coding constraints mimic deamination gradients. THE SCIENCE OF NATURE - NATURWISSENSCHAFTEN 2019; 106:44. [DOI: 10.1007/s00114-019-1638-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2018] [Revised: 06/18/2019] [Accepted: 06/19/2019] [Indexed: 11/27/2022]
|
8
|
Gu X, Kang X, Liu J. Mutation signatures in germline mitochondrial genome provide insights into human mitochondrial evolution and disease. Hum Genet 2019; 138:613-624. [PMID: 30968252 DOI: 10.1007/s00439-019-02009-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Accepted: 04/02/2019] [Indexed: 01/06/2023]
Abstract
Variations in mitochondrial DNA (mtDNA) have been fundamental for understanding human evolution and are causative for a plethora of inherited mitochondrial diseases, but the mutation signatures of germline mtDNA and their value in understanding mitochondrial pathogenicity remain unknown. Here, we carried out a systematic analysis of mutation patterns in germline mtDNA based on 97,566 mtDNA variants from 45,494 full-length sequences and revealed a highly non-stochastic and replication-coupled mutation signature characterized by nucleotide-specific mutation pressure (G > T>A > C) and position-specific selection pressure, suggesting the existence of an intensive mutation-selection interplay in germline mtDNA. We provide evidence that this mutation-selection interplay has strongly shaped the mtDNA sequence during evolution, which not only manifests as an oriented alteration of amino acid compositions of mitochondrial encoded proteins, but also explains the long-lasting mystery of CpG depletion in mitochondrial genome. Finally, we demonstrated that these insights may be integrated to better understand the pathogenicity of disease-implicated mitochondrial variants.
Collapse
Affiliation(s)
- Xiwen Gu
- Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, Clinical Research Center of Shaanxi Province for Dental and Maxillofacial Diseases, College of Stomatology, Xi'an Jiaotong University, Xi'an, 710004, China.
| | - Xinyun Kang
- Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, Clinical Research Center of Shaanxi Province for Dental and Maxillofacial Diseases, College of Stomatology, Xi'an Jiaotong University, Xi'an, 710004, China
| | - Jiankang Liu
- Center for Mitochondrial Biology and Medicine & Douglas C. Wallace Institute for Mitochondrial and Epigenetic Information Sciences, The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, 710049, China.
| |
Collapse
|
9
|
Jones CT, Youssef N, Susko E, Bielawski JP. Phenomenological Load on Model Parameters Can Lead to False Biological Conclusions. Mol Biol Evol 2019; 35:1473-1488. [PMID: 29596684 DOI: 10.1093/molbev/msy049] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
When a substitution model is fitted to an alignment using maximum likelihood, its parameters are adjusted to account for as much site-pattern variation as possible. A parameter might therefore absorb a substantial quantity of the total variance in an alignment (or more formally, bring about a substantial reduction in the deviance of the fitted model) even if the process it represents played no role in the generation of the data. When this occurs, we say that the parameter estimate carries phenomenological load (PL). Large PL in a parameter estimate is a concern because it not only invalidates its mechanistic interpretation (if it has one) but also increases the likelihood that it will be found to be statistically significant. The problem of PL was not identified in the past because most off-the-shelf substitution models make simplifying assumptions that preclude the generation of realistic levels of variation. In this study, we use the more realistic mutation-selection framework as the basis of a generating model formulated to produce data that mimic an alignment of mammalian mitochondrial DNA. We show that a parameter estimate can carry PL when 1) the substitution model is underspecified and 2) the parameter represents a process that is confounded with other processes represented in the data-generating model. We then provide a method that can be used to identify signal for the process that a given parameter represents despite the existence of PL.
Collapse
Affiliation(s)
- Christopher T Jones
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada
| | - Noor Youssef
- Department of Biology, Dalhousie University, Halifax, NS, Canada
| | - Edward Susko
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada
| | | |
Collapse
|
10
|
Qian L, Wang H, Yan J, Pan T, Jiang S, Rao D, Zhang B. Multiple independent structural dynamic events in the evolution of snake mitochondrial genomes. BMC Genomics 2018; 19:354. [PMID: 29747572 PMCID: PMC5946542 DOI: 10.1186/s12864-018-4717-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2017] [Accepted: 04/24/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Mitochondrial DNA sequences have long been used in phylogenetic studies. However, little attention has been paid to the changes in gene arrangement patterns in the snake's mitogenome. Here, we analyzed the complete mitogenome sequences and structures of 65 snake species from 14 families and examined their structural patterns, organization and evolution. Our purpose was to further investigate the evolutionary implications and possible rearrangement mechanisms of the mitogenome within snakes. RESULTS In total, eleven types of mitochondrial gene arrangement patterns were detected (Type I, II, III, III-A, III-B, III-B1, III-C, III-D, III-E, III-F, III-G), with mitochondrial genome rearrangements being a major trend in snakes, especially in Alethinophidia. In snake mitogenomes, the rearrangements mainly involved three processes, gene loss, translocation and duplication. Within Scolecophidia, the OL was lost several times in Typhlopidae and Leptotyphlopidae, but persisted as a plesiomorphy in the Alethinophidia. Duplication of the control region and translocation of the tRNALeu gene are two visible features in Alethinophidian mitochondrial genomes. Independently and stochastically, the duplication of pseudo-Pro (P*) emerged in seven different lineages of unequal size in three families, indicating that the presence of P* was a polytopic event in the mitogenome. CONCLUSIONS The WANCY tRNA gene cluster and the control regions and their adjacent segments were hotspots for mitogenome rearrangement. Maintenance of duplicate control regions may be the source for snake mitogenome structural diversity.
Collapse
Affiliation(s)
- Lifu Qian
- Anhui Key Laboratory of Eco-engineering and Bio-technique, School of Life Sciences, Anhui University, Hefei, 230601, China.,Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, 210046, China
| | - Hui Wang
- Anhui Key Laboratory of Eco-engineering and Bio-technique, School of Life Sciences, Anhui University, Hefei, 230601, China
| | - Jie Yan
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, 210046, China
| | - Tao Pan
- Anhui Key Laboratory of Eco-engineering and Bio-technique, School of Life Sciences, Anhui University, Hefei, 230601, China
| | - Shanqun Jiang
- Anhui Key Laboratory of Eco-engineering and Bio-technique, School of Life Sciences, Anhui University, Hefei, 230601, China
| | - Dingqi Rao
- Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.
| | - Baowei Zhang
- Anhui Key Laboratory of Eco-engineering and Bio-technique, School of Life Sciences, Anhui University, Hefei, 230601, China.
| |
Collapse
|
11
|
Wan X, Zhao X, Yau SST. An information-based network approach for protein classification. PLoS One 2017; 12:e0174386. [PMID: 28350835 PMCID: PMC5370107 DOI: 10.1371/journal.pone.0174386] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2016] [Accepted: 03/08/2017] [Indexed: 11/25/2022] Open
Abstract
Protein classification is one of the critical problems in bioinformatics. Early studies used geometric distances and polygenetic-tree to classify proteins. These methods use binary trees to present protein classification. In this paper, we propose a new protein classification method, whereby theories of information and networks are used to classify the multivariate relationships of proteins. In this study, protein universe is modeled as an undirected network, where proteins are classified according to their connections. Our method is unsupervised, multivariate, and alignment-free. It can be applied to the classification of both protein sequences and structures. Nine examples are used to demonstrate the efficiency of our new method.
Collapse
Affiliation(s)
- Xiaogeng Wan
- Department of Mathematical Sciences, Tsinghua University, Beijing, China
- * E-mail: (XW); (XZ); (SSTY)
| | - Xin Zhao
- Department of Mathematical Sciences, Tsinghua University, Beijing, China
- * E-mail: (XW); (XZ); (SSTY)
| | - Stephen S. T. Yau
- Department of Mathematical Sciences, Tsinghua University, Beijing, China
- * E-mail: (XW); (XZ); (SSTY)
| |
Collapse
|
12
|
Seligmann H. Systematic exchanges between nucleotides: Genomic swinger repeats and swinger transcription in human mitochondria. J Theor Biol 2015; 384:70-7. [PMID: 26297891 DOI: 10.1016/j.jtbi.2015.07.036] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2014] [Revised: 07/11/2015] [Accepted: 07/24/2015] [Indexed: 10/23/2022]
Abstract
Chargaff׳s second parity rule, quasi-equal single strand frequencies for complementary nucleotides, presumably results from insertion of repeats and inverted repeats during sequence genesis. Vertebrate mitogenomes escape this rule because repeats are counterselected: their hybridization produces loop bulges whose deletion is deleterious. Some DNA/RNA sequences match mitogenomes only after assuming one among 23 systematic nucleotide exchanges (swinger DNA/RNA: nine symmetric, e.g. A ↔ C; and 14 asymmetric, e.g. A → C → G → A). Swinger-transformed repeats do not hybridize, escaping selection against deletions due to bulge formation. Blast analyses of the human mitogenome detect swinger repeats for all 23 swinger types, more than in randomized sequences with identical length and nucleotide contents. Mean genomic swinger repeat lengths increase with observed human swinger RNA frequencies: swinger repeat and swinger RNA productions appear linked, perhaps by swinger RNA retrotranscription. Mean swinger repeat lengths are proportional to reading frame retrievability, post-swinger transformation, by the natural circular code. Genomic swinger repeats confirm at genomic level, independently of swinger RNA detection, occurrence of swinger polymerizations. They suggest that repeats, and swinger repeats in particular, contribute to genome genesis.
Collapse
Affiliation(s)
- Hervé Seligmann
- Unité de Recherche sur les Maladies Infectieuses et Tropicales Émergentes, Faculté de Médecine, URMITE CNRS-IRD 198 UMER 6236, Université Aix-Marseille, Marseille, France.
| |
Collapse
|
13
|
Levin L, Mishmar D. A Genetic View of the Mitochondrial Role in Ageing: Killing Us Softly. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2015; 847:89-106. [DOI: 10.1007/978-1-4939-2404-2_4] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
14
|
Wen J, Zhang Y, Yau SS. k-mer Sparse matrix model for genetic sequence and its applications in sequence comparison. J Theor Biol 2014; 363:145-50. [DOI: 10.1016/j.jtbi.2014.08.028] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2014] [Revised: 07/14/2014] [Accepted: 08/17/2014] [Indexed: 10/24/2022]
|
15
|
Fonseca MM, Harris DJ, Posada D. The inversion of the Control Region in three mitogenomes provides further evidence for an asymmetric model of vertebrate mtDNA replication. PLoS One 2014; 9:e106654. [PMID: 25268704 PMCID: PMC4182315 DOI: 10.1371/journal.pone.0106654] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2014] [Accepted: 08/04/2014] [Indexed: 11/29/2022] Open
Abstract
Mitochondrial genomes are known to have a strong strand-specific compositional bias that is more pronounced at fourfold redundant sites of mtDNA protein-coding genes. This observation suggests that strand asymmetries, to a large extent, are caused by mutational asymmetric mechanisms. In vertebrate mitogenomes, replication and not transcription seems to play a major role in shaping compositional bias. Hence, one can better understand how mtDNA is replicated – a debated issue – through a detailed picture of mitochondrial genome evolution. Here, we analyzed the compositional bias (AT and GC skews) in protein-coding genes of almost 2,500 complete vertebrate mitogenomes. We were able to identify three fish mitogenomes with inverted AT/GC skew coupled with an inversion of the Control Region. These findings suggest that the vertebrate mitochondrial replication mechanism is asymmetric and may invert its polarity, with the leading-strand becoming the lagging-strand and vice-versa, without compromising mtDNA maintenance and expression. The inversion of the strand-specific compositional bias through the inversion of the Control Region is in agreement with the strand-displacement model but it is also compatible with the RITOLS model of mtDNA replication.
Collapse
Affiliation(s)
- Miguel M. Fonseca
- Department of Biochemistry, Genetics and Immunology, University of Vigo, Vigo, Spain
- CIBIO/InBIO, Research Center in Biodiversity and Genetic Resources, University of Porto, Vairão, Portugal
- * E-mail:
| | - D. James Harris
- CIBIO/InBIO, Research Center in Biodiversity and Genetic Resources, University of Porto, Vairão, Portugal
| | - David Posada
- Department of Biochemistry, Genetics and Immunology, University of Vigo, Vigo, Spain
| |
Collapse
|
16
|
Pozzi L, Hodgson JA, Burrell AS, Sterner KN, Raaum RL, Disotell TR. Primate phylogenetic relationships and divergence dates inferred from complete mitochondrial genomes. Mol Phylogenet Evol 2014; 75:165-83. [PMID: 24583291 PMCID: PMC4059600 DOI: 10.1016/j.ympev.2014.02.023] [Citation(s) in RCA: 153] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2013] [Revised: 02/17/2014] [Accepted: 02/19/2014] [Indexed: 01/23/2023]
Abstract
The origins and the divergence times of the most basal lineages within primates have been difficult to resolve mainly due to the incomplete sampling of early fossil taxa. The main source of contention is related to the discordance between molecular and fossil estimates: while there are no crown primate fossils older than 56Ma, most molecule-based estimates extend the origins of crown primates into the Cretaceous. Here we present a comprehensive mitogenomic study of primates. We assembled 87 mammalian mitochondrial genomes, including 62 primate species representing all the families of the order. We newly sequenced eleven mitochondrial genomes, including eight Old World monkeys and three strepsirrhines. Phylogenetic analyses support a strong topology, confirming the monophyly for all the major primate clades. In contrast to previous mitogenomic studies, the positions of tarsiers and colugos relative to strepsirrhines and anthropoids are well resolved. In order to improve our understanding of how fossil calibrations affect age estimates within primates, we explore the effect of seventeen fossil calibrations across primates and other mammalian groups and we select a subset of calibrations to date our mitogenomic tree. The divergence date estimates of the Strepsirrhine/Haplorhine split support an origin of crown primates in the Late Cretaceous, at around 74Ma. This result supports a short-fuse model of primate origins, whereby relatively little time passed between the origin of the order and the diversification of its major clades. It also suggests that the early primate fossil record is likely poorly sampled.
Collapse
Affiliation(s)
- Luca Pozzi
- Department of Anthropology, Center for the Study of Human Origins, New York University, New York, NY, United States; New York Consortium in Evolutionary Primatology, United States; Behavioral Ecology and Sociobiology Unit, German Primate Center, Göttingen, Germany.
| | - Jason A Hodgson
- Department of Anthropology, Center for the Study of Human Origins, New York University, New York, NY, United States; New York Consortium in Evolutionary Primatology, United States; Department of Life Sciences, Imperial College London, London, United Kingdom.
| | - Andrew S Burrell
- Department of Anthropology, Center for the Study of Human Origins, New York University, New York, NY, United States.
| | - Kirstin N Sterner
- Department of Anthropology, University of Oregon, Eugene, OR, United States.
| | - Ryan L Raaum
- New York Consortium in Evolutionary Primatology, United States; Department of Anthropology, Lehman College & The Graduate Center, City University of New York, Bronx, NY, United States.
| | - Todd R Disotell
- Department of Anthropology, Center for the Study of Human Origins, New York University, New York, NY, United States; New York Consortium in Evolutionary Primatology, United States.
| |
Collapse
|
17
|
K-mer natural vector and its application to the phylogenetic analysis of genetic sequences. Gene 2014; 546:25-34. [PMID: 24858075 DOI: 10.1016/j.gene.2014.05.043] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2014] [Revised: 05/04/2014] [Accepted: 05/20/2014] [Indexed: 11/21/2022]
Abstract
Based on the well-known k-mer model, we propose a k-mer natural vector model for representing a genetic sequence based on the numbers and distributions of k-mers in the sequence. We show that there exists a one-to-one correspondence between a genetic sequence and its associated k-mer natural vector. The k-mer natural vector method can be easily and quickly used to perform phylogenetic analysis of genetic sequences without requiring evolutionary models or human intervention. Whole or partial genomes can be handled more effective with our proposed method. It is applied to the phylogenetic analysis of genetic sequences, and the obtaining results fully demonstrate that the k-mer natural vector method is a very powerful tool for analysing and annotating genetic sequences and determining evolutionary relationships both in terms of accuracy and efficiency.
Collapse
|
18
|
Seligmann H. Polymerization of non-complementary RNA: systematic symmetric nucleotide exchanges mainly involving uracil produce mitochondrial RNA transcripts coding for cryptic overlapping genes. Biosystems 2013; 111:156-74. [PMID: 23410796 DOI: 10.1016/j.biosystems.2013.01.011] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Revised: 01/24/2013] [Accepted: 01/29/2013] [Indexed: 12/23/2022]
Abstract
Usual DNA→RNA transcription exchanges T→U. Assuming different systematic symmetric nucleotide exchanges during translation, some GenBank RNAs match exactly human mitochondrial sequences (exchange rules listed in decreasing transcript frequencies): C↔U, A↔U, A↔U+C↔G (two nucleotide pairs exchanged), G↔U, A↔G, C↔G, none for A↔C, A↔G+C↔U, and A↔C+G↔U. Most unusual transcripts involve exchanging uracil. Independent measures of rates of rare replicational enzymatic DNA nucleotide misinsertions predict frequencies of RNA transcripts systematically exchanging the corresponding misinserted nucleotides. Exchange transcripts self-hybridize less than other gene regions, self-hybridization increases with length, suggesting endoribonuclease-limited elongation. Blast detects stop codon depleted putative protein coding overlapping genes within exchange-transcribed mitochondrial genes. These align with existing GenBank proteins (mainly metazoan origins, prokaryotic and viral origins underrepresented). These GenBank proteins frequently interact with RNA/DNA, are membrane transporters, or are typical of mitochondrial metabolism. Nucleotide exchange transcript frequencies increase with overlapping gene densities and stop densities, indicating finely tuned counterbalancing regulation of expression of systematic symmetric nucleotide exchange-encrypted proteins. Such expression necessitates combined activities of suppressor tRNAs matching stops, and nucleotide exchange transcription. Two independent properties confirm predicted exchanged overlap coding genes: discrepancy of third codon nucleotide contents from replicational deamination gradients, and codon usage according to circular code predictions. Predictions from both properties converge, especially for frequent nucleotide exchange types. Nucleotide exchanging transcription apparently increases coding densities of protein coding genes without lengthening genomes, revealing unsuspected functional DNA coding potential.
Collapse
Affiliation(s)
- Hervé Seligmann
- National Natural History Museum Collections, The Hebrew University of Jerusalem, 91904 Jerusalem, Israel.
| |
Collapse
|
19
|
Powell AF, Barker FK, Lanyon SM. Empirical evaluation of partitioning schemes for phylogenetic analyses of mitogenomic data: An avian case study. Mol Phylogenet Evol 2013; 66:69-79. [DOI: 10.1016/j.ympev.2012.09.006] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2012] [Revised: 09/08/2012] [Accepted: 09/08/2012] [Indexed: 10/27/2022]
|
20
|
Yamashita A, Fuchs E, Taira M, Yamamoto T, Hayashi M. Somatostatin-immunoreactive senile plaque-like structures in the frontal cortex and nucleus accumbens of aged tree shrews and Japanese macaques. J Med Primatol 2012; 41:147-57. [PMID: 22512242 DOI: 10.1111/j.1600-0684.2012.00540.x] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
BACKGROUND Previously, we demonstrated decreased expression of somatostatin mRNA in aged macaque brain, particularly in the prefrontal cortex. To investigate whether or not this age-dependent decrease in mRNA is related to morphological changes, we analyzed somatostatin cells in the cerebra of aged Japanese macaques and compared them with those in rats and tree shrews, the latter of which are closely related to primates. METHODS Brains of aged macaques, tree shrews, and rats were investigated by immunohistochemistry with special emphasis on somatostatin. RESULTS We observed degenerating somatostatin-immunoreactive cells in the cortices of aged macaques and tree shrews. Somatostatin-immunoreactive senile plaque-like structures were found in areas 6 and 8 and in the nucleus accumbens of macaques, as well as in the nucleus accumbens and the cortex of aged tree shrews, where amyloid accumulations were observed. CONCLUSIONS Somatostatin degenerations may be related to amyloid accumulations and may play roles in impairments of cognitive functions during aging.
Collapse
Affiliation(s)
- Akiko Yamashita
- Division of Applied System Neuroscience, Nihon University School of Medicine, Tokyo, Japan.
| | | | | | | | | |
Collapse
|
21
|
Alignment-free comparison of genome sequences by a new numerical characterization. J Theor Biol 2011; 281:107-12. [PMID: 21536050 DOI: 10.1016/j.jtbi.2011.04.003] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2010] [Revised: 04/01/2011] [Accepted: 04/02/2011] [Indexed: 01/29/2023]
Abstract
In order to compare different genome sequences, an alignment-free method has proposed. First, we presented a new graphical representation of DNA sequences without degeneracy, which is conducive to intuitive comparison of sequences. Then, a new numerical characterization based on the representation was introduced to quantitatively depict the intrinsic nature of genome sequences, and considered as a 10-dimensional vector in the mathematical space. Alignment-free comparison of sequences was performed by computing the distances between vectors of the corresponding numerical characterizations, which define the evolutionary relationship. Two data sets of DNA sequences were constructed to assess the performance on sequence comparison. The results illustrate well validity of the method. The new numerical characterization provides a powerful tool for genome comparison.
Collapse
|
22
|
A new distribution vector and its application in genome clustering. Mol Phylogenet Evol 2011; 59:438-43. [PMID: 21385621 DOI: 10.1016/j.ympev.2011.02.020] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2010] [Revised: 01/20/2011] [Accepted: 02/28/2011] [Indexed: 11/20/2022]
Abstract
In this paper we report a novel mathematical method to transform the DNA sequences into the distribution vectors which correspond to points in the sixty dimensional space. Each component of the distribution vector represents the distribution of one kind of nucleotide in k segments of the DNA sequences. The mathematical and statistical properties of the distribution vectors are demonstrated and examined with huge datasets of human DNA sequences and random sequences. The determined expectation and standard deviation can make the mapping stable and practicable. Moreover, we apply the distribution vectors to the clustering of the Haemagglutinin (HA) gene of 60 H1N1 viruses from Human, Swine and Avian, the complete mitochondrial genomes from 80 placental mammals and the complete genomes from 50 bacteria. The 60 H1N1 viruses, 80 placental mammals and 50 bacteria are classified accurately and rapidly compared to the multiple sequence alignment methods. The results indicate that the distribution vectors can reveal the similarity and evolutionary relationship among homologous DNA sequences based on the distances between any two of these distribution vectors. The advantage of fast computation offers the distribution vectors the opportunity to deal with a huge amount of DNA sequences efficiently.
Collapse
|
23
|
Deng M, Yu C, Liang Q, He RL, Yau SST. A novel method of characterizing genetic sequences: genome space with biological distance and applications. PLoS One 2011; 6:e17293. [PMID: 21399690 PMCID: PMC3047556 DOI: 10.1371/journal.pone.0017293] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2010] [Accepted: 01/28/2011] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Most existing methods for phylogenetic analysis involve developing an evolutionary model and then using some type of computational algorithm to perform multiple sequence alignment. There are two problems with this approach: (1) different evolutionary models can lead to different results, and (2) the computation time required for multiple alignments makes it impossible to analyse the phylogeny of a whole genome. This motivates us to create a new approach to characterize genetic sequences. METHODOLOGY To each DNA sequence, we associate a natural vector based on the distributions of nucleotides. This produces a one-to-one correspondence between the DNA sequence and its natural vector. We define the distance between two DNA sequences to be the distance between their associated natural vectors. This creates a genome space with a biological distance which makes global comparison of genomes with same topology possible. We use our proposed method to analyze the genomes of the new influenza A (H1N1) virus, human rhinoviruses (HRV) and mammalian mitochondrial. The result shows that a triple-reassortant swine virus circulating in North America and the Eurasian swine virus belong to the lineage of the influenza A (H1N1) virus. For the HRV and mammalian mitochondrial genomes, the results coincide with biologists' analyses. CONCLUSIONS Our approach provides a powerful new tool for analyzing and annotating genomes and their phylogenetic relationships. Whole or partial genomes can be handled more easily and more quickly than using multiple alignment methods. Once a genome space has been constructed, it can be stored in a database. There is no need to reconstruct the genome space for subsequent applications, whereas in multiple alignment methods, realignment is needed to add new sequences. Furthermore, one can make a global comparison of all genomes simultaneously, which no other existing method can achieve.
Collapse
Affiliation(s)
- Mo Deng
- Department of Mathematics, Statistics and Computer Science, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - Chenglong Yu
- The Institute of Mathematical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong, People's Republic of China
| | - Qian Liang
- Department of Mathematics, Statistics and Computer Science, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - Rong L. He
- Department of Biological Sciences, Chicago State University, Chicago, Illinois, United States of America
| | - Stephen S.-T. Yau
- Department of Mathematics, Statistics and Computer Science, University of Illinois at Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
24
|
Wei SJ, Shi M, Chen XX, Sharkey MJ, van Achterberg C, Ye GY, He JH. New views on strand asymmetry in insect mitochondrial genomes. PLoS One 2010; 5:e12708. [PMID: 20856815 PMCID: PMC2939890 DOI: 10.1371/journal.pone.0012708] [Citation(s) in RCA: 198] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2009] [Accepted: 08/20/2010] [Indexed: 01/16/2023] Open
Abstract
Strand asymmetry in nucleotide composition is a remarkable feature of animal mitochondrial genomes. Understanding the mutation processes that shape strand asymmetry is essential for comprehensive knowledge of genome evolution, demographical population history and accurate phylogenetic inference. Previous studies found that the relative contributions of different substitution types to strand asymmetry are associated with replication alone or both replication and transcription. However, the relative contributions of replication and transcription to strand asymmetry remain unclear. Here we conducted a broad survey of strand asymmetry across 120 insect mitochondrial genomes, with special reference to the correlation between the signs of skew values and replication orientation/gene direction. The results show that the sign of GC skew on entire mitochondrial genomes is reversed in all species of three distantly related families of insects, Philopteridae (Phthiraptera), Aleyrodidae (Hemiptera) and Braconidae (Hymenoptera); the replication-related elements in the A+T-rich regions of these species are inverted, confirming that reversal of strand asymmetry (GC skew) was caused by inversion of replication origin; and finally, the sign of GC skew value is associated with replication orientation but not with gene direction, while that of AT skew value varies with gene direction, replication and codon positions used in analyses. These findings show that deaminations during replication and other mutations contribute more than selection on amino acid sequences to strand compositions of G and C, and that the replication process has a stronger affect on A and T content than does transcription. Our results may contribute to genome-wide studies of replication and transcription mechanisms.
Collapse
Affiliation(s)
- Shu-Jun Wei
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
- Institute of Plant and Environmental Protection, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Min Shi
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| | - Xue-Xin Chen
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| | - Michael J. Sharkey
- Department of Entomology, University of Kentucky, Lexington, Kentucky, United States of America
| | | | - Gong-Yin Ye
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| | - Jun-Hua He
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| |
Collapse
|
25
|
Seligmann H. Positive correlations between molecular and morphological rates of evolution. J Theor Biol 2010; 264:799-807. [DOI: 10.1016/j.jtbi.2010.03.019] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2009] [Revised: 02/17/2010] [Accepted: 03/10/2010] [Indexed: 11/30/2022]
|
26
|
Yu C, Liang Q, Yin C, He RL, Yau SST. A novel construction of genome space with biological geometry. DNA Res 2010; 17:155-68. [PMID: 20360268 PMCID: PMC2885272 DOI: 10.1093/dnares/dsq008] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
A genome space is a moduli space of genomes. In this space, each point corresponds to a genome. The natural distance between two points in the genome space reflects the biological distance between these two genomes. Currently, there is no method to represent genomes by a point in a space without losing biological information. Here, we propose a new graphical representation for DNA sequences. The breakthrough of the subject is that we can construct the moment vectors from DNA sequences using this new graphical method and prove that the correspondence between moment vectors and DNA sequences is one-to-one. Using these moment vectors, we have constructed a novel genome space as a subspace in RN. It allows us to show that the SARS-CoV is most closely related to a coronavirus from the palm civet not from a bird as initially suspected, and the newly discovered human coronavirus HCoV-HKU1 is more closely related to SARS than to any other known member of group 2 coronavirus. Furthermore, we reconstructed the phylogenetic tree for 34 lentiviruses (including human immunodeficiency virus) based on their whole genome sequences. Our genome space will provide a new powerful tool for analyzing the classification of genomes and their phylogenetic relationships.
Collapse
Affiliation(s)
- Chenglong Yu
- Department of Mathematics, The Chinese University of Hong Kong, Shatin, Hong Kong
| | | | | | | | | |
Collapse
|
27
|
Castoe TA, Gu W, de Koning APJ, Daza JM, Jiang ZJ, Parkinson CL, Pollock DD. Dynamic nucleotide mutation gradients and control region usage in squamate reptile mitochondrial genomes. Cytogenet Genome Res 2010; 127:112-27. [PMID: 20215734 DOI: 10.1159/000295342] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Gradients of nucleotide bias and substitution rates occur in vertebrate mitochondrial genomes due to the asymmetric nature of the replication process. The evolution of these gradients has previously been studied in detail in primates, but not in other vertebrate groups. From the primate study, the strengths of these gradients are known to evolve in ways that can substantially alter the substitution process, but it is unclear how rapidly they evolve over evolutionary time or how different they may be in different lineages or groups of vertebrates. Given the importance of mitochondrial genomes in phylogenetics and molecular evolutionary research, a better understanding of how asymmetric mitochondrial substitution gradients evolve would contribute key insights into how this gradient evolution may mislead evolutionary inferences, and how it may also be incorporated into new evolutionary models. Most snake mitochondrial genomes have an additional interesting feature, 2 nearly identical control regions, which vary among different species in the extent that they are used as origins of replication. Given the expanded sampling of complete snake genomes currently available, together with 2 additional snakes sequenced in this study, we reexamined gradient strength and CR usage in alethinophidian snakes as well as several lizards that possess dual CRs. Our results suggest that nucleotide substitution gradients (and corresponding nucleotide bias) and CR usage is highly labile over the approximately 200 m.y. of squamate evolution, and demonstrates greater overall variability than previously shown in primates. The evidence for the existence of such gradients, and their ability to evolve rapidly and converge among unrelated species suggests that gradient dynamics could easily mislead phylogenetic and molecular evolutionary inferences, and argues strongly that these dynamics should be incorporated into phylogenetic models.
Collapse
Affiliation(s)
- T A Castoe
- Consortium for Comparative Genomics, Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | | | | | | | | | | | | |
Collapse
|
28
|
Seligmann H. Mitochondrial tRNAs as light strand replication origins: Similarity between anticodon loops and the loop of the light strand replication origin predicts initiation of DNA replication. Biosystems 2010; 99:85-93. [DOI: 10.1016/j.biosystems.2009.09.003] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2009] [Revised: 09/07/2009] [Accepted: 09/08/2009] [Indexed: 10/20/2022]
|
29
|
Xu S, Luosang J, Hua S, He J, Ciren A, Wang W, Tong X, Liang Y, Wang J, Zheng X. High altitude adaptation and phylogenetic analysis of Tibetan horse based on the mitochondrial genome. J Genet Genomics 2009; 34:720-9. [PMID: 17707216 DOI: 10.1016/s1673-8527(07)60081-2] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2006] [Accepted: 12/31/2006] [Indexed: 11/18/2022]
Abstract
To investigate genetic mechanisms of high altitude adaptations of animals living in the Tibetan Plateau, three mitochondrial genomes (mt-genome) of Tibetan horses living in Naqu (4,500 m) of Tibetan, Zhongdian (3,300 m) and Deqin (3,100 m) of Yunnan province were sequenced. The structures and lengths of these three mt-genomes are similar to the Cheju horse, which is related to Tibetan horses, but little shorter than the Swedish horse. The pair-wise identity of these three horses on nucleotide level is more than 99.3%. When the gene encoding the mitochondrial protein of Tibetan horses was analyzed, we found that NADH6 has higher non-synonymous mutation rate in all of three Tibetan horses. This implies that NADH6 may play a role in Tibetan horses' high altitude adaptation. NADH6 is one of the subunits of the complex I in the respiratory chain. Furthermore, 7 D-loop sequences of Tibetan horse from different areas were sequenced, and the phylogeny tree was constructed to study the origin and evolutionary history of Tibetan horses. The result showed that the genetic diverse was high among Tibetan horses. All of Tibetan horses from Naqu were clustered into one clade, and Tibetan horses from Zhongdian and Deqin were clustered into others clades. The first molecular evidence of Tibetan horses indicated in this study is that Tibetan horse population might have multiple origins.
Collapse
Affiliation(s)
- Shuqing Xu
- Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Alter SE, Palumbi SR. Comparing evolutionary patterns and variability in the mitochondrial control region and cytochrome B in three species of baleen whales. J Mol Evol 2008; 68:97-111. [PMID: 19116685 DOI: 10.1007/s00239-008-9193-2] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2007] [Revised: 10/29/2008] [Accepted: 12/09/2008] [Indexed: 11/26/2022]
Abstract
The rapidly evolving mitochondrial control region remains an important source of information on phylogeography and demographic history for cetaceans and other vertebrates, despite great uncertainty in the rate of nucleotide substitution across both nucleotide positions and lineages. Patterns of variation in linked markers with slower rates of evolution can potentially be used to calibrate the rate of nucleotide substitution in the control region and to better understand the interplay of evolutionary and demographic forces across the mitochondrial genome above and below the species level. We have examined patterns of diversity within and between three baleen whale species (gray, humpback, and Antarctic minke whales) in order to determine how patterns of molecular evolution differ between cytochrome b and the control region. Our results show that cytochrome b is less variable than expected given the diversity in the control region for gray and humpback whales, even after functional differences are taken into account, but more variable than expected for minke whales. Differences in the frequency distributions of polymorphic sites and in best-fit models of nucleotide substitution indicate that these patterns may be the result of hypervariability in the control region in gray and humpback whales but, in minke whales, may result from a large, stable or expanding population size coupled with saturation at the control region. Using paired cytochrome b and control region data across individuals, we show that the average rate of nucleotide substitution in the control region may be on average 2.6 times higher than phylogenetically derived estimates in cetaceans. These results highlight the complexity of making inferences from control region data alone and suggest that applying simple rules of DNA sequence analyses across species may be difficult.
Collapse
Affiliation(s)
- S Elizabeth Alter
- Department of Biological Sciences, Hopkins Marine Station, Stanford University, 120 Oceanview Boulevard, Pacific Grove, CA, 93950, USA.
| | | |
Collapse
|
31
|
Gissi C, Iannelli F, Pesole G. Evolution of the mitochondrial genome of Metazoa as exemplified by comparison of congeneric species. Heredity (Edinb) 2008; 101:301-20. [PMID: 18612321 DOI: 10.1038/hdy.2008.62] [Citation(s) in RCA: 425] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
The mitochondrial genome (mtDNA) of Metazoa is a good model system for evolutionary genomic studies and the availability of more than 1000 sequences provides an almost unique opportunity to decode the mechanisms of genome evolution over a large phylogenetic range. In this paper, we review several structural features of the metazoan mtDNA, such as gene content, genome size, genome architecture and the new parameter of gene strand asymmetry in a phylogenetic framework. The data reviewed here show that: (1) the plasticity of Metazoa mtDNA is higher than previously thought and mainly due to variation in number and location of tRNA genes; (2) an exceptional trend towards stabilization of genomic features occurred in deuterostomes and was exacerbated in vertebrates, where gene content, genome architecture and gene strand asymmetry are almost invariant. Only tunicates exhibit a very high degree of genome variability comparable to that found outside deuterostomes. In order to analyse the genomic evolutionary process at short evolutionary distances, we have also compared mtDNAs of species belonging to the same genus: the variability observed in congeneric species significantly recapitulates the evolutionary dynamics observed at higher taxonomic ranks, especially for taxa showing high levels of genome plasticity and/or fast nucleotide substitution rates. Thus, the analysis of congeneric species promises to be a valuable approach for the assessment of the mtDNA evolutionary trend in poorly or not yet sampled metazoan groups.
Collapse
Affiliation(s)
- C Gissi
- Dipartimento di Scienze Biomolecolari e Biotecnologie, Università di Milano, Milano, Italy.
| | | | | |
Collapse
|
32
|
Castoe TA, Jiang ZJ, Gu W, Wang ZO, Pollock DD. Adaptive evolution and functional redesign of core metabolic proteins in snakes. PLoS One 2008; 3:e2201. [PMID: 18493604 PMCID: PMC2376058 DOI: 10.1371/journal.pone.0002201] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2007] [Accepted: 04/01/2008] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Adaptive evolutionary episodes in core metabolic proteins are uncommon, and are even more rarely linked to major macroevolutionary shifts. METHODOLOGY/PRINCIPAL FINDINGS We conducted extensive molecular evolutionary analyses on snake mitochondrial proteins and discovered multiple lines of evidence suggesting that the proteins at the core of aerobic metabolism in snakes have undergone remarkably large episodic bursts of adaptive change. We show that snake mitochondrial proteins experienced unprecedented levels of positive selection, coevolution, convergence, and reversion at functionally critical residues. We examined Cytochrome C oxidase subunit I (COI) in detail, and show that it experienced extensive modification of normally conserved residues involved in proton transport and delivery of electrons and oxygen. Thus, adaptive changes likely altered the flow of protons and other aspects of function in CO, thereby influencing fundamental characteristics of aerobic metabolism. We refer to these processes as "evolutionary redesign" because of the magnitude of the episodic bursts and the degree to which they affected core functional residues. CONCLUSIONS/SIGNIFICANCE The evolutionary redesign of snake COI coincided with adaptive bursts in other mitochondrial proteins and substantial changes in mitochondrial genome structure. It also generally coincided with or preceded major shifts in ecological niche and the evolution of extensive physiological adaptations related to lung reduction, large prey consumption, and venom evolution. The parallel timing of these major evolutionary events suggests that evolutionary redesign of metabolic and mitochondrial function may be related to, or underlie, the extreme changes in physiological and metabolic efficiency, flexibility, and innovation observed in snake evolution.
Collapse
Affiliation(s)
- Todd A. Castoe
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| | - Zhi J. Jiang
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Wanjun Gu
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| | - Zhengyuan O. Wang
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - David D. Pollock
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado, United States of America
- * E-mail:
| |
Collapse
|
33
|
Fonseca MM, Harris DJ. Relationship between mitochondrial gene rearrangements and stability of the origin of light strand replication. Genet Mol Biol 2008. [DOI: 10.1590/s1415-47572008000300027] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Affiliation(s)
- Miguel M. Fonseca
- Instituto de Ciências e Tecnologias Agrárias e Agro-Alimentares, Portugal; Universidade do Porto, Portugal
| | - D. James Harris
- Instituto de Ciências e Tecnologias Agrárias e Agro-Alimentares, Portugal; Universidade do Porto, Portugal
| |
Collapse
|
34
|
Comparative mitochondrial genomics of snakes: extraordinary substitution rate dynamics and functionality of the duplicate control region. BMC Evol Biol 2007; 7:123. [PMID: 17655768 PMCID: PMC1950710 DOI: 10.1186/1471-2148-7-123] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2006] [Accepted: 07/26/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The mitochondrial genomes of snakes are characterized by an overall evolutionary rate that appears to be one of the most accelerated among vertebrates. They also possess other unusual features, including short tRNAs and other genes, and a duplicated control region that has been stably maintained since it originated more than 70 million years ago. Here, we provide a detailed analysis of evolutionary dynamics in snake mitochondrial genomes to better understand the basis of these extreme characteristics, and to explore the relationship between mitochondrial genome molecular evolution, genome architecture, and molecular function. We sequenced complete mitochondrial genomes from Slowinski's corn snake (Pantherophis slowinskii) and two cottonmouths (Agkistrodon piscivorus) to complement previously existing mitochondrial genomes, and to provide an improved comparative view of how genome architecture affects molecular evolution at contrasting levels of divergence. RESULTS We present a Bayesian genetic approach that suggests that the duplicated control region can function as an additional origin of heavy strand replication. The two control regions also appear to have different intra-specific versus inter-specific evolutionary dynamics that may be associated with complex modes of concerted evolution. We find that different genomic regions have experienced substantial accelerated evolution along early branches in snakes, with different genes having experienced dramatic accelerations along specific branches. Some of these accelerations appear to coincide with, or subsequent to, the shortening of various mitochondrial genes and the duplication of the control region and flanking tRNAs. CONCLUSION Fluctuations in the strength and pattern of selection during snake evolution have had widely varying gene-specific effects on substitution rates, and these rate accelerations may have been functionally related to unusual changes in genomic architecture. The among-lineage and among-gene variation in rate dynamics observed in snakes is the most extreme thus far observed in animal genomes, and provides an important study system for further evaluating the biochemical and physiological basis of evolutionary pressures in vertebrate mitochondria.
Collapse
|
35
|
Rodakis GC, Cao L, Mizi A, Kenchington ELR, Zouros E. Nucleotide Content Gradients in Maternally and Paternally Inherited Mitochondrial Genomes of the Mussel Mytilus. J Mol Evol 2007; 65:124-36. [PMID: 17632681 DOI: 10.1007/s00239-005-0298-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2005] [Accepted: 07/26/2006] [Indexed: 10/23/2022]
Abstract
Several studies have shown that in vertebrate mtDNAs the nucleotide content at fourfold degenerate sites is well correlated with the site's time of exposure to the single-strand state, as predicted from the asymmetrical model of mtDNA replication. Here we examine whether the same explanation may hold for the regional variation in nucleotide content in the maternal and paternal mtDNAs of the mussel Mytilus galloprovincialis. The origin of replication of the heavy strand (O(H)) of these genomes has been previously established. A systematic search of the two genomes for sequences that are likely to act as the origin of replication of the light strand (O(L)) suggested that the most probable site lies within the ND3 gene. By adopting this O(L) position we calculated times of exposure for 0(FD) (nondegenerate), 2(FD) (twofold degenerate), and 4(FD) (fourfold degenerate) sites of the protein-coding part of the genome and for the rRNA, tRNA and noncoding parts. The presence of thymine and absence of guanine at 4(FD) sites was highly correlated with the presumed time of exposure. Such an effect was not found for the 2(FD) sites, the rRNA, the tRNA, or the noncoding parts. There was a trend for a small increase in cytosine at 0(FD) sites with exposure time, which is explicable as the result of biased usage of 4(FD) codons. The same analysis was applied to a recently sequenced mitochondrial genome of Mytilus trossulus and produced similar results. These results are consistent with the asymmetrical model of replication and suggest that guanine oxidation due to single-strand exposure is the main cause of regional variation of nucleotide content in Mytilus mitochondrial genomes.
Collapse
Affiliation(s)
- George C Rodakis
- Department of Biochemistry and Molecular Biology, National and Kapodistrian University of Athens, Panepistimioupolis, 15701 Athens, Greece
| | | | | | | | | |
Collapse
|
36
|
Xia X, Huang H, Carullo M, Betrán E, Moriyama EN. Conflict between translation initiation and elongation in vertebrate mitochondrial genomes. PLoS One 2007; 2:e227. [PMID: 17311091 PMCID: PMC1794132 DOI: 10.1371/journal.pone.0000227] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2006] [Accepted: 01/25/2007] [Indexed: 11/18/2022] Open
Abstract
The strand-biased mutation spectrum in vertebrate mitochondrial genomes results in an AC-rich L-strand and a GT-rich H-strand. Because the L-strand is the sense strand of 12 protein-coding genes out of the 13, the third codon position is overall strongly AC-biased. The wobble site of the anticodon of the 22 mitochondrial tRNAs is either U or G to pair with the most abundant synonymous codon, with only one exception. The wobble site of Met-tRNA is C instead of U, forming the Watson-Crick match with AUG instead of AUA, the latter being much more frequent than the former. This has been attributed to a compromise between translation initiation and elongation; i.e., AUG is not only a methionine codon, but also an initiation codon, and an anticodon matching AUG will increase the initiation rate. However, such an anticodon would impose selection against the use of AUA codons because AUA needs to be wobble-translated. According to this translation conflict hypothesis, AUA should be used relatively less frequently compared to UUA in the UUR codon family. A comprehensive analysis of mitochondrial genomes from a variety of vertebrate species revealed a general deficiency of AUA codons relative to UUA codons. In contrast, urochordate mitochondrial genomes with two tRNA(Met) genes with CAU and UAU anticodons exhibit increased AUA codon usage. Furthermore, six bivalve mitochondrial genomes with both of their tRNA-Met genes with a CAU anticodon have reduced AUA usage relative to three other bivalve mitochondrial genomes with one of their two tRNA-Met genes having a CAU anticodon and the other having a UAU anticodon. We conclude that the translation conflict hypothesis is empirically supported, and our results highlight the fine details of selection in shaping molecular evolution.
Collapse
MESH Headings
- Animals
- Anticodon/genetics
- Bivalvia/genetics
- Codon/genetics
- Codon, Initiator/genetics
- DNA, Mitochondrial/genetics
- Evolution, Molecular
- Genome, Mitochondrial
- Models, Genetic
- Peptide Chain Elongation, Translational
- Peptide Chain Initiation, Translational
- RNA, Transfer, Met/genetics
- Selection, Genetic
- Species Specificity
- Urochordata/genetics
- Vertebrates/genetics
Collapse
Affiliation(s)
- Xuhua Xia
- Department of Biology, University of Ottawa, Ottawa, Ontario, Canada.
| | | | | | | | | |
Collapse
|
37
|
Fonseca MM, Froufe E, Harris DJ. Mitochondrial gene rearrangements and partial genome duplications detected by multigene asymmetric compositional bias analysis. J Mol Evol 2006; 63:654-61. [PMID: 17075699 DOI: 10.1007/s00239-005-0242-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2005] [Accepted: 05/30/2006] [Indexed: 11/30/2022]
Abstract
Asymmetric compositional and mutation bias between the two strands occurs in mitochondrial genomes, and an asymmetric mechanism of mtDNA replication is a potential source of this bias. Some evidence indicates that during replication the heavy strand is subject to a gradient of time spent in a single-stranded state (D (ssH)) and a gradient of mutational damage. The nucleotide composition bias among genes varies with D (ssH). Consequently, partial genome duplications (PGD) will alter the skew for genes located downstream of the duplication, relatively to nascent light strand synthesis, and in the same way, gene rearrangements (GRr) will affect genes by changing their skews. We examined cases where there had been PGD or GRr and determined whether this left a trace in the form of unusual patterns of base composition. We compared the skew of genes differently located on the mtDNA genome of previously published whole mtDNA genomes from amphibians, a group that shows considerable levels of both GRr and PGD. After observing a significant correlation between AT and GC skew with D (ssH) at fourfold redundant sites, we ran our analysis and detected 31.3% of the species with GRr and/or PGD. By comparing the nucleotide composition at fourfold redundant sites in normal and "abnormal" species, we found that A/C variation occurs and is associated with GRr/PGD. These results show that by analyzing the nucleotide skews of only three genes, it may be possible to predict some mitochondrial GRr and/or PGD without knowing the complete mtDNA genome sequence.
Collapse
Affiliation(s)
- Miguel M Fonseca
- Centro de Investigação em Biodiversidade e Recursos Genéticos (CIBIO/UP), ICETA-UP, Campus Agrário de Vairão, Rua Padre Armando, 4485-661 Vairão, Portugal
| | | | | |
Collapse
|
38
|
Xu W, Jameson D, Tang B, Higgs PG. The relationship between the rate of molecular evolution and the rate of genome rearrangement in animal mitochondrial genomes. J Mol Evol 2006; 63:375-92. [PMID: 16838214 DOI: 10.1007/s00239-005-0246-5] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2005] [Accepted: 04/17/2006] [Indexed: 10/24/2022]
Abstract
Evolution of mitochondrial genes is far from clock-like. The substitution rate varies considerably between species, and there are many species that have a significantly increased rate with respect to their close relatives. There is also considerable variation among species in the rate of gene order rearrangement. Using a set of 55 complete arthropod mitochondrial genomes, we estimate the evolutionary distance from the common ancestor to each species using protein sequences, tRNA sequences, and breakpoint distances (a measure of the degree of genome rearrangement). All these distance measures are correlated. We use relative rate tests to compare pairs of related species in several animal phyla. In the majority of cases, the species with the more highly rearranged genome also has a significantly higher rate of sequence evolution. Species with higher amino acid substitution rates in mitochondria also have more variable amino acid composition in response to mutation pressure. We discuss the possible causes of variation in rates of sequence evolution and gene rearrangement among species and the possible reasons for the observed correlation between the two rates.
Collapse
Affiliation(s)
- Wei Xu
- Department of Physics and Astronomy, McMaster University, Main St. West, Hamilton, Ontario, L8S 4M1, Canada
| | | | | | | |
Collapse
|
39
|
Seligmann H, Krishnan NM, Rao BJ. Possible multiple origins of replication in primate mitochondria: Alternative role of tRNA sequences. J Theor Biol 2006; 241:321-32. [PMID: 16430924 DOI: 10.1016/j.jtbi.2005.11.035] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2005] [Revised: 11/29/2005] [Accepted: 11/30/2005] [Indexed: 11/20/2022]
Abstract
DNA replication in vertebrate mitochondria is usually directional, leaving different portions of the genome single-stranded for different periods of time. During this time, mutations resulting from deaminations of cytosines to thymines and adenines to guanines accumulate on the heavy strand. Therefore, T/C and G/A ratios increase along mitochondrial genomes, proportionally to the time spent single-stranded during replication. Such trends exist at third codon positions for base ratios averaged across genes in individual genomes as well as for gene-specific and site-specific substitution frequencies estimated using phylogenetic methods. We use multiple regressions to test for the potential functioning of all 12 tRNA clusters in 19 primate mitochondrial genomes as alternative origins of light strand replication (OL). We provide a general algorithm for calculating time spent single stranded by a given site for any possible locations of the site and OL. For codon positions 1, 2, and 3, respectively, 23%, 9% and 35% of tRNA gene clusters have significant (p < 0.05) deamination gradients originating from them. The strength of the deamination gradient originating from tRNA gene clusters varies among species, and for five clusters, correlates with the tendency of tRNA genes in each of these clusters to form secondary structures that resemble the OL's structure. This is notably true for all codon positions for tRNA-Lys, which in absence of nuclear regulation, forms secondary structures resembling the hairpin structure of OL. For two tRNA gene clusters, correlations were statistically significant, but opposite to the direction expected by the known unidirectional replication, putatively compatible with bi-directional replication. Few substitutions in tRNA sequences can be neutral at the level of cloverleaf structure and function, yet significantly alter capacities to form OL-like structures, causing sudden evolution of genome-wide nucleotide contents.
Collapse
Affiliation(s)
- Hervé Seligmann
- Department of Evolution, Systematics and Ecology, The Hebrew University of Jerusalem, 91904, Israel.
| | | | | |
Collapse
|
40
|
Broughton RE, Reneau PC. Spatial Covariation of Mutation and Nonsynonymous Substitution Rates in Vertebrate Mitochondrial Genomes. Mol Biol Evol 2006; 23:1516-24. [PMID: 16705079 DOI: 10.1093/molbev/msl013] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Mitochondrial genomes encode fundamental subunits of the basic energy producing machinery of eukaryotic cells that are under strong functional constraint. Paradoxically, these genes evolve rapidly in general, and there is substantial variation in evolutionary rates among genes within genomes. In order to investigate spatial variation in selection intensity, we conducted tests of neutrality using ratios of synonymous to nonsynonymous substitutions (dN/dS = omega) on numerous protein gene segments from fishes and mammals. Values of omega were very low for nearly all genomic regions. However, values of both omega and dN varied in a clinal pattern with increasing distance from the light-strand origin of replication. Spatial heterogeneity of nonsynonymous substitution rates exhibits a significantly positive correlation with variation in mutation rates that are related to the mode of mitochondrial DNA replication. The finding that nonsynonymous substitution rates are proportional to mutation rates is expected if a majority of substitutions are selectively neutral or slightly deleterious. Spatial patterns of among-gene variation in nonsynonymous rates were highly similar between fishes and mammals, suggesting that forces governing mitochondrial gene evolution have remained relatively constant over 450 Myr of vertebrate evolution. Conservation of substitution patterns despite major shifts in thermal habit and metabolic demands among taxa implicates a conserved replication mechanism controlling relative mutation rates as a major determinant of mitochondrial protein evolution.
Collapse
Affiliation(s)
- Richard E Broughton
- Oklahoma Biological Survey and Department of Zoology, University of Oklahoma, USA.
| | | |
Collapse
|
41
|
Seligmann H. Error propagation across levels of organization: from chemical stability of ribosomal RNA to developmental stability. J Theor Biol 2006; 242:69-80. [PMID: 16584749 DOI: 10.1016/j.jtbi.2006.02.004] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2005] [Revised: 01/30/2006] [Accepted: 02/02/2006] [Indexed: 11/19/2022]
Abstract
My hypothesis integrates molecular and whole-organism levels of development. A physico-chemical property of nucleotides (their dipole moment), confers structural thermostability on double-stranded sequences, and decreases chemical stability of single-stranded sequences. According to this approach, low ribosomal RNA stability should decrease the precision of protein synthesis and whole-organism developmental stability. Indeed, substitution frequencies in pseudogenes are proportional to the subtraction of the dipole moment of the substituting nucleotide from that of the substituted one, and developmental instability, estimated by morphological fluctuating asymmetry (FA), correlates with mammal 12s rRNA base content of loop (but not stem) regions. In lizards, fit to the single-strand rationale of sequence chemical stability decreases with the level of poikilothermy of the investigated lizard family, suggesting interactions between changes in body temperature, ribosomal structure and developmental instability. Results confirm the hypothesis (less than for 12s rRNA) in: third codon positions of cytochrome B, probably because, unlike rRNAs, specific mRNAs affect only the protein they code; and 16s rRNA, apparently because its base composition is more affected by genome-wide mutational biases than that of 12s rRNA.
Collapse
Affiliation(s)
- Hervé Seligmann
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA.
| |
Collapse
|
42
|
Urbina D, Tang B, Higgs PG. The response of amino acid frequencies to directional mutation pressure in mitochondrial genome sequences is related to the physical properties of the amino acids and to the structure of the genetic code. J Mol Evol 2006; 62:340-61. [PMID: 16477524 DOI: 10.1007/s00239-005-0051-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2005] [Accepted: 10/01/2005] [Indexed: 11/29/2022]
Abstract
The frequencies of A, C, G, and T in mitochondrial DNA vary among species due to unequal rates of mutation between the bases. The frequencies of bases at fourfold degenerate sites respond directly to mutation pressure. At first and second positions, selection reduces the degree of frequency variation. Using a simple evolutionary model, we show that first position sites are less constrained by selection than second position sites and, therefore, that the frequencies of bases at first position are more responsive to mutation pressure than those at second position. We define a measure of distance between amino acids that is dependent on eight measured physical properties and a similarity measure that is the inverse of this distance. Columns 1, 2, 3, and 4 of the genetic code correspond to codons with U, C, A, and G in their second position, respectively. The similarity of amino acids in the four columns decreases systematically from column 1 to column 2 to column 3 to column 4. We then show that the responsiveness of first position bases to mutation pressure is dependent on the second position base and follows the same decreasing trend through the four columns. Again, this shows the correlation between physical properties and responsiveness. We determine a proximity measure for each amino acid, which is the average similarity between an amino acid and all others that are accessible via single point mutations in the mitochondrial genetic code structure. We also define a responsiveness for each amino acid, which measures how rapidly an amino acid frequency changes as a result of mutation pressure acting on the base frequencies. We show that there is a strong correlation between responsiveness and proximity, and that both these quantities are also correlated with the mutability of amino acids estimated from the mtREV substitution rate matrix. We also consider the variation of base frequencies between strands and between genes on a strand. These trends are consistent with the patterns expected from analysis of the variation among genomes.
Collapse
Affiliation(s)
- Daniel Urbina
- Department of Physics and Astronomy, McMaster University, Hamilton, Ontario, Canada
| | | | | |
Collapse
|
43
|
Seligmann H, Krishnan NM. Mitochondrial replication origin stability and propensity of adjacent tRNA genes to form putative replication origins increase developmental stability in Lizards. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2006; 306:433-49. [PMID: 16463378 DOI: 10.1002/jez.b.21095] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Secondary structure stability of mitochondrial origins of light-strand replication (OL) presumably reduces delayed formation of light-strand initiating replication forks on the heavy strand. Delayed replication initiation prolongs single strandedness of the heavy strand. More mutations accumulate during the prolonged time spent single stranded. Presumably, delayed replication initiation and excess mutations affect mitochondrial biochemical processes and ultimately morphological outcomes of development at the whole-organism level. This predicts that developmental stability increases with OL secondary structure stability and with formation of OL-like structures by the five tRNA genes flanking recognized OLs. Stable OLs and high percentages of OL-resembling secondary structures of adjacent tRNA genes (predicted by Mfold) correlate positively with developmental stability in three lizard families (Anguidae, Amphisbaenidae, and Polychrotidae). Accounting for effects of the regular OL, Sfold-predicted OL-like propensity of the entire tRNA gene cluster (not of individual genes) correlates with increased developmental stability in Anguidae, also across the entire free-energy range of Boltzmann's distribution of secondary structures. In the fossorial Amphisbaenidae, the OL-like structure-forming propensity of tRNA genes correlates positively with developmental stability for the distribution's sub-optimally stable regions, and negatively for its optimally stable regions, suggesting the thermoregulated functioning of OL vs. flanking tRNA genes as replication origins. Results for polychrotid tRNA genes are intermediate. Anguid tRNA genes possibly function in addition to the regular OL. Mitochondrial tRNA genes may thus frequently acquire and lose the alternative OL function, without sequence (gene) duplication and loss of their primary function.
Collapse
Affiliation(s)
- Hervé Seligmann
- Department of Evolution, Systematics and Ecology, The Hebrew University of Jerusalem, 91404, Israel.
| | | |
Collapse
|
44
|
Galtier N, Enard D, Radondy Y, Bazin E, Belkhir K. Mutation hot spots in mammalian mitochondrial DNA. Genome Res 2005; 16:215-22. [PMID: 16354751 PMCID: PMC1361717 DOI: 10.1101/gr.4305906] [Citation(s) in RCA: 102] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Animal mitochondrial DNA is characterized by a remarkably high level of within-species homoplasy, that is, phylogenetic incongruence between sites of the molecule. Several investigators have invoked recombination to explain it, challenging the dogma of maternal, clonal mitochondrial inheritance in animals. Alternatively, a high level of homoplasy could be explained by the existence of mutation hot spots. By using an exhaustive mammalian data set, we test the hot spot hypothesis by comparing patterns of site-specific polymorphism and divergence in several groups of closely related species, including hominids. We detect significant co-occurrence of synonymous polymorphisms among closely related species in various mammalian groups, and a correlation between the site-specific levels of variability within humans (on one hand) and between Hominoidea species (on the other hand), indicating that mutation hot spots actually exist in mammalian mitochondrial coding regions. The whole data, however, cannot be explained by a simple mutation hot spots model. Rather, we show that the site-specific mutation rate quickly varies in time, so that the same sites are not hypermutable in distinct lineages. This study provides a plausible mutation model that potentially accounts for the peculiar distribution of mitochondrial sequence variation in mammals without the need for invoking recombination. It also gives hints about the proximal causes of mitochondrial site-specific hypermutability in humans.
Collapse
Affiliation(s)
- Nicolas Galtier
- Centre National de la Recherche Scientifique, Unité Mixte de Recherche 5171-Génome, Populations, Interactions, Adaptation, Université Montpellier 2, 34095 Montpellier, France.
| | | | | | | | | |
Collapse
|