1
|
Barbhuiya PA, Uddin A, Chakraborty S. Understanding the codon usage patterns of mitochondrial CO genes among Amphibians. Gene 2021; 777:145462. [PMID: 33515725 DOI: 10.1016/j.gene.2021.145462] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Revised: 12/18/2020] [Accepted: 01/20/2021] [Indexed: 11/17/2022]
Abstract
A universal phenomenon of using synonymous codons unequally in coding sequences known as codon usage bias (CUB) is observed in all forms of life. Mutation and natural selection drive CUB in many species but the relative role of evolutionary forces varies across species, genes and genomes. We studied the CUB in mitochondrial (mt) CO genes from three orders of Amphibia using bioinformatics approach as no work was reported yet. We observed that CUB of mt CO genes of Amphibians was weak across different orders. Order Caudata had higher CUB followed by Gymnophiona and Anura for all genes and CUB also varied across genes. Nucleotide composition analysis showed that CO genes were AT-rich. The AT content in Caudata was higher than that in Gymnophiona while Anura showed the least content. Multiple investigations namely nucleotide composition, correspondence analysis, parity plot analysis showed that the interplay of mutation pressure and natural selection caused CUB in these genes. Neutrality plot suggested the involvement of natural selection was more than the mutation pressure. The contribution of natural selection was higher in Anura than Gymnophiona and the lowest in Caudata. The codons CGA, TGA, AAA were found to be highly favoured by nature across all genes and orders.
Collapse
Affiliation(s)
- Parvin A Barbhuiya
- Department of Biotechnology, Assam University, Silchar 788150, Assam, India
| | - Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Algapur, Hailakandi 788150, Assam, India
| | - Supriyo Chakraborty
- Department of Biotechnology, Assam University, Silchar 788150, Assam, India.
| |
Collapse
|
2
|
Dainat J, Pontarotti P. Methods to Identify and Study the Evolution of Pseudogenes Using a Phylogenetic Approach. Methods Mol Biol 2021; 2324:21-34. [PMID: 34165706 DOI: 10.1007/978-1-0716-1503-4_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The discovery that pseudogenes are involved in important biological processes has excited enthusiasm and increased the research interest on them. An accurate detection and analysis of pseudogenes can be achieved using comparative methods, but only the use of phylogenetic tools can provide accurate information about their birth, their evolution and their death, hence about the impact that they have on genes and genomes. Here, phylogenetic methods that allow for studying pseudogene history are described.
Collapse
Affiliation(s)
- Jacques Dainat
- Department of Medical Biochemistry Microbiology and Genomics, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
| | - Pierre Pontarotti
- Aix Marseille Université, Institut de Recherche pour le Développement (IRD), Assistance Publique - Hôpitaux de Marseille (AP-HM), Microbes Evolution Phylogeny and Infections (MEPHI), IHU Méditerranée Infection, Marseille, France
- SNC5039 CNRS, Marseille, France
| |
Collapse
|
3
|
Mycobacterium lepromatosis genome exhibits unusually high CpG dinucleotide content and selection is key force in shaping codon usage. INFECTION GENETICS AND EVOLUTION 2020; 84:104399. [PMID: 32512206 DOI: 10.1016/j.meegid.2020.104399] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2019] [Revised: 05/30/2020] [Accepted: 06/03/2020] [Indexed: 01/06/2023]
Abstract
Mycobacterium lepromatosis was identified as a causative agent for leprosy in the year 2008 in the United States and later more cases were identified in Canada, Singapore, Brazil, and Myanmar. It is known to cause diffuse lepromatosis leprosy among humans. Since it is invasive, the mortality rates are higher in comparison to the M. leprae. At genomic level, there exists 90.9% similarity between M. lepromatosis and M. leprae. Codon usage analysis based on analyses of 228 coding sequences (CDSs) of M. lepromatosis, revealed that the genome is GC rich. Among the total 16 dinucleotides, CpG dinucleotide possesses the highest dinucleotide frequency in M. lepromatosis, that is strikingly an unobvious observation since higher CpG is associated with higher proinflammatory cytokine production and NF-κB activation that eventually leads to high pathogenicity. To evade immune response, CpG content is generally less in pathogens. The unusually high CpG content can be explained by the fact that the nucleotide composition of M. lepromatosis is CG rich. Various forces interplay to shape codon usage pattern of any organism including selection; mutation, nucleotide composition as well as GC biased gene conversion. To understand the interplay between various forces; neutrality, parity, Nc-GC3 (Effective number of codons-GC content at 3rd position of the codon), aromaticity (AROMO) and the general average hydropathicity score (GRAVY) analyses have been carried out. The analyses revealed that selection force is the major contributory force. Along with the selection; mutation, nucleotide composition as well as GC biased gene conversion also play role in shaping codon usage bias in M. lepromatosis. This is the first report on the codon usage in M. lepromatosis.
Collapse
|
4
|
Zhou C, Sun Y, Yan R, Liu Y, Zuo E, Gu C, Han L, Wei Y, Hu X, Zeng R, Li Y, Zhou H, Guo F, Yang H. Off-target RNA mutation induced by DNA base editing and its elimination by mutagenesis. Nature 2019; 571:275-278. [PMID: 31181567 DOI: 10.1038/s41586-019-1314-0] [Citation(s) in RCA: 291] [Impact Index Per Article: 58.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Accepted: 05/30/2019] [Indexed: 12/21/2022]
Abstract
Recently developed DNA base editing methods enable the direct generation of desired point mutations in genomic DNA without generating any double-strand breaks1-3, but the issue of off-target edits has limited the application of these methods. Although several previous studies have evaluated off-target mutations in genomic DNA4-8, it is now clear that the deaminases that are integral to commonly used DNA base editors often bind to RNA9-13. For example, the cytosine deaminase APOBEC1-which is used in cytosine base editors (CBEs)-targets both DNA and RNA12, and the adenine deaminase TadA-which is used in adenine base editors (ABEs)-induces site-specific inosine formation on RNA9,11. However, any potential RNA mutations caused by DNA base editors have not been evaluated. Adeno-associated viruses are the most common delivery system for gene therapies that involve DNA editing; these viruses can sustain long-term gene expression in vivo, so the extent of potential RNA mutations induced by DNA base editors is of great concern14-16. Here we quantitatively evaluated RNA single nucleotide variations (SNVs) that were induced by CBEs or ABEs. Both the cytosine base editor BE3 and the adenine base editor ABE7.10 generated tens of thousands of off-target RNA SNVs. Subsequently, by engineering deaminases, we found that three CBE variants and one ABE variant showed a reduction in off-target RNA SNVs to the baseline while maintaining efficient DNA on-target activity. This study reveals a previously overlooked aspect of off-target effects in DNA editing and also demonstrates that such effects can be eliminated by engineering deaminases.
Collapse
Affiliation(s)
- Changyang Zhou
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.,College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Yidi Sun
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China.,CAS Key Laboratory of Systems Biology, CAS Center for Excellence in Molecular Cell Science, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.,Bio-Med Big Data Center, Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Rui Yan
- Center for Translational Medicine, Ministry of Education Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Obstetrics and Gynecology, West China Second University Hospital, College of Life Sciences, Sichuan University, Chengdu, China
| | - Yajing Liu
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China.,School of Life Science and Technology, Shanghai Tech University, Shanghai, China
| | - Erwei Zuo
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.,Center for Animal Genomics, Agricultural Genome Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Chan Gu
- Center for Translational Medicine, Ministry of Education Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Obstetrics and Gynecology, West China Second University Hospital, College of Life Sciences, Sichuan University, Chengdu, China
| | - Linxiao Han
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yu Wei
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Xinde Hu
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.,College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Rong Zeng
- CAS Key Laboratory of Systems Biology, CAS Center for Excellence in Molecular Cell Science, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.,School of Life Science and Technology, Shanghai Tech University, Shanghai, China
| | - Yixue Li
- Center for Translational Medicine, Ministry of Education Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Obstetrics and Gynecology, West China Second University Hospital, College of Life Sciences, Sichuan University, Chengdu, China. .,School of Life Science and Technology, Shanghai Tech University, Shanghai, China. .,Shanghai Jiao Tong University, Fudan University, Shanghai Academy of Science & Technology, Shanghai, China.
| | - Haibo Zhou
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.
| | - Fan Guo
- Center for Translational Medicine, Ministry of Education Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Obstetrics and Gynecology, West China Second University Hospital, College of Life Sciences, Sichuan University, Chengdu, China.
| | - Hui Yang
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.
| |
Collapse
|
5
|
Almpanis A, Swain M, Gatherer D, McEwan N. Correlation between bacterial G+C content, genome size and the G+C content of associated plasmids and bacteriophages. Microb Genom 2018; 4:e000168. [PMID: 29633935 PMCID: PMC5989581 DOI: 10.1099/mgen.0.000168] [Citation(s) in RCA: 63] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Accepted: 03/06/2018] [Indexed: 02/06/2023] Open
Abstract
Based on complete bacterial genome sequence data, we demonstrate a correlation between bacterial chromosome length and the G+C content of the genome, with longer genomes having higher G+C contents. The correlation value decreases at shorter genome sizes, where there is a wider spread of G+C values. However, although significant (P<0.001), the correlation value (Pearson R=0.58) suggests that other factors also have a significant influence. A similar pattern was seen for plasmids; longer plasmids had higher G+C values, although the large number of shorter plasmids had a wide spread of G+C values. There was also a significant (P<0.0001) correlation between the G+C content of plasmids and the G+C content of their bacterial host. Conversely, the G+C content of bacteriophages tended to reduce with larger genome sizes, and although there was a correlation between host genome G+C content and that of the bacteriophage, it was not as strong as that seen between plasmids and their hosts.
Collapse
Affiliation(s)
- Apostolos Almpanis
- Aberystwyth University, Aberystwyth, UK
- Newcastle University, Newcastle-upon-Tyne, UK
| | | | | | - Neil McEwan
- Aberystwyth University, Aberystwyth, UK
- School of Pharmacy and Life Sciences, Robert Gordon University, Aberdeen, UK
| |
Collapse
|
6
|
Maharjan R, Ferenci T. Mutational signatures indicative of environmental stress in bacteria. Mol Biol Evol 2014; 32:380-91. [PMID: 25389207 DOI: 10.1093/molbev/msu306] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Evolutionary innovations are dependent on mutations. Mutation rates are increased by adverse conditions in the laboratory, but there is no evidence that stressful environments that do not directly impact on DNA leave a mutational imprint on extant genomes. Mutational spectra in the laboratory are normally determined with unstressed cells but are unavailable with stressed bacteria. To by-pass problems with viability, selection effects, and growth rate differences due to stressful environments, in this study we used a set of genetically engineered strains to identify the mutational spectrum associated with nutritional stress. The strain set members each had a fixed level of the master regulator protein, RpoS, which controls the general stress response of Escherichia coli. By assessing mutations in cycA gene from 485 cycloserine resistant mutants collected from as many independent cultures with three distinct perceived stress (RpoS) levels, we were able establish a dose-dependent relationship between stress and mutational spectra. The altered mutational patterns included base pair substitutions, single base pair indels, longer indels, and transpositions of different insertion sequences. The mutational spectrum of low-RpoS cells closely matches the genome-wide spectrum previously generated in laboratory environments, while the spectra of high RpoS, high perceived stress cells more closely matches spectra found in comparisons of extant genomes. Our results offer an explanation of the uneven mutational profiles such as the transition-transversion biases observed in extant genomes and provide a framework for assessing the contribution of stress-induced mutagenesis to evolutionary transitions and the mutational emergence of antibiotic resistance and disease states.
Collapse
Affiliation(s)
- Ram Maharjan
- School of Molecular Bioscience, University of Sydney, Sydney, NSW, Australia
| | - Thomas Ferenci
- School of Molecular Bioscience, University of Sydney, Sydney, NSW, Australia
| |
Collapse
|
7
|
Dainat J, Pontarotti P. Methods to study the occurrence and the evolution of pseudogenes through a phylogenetic approach. Methods Mol Biol 2014; 1167:87-99. [PMID: 24823773 DOI: 10.1007/978-1-4939-0835-6_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2023]
Abstract
During the last few years, the study of pseudogenes has excited enthusiasm, because it has been proven that at least some of them are involved in important biological processes. An accurate detection and analysis of pseudogenes can be achieved using comparative methods, but only the use of phylogenetic tools can provide accurate information about their birth, their evolution and their death, hence about the impact that they have on genes and genomes. Here, phylogenetic methods that allow studying pseudogene history are described.
Collapse
Affiliation(s)
- Jacques Dainat
- Evolutionary Biology and Modeling Group, Aix-Marseille Université, LATP - UMR 7353, 3 Place Victor Hugo - Case 19, 13331, Marseille Cedex 3, France,
| | | |
Collapse
|
8
|
Li Y, Jiao L, Yao YJ. Non-concerted ITS evolution in fungi, as revealed from the important medicinal fungus Ophiocordyceps sinensis. Mol Phylogenet Evol 2013; 68:373-9. [PMID: 23618625 DOI: 10.1016/j.ympev.2013.04.010] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2012] [Revised: 04/04/2013] [Accepted: 04/12/2013] [Indexed: 11/16/2022]
Abstract
The internal transcribed spacer (ITS) of nuclear ribosomal DNA (nrDNA) has been widely used as a molecular marker in phylogenetic studies and has been selected as a DNA barcode for fungi. It is generally believed that nrDNA conforms to concerted evolution in most eukaryotes; however, intraindividual-intraspecific polymorphisms of this region were reported in various organisms, suggesting a non-concerted evolutionary process. In Ophiocordyceps sinensis, one of the most valuable medicinal fungi, a remarkable variation of the ITS region has been revealed. Some highly divergent sequences were thought to represent cryptic species, different species or genotypes in previous studies. To clarify the unusual ITS polymorphisms observed in O. sinensis, specific primers were designed to amplify ITS paralogs from pure cultures of both single-ascospore and tissue isolates in this study. All of the available ITS sequences, including those generated by this group and those in GenBank, were analyzed. Several AT-biased ITS paralogs were classified as pseudogenes based on their nucleotide compositions, secondary structures and minimum free energies of their 5.8S rRNAs, substitution rates, phylogenetic positions and gene expression analyses. Furthermore, ITS pseudogenes were amplified with specific primers from 10 of the 28 strains tested, including eight single-ascospore and two tissue isolates. Divergent ITS paralogs were proved to coexist in individual genomes, suggesting a non-concerted mechanism of evolution in the ITS region of O. sinensis. The hypotheses that divergent ITS paralogs represent cryptic or other species or different genotypes were thus rejected.
Collapse
Affiliation(s)
- Yi Li
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | | | | |
Collapse
|
9
|
Dainat J, Paganini J, Pontarotti P, Gouret P. GLADX: an automated approach to analyze the lineage-specific loss and pseudogenization of genes. PLoS One 2012; 7:e38792. [PMID: 22723889 PMCID: PMC3377690 DOI: 10.1371/journal.pone.0038792] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2012] [Accepted: 05/10/2012] [Indexed: 11/23/2022] Open
Abstract
A well-established ancestral gene can usually be found, in one or multiple copies, in different descendant species. Sometimes during the course of evolution, all the representatives of a well-established ancestral gene disappear in specific lineages; such gene losses may occur in the genome by deletion of a DNA fragment or by pseudogenization. The loss of an entire gene family in a given lineage may reflect an important phenomenon, and could be due either to adaptation, or to a relaxation of selection that leads to neutral evolution. Therefore, the lineage-specific gene loss analyses are important to improve the understanding of the evolutionary history of genes and genomes. In order to perform this kind of study from the increasing number of complete genome sequences available, we developed a unique new software module called GLADX in the DAGOBAH framework, based on a comparative genomic approach. The software is able to automatically detect, for all the species of a phylum, the presence/absence of a representative of a well-established ancestral gene, and by systematic steps of re-annotation, confirm losses, detect and analyze pseudogenes and find novel genes. The approach is based on the use of highly reliable gene phylogenies, of protein predictions and on the analysis of genomic mutations. All the evidence associated to evolutionary approach provides accurate information for building an overall view of the evolution of a given gene in a selected phylum. The reliability of GLADX has been successfully tested on a benchmark analysis of 14 reported cases. It is the first tool that is able to fully automatically study the lineage-specific losses and pseudogenizations. GLADX is available at http://ioda.univ-provence.fr/IodaSite/gladx/.
Collapse
Affiliation(s)
- Jacques Dainat
- Aix-Marseille Université Laboratoire d'Analyse, Topologogie, Probabilités (LATP) UMR-CNRS 7353 équipe Evolution Biologique & Modélisation, Marseille, France.
| | | | | | | |
Collapse
|
10
|
Hershberg R, Petrov DA. Evidence that mutation is universally biased towards AT in bacteria. PLoS Genet 2010; 6:e1001115. [PMID: 20838599 PMCID: PMC2936535 DOI: 10.1371/journal.pgen.1001115] [Citation(s) in RCA: 304] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2010] [Accepted: 08/09/2010] [Indexed: 11/19/2022] Open
Abstract
Mutation is the engine that drives evolution and adaptation forward in that it generates the variation on which natural selection acts. Mutation is a random process that nevertheless occurs according to certain biases. Elucidating mutational biases and the way they vary across species and within genomes is crucial to understanding evolution and adaptation. Here we demonstrate that clonal pathogens that evolve under severely relaxed selection are uniquely suitable for studying mutational biases in bacteria. We estimate mutational patterns using sequence datasets from five such clonal pathogens belonging to four diverse bacterial clades that span most of the range of genomic nucleotide content. We demonstrate that across different types of sites and in all four clades mutation is consistently biased towards AT. This is true even in clades that have high genomic GC content. In all studied cases the mutational bias towards AT is primarily due to the high rate of C/G to T/A transitions. These results suggest that bacterial mutational biases are far less variable than previously thought. They further demonstrate that variation in nucleotide content cannot stem entirely from variation in mutational biases and that natural selection and/or a natural selection-like process such as biased gene conversion strongly affect nucleotide content.
Collapse
Affiliation(s)
- Ruth Hershberg
- Department of Biology, Stanford University, Stanford, California, United States of America.
| | | |
Collapse
|
11
|
Hildebrand F, Meyer A, Eyre-Walker A. Evidence of selection upon genomic GC-content in bacteria. PLoS Genet 2010; 6:e1001107. [PMID: 20838593 PMCID: PMC2936529 DOI: 10.1371/journal.pgen.1001107] [Citation(s) in RCA: 253] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2010] [Accepted: 08/02/2010] [Indexed: 01/14/2023] Open
Abstract
The genomic GC-content of bacteria varies dramatically, from less than 20% to more than 70%. This variation is generally ascribed to differences in the pattern of mutation between bacteria. Here we test this hypothesis by examining patterns of synonymous polymorphism using datasets from 149 bacterial species. We find a large excess of synonymous GC→AT mutations over AT→GC mutations segregating in all but the most AT-rich bacteria, across a broad range of phylogenetically diverse species. We show that the excess of GC→AT mutations is inconsistent with mutation bias, since it would imply that most GC-rich bacteria are declining in GC-content; such a pattern would be unsustainable. We also show that the patterns are probably not due to translational selection or biased gene conversion, because optimal codons tend to be AT-rich, and the excess of GC→AT SNPs is observed in datasets with no evidence of recombination. We therefore conclude that there is selection to increase synonymous GC-content in many species. Since synonymous GC-content is highly correlated to genomic GC-content, we further conclude that there is selection on genomic base composition in many bacteria.
Collapse
Affiliation(s)
- Falk Hildebrand
- Centre for the Study of Evolution and School of Life Sciences, University of Sussex, Brighton, United Kingdom
- Department of Biology, University of Konstanz, Konstanz, Germany
| | - Axel Meyer
- Department of Biology, University of Konstanz, Konstanz, Germany
| | - Adam Eyre-Walker
- Centre for the Study of Evolution and School of Life Sciences, University of Sussex, Brighton, United Kingdom
- * E-mail:
| |
Collapse
|
12
|
Kondrashov FA, Kondrashov AS. Measurements of spontaneous rates of mutations in the recent past and the near future. Philos Trans R Soc Lond B Biol Sci 2010; 365:1169-76. [PMID: 20308091 PMCID: PMC2871817 DOI: 10.1098/rstb.2009.0286] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The rate of spontaneous mutation in natural populations is a fundamental parameter for many evolutionary phenomena. Because the rate of mutation is generally low, most of what is currently known about mutation has been obtained through indirect, complex and imprecise methodological approaches. However, in the past few years genome-wide sequencing of closely related individuals has made it possible to estimate the rates of mutation directly at the level of the DNA, avoiding most of the problems associated with using indirect methods. Here, we review the methods used in the past with an emphasis on next generation sequencing, which may soon make the accurate measurement of spontaneous mutation rates a matter of routine.
Collapse
Affiliation(s)
- Fyodor A Kondrashov
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation, , C/Dr. Aiguader 88, Barcelona Biomedical Research Park Building 08003, Barcelona, Spain.
| | | |
Collapse
|
13
|
Comparative genomic and phylogeographic analysis of Mycobacterium leprae. Nat Genet 2009; 41:1282-9. [PMID: 19881526 DOI: 10.1038/ng.477] [Citation(s) in RCA: 263] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2009] [Accepted: 09/01/2009] [Indexed: 11/08/2022]
Abstract
Reductive evolution and massive pseudogene formation have shaped the 3.31-Mb genome of Mycobacterium leprae, an unculturable obligate pathogen that causes leprosy in humans. The complete genome sequence of M. leprae strain Br4923 from Brazil was obtained by conventional methods (6x coverage), and Illumina resequencing technology was used to obtain the sequences of strains Thai53 (38x coverage) and NHDP63 (46x coverage) from Thailand and the United States, respectively. Whole-genome comparisons with the previously sequenced TN strain from India revealed that the four strains share 99.995% sequence identity and differ only in 215 polymorphic sites, mainly SNPs, and by 5 pseudogenes. Sixteen interrelated SNP subtypes were defined by genotyping both extant and extinct strains of M. leprae from around the world. The 16 SNP subtypes showed a strong geographical association that reflects the migration patterns of early humans and trade routes, with the Silk Road linking Europe to China having contributed to the spread of leprosy.
Collapse
|
14
|
Morton RA, Morton BR. Separating the effects of mutation and selection in producing DNA skew in bacterial chromosomes. BMC Genomics 2007; 8:369. [PMID: 17935620 PMCID: PMC2099444 DOI: 10.1186/1471-2164-8-369] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2007] [Accepted: 10/12/2007] [Indexed: 01/01/2023] Open
Abstract
Background Many bacterial chromosomes display nucleotide asymmetry, or skew, between the leading and lagging strands of replication. Mutational differences between these strands result in an overall pattern of skew that is centered about the origin of replication. Such a pattern could also arise from selection coupled with a bias for genes coded on the leading strand. The relative contributions of selection and mutation in producing compositional skew are largely unknown. Results We describe a model to quantify the contribution of mutational differences between the leading and lagging strands in producing replication-induced skew. When the origin and terminus of replication are known, the model can be used to estimate the relative accumulation of G over C and of A over T on the leading strand due to replication effects in a chromosome with bidirectional replication arms. The model may also be implemented in a maximum likelihood framework to estimate the locations of origin and terminus. We find that our estimations for the origin and terminus agree very well with the location of genes that are thought to be associated with the replication origin. This indicates that our model provides an accurate, objective method of determining the replication arms and also provides support for the hypothesis that these genes represent an ancestral cluster of origin-associated genes. Conclusion The model has several advantages over other methods of analyzing genome skew. First, it quantifies the role of mutation in generating skew so that its effect on composition, for example codon bias, can be assessed. Second, it provides an objective method for locating origin and terminus, one that is based on chromosome-wide accumulation of leading vs lagging strand nucleotide differences. Finally, the model has the potential to be utilized in a maximum likelihood framework in order to analyze the effect of chromosome rearrangements on nucleotide composition.
Collapse
Affiliation(s)
- Richard A Morton
- Department of Biology, McMaster University, 1280 Main Street West, Hamilton ON L8S 4K1, Canada.
| | | |
Collapse
|
15
|
Zheng T, Ichiba T, Morton BR. Assessing substitution variation across sites in grass chloroplast DNA. J Mol Evol 2007; 64:605-13. [PMID: 17541677 DOI: 10.1007/s00239-006-0076-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2006] [Accepted: 02/28/2007] [Indexed: 11/24/2022]
Abstract
We assess the similarity of base substitution processes, described by empirically derived 4 x 4 matrices, using chi-square homogeneity tests. Such significance analyses allow us to assess variation in sequence evolution across sites and we apply them to matrices derived from noncoding sites in different contexts in grass chloroplast DNA. We show that there is statistically significant variation in rates and patterns of mutation among noncoding sites in different contexts and then demonstrate a similar and significant influence of context on substitutions at fourfold degenerate sites of coding regions from grass chloroplast DNA. These results show that context has the same general effect on substitution bias in coding and noncoding DNA: the A+T content of flanking bases is correlated with rate of substitution, transition bias, and GC --> AT pressure, while the number of flanking pyrimidines on a single strand is correlated with a mutational bias, or skew, toward pyrimidines. Despite the similarity in general trends, however, when we compare coding and noncoding matrices we find that there is a statistically significant difference between them even when we control for context. Most noticeably, fourfold degenerate sites in coding sequences are undergoing substitution at a higher rate and there are also significant differences in the relationship between pyrimidines skew and the number of flanking pyrimidines. Possible reasons for the differences between coding and noncoding sites are discussed. Furthermore, our analysis illustrates a simple statistical way for comparing substitution processes across sites allowing us to better study variation in evolutionary processes across a genome.
Collapse
Affiliation(s)
- Tian Zheng
- Department of Statistics, Columbia University, New York, NY 10027, USA
| | | | | |
Collapse
|