1
|
Ailloud F, Gottschall W, Suerbaum S. Methylome evolution suggests lineage-dependent selection in the gastric pathogen Helicobacter pylori. Commun Biol 2023; 6:839. [PMID: 37573385 PMCID: PMC10423294 DOI: 10.1038/s42003-023-05218-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 08/04/2023] [Indexed: 08/14/2023] Open
Abstract
The bacterial pathogen Helicobacter pylori, the leading cause of gastric cancer, is genetically highly diverse and harbours a large and variable portfolio of restriction-modification systems. Our understanding of the evolution and function of DNA methylation in bacteria is limited. Here, we performed a comprehensive analysis of the methylome diversity in H. pylori, using a dataset of 541 genomes that included all known phylogeographic populations. The frequency of 96 methyltransferases and the abundance of their cognate recognition sequences were strongly influenced by phylogeographic structure and were inter-correlated, positively or negatively, for 20% of type II methyltransferases. Low density motifs were more likely to be affected by natural selection, as reflected by higher genomic instability and compositional bias. Importantly, direct correlation implied that methylation patterns can be actively enriched by positive selection and suggests that specific sites have important functions in methylation-dependent phenotypes. Finally, we identified lineage-specific selective pressures modulating the contraction and expansion of the motif ACGT, revealing that the genetic load of methylation could be dependent on local ecological factors. Taken together, natural selection may shape both the abundance and distribution of methyltransferases and their specific recognition sequences, likely permitting a fine-tuning of genome-encoded functions not achievable by genetic variation alone.
Collapse
Affiliation(s)
- Florent Ailloud
- Medical Microbiology and Hospital Epidemiology, Max von Pettenkofer Institute, Faculty of Medicine, LMU Munich, Munich, Germany.
- German Center for Infection Research (DZIF), Partner Site Munich, Munich, Germany.
| | - Wilhelm Gottschall
- Medical Microbiology and Hospital Epidemiology, Max von Pettenkofer Institute, Faculty of Medicine, LMU Munich, Munich, Germany
| | - Sebastian Suerbaum
- Medical Microbiology and Hospital Epidemiology, Max von Pettenkofer Institute, Faculty of Medicine, LMU Munich, Munich, Germany.
- German Center for Infection Research (DZIF), Partner Site Munich, Munich, Germany.
| |
Collapse
|
2
|
Semashko TA, Arzamasov AA, Evsyutina DV, Garanina IA, Matyushkina DS, Ladygina VG, Pobeguts OV, Fisunov GY, Govorun VM. Role of DNA modifications in Mycoplasma gallisepticum. PLoS One 2022; 17:e0277819. [PMID: 36413541 PMCID: PMC9681074 DOI: 10.1371/journal.pone.0277819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 11/03/2022] [Indexed: 11/23/2022] Open
Abstract
The epigenetics of bacteria, and bacteria with a reduced genome in particular, is of great interest, but is still poorly understood. Mycoplasma gallisepticum, a representative of the class Mollicutes, is an excellent model of a minimal cell because of its reduced genome size, lack of a cell wall, and primitive cell organization. In this study we investigated DNA modifications of the model object Mycoplasma gallisepticum and their roles. We identified DNA modifications and methylation motifs in M. gallisepticum S6 at the genome level using single molecule real time (SMRT) sequencing. Only the ANCNNNNCCT methylation motif was found in the M. gallisepticum S6 genome. The studied bacteria have one functional system for DNA modifications, the Type I restriction-modification (RM) system, MgaS6I. We characterized its activity, affinity, protection and epigenetic functions. We demonstrated the protective effects of this RM system. A common epigenetic signal for bacteria is the m6A modification we found, which can cause changes in DNA-protein interactions and affect the cell phenotype. Native methylation sites are underrepresented in promoter regions and located only near the -35 box of the promoter, which does not have a significant effect on gene expression in mycoplasmas. To study the epigenetics effect of m6A for genome-reduced bacteria, we constructed a series of M. gallisepticum strains expressing EGFP under promoters with the methylation motifs in their different elements. We demonstrated that m6A modifications of the promoter located only in the -10-box affected gene expression and downregulated the expression of the corresponding gene.
Collapse
Affiliation(s)
- Tatiana A. Semashko
- Federal Research and Clinical Center of Physical-Chemical Medicine, Moscow, Russian Federation
- Research Institute for Systems Biology and Medicine, Moscow, Russian Federation
- * E-mail:
| | - Alexander A. Arzamasov
- Federal Research and Clinical Center of Physical-Chemical Medicine, Moscow, Russian Federation
| | - Daria V. Evsyutina
- Federal Research and Clinical Center of Physical-Chemical Medicine, Moscow, Russian Federation
- Research Institute for Systems Biology and Medicine, Moscow, Russian Federation
| | - Irina A. Garanina
- Federal Research and Clinical Center of Physical-Chemical Medicine, Moscow, Russian Federation
| | - Daria S. Matyushkina
- Federal Research and Clinical Center of Physical-Chemical Medicine, Moscow, Russian Federation
- Research Institute for Systems Biology and Medicine, Moscow, Russian Federation
| | - Valentina G. Ladygina
- Federal Research and Clinical Center of Physical-Chemical Medicine, Moscow, Russian Federation
| | - Olga V. Pobeguts
- Federal Research and Clinical Center of Physical-Chemical Medicine, Moscow, Russian Federation
| | - Gleb Y. Fisunov
- Federal Research and Clinical Center of Physical-Chemical Medicine, Moscow, Russian Federation
- Research Institute for Systems Biology and Medicine, Moscow, Russian Federation
| | - Vadim M. Govorun
- Research Institute for Systems Biology and Medicine, Moscow, Russian Federation
| |
Collapse
|
3
|
Madival SD, Mishra DC, Sharma A, Kumar S, Maji AK, Budhlakoti N, Sinha D, Rai A. A Deep Clustering-based Novel Approach for Binning of Metagenomics Data. Curr Genomics 2022; 23:353-368. [PMID: 36778191 PMCID: PMC9878855 DOI: 10.2174/1389202923666220928150100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 08/30/2022] [Accepted: 09/02/2022] [Indexed: 11/22/2022] Open
Abstract
Background One major challenge in binning Metagenomics data is the limited availability of reference datasets, as only 1% of the total microbial population is yet cultured. This has given rise to the efficacy of unsupervised methods for binning in the absence of any reference datasets. Objective To develop a deep clustering-based binning approach for Metagenomics data and to evaluate results with suitable measures. Methods In this study, a deep learning-based approach has been taken for binning the Metagenomics data. The results are validated on different datasets by considering features such as Tetra-nucleotide frequency (TNF), Hexa-nucleotide frequency (HNF) and GC-Content. Convolutional Autoencoder is used for feature extraction and for binning; the K-means clustering method is used. Results In most cases, it has been found that evaluation parameters such as the Silhouette index and Rand index are more than 0.5 and 0.8, respectively, which indicates that the proposed approach is giving satisfactory results. The performance of the developed approach is compared with current methods and tools using benchmarked low complexity simulated and real metagenomic datasets. It is found better for unsupervised and at par with semi-supervised methods. Conclusion An unsupervised advanced learning-based approach for binning has been proposed, and the developed method shows promising results for various datasets. This is a novel approach for solving the lack of reference data problem of binning in metagenomics.
Collapse
Affiliation(s)
| | | | - Anu Sharma
- Division of Agriculture Bioinformatics, ICAR-IASRI, New Delhi- 110012, India
| | - Sanjeev Kumar
- Division of Agriculture Bioinformatics, ICAR-IASRI, New Delhi- 110012, India
| | - Arpan Kumar Maji
- Division of Computer Applications, ICAR-IASRI, New Delhi- 110012, India
| | - Neeraj Budhlakoti
- Division of Agriculture Bioinformatics, ICAR-IASRI, New Delhi- 110012, India
| | - Dipro Sinha
- Division of Agriculture Bioinformatics, ICAR-IASRI, New Delhi- 110012, India
| | - Anil Rai
- Division of Agriculture Bioinformatics, ICAR-IASRI, New Delhi- 110012, India
| |
Collapse
|
4
|
Sinha D, Sharma A, Mishra DC, Rai A, Lal SB, Kumar S, Farooqi MS, Chaturvedi KK. MetaConClust - Unsupervised Binning of Metagenomics Data using Consensus Clustering. Curr Genomics 2022; 23:137-146. [PMID: 36778980 PMCID: PMC9878838 DOI: 10.2174/1389202923666220413114659] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 01/18/2022] [Accepted: 02/21/2022] [Indexed: 11/22/2022] Open
Abstract
Background: Binning of metagenomic reads is an active area of research, and many unsupervised machine learning-based techniques have been used for taxonomic independent binning of metagenomic reads. Objective: It is important to find the optimum number of the cluster as well as develop an efficient pipeline for deciphering the complexity of the microbial genome. Methods: Applying unsupervised clustering techniques for binning requires finding the optimal number of clusters beforehand and is observed to be a difficult task. This paper describes a novel method, MetaConClust, using coverage information for grouping of contigs and automatically finding the optimal number of clusters for binning of metagenomics data using a consensus-based clustering approach. The coverage of contigs in a metagenomics sample has been observed to be directly proportional to the abundance of species in the sample and is used for grouping of data in the first phase by MetaConClust. The Partitioning Around Medoid (PAM) method is used for clustering in the second phase for generating bins with the initial number of clusters determined automatically through a consensus-based method. Results: Finally, the quality of the obtained bins is tested using silhouette index, rand Index, recall, precision, and accuracy. Performance of MetaConClust is compared with recent methods and tools using benchmarked low complexity simulated and real metagenomic datasets and is found better for unsupervised and comparable for hybrid methods. Conclusion: This is suggestive of the proposition that the consensus-based clustering approach is a promising method for automatically finding the number of bins for metagenomics data.
Collapse
Affiliation(s)
- Dipro Sinha
- These authors contributed equally to this work
| | - Anu Sharma
- Address correspondence to this author at the Division of Agriculture Bioinformatics, ICAR-IASRI, New Delhi- 110012, India; E-mail:
| | | | | | | | | | | | | |
Collapse
|
5
|
A highly specific aptamer probe targeting PD-L1 in tumor tissue sections: Mutation favors specificity. Anal Chim Acta 2021; 1185:339066. [PMID: 34711320 DOI: 10.1016/j.aca.2021.339066] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 09/11/2021] [Accepted: 09/13/2021] [Indexed: 02/07/2023]
Abstract
Although DNA aptamers can show comparable affinity to antibodies and have the advantage of having high batch-to-batch consistency, they often suffer from unsatisfied specificity for complex samples. The limited library size used for aptamer in vitro isolation (SELEX) has been recognized as one of the major reasons. Programmed cell death-ligand 1 (PD-L1) is both a key protein in cancer diagnostics and also immunotherapy. We report here a DNA aptamer that highly specifically binds PD-L1 expressed on the surface of various cancer cells and multiple types of tissue sections. The aptamers were selected from a DNA library containing a type II restriction endonuclease Alu I recognition site in the middle of the 40-nt random sequences, against recombinant PD-L1 rather than the whole cell or tissue section. The library enrichment was achieved by Alu I mediated-SELEX, named as REase-SELEX, in which Alu I cut off the non-binders at the recognition site and, more importantly, induced library mutations to substantially increase the library diversity. 8-60, a representative aptamer with high affinity (KD = 1.4 nM determined by SPR) successfully detected four types of cancer cells with PD-L1 expression levels from low to high by flow cytometry, normal human tonsil (gold standard for PD-L1 antibody evaluation), clinical non-small cell lung cancer (high PD-L1 expression level), and malignant melanoma (low PD-L1 expression level) tissue sections by fluorescence microscopy imaging, showing unprecedented high specificity. The results demonstrate that 8-60 is an advanced probe for PD-L1 cancer diagnostics and mutations in SELEX greatly favor aptamer specificity.
Collapse
|
6
|
Genomic and phenotypic comparison of two Salmonella Typhimurium strains responsible for consecutive salmonellosis outbreaks in New Zealand. Int J Med Microbiol 2021; 311:151534. [PMID: 34564018 DOI: 10.1016/j.ijmm.2021.151534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Revised: 03/20/2021] [Accepted: 08/16/2021] [Indexed: 11/20/2022] Open
Abstract
Salmonella enterica serovar Typhimurium DT160 was the predominant cause of notified human salmonellosis cases in New Zealand from 2000 to 2010, before it was superseded by another S. Typhimurium strain, DT56 variant (DT56v). Whole genome sequencing and phenotypic testing were used to compare 109 DT160 isolates with eight DT56v isolates from New Zealand animal and human sources. Phylogenetic analysis provided evidence that DT160 and DT56v strains were distantly related with an estimated date of common ancestor between 1769 and 1821. The strains replicated at different rates but had similar antimicrobial susceptibility profiles. Both strains were resistant to the phage expressed from the chromosome of the other strain, which may have contributed to the emergence of DT56v. DT160 contained the pSLT virulence plasmid, and the sseJ and sseK2 genes that may have contributed to the higher reported prevalence compared to DT56v. A linear pBSSB1-family plasmid was also found in one of the DT56v isolates, but there was no evidence that this plasmid affected bacterial replication or antimicrobial susceptibility. One of the DT56v isolates was also sequenced using long-read technology and found to contain an uncommon chromosome arrangement for a Typhimurium isolate. This study demonstrates how comparative genomics and phenotypic testing can help identify strain-specific elements and factors that may have influenced the emergence and supersession of bacterial strains of public health importance.
Collapse
|
7
|
Mier P, Andrade-Navarro MA. Avoided motifs: short amino acid strings missing from protein datasets. Biol Chem 2021; 402:945-951. [PMID: 33660494 DOI: 10.1515/hsz-2020-0383] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Accepted: 02/19/2021] [Indexed: 11/15/2022]
Abstract
According to the amino acid composition of natural proteins, it could be expected that all possible sequences of three or four amino acids will occur at least once in large protein datasets purely by chance. However, in some species or cellular context, specific short amino acid motifs are missing due to unknown reasons. We describe these as Avoided Motifs, short amino acid combinations missing from biological sequences. Here we identify 209 human and 154 bacterial Avoided Motifs of length four amino acids, and discuss their possible functionality according to their presence in other species. Furthermore, we determine two Avoided Motifs of length three amino acids in human proteins specifically located in the cytoplasm, and two more in secreted proteins. Our results support the hypothesis that the characterization of Avoided Motifs in particular contexts can provide us with information about functional motifs, pointing to a new approach in the use of molecular sequences for the discovery of protein function.
Collapse
Affiliation(s)
- Pablo Mier
- Faculty of Biology, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, Hanns-Dieter-Hüsch-Weg 15, D-55128 Mainz, Germany
| | - Miguel A Andrade-Navarro
- Faculty of Biology, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, Hanns-Dieter-Hüsch-Weg 15, D-55128 Mainz, Germany
| |
Collapse
|
8
|
Nutrient Loading and Viral Memory Drive Accumulation of Restriction Modification Systems in Bloom-Forming Cyanobacteria. mBio 2021; 12:e0087321. [PMID: 34060332 PMCID: PMC8262939 DOI: 10.1128/mbio.00873-21] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
The mechanisms driving cyanobacterial harmful algal blooms (HABs) like those caused by Microcystis aeruginosa remain elusive, but improved defense against viral predation has been implicated for success in eutrophic environments. Our genus-level analyses of 139,023 genomes revealed that HAB-forming cyanobacteria carry vastly more restriction modification systems per genome (RMPG) than nearly all other prokaryotic genera, suggesting that viral defense is a cornerstone of their ecological success. In contrast, picocyanobacteria that numerically dominate nutrient-poor systems have the fewest RMPG within the phylum Cyanobacteria. We used classic resource competition models to explore the hypothesis that nutrient enrichments drive ecological selection for high RMPG due to increased host-phage contact rate. These classic models, agnostic to the mechanism of defense, explain how nutrient loading can select for increased RMPG but, importantly, fail to explain the extreme accumulation of these defense systems. However, extreme accumulation of RMPG can be achieved in a novel “memory” model that accounts for a unique activity of restriction modification systems: the accidental methylation of viral DNA by the methyltransferase. The methylated virus “remembers” the RM defenses of its former host and can evade these defenses if they are present in the next host. This viral memory leads to continual RM system devaluation; RMs accumulate extensively because the benefit of each addition is diminished. Our modeling leads to the hypothesis that nutrient loading and virion methylation drive the extreme accumulation of RMPG in HAB-forming cyanobacteria. Finally, our models suggest that hosts with different RMPG values can coexist when hosts have unique sets of RM systems.
Collapse
|
9
|
Callens M, Pradier L, Finnegan M, Rose C, Bedhomme S. Read between the lines: Diversity of non-translational selection pressures on local codon usage. Genome Biol Evol 2021; 13:6263832. [PMID: 33944930 PMCID: PMC8410138 DOI: 10.1093/gbe/evab097] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/28/2021] [Indexed: 12/14/2022] Open
Abstract
Protein coding genes can contain specific motifs within their nucleotide sequence that function as a signal for various biological pathways. The presence of such sequence motifs within a gene can have beneficial or detrimental effects on the phenotype and fitness of an organism, and this can lead to the enrichment or avoidance of this sequence motif. The degeneracy of the genetic code allows for the existence of alternative synonymous sequences that exclude or include these motifs, while keeping the encoded amino acid sequence intact. This implies that locally, there can be a selective pressure for preferentially using a codon over its synonymous alternative in order to avoid or enrich a specific sequence motif. This selective pressure could -in addition to mutation, drift and selection for translation efficiency and accuracy- contribute to shape the codon usage bias. In this review, we discuss patterns of avoidance of (or enrichment for) the various biological signals contained in specific nucleotide sequence motifs: transcription and translation initiation and termination signals, mRNA maturation signals, and antiviral immune system targets. Experimental data on the phenotypic or fitness effects of synonymous mutations in these sequence motifs confirm that they can be targets of local selection pressures on codon usage. We also formulate the hypothesis that transposable elements could have a similar impact on codon usage through their preferred integration sequences. Overall, selection on codon usage appears to be a combination of a global selection pressure imposed by the translation machinery, and a patchwork of local selection pressures related to biological signals contained in specific sequence motifs.
Collapse
Affiliation(s)
- Martijn Callens
- Centre d'Ecologie Fonctionnelle et Evolutive, CNRS, Université de Montpellier, Université Paul Valéry Montpellier 3, Ecole Pratique des Hautes Etudes, Institut de Recherche pour le Développement, 34000 Montpellier, France
| | - Léa Pradier
- Centre d'Ecologie Fonctionnelle et Evolutive, CNRS, Université de Montpellier, Université Paul Valéry Montpellier 3, Ecole Pratique des Hautes Etudes, Institut de Recherche pour le Développement, 34000 Montpellier, France
| | - Michael Finnegan
- Centre d'Ecologie Fonctionnelle et Evolutive, CNRS, Université de Montpellier, Université Paul Valéry Montpellier 3, Ecole Pratique des Hautes Etudes, Institut de Recherche pour le Développement, 34000 Montpellier, France
| | - Caroline Rose
- Centre d'Ecologie Fonctionnelle et Evolutive, CNRS, Université de Montpellier, Université Paul Valéry Montpellier 3, Ecole Pratique des Hautes Etudes, Institut de Recherche pour le Développement, 34000 Montpellier, France
| | - Stéphanie Bedhomme
- Centre d'Ecologie Fonctionnelle et Evolutive, CNRS, Université de Montpellier, Université Paul Valéry Montpellier 3, Ecole Pratique des Hautes Etudes, Institut de Recherche pour le Développement, 34000 Montpellier, France
| |
Collapse
|
10
|
Structure of the space of taboo-free sequences. J Math Biol 2020; 81:1029-1057. [PMID: 32940748 PMCID: PMC7560954 DOI: 10.1007/s00285-020-01535-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Revised: 08/19/2020] [Indexed: 11/29/2022]
Abstract
Models of sequence evolution typically assume that all sequences are possible. However, restriction enzymes that cut DNA at specific recognition sites provide an example where carrying a recognition site can be lethal. Motivated by this observation, we studied the set of strings over a finite alphabet with taboos, that is, with prohibited substrings. The taboo-set is referred to as \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\mathbb {T}$$\end{document}T and any allowed string as a taboo-free string. We consider the so-called Hamming graph \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\varGamma _n(\mathbb {T})$$\end{document}Γn(T), whose vertices are taboo-free strings of length n and whose edges connect two taboo-free strings if their Hamming distance equals one. Any (random) walk on this graph describes the evolution of a DNA sequence that avoids taboos. We describe the construction of the vertex set of \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\varGamma _n(\mathbb {T})$$\end{document}Γn(T). Then we state conditions under which \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\varGamma _n(\mathbb {T})$$\end{document}Γn(T) and its suffix subgraphs are connected. Moreover, we provide an algorithm that determines if all these graphs are connected for an arbitrary \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\mathbb {T}$$\end{document}T. As an application of the algorithm, we show that about \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$87\%$$\end{document}87% of bacteria listed in REBASE have a taboo-set that induces connected taboo-free Hamming graphs, because they have less than four type II restriction enzymes. On the other hand, four properly chosen taboos are enough to disconnect one suffix subgraph, and consequently connectivity of taboo-free Hamming graphs could change depending on the composition of restriction sites.
Collapse
|
11
|
Zarai Y, Zafrir Z, Siridechadilok B, Suphatrakul A, Roopin M, Julander J, Tuller T. Evolutionary selection against short nucleotide sequences in viruses and their related hosts. DNA Res 2020; 27:dsaa008. [PMID: 32339222 PMCID: PMC7320823 DOI: 10.1093/dnares/dsaa008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Accepted: 04/20/2020] [Indexed: 11/13/2022] Open
Abstract
Viruses are under constant evolutionary pressure to effectively interact with the host intracellular factors, while evading its immune system. Understanding how viruses co-evolve with their hosts is a fundamental topic in molecular evolution and may also aid in developing novel viral based applications such as vaccines, oncologic therapies, and anti-bacterial treatments. Here, based on a novel statistical framework and a large-scale genomic analysis of 2,625 viruses from all classes infecting 439 host organisms from all kingdoms of life, we identify short nucleotide sequences that are under-represented in the coding regions of viruses and their hosts. These sequences cannot be explained by the coding regions' amino acid content, codon, and dinucleotide frequencies. We specifically show that short homooligonucleotide and palindromic sequences tend to be under-represented in many viruses probably due to their effect on gene expression regulation and the interaction with the host immune system. In addition, we show that more sequences tend to be under-represented in dsDNA viruses than in other viral groups. Finally, we demonstrate, based on in vitro and in vivo experiments, how under-represented sequences can be used to attenuated Zika virus strains.
Collapse
Affiliation(s)
- Yoram Zarai
- Biomedical Engineering Department, Tel Aviv University, Tel Aviv 69978, Israel
| | - Zohar Zafrir
- Biomedical Engineering Department, Tel Aviv University, Tel Aviv 69978, Israel
- SynVaccine Ltd., Ramat Hachayal, Tel Aviv, Israel
| | | | - Amporn Suphatrakul
- National Center for Genetic Engineering and Biotechnology, Pathumthani 12120, Thailand
| | - Modi Roopin
- Biomedical Engineering Department, Tel Aviv University, Tel Aviv 69978, Israel
- SynVaccine Ltd., Ramat Hachayal, Tel Aviv, Israel
| | - Justin Julander
- Institute for Antiviral Research, Utah State University, Logan, UT, USA
| | - Tamir Tuller
- Biomedical Engineering Department, Tel Aviv University, Tel Aviv 69978, Israel
- SynVaccine Ltd., Ramat Hachayal, Tel Aviv, Israel
| |
Collapse
|
12
|
Ruess J, Pleška M, Guet CC, Tkačik G. Molecular noise of innate immunity shapes bacteria-phage ecologies. PLoS Comput Biol 2019; 15:e1007168. [PMID: 31265463 PMCID: PMC6629147 DOI: 10.1371/journal.pcbi.1007168] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Revised: 07/15/2019] [Accepted: 06/07/2019] [Indexed: 01/21/2023] Open
Abstract
Mathematical models have been used successfully at diverse scales of biological organization, ranging from ecology and population dynamics to stochastic reaction events occurring between individual molecules in single cells. Generally, many biological processes unfold across multiple scales, with mutations being the best studied example of how stochasticity at the molecular scale can influence outcomes at the population scale. In many other contexts, however, an analogous link between micro- and macro-scale remains elusive, primarily due to the challenges involved in setting up and analyzing multi-scale models. Here, we employ such a model to investigate how stochasticity propagates from individual biochemical reaction events in the bacterial innate immune system to the ecology of bacteria and bacterial viruses. We show analytically how the dynamics of bacterial populations are shaped by the activities of immunity-conferring enzymes in single cells and how the ecological consequences imply optimal bacterial defense strategies against viruses. Our results suggest that bacterial populations in the presence of viruses can either optimize their initial growth rate or their population size, with the first strategy favoring simple immunity featuring a single restriction modification system and the second strategy favoring complex bacterial innate immunity featuring several simultaneously active restriction modification systems.
Collapse
Affiliation(s)
- Jakob Ruess
- Inria Saclay - Ile-de-France, 91120 Palaiseau, France
- Institut Pasteur, USR 3756 IP CNRS, 75015 Paris, France
| | - Maroš Pleška
- Institute of Science and Technology Austria, A-3400 Klosterneuburg, Austria
- Rockefeller University, New York, New York, United States of America
| | - Cǎlin C. Guet
- Institute of Science and Technology Austria, A-3400 Klosterneuburg, Austria
| | - Gašper Tkačik
- Institute of Science and Technology Austria, A-3400 Klosterneuburg, Austria
| |
Collapse
|
13
|
Barahona CJ, Basantes LE, Tompkins KJ, Heitman DM, Chukwu BI, Sanchez J, Sanchez JL, Ghadirian N, Park CK, Horton NC. The Need for Speed: Run-On Oligomer Filament Formation Provides Maximum Speed with Maximum Sequestration of Activity. J Virol 2019; 93:e01647-18. [PMID: 30518649 PMCID: PMC6384071 DOI: 10.1128/jvi.01647-18] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 11/26/2018] [Indexed: 01/29/2023] Open
Abstract
Here, we investigate an unusual antiviral mechanism developed in the bacterium Streptomyces griseus SgrAI is a type II restriction endonuclease that forms run-on oligomer filaments when activated and possesses both accelerated DNA cleavage activity and expanded DNA sequence specificity. Mutations disrupting the run-on oligomer filament eliminate the robust antiphage activity of wild-type SgrAI, and the observation that even relatively modest disruptions completely abolish this anti-viral activity shows that the greater speed imparted by the run-on oligomer filament mechanism is critical to its biological function. Simulations of DNA cleavage by SgrAI uncover the origins of the kinetic advantage of this newly described mechanism of enzyme regulation over more conventional mechanisms, as well as the origin of the sequestering effect responsible for the protection of the host genome against damaging DNA cleavage activity of activated SgrAI.IMPORTANCE This work is motivated by an interest in understanding the characteristics and advantages of a relatively newly discovered enzyme mechanism involving filament formation. SgrAI is an enzyme responsible for protecting against viral infections in its host bacterium and was one of the first such enzymes shown to utilize such a mechanism. In this work, filament formation by SgrAI is disrupted, and the effects on the speed of the purified enzyme as well as its function in cells are measured. It was found that even small disruptions, which weaken but do not destroy filament formation, eliminate the ability of SgrAI to protect cells from viral infection, its normal biological function. Simulations of enzyme activity were also performed and show how filament formation can greatly speed up an enzyme's activation compared to that of other known mechanisms, as well as to better localize its action to molecules of interest, such as invading phage DNA.
Collapse
Affiliation(s)
- Claudia J Barahona
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, USA
| | - L Emilia Basantes
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, USA
| | - Kassidy J Tompkins
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, USA
| | - Desirae M Heitman
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, USA
| | - Barbara I Chukwu
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, USA
| | - Juan Sanchez
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, USA
| | - Jonathan L Sanchez
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, USA
| | - Niloofar Ghadirian
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, USA
| | - Chad K Park
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, USA
| | - N C Horton
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, USA
| |
Collapse
|
14
|
Brownell D, King J, Caliando B, Sycheva L, Koeris M. Engineering Bacteriophage-Based Biosensors. Methods Mol Biol 2019; 1898:37-50. [PMID: 30570721 DOI: 10.1007/978-1-4939-8940-9_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Bacteriophages have been used for diagnostic purposes in the past, but a lack of parallelizable engineering methods had limited their applicability to a narrow subset of diagnostic settings. More recently, however, advances in DNA sequencing and the introduction of more sensitive reporter systems have enabled novel engineering methods, which in turn have broadened the scope of modern phage diagnostics. Here we describe advanced methods to engineer the genomes of bacteriophages in a modular and rapid fashion.
Collapse
Affiliation(s)
- Daniel Brownell
- Sample 6 Technologies, 15300 Bothell Way NE Lake Forest Park, WA, 98155, Woburn, MA, USA
| | - John King
- Sample 6 Technologies, 15300 Bothell Way NE Lake Forest Park, WA, 98155, Woburn, MA, USA
| | - Brian Caliando
- Sample 6 Technologies, 15300 Bothell Way NE Lake Forest Park, WA, 98155, Woburn, MA, USA
| | - Lada Sycheva
- Sample 6 Technologies, 15300 Bothell Way NE Lake Forest Park, WA, 98155, Woburn, MA, USA
| | - Michael Koeris
- Sample 6 Technologies, 15300 Bothell Way NE Lake Forest Park, WA, 98155, Woburn, MA, USA.
| |
Collapse
|
15
|
Rusinov IS, Ershova AS, Karyagina AS, Spirin SA, Alexeevski AV. Avoidance of recognition sites of restriction-modification systems is a widespread but not universal anti-restriction strategy of prokaryotic viruses. BMC Genomics 2018; 19:885. [PMID: 30526500 PMCID: PMC6286503 DOI: 10.1186/s12864-018-5324-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Accepted: 11/28/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Restriction-modification (R-M) systems protect bacteria and archaea from attacks by bacteriophages and archaeal viruses. An R-M system specifically recognizes short sites in foreign DNA and cleaves it, while such sites in the host DNA are protected by methylation. Prokaryotic viruses have developed a number of strategies to overcome this host defense. The simplest anti-restriction strategy is the elimination of recognition sites in the viral genome: no sites, no DNA cleavage. Even a decrease of the number of recognition sites can help a virus to overcome this type of host defense. Recognition site avoidance has been a known anti-restriction strategy of prokaryotic viruses for decades. However, recognition site avoidance has not been systematically studied with the currently available sequence data. We analyzed the complete genomes of almost 4000 prokaryotic viruses with known host species and more than 17,000 restriction endonucleases with known specificities in terms of recognition site avoidance. RESULTS We observed considerable limitations of recognition site avoidance as an anti-restriction strategy. Namely, the avoidance of recognition sites is specific for dsDNA and ssDNA prokaryotic viruses. Avoidance is much more pronounced in the genomes of non-temperate bacteriophages than in the genomes of temperate ones. Avoidance is not observed for the sites of Type I and Type IIG systems and is very rarely observed for the sites of Type III systems. The vast majority of avoidance cases concern recognition sites of orthodox Type II restriction-modification systems. Even under these constraints, complete or almost complete elimination of sites is observed for approximately one-tenth of viral genomes and a significant under-representation for approximately one-fourth of them. CONCLUSIONS Avoidance of recognition sites of restriction-modification systems is a widespread but not universal anti-restriction strategy of prokaryotic viruses.
Collapse
Affiliation(s)
- I S Rusinov
- Belozersky Institute of Physical and Chemical Biology, Lomonosov Moscow State University, 119992, Moscow, Russia
| | - A S Ershova
- Belozersky Institute of Physical and Chemical Biology, Lomonosov Moscow State University, 119992, Moscow, Russia.,Gamaleya National Research Center of Epidemiology and Microbiology of the Ministry of Health of the Russian Federation, 123098, Moscow, Russia.,All-Russia Research Institute of Agricultural Biotechnology, 127550, Moscow, Russia
| | - A S Karyagina
- Belozersky Institute of Physical and Chemical Biology, Lomonosov Moscow State University, 119992, Moscow, Russia.,Gamaleya National Research Center of Epidemiology and Microbiology of the Ministry of Health of the Russian Federation, 123098, Moscow, Russia.,All-Russia Research Institute of Agricultural Biotechnology, 127550, Moscow, Russia
| | - S A Spirin
- Belozersky Institute of Physical and Chemical Biology, Lomonosov Moscow State University, 119992, Moscow, Russia.,Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119991, Moscow, Russia.,National Research University Higher School of Economics, 101000, Moscow, Russia.,Institute of System Studies, 117281, Moscow, Russia
| | - A V Alexeevski
- Belozersky Institute of Physical and Chemical Biology, Lomonosov Moscow State University, 119992, Moscow, Russia. .,Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119991, Moscow, Russia. .,Institute of System Studies, 117281, Moscow, Russia.
| |
Collapse
|
16
|
An open-source k-mer based machine learning tool for fast and accurate subtyping of HIV-1 genomes. PLoS One 2018; 13:e0206409. [PMID: 30427878 PMCID: PMC6235296 DOI: 10.1371/journal.pone.0206409] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Accepted: 10/14/2018] [Indexed: 01/11/2023] Open
Abstract
For many disease-causing virus species, global diversity is clustered into a taxonomy of subtypes with clinical significance. In particular, the classification of infections among the subtypes of human immunodeficiency virus type 1 (HIV-1) is a routine component of clinical management, and there are now many classification algorithms available for this purpose. Although several of these algorithms are similar in accuracy and speed, the majority are proprietary and require laboratories to transmit HIV-1 sequence data over the network to remote servers. This potentially exposes sensitive patient data to unauthorized access, and makes it impossible to determine how classifications are made and to maintain the data provenance of clinical bioinformatic workflows. We propose an open-source supervised and alignment-free subtyping method (Kameris) that operates on k-mer frequencies in HIV-1 sequences. We performed a detailed study of the accuracy and performance of subtype classification in comparison to four state-of-the-art programs. Based on our testing data set of manually curated real-world HIV-1 sequences (n = 2, 784), Kameris obtained an overall accuracy of 97%, which matches or exceeds all other tested software, with a processing rate of over 1,500 sequences per second. Furthermore, our fully standalone general-purpose software provides key advantages in terms of data security and privacy, transparency and reproducibility. Finally, we show that our method is readily adaptable to subtype classification of other viruses including dengue, influenza A, and hepatitis B and C virus.
Collapse
|
17
|
Rusinov IS, Ershova AS, Karyagina AS, Spirin SA, Alexeevski AV. Comparison of Methods of Detection of Exceptional Sequences in Prokaryotic Genomes. BIOCHEMISTRY (MOSCOW) 2018; 83:129-139. [PMID: 29618299 DOI: 10.1134/s0006297918020050] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Many proteins need recognition of specific DNA sequences for functioning. The number of recognition sites and their distribution along the DNA might be of biological importance. For example, the number of restriction sites is often reduced in prokaryotic and phage genomes to decrease the probability of DNA cleavage by restriction endonucleases. We call a sequence an exceptional one if its frequency in a genome significantly differs from one predicted by some mathematical model. An exceptional sequence could be either under- or over-represented, depending on its frequency in comparison with the predicted one. Exceptional sequences could be considered biologically meaningful, for example, as targets of DNA-binding proteins or as parts of abundant repetitive elements. Several methods to predict frequency of a short sequence in a genome, based on actual frequencies of certain its subsequences, are used. The most popular are methods based on Markov chain models. But any rigorous comparison of the methods has not previously been performed. We compared three methods for the prediction of short sequence frequencies: the maximum-order Markov chain model-based method, the method that uses geometric mean of extended Markovian estimates, and the method that utilizes frequencies of all subsequences including discontiguous ones. We applied them to restriction sites in complete genomes of 2500 prokaryotic species and demonstrated that the results depend greatly on the method used: lists of 5% of the most under-represented sites differed by up to 50%. The method designed by Burge and coauthors in 1992, which utilizes all subsequences of the sequence, showed a higher precision than the other two methods both on prokaryotic genomes and randomly generated sequences after computational imitation of selective pressure. We propose this method as the first choice for detection of exceptional sequences in prokaryotic genomes.
Collapse
Affiliation(s)
- I S Rusinov
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119992, Russia
| | | | | | | | | |
Collapse
|
18
|
Abstract
Evolution of bacteria and archaea involves an incessant arms race against an enormous diversity of genetic parasites. Accordingly, a substantial fraction of the genes in most bacteria and archaea are dedicated to antiparasite defense. The functions of these defense systems follow several distinct strategies, including innate immunity; adaptive immunity; and dormancy induction, or programmed cell death. Recent comparative genomic studies taking advantage of the expanding database of microbial genomes and metagenomes, combined with direct experiments, resulted in the discovery of several previously unknown defense systems, including innate immunity centered on Argonaute proteins, bacteriophage exclusion, and new types of CRISPR-Cas systems of adaptive immunity. Some general principles of function and evolution of defense systems are starting to crystallize, in particular, extensive gain and loss of defense genes during the evolution of prokaryotes; formation of genomic defense islands; evolutionary connections between mobile genetic elements and defense, whereby genes of mobile elements are repeatedly recruited for defense functions; the partially selfish and addictive behavior of the defense systems; and coupling between immunity and dormancy induction/programmed cell death.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894;
| | - Kira S Makarova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894;
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894;
| |
Collapse
|
19
|
Sadovsky M, Fontaine JF, Andrade-Navarro MA, Yakubailik Y, Rudenko N. Lost Strings in Genomes: What Sense Do They Make? BIOINFORMATICS AND BIOMEDICAL ENGINEERING 2017. [DOI: 10.1007/978-3-319-56154-7_3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
20
|
Ershova AS, Rusinov IS, Spirin SA, Karyagina AS, Alexeevski AV. Role of Restriction-Modification Systems in Prokaryotic Evolution and Ecology. BIOCHEMISTRY (MOSCOW) 2016; 80:1373-86. [PMID: 26567582 DOI: 10.1134/s0006297915100193] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Restriction-modification (R-M) systems are able to methylate or cleave DNA depending on methylation status of their recognition site. It allows them to protect bacterial cells from invasion by foreign DNA. Comparative analysis of a large number of available bacterial genomes and methylomes clearly demonstrates that the role of R-M systems in bacteria is wider than only defense. R-M systems maintain heterogeneity of a bacterial population and are involved in adaptation of bacteria to change in their environmental conditions. R-M systems can be essential for host colonization by pathogenic bacteria. Phase variation and intragenomic recombinations are sources of the fast evolution of the specificity of R-M systems. This review focuses on the influence of R-M systems on evolution and ecology of prokaryotes.
Collapse
Affiliation(s)
- A S Ershova
- Belozerksy Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119991, Russia.
| | | | | | | | | |
Collapse
|
21
|
Xu T, Qin S, Hu Y, Song Z, Ying J, Li P, Dong W, Zhao F, Yang H, Bao Q. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity. DNA Res 2016; 23:325-38. [PMID: 27330141 PMCID: PMC4991836 DOI: 10.1093/dnares/dsw023] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2015] [Accepted: 05/12/2016] [Indexed: 11/13/2022] Open
Abstract
Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies.
Collapse
Affiliation(s)
- Teng Xu
- School of Laboratory Medicine and Life Science/Institute of Biomedical Informatics, Wenzhou Medical University, Wenzhou 325035, China
| | - Song Qin
- Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China
| | - Yongwu Hu
- School of Laboratory Medicine and Life Science/Institute of Biomedical Informatics, Wenzhou Medical University, Wenzhou 325035, China BGI-Shenzhen, Shenzhen 518083, China
| | - Zhijian Song
- School of Laboratory Medicine and Life Science/Institute of Biomedical Informatics, Wenzhou Medical University, Wenzhou 325035, China
| | - Jianchao Ying
- School of Laboratory Medicine and Life Science/Institute of Biomedical Informatics, Wenzhou Medical University, Wenzhou 325035, China
| | - Peizhen Li
- School of Laboratory Medicine and Life Science/Institute of Biomedical Informatics, Wenzhou Medical University, Wenzhou 325035, China
| | - Wei Dong
- BGI-Shenzhen, Shenzhen 518083, China
| | - Fangqing Zhao
- Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China
| | - Huanming Yang
- BGI-Shenzhen, Shenzhen 518083, China James D. Watson Institute of Genome Sciences, Zhejiang University, Hangzhou 310058, China
| | - Qiyu Bao
- School of Laboratory Medicine and Life Science/Institute of Biomedical Informatics, Wenzhou Medical University, Wenzhou 325035, China BGI-Shenzhen, Shenzhen 518083, China
| |
Collapse
|
22
|
Pleška M, Qian L, Okura R, Bergmiller T, Wakamoto Y, Kussell E, Guet C. Bacterial Autoimmunity Due to a Restriction-Modification System. Curr Biol 2016; 26:404-9. [DOI: 10.1016/j.cub.2015.12.041] [Citation(s) in RCA: 65] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2015] [Revised: 11/08/2015] [Accepted: 12/10/2015] [Indexed: 01/25/2023]
|
23
|
Ershova A, Rusinov I, Vasiliev M, Spirin S, Karyagina A. Restriction-Modification systems interplay causes avoidance of GATC site in prokaryotic genomes. J Bioinform Comput Biol 2016; 14:1641003. [PMID: 26972562 DOI: 10.1142/s0219720016410031] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Palindromes are frequently underrepresented in prokaryotic genomes. Palindromic 5[Formula: see text]-GATC-3[Formula: see text] site is a recognition site of different Restriction-Modification (R-M) systems, as well as solitary methyltransferase Dam. Classical GATC-specific R-M systems methylate GATC and cleave unmethylated GATC. On the contrary, methyl-directed Type II restriction endonucleases cleave methylated GATC. Methylation of GATC by Dam methyltransferase is involved in the regulation of different cellular processes. The diversity of functions of GATC-recognizing proteins makes GATC sequence a good model for studying the reasons of palindrome avoidance in prokaryotic genomes. In this work, the influence of R-M systems and solitary proteins on the GATC site avoidance is described by a mathematical model. GATC avoidance is strongly associated with the presence of alternate (methyl-directed or classical Type II R-M system) genes in different strains of the same species, as we have shown for Streptococcus pneumoniae, Neisseria meningitidis, Eubacterium rectale, and Moraxella catarrhalis. We hypothesize that GATC avoidance can result from a DNA exchange between strains with different methylation status of GATC site within the process of natural transformation. If this hypothesis is correct, the GATC avoidance is a sign of a DNA exchange between bacteria with different methylation status in a mixed population.
Collapse
Affiliation(s)
- Anna Ershova
- * Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119992, Russia.,† Gamaleya Center for Epidemiology and Microbiology, the Ministry of Health of the Russian Federation, Moscow 123098, Russia.,‡ Institute of Agricultural Biotechnology, the Russian Academy of Sciences, Moscow 127550, Russia
| | - Ivan Rusinov
- * Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119992, Russia.,§ Lomonosov Moscow State University, Faculty of Bioengineering and Bioinformatics, Moscow 119992, Russia
| | - Mikhail Vasiliev
- ¶ Moscow Institute of Physics and Technology, the Ministry of Education and Science of the Russian Federation, Dolgoprudny, Moscow Region, 141700, Russia
| | - Sergey Spirin
- * Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119992, Russia.,§ Lomonosov Moscow State University, Faculty of Bioengineering and Bioinformatics, Moscow 119992, Russia.,∥ Scientific Research Institute for System Studies, the Russian Academy of Science (NIISI RAS), Moscow 117218, Russia
| | - Anna Karyagina
- * Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119992, Russia.,† Gamaleya Center for Epidemiology and Microbiology, the Ministry of Health of the Russian Federation, Moscow 123098, Russia.,‡ Institute of Agricultural Biotechnology, the Russian Academy of Sciences, Moscow 127550, Russia
| |
Collapse
|
24
|
Rusinov I, Ershova A, Karyagina A, Spirin S, Alexeevski A. Lifespan of restriction-modification systems critically affects avoidance of their recognition sites in host genomes. BMC Genomics 2015; 16:1084. [PMID: 26689194 PMCID: PMC4687349 DOI: 10.1186/s12864-015-2288-4] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Accepted: 12/11/2015] [Indexed: 01/10/2023] Open
Abstract
Background Avoidance of palindromic recognition sites of Type II restriction-modification (R-M) systems was shown for many R-M systems in dozens of prokaryotic genomes. However the phenomenon has not been investigated systematically for all presently available genomes and annotated R-M systems. We have studied all known recognition sites in thousands of prokaryotic genomes and found factors that influence their avoidance. Results Only Type II R-M systems consisting of independently acting endonuclease and methyltransferase (called ‘orthodox’ here) cause avoidance of their sites, both palindromic and asymmetric, in corresponding prokaryotic genomes; the avoidance takes place for ~ 50 % of 1774 studied cases. It is known that prokaryotes can acquire and lose R-M systems. Thus it is possible to talk about the lifespan of an R-M system in a genome. We have shown that the recognition site avoidance correlates with the lifespan of R-M systems. The sites of orthodox R-M systems that are encoded in host genomes for a long time are avoided more often (up to 100 % in certain cohorts) than the sites of recently acquired ones. We also found cases of site avoidance in absence of the corresponding R-M systems in the genome. An analysis of closely related bacteria shows that such avoidance can be a trace of lost R-M systems. Sites of Type I, IIС/G, IIM, III, and IV R-M systems are not avoided in vast majority of cases. Conclusions The avoidance of orthodox Type II R-M system recognition sites in prokaryotic genomes is a widespread phenomenon. Presence of an R-M system without an underrepresentation of its site may indicate that the R-M system was acquired recently. At the same time, a significant underrepresentation of a site may be a sign of presence of the corresponding R-M system in this organism or in its ancestors for a long time. The drastic difference between site avoidance for orthodox Type II R-M systems and R-M systems of other types can be explained by a higher rate of specificity changes or a less self-toxicity of the latter. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2288-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ivan Rusinov
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, 119992, Russia.
| | - Anna Ershova
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119992, Russia. .,Gamaleya Center of Epidemiology and Microbiology, Moscow, 123098, Russia. .,Institute of Agricultural Biotechnology, the Russian Academy of Sciences, Moscow, 127550, Russia.
| | - Anna Karyagina
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119992, Russia. .,Gamaleya Center of Epidemiology and Microbiology, Moscow, 123098, Russia. .,Institute of Agricultural Biotechnology, the Russian Academy of Sciences, Moscow, 127550, Russia.
| | - Sergey Spirin
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, 119992, Russia. .,Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119992, Russia. .,Scientific Research Institute for System Studies, the Russian Academy of Science (NIISI RAS), Moscow, 117281, Russia.
| | - Andrei Alexeevski
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, 119992, Russia. .,Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119992, Russia. .,Scientific Research Institute for System Studies, the Russian Academy of Science (NIISI RAS), Moscow, 117281, Russia.
| |
Collapse
|
25
|
Karamichalis R, Kari L, Konstantinidis S, Kopecki S. An investigation into inter- and intragenomic variations of graphic genomic signatures. BMC Bioinformatics 2015; 16:246. [PMID: 26249837 PMCID: PMC4527362 DOI: 10.1186/s12859-015-0655-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Accepted: 06/30/2015] [Indexed: 11/30/2022] Open
Abstract
Background Motivated by the general need to identify and classify species based on molecular evidence, genome comparisons have been proposed that are based on measuring mostly Euclidean distances between Chaos Game Representation (CGR) patterns of genomic DNA sequences. Results We provide, on an extensive dataset and using several different distances, confirmation of the hypothesis that CGR patterns are preserved along a genomic DNA sequence, and are different for DNA sequences originating from genomes of different species. This finding lends support to the theory that CGRs of genomic sequences can act as graphic genomic signatures. In particular, we compare the CGR patterns of over five hundred different 150,000 bp genomic sequences spanning one complete chromosome from each of six organisms, representing all kingdoms of life: H. sapiens (Animalia; chromosome 21), S. cerevisiae (Fungi; chromosome 4), A. thaliana (Plantae; chromosome 1), P. falciparum (Protista; chromosome 14), E. coli (Bacteria - full genome), and P. furiosus (Archaea - full genome). To maximize the diversity within each species, we also analyze the interrelationships within a set of over five hundred 150,000 bp genomic sequences sampled from the entire aforementioned genomes. Lastly, we provide some preliminary evidence of this method’s ability to classify genomic DNA sequences at lower taxonomic levels by comparing sequences sampled from the entire genome of H. sapiens (class Mammalia, order Primates) and of M. musculus (class Mammalia, order Rodentia), for a total length of approximately 174 million basepairs analyzed. We compute pairwise distances between CGRs of these genomic sequences using six different distances, and construct Molecular Distance Maps, which visualize all sequences as points in a two-dimensional or three-dimensional space, to simultaneously display their interrelationships. Conclusion Our analysis confirms, for this dataset, that CGR patterns of DNA sequences from the same genome are in general quantitatively similar, while being different for DNA sequences from genomes of different species. Our assessment of the performance of the six distances analyzed uses three different quality measures and suggests that several distances outperform the Euclidean distance, which has so far been almost exclusively used for such studies.
Collapse
Affiliation(s)
- Rallis Karamichalis
- Department of Computer Science, University of Western Ontario, London, ON, Canada.
| | - Lila Kari
- Department of Computer Science, University of Western Ontario, London, ON, Canada.
| | - Stavros Konstantinidis
- Department of Mathematics and Computing Science, Saint Mary's University, Halifax, NS, Canada.
| | - Steffen Kopecki
- Department of Computer Science, University of Western Ontario, London, ON, Canada. .,Department of Mathematics and Computing Science, Saint Mary's University, Halifax, NS, Canada.
| |
Collapse
|
26
|
Siranosian B, Perera S, Williams E, Ye C, de Graffenried C, Shank P. Tetranucleotide usage highlights genomic heterogeneity among mycobacteriophages. F1000Res 2015; 4:36. [PMID: 27134721 PMCID: PMC4841201 DOI: 10.12688/f1000research.6077.2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/28/2015] [Indexed: 02/02/2023] Open
Abstract
Background The genomic sequences of mycobacteriophages, phages infecting mycobacterial hosts, are diverse and mosaic. Mycobacteriophages often share little nucleotide similarity, but most of them have been grouped into lettered clusters and further into subclusters. Traditionally, mycobacteriophage genomes are analyzed based on sequence alignment or knowledge of gene content. However, these approaches are computationally expensive and can be ineffective for significantly diverged sequences. As an alternative to alignment-based genome analysis, we evaluated tetranucleotide usage in mycobacteriophage genomes. These methods make it easier to characterize features of the mycobacteriophage population at many scales. Description We computed tetranucleotide usage deviation (TUD), the ratio of observed counts of 4-mers in a genome to the expected count under a null model. TUD values are comparable between members of a phage subcluster and distinct between subclusters. With few exceptions, neighbor joining phylogenetic trees and hierarchical clustering dendrograms constructed using TUD values place phages in a monophyletic clade with members of the same subcluster. Regions in a genome with exceptional TUD values can point to interesting features of genomic architecture. Finally, we found that subcluster B3 mycobacteriophages contain significantly overrepresented 4-mers and 6-mers that are atypical of phage genomes. Conclusions Statistics based on tetranucleotide usage support established clustering of mycobacteriophages and can uncover interesting relationships within and between sequenced phage genomes. These methods are efficient to compute and do not require sequence alignment or knowledge of gene content. The code to download mycobacteriophage genome sequences and reproduce our analysis is freely available at
https://github.com/bsiranosian/tango_final.
Collapse
Affiliation(s)
- Benjamin Siranosian
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA; Division of Biology and Medicine, Brown University, Providence, RI, 02912, USA
| | - Sudheesha Perera
- Division of Biology and Medicine, Brown University, Providence, RI, 02912, USA
| | - Edward Williams
- Division of Biology and Medicine, Brown University, Providence, RI, 02912, USA
| | - Chen Ye
- Division of Biology and Medicine, Brown University, Providence, RI, 02912, USA
| | | | - Peter Shank
- Department of Molecular Microbiology and Immunology, Brown University, Providence, RI, 02912, USA
| |
Collapse
|
27
|
Furuta Y, Namba-Fukuyo H, Shibata TF, Nishiyama T, Shigenobu S, Suzuki Y, Sugano S, Hasebe M, Kobayashi I. Methylome diversification through changes in DNA methyltransferase sequence specificity. PLoS Genet 2014; 10:e1004272. [PMID: 24722038 PMCID: PMC3983042 DOI: 10.1371/journal.pgen.1004272] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2013] [Accepted: 02/13/2014] [Indexed: 12/20/2022] Open
Abstract
Epigenetic modifications such as DNA methylation have large effects on gene expression and genome maintenance. Helicobacter pylori, a human gastric pathogen, has a large number of DNA methyltransferase genes, with different strains having unique repertoires. Previous genome comparisons suggested that these methyltransferases often change DNA sequence specificity through domain movement--the movement between and within genes of coding sequences of target recognition domains. Using single-molecule real-time sequencing technology, which detects N6-methyladenines and N4-methylcytosines with single-base resolution, we studied methylated DNA sites throughout the H. pylori genome for several closely related strains. Overall, the methylome was highly variable among closely related strains. Hypermethylated regions were found, for example, in rpoB gene for RNA polymerase. We identified DNA sequence motifs for methylation and then assigned each of them to a specific homology group of the target recognition domains in the specificity-determining genes for Type I and other restriction-modification systems. These results supported proposed mechanisms for sequence-specificity changes in DNA methyltransferases. Knocking out one of the Type I specificity genes led to transcriptome changes, which suggested its role in gene expression. These results are consistent with the concept of evolution driven by DNA methylation, in which changes in the methylome lead to changes in the transcriptome and potentially to changes in phenotype, providing targets for natural or artificial selection.
Collapse
Affiliation(s)
- Yoshikazu Furuta
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo, Japan
- Institute of Medical Science, University of Tokyo, Minato-ku, Tokyo, Japan
| | - Hiroe Namba-Fukuyo
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo, Japan
| | | | - Tomoaki Nishiyama
- Advanced Science Research Center, Kanazawa University, Kanazawa, Japan
| | - Shuji Shigenobu
- National Institute for Basic Biology, Okazaki, Japan
- Department of Basic Biology, School of Life Science, Graduate University for Advanced Studies, Okazaki, Japan
| | - Yutaka Suzuki
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo, Japan
| | - Sumio Sugano
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo, Japan
| | - Mitsuyasu Hasebe
- National Institute for Basic Biology, Okazaki, Japan
- Department of Basic Biology, School of Life Science, Graduate University for Advanced Studies, Okazaki, Japan
| | - Ichizo Kobayashi
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo, Japan
- Institute of Medical Science, University of Tokyo, Minato-ku, Tokyo, Japan
- * E-mail:
| |
Collapse
|
28
|
O'Neill PK, Forder R, Erill I. Informational requirements for transcriptional regulation. J Comput Biol 2014; 21:373-84. [PMID: 24689750 DOI: 10.1089/cmb.2014.0032] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Transcription factors (TFs) regulate transcription by binding to specific sites in promoter regions. Information theory provides a useful mathematical framework to analyze the binding motifs associated with TFs but imposes several assumptions that limit their applicability to specific regulatory scenarios. Explicit simulations of the co-evolution of TFs and their binding motifs allow the study of the evolution of regulatory networks with a high degree of realism. In this work we analyze the impact of differential regulatory demands on the information content of TF-binding motifs by means of evolutionary simulations. We generalize a predictive index based on information theory, and we validate its applicability to regulatory scenarios in which the TF binds significantly to the genomic background. Our results show a logarithmic dependence of the evolved information content on the occupancy of target sites and indicate that TFs may actively exploit pseudo-sites to modulate their occupancy of target sites. In regulatory networks with differentially regulated targets, we observe that information content in TF-binding motifs is dictated primarily by the fraction of total probability mass that the TF assigns to its target sites, and we provide a predictive index to estimate the amount of information associated with arbitrarily complex regulatory systems. We observe that complex regulatory patterns can exert additional demands on evolved information content, but, given a total occupancy for target sites, we do not find conclusive evidence that this effect is because of the range of required binding affinities.
Collapse
Affiliation(s)
- Patrick K O'Neill
- 1 Department of Biological Sciences, University of Maryland Baltimore County , Baltimore, Maryland
| | | | | |
Collapse
|
29
|
Bonham-Carter O, Ali H, Bastola D. A base composition analysis of natural patterns for the preprocessing of metagenome sequences. BMC Bioinformatics 2014; 14 Suppl 11:S5. [PMID: 24564274 PMCID: PMC3816298 DOI: 10.1186/1471-2105-14-s11-s5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
Background On the pretext that sequence reads and contigs often exhibit the same kinds of base usage that is also observed in the sequences from which they are derived, we offer a base composition analysis tool. Our tool uses these natural patterns to determine relatedness across sequence data. We introduce spectrum sets (sets of motifs) which are permutations of bacterial restriction sites and the base composition analysis framework to measure their proportional content in sequence data. We suggest that this framework will increase the efficiency during the pre-processing stages of metagenome sequencing and assembly projects. Results Our method is able to differentiate organisms and their reads or contigs. The framework shows how to successfully determine the relatedness between these reads or contigs by comparison of base composition. In particular, we show that two types of organismal-sequence data are fundamentally different by analyzing their spectrum set motif proportions (coverage). By the application of one of the four possible spectrum sets, encompassing all known restriction sites, we provide the evidence to claim that each set has a different ability to differentiate sequence data. Furthermore, we show that the spectrum set selection having relevance to one organism, but not to the others of the data set, will greatly improve performance of sequence differentiation even if the fragment size of the read, contig or sequence is not lengthy. Conclusions We show the proof of concept of our method by its application to ten trials of two or three freshly selected sequence fragments (reads and contigs) for each experiment across the six organisms of our set. Here we describe a novel and computationally effective pre-processing step for metagenome sequencing and assembly tasks. Furthermore, our base composition method has applications in phylogeny where it can be used to infer evolutionary distances between organisms based on the notion that related organisms often have much conserved code.
Collapse
|
30
|
Maldonado-Contreras A, Mane SP, Zhang XS, Pericchi L, Alarcón T, Contreras M, Linz B, Blaser MJ, Domínguez-Bello MG. Phylogeographic evidence of cognate recognition site patterns and transformation efficiency differences in H. pylori: theory of strain dominance. BMC Microbiol 2013; 13:211. [PMID: 24050390 PMCID: PMC3849833 DOI: 10.1186/1471-2180-13-211] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2013] [Accepted: 08/28/2013] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Helicobacter pylori has diverged in parallel to its human host, leading to distinct phylogeographic populations. Recent evidence suggests that in the current human mixing in Latin America, European H. pylori (hpEurope) are increasingly dominant at the expense of Amerindian haplotypes (hspAmerind). This phenomenon might occur via DNA recombination, modulated by restriction-modification systems (RMS), in which differences in cognate recognition sites (CRS) and in active methylases will determine direction and frequency of gene flow. We hypothesized that genomes from hspAmerind strains that evolved from a small founder population have lost CRS for RMS and active methylases, promoting hpEurope's DNA invasion. We determined the observed and expected frequencies of CRS for RMS in DNA from 7 H. pylori whole genomes and 110 multilocus sequences. We also measured the number of active methylases by resistance to in vitro digestion by 16 restriction enzymes of genomic DNA from 9 hpEurope and 9 hspAmerind strains, and determined the direction of DNA uptake in co-culture experiments of hspAmerind and hpEurope strains. RESULTS Most of the CRS were underrepresented with consistency between whole genomes and multilocus sequences. Although neither the frequency of CRS nor the number of active methylases differ among the bacterial populations (average 8.6 ± 2.6), hspAmerind strains had a restriction profile distinct from that in hpEurope strains, with 15 recognition sites accounting for the differences. Amerindians strains also exhibited higher transformation rates than European strains, and were more susceptible to be subverted by larger DNA hpEurope-fragments than vice versa. CONCLUSIONS The geographical variation in the pattern of CRS provides evidence for ancestral differences in RMS representation and function, and the transformation findings support the hypothesis of Europeanization of the Amerindian strains in Latin America via DNA recombination.
Collapse
|
31
|
Roberts GA, Houston PJ, White JH, Chen K, Stephanou AS, Cooper LP, Dryden DTF, Lindsay JA. Impact of target site distribution for Type I restriction enzymes on the evolution of methicillin-resistant Staphylococcus aureus (MRSA) populations. Nucleic Acids Res 2013; 41:7472-84. [PMID: 23771140 PMCID: PMC3753647 DOI: 10.1093/nar/gkt535] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
A limited number of Methicillin-resistant Staphylococcus aureus (MRSA) clones are responsible for MRSA infections worldwide, and those of different lineages carry unique Type I restriction-modification (RM) variants. We have identified the specific DNA sequence targets for the dominant MRSA lineages CC1, CC5, CC8 and ST239. We experimentally demonstrate that this RM system is sufficient to block horizontal gene transfer between clinically important MRSA, confirming the bioinformatic evidence that each lineage is evolving independently. Target sites are distributed randomly in S. aureus genomes, except in a set of large conjugative plasmids encoding resistance genes that show evidence of spreading between two successful MRSA lineages. This analysis of the identification and distribution of target sites explains evolutionary patterns in a pathogenic bacterium. We show that a lack of specific target sites enables plasmids to evade the Type I RM system thereby contributing to the evolution of increasingly resistant community and hospital MRSA.
Collapse
Affiliation(s)
- Gareth A Roberts
- EaStCHEM School of Chemistry, University of Edinburgh, The King's Buildings, Edinburgh EH9 3JJ, UK and Division of Clinical Sciences, St. George's, University of London, Cranmer Terrace, London, SW17 0RE, UK
| | | | | | | | | | | | | | | |
Collapse
|
32
|
Savitskaya E, Semenova E, Dedkov V, Metlitskaya A, Severinov K. High-throughput analysis of type I-E CRISPR/Cas spacer acquisition in E. coli. RNA Biol 2013; 10:716-25. [PMID: 23619643 PMCID: PMC3737330 DOI: 10.4161/rna.24325] [Citation(s) in RCA: 91] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2013] [Revised: 03/15/2013] [Accepted: 03/15/2013] [Indexed: 12/26/2022] Open
Abstract
In Escherichia coli, the acquisition of new CRISPR spacers is strongly stimulated by a priming interaction between a spacer in CRISPR RNA and a protospacer in foreign DNA. Priming also leads to a pronounced bias in DNA strand from which new spacers are selected. Here, ca. 200,000 spacers acquired during E. coli type I-E CRISPR/Cas-driven plasmid elimination were analyzed. Analysis of positions of plasmid protospacers from which newly acquired spacers have been derived is inconsistent with spacer acquisition machinery sliding along the target DNA as the primary mechanism responsible for strand bias during primed spacer acquisition. Most protospacers that served as donors of newly acquired spacers during primed spacer acquisition had an AAG protospacer adjacent motif, PAM. Yet, the introduction of multiple AAG sequences in the target DNA had no effect on the choice of protospacers used for adaptation, which again is inconsistent with the sliding mechanism. Despite a strong preference for an AAG PAM during CRISPR adaptation, the AAG (and CTT) triplets do not appear to be avoided in known E. coli phages. Likewise, PAM sequences are not avoided in Streptococcus thermophilus phages, indicating that CRISPR/Cas systems may not have been a strong factor in shaping host-virus interactions.
Collapse
Affiliation(s)
- Ekaterina Savitskaya
- Institute of Molecular Genetics of the Russian Academy of Sciences; Moscow, Russia
- Institute of Gene Biology of the Russian Academy of Sciences; Moscow, Russia
| | - Ekaterina Semenova
- Waksman Institute for Microbiology; Rutgers, The State University of New Jersey; Piscataway, NJ USA
| | - Vladimir Dedkov
- Central Research Institute of Epidemiology; Russian Inspectorate for Protection of Consumer Right and Human Welfare; Moscow, Russia
| | | | - Konstantin Severinov
- Institute of Molecular Genetics of the Russian Academy of Sciences; Moscow, Russia
- Institute of Gene Biology of the Russian Academy of Sciences; Moscow, Russia
- Waksman Institute for Microbiology; Rutgers, The State University of New Jersey; Piscataway, NJ USA
- Department of Molecular Biology and Biochemistry; Rutgers, The State University of New Jersey; Piscataway, NJ USA
| |
Collapse
|
33
|
Vasu K, Nagaraja V. Diverse functions of restriction-modification systems in addition to cellular defense. Microbiol Mol Biol Rev 2013; 77:53-72. [PMID: 23471617 PMCID: PMC3591985 DOI: 10.1128/mmbr.00044-12] [Citation(s) in RCA: 386] [Impact Index Per Article: 35.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Restriction-modification (R-M) systems are ubiquitous and are often considered primitive immune systems in bacteria. Their diversity and prevalence across the prokaryotic kingdom are an indication of their success as a defense mechanism against invading genomes. However, their cellular defense function does not adequately explain the basis for their immaculate specificity in sequence recognition and nonuniform distribution, ranging from none to too many, in diverse species. The present review deals with new developments which provide insights into the roles of these enzymes in other aspects of cellular function. In this review, emphasis is placed on novel hypotheses and various findings that have not yet been dealt with in a critical review. Emerging studies indicate their role in various cellular processes other than host defense, virulence, and even controlling the rate of evolution of the organism. We also discuss how R-M systems could have successfully evolved and be involved in additional cellular portfolios, thereby increasing the relative fitness of their hosts in the population.
Collapse
Affiliation(s)
- Kommireddy Vasu
- Department of Microbiology and Cell Biology, Indian Institute of Science, Bangalore
| | - Valakunja Nagaraja
- Department of Microbiology and Cell Biology, Indian Institute of Science, Bangalore
- Jawaharlal Nehru Centre for Advanced Scientific Research, Bangalore, India
| |
Collapse
|
34
|
CpG underrepresentation and the bacterial CpG-specific DNA methyltransferase M.MpeI. Proc Natl Acad Sci U S A 2012; 110:105-10. [PMID: 23248272 DOI: 10.1073/pnas.1207986110] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Cytosine methylation promotes deamination. In eukaryotes, CpG methylation is thought to account for CpG underrepresentation. Whether scarcity of CpGs in prokaryotic genomes is diagnostic for methylation is not clear. Here, we report that Mycoplasms tend to be CpG depleted and to harbor a family of constitutively expressed or phase variable CpG-specific DNA methyltransferases. The very CpG poor Mycoplasma penetrans and its constitutively active CpG-specific methyltransferase M.MpeI were chosen for further characterization. Genome-wide sequencing of bisulfite-converted DNA indicated that M.MpeI methylated CpG target sites both in vivo and in vitro in a locus-nonselective manner. A crystal structure of M.MpeI with DNA at 2.15-Å resolution showed that the substrate base was flipped and that its place in the DNA stack was taken by a glutamine residue. A phenylalanine residue was intercalated into the "weak" CpG step of the nonsubstrate strand, indicating mechanistic similarities in the recognition of the short CpG target sequence by prokaryotic and eukaryotic DNA methyltransferases.
Collapse
|
35
|
Du Y, Murani E, Ponsuksili S, Wimmers K. Flexible and efficient genome tiling design with penalized uniqueness score. BMC Bioinformatics 2012; 13:323. [PMID: 23216884 PMCID: PMC3583072 DOI: 10.1186/1471-2105-13-323] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2012] [Accepted: 10/26/2012] [Indexed: 11/24/2022] Open
Abstract
Background As a powerful tool in whole genome analysis, tiling array has been widely used in the answering of many genomic questions. Now it could also serve as a capture device for the library preparation in the popular high throughput sequencing experiments. Thus, a flexible and efficient tiling array design approach is still needed and could assist in various types and scales of transcriptomic experiment. Results In this paper, we address issues and challenges in designing probes suitable for tiling array applications and targeted sequencing. In particular, we define the penalized uniqueness score, which serves as a controlling criterion to eliminate potential cross-hybridization, and a flexible tiling array design pipeline. Unlike BLAST or simple suffix array based methods, computing and using our uniqueness measurement can be more efficient for large scale design and require less memory. The parameters provided could assist in various types of genomic tiling task. In addition, using both commercial array data and experiment data we show, unlike previously claimed, that palindromic sequence exhibiting relatively lower uniqueness. Conclusions Our proposed penalized uniqueness score could serve as a better indicator for cross hybridization with higher sensitivity and specificity, giving more control of expected array quality. The flexible tiling design algorithm incorporating the penalized uniqueness score was shown to give higher coverage and resolution. The package to calculate the penalized uniqueness score and the described probe selection algorithm are implemented as a Perl program, which is freely available at http://www1.fbn-dummerstorf.de/en/forschung/fbs/fb3/paper/2012-yang-1/OTAD.v1.1.tar.gz.
Collapse
Affiliation(s)
- Yang Du
- Research Unit Molecular Biology, Leibniz Institute for Farm Animal Biology, Dummerstorf, Germany
| | | | | | | |
Collapse
|
36
|
Compositional bias is a major determinant of the distribution pattern and abundance of palindromes in Drosophila melanogaster. J Mol Evol 2012; 75:130-40. [PMID: 23138634 DOI: 10.1007/s00239-012-9527-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2012] [Accepted: 10/22/2012] [Indexed: 10/27/2022]
Abstract
Palindromic sequences are important DNA motifs related to gene regulation, DNA replication and recombination, and thus, investigating the evolutionary forces shaping the distribution pattern and abundance of palindromes in the genome is substantially important. In this article, we analyzed the abundance of palindromes in the genome, and then explored the possible effects of several genomic factors on the palindrome distribution and abundance in Drosophila melanogaster. Our results show that the palindrome abundance in D. melanogaster deviates from random expectation and the uneven distribution of palindromes across the genome is associated with local GC content, recombination rate, and coding exon density. Our data suggest that base composition is the major determinant of the distribution pattern and abundance of palindromes and the correlation between palindrome density and recombination is a side-product of the effect of compositional bias on the palindrome abundance.
Collapse
|
37
|
Transfer RNA gene numbers may not be completely responsible for the codon usage bias in asparagine, isoleucine, phenylalanine, and tyrosine in the high expression genes in bacteria. J Mol Evol 2012; 75:34-42. [PMID: 23053196 DOI: 10.1007/s00239-012-9524-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2012] [Accepted: 09/24/2012] [Indexed: 10/27/2022]
Abstract
It is generally believed that the effect of translational selection on codon usage bias is related to the number of transfer RNA genes in bacteria, which is more with respect to the high expression genes than the whole genome. Keeping this in the background, we analyzed codon usage bias with respect to asparagine, isoleucine, phenylalanine, and tyrosine amino acids. Analysis was done in seventeen bacteria with the available gene expression data and information about the tRNA gene number. In most of the bacteria, it was observed that codon usage bias and tRNA gene number were not in agreement, which was unexpected. We extended the study further to 199 bacteria, limiting to the codon usage bias in the two highly expressed genes rpoB and rpoC which encode the RNA polymerase subunits β and β', respectively. In concordance with the result in the high expression genes, codon usage bias in rpoB and rpoC genes was also found to not be in agreement with tRNA gene number in many of these bacteria. Our study indicates that tRNA gene numbers may not be the sole determining factor for translational selection of codon usage bias in bacterial genomes.
Collapse
|
38
|
Elhai J, Liu H, Taton A. Detection of horizontal transfer of individual genes by anomalous oligomer frequencies. BMC Genomics 2012; 13:245. [PMID: 22702893 PMCID: PMC3497702 DOI: 10.1186/1471-2164-13-245] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2011] [Accepted: 05/18/2012] [Indexed: 11/10/2022] Open
Abstract
Background Understanding the history of life requires that we understand the transfer of genetic material across phylogenetic boundaries. Detecting genes that were acquired by means other than vertical descent is a basic step in that process. Detection by discordant phylogenies is computationally expensive and not always definitive. Many have used easily computed compositional features as an alternative procedure. However, different compositional methods produce different predictions, and the effectiveness of any method is not well established. Results The ability of octamer frequency comparisons to detect genes artificially seeded in cyanobacterial genomes was markedly increased by using as a training set those genes that are highly conserved over all bacteria. Using a subset of octamer frequencies in such tests also increased effectiveness, but this depended on the specific target genome and the source of the contaminating genes. The presence of high frequency octamers and the GC content of the contaminating genes were important considerations. A method comprising best practices from these tests was devised, the Core Gene Similarity (CGS) method, and it performed better than simple octamer frequency analysis, codon bias, or GC contrasts in detecting seeded genes or naturally occurring transposons. From a comparison of predictions with phylogenetic trees, it appears that the effectiveness of the method is confined to horizontal transfer events that have occurred recently in evolutionary time. Conclusions The CGS method may be an improvement over existing surrogate methods to detect genes of foreign origin.
Collapse
Affiliation(s)
- Jeff Elhai
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA 23284, USA.
| | | | | |
Collapse
|
39
|
Promiscuous restriction is a cellular defense strategy that confers fitness advantage to bacteria. Proc Natl Acad Sci U S A 2012; 109:E1287-93. [PMID: 22509013 DOI: 10.1073/pnas.1119226109] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Most bacterial genomes harbor restriction-modification systems, encoding a REase and its cognate MTase. On attack by a foreign DNA, the REase recognizes it as nonself and subjects it to restriction. Should REases be highly specific for targeting the invading foreign DNA? It is often considered to be the case. However, when bacteria harboring a promiscuous or high-fidelity variant of the REase were challenged with bacteriophages, fitness was maximal under conditions of catalytic promiscuity. We also delineate possible mechanisms by which the REase recognizes the chromosome as self at the noncanonical sites, thereby preventing lethal dsDNA breaks. This study provides a fundamental understanding of how bacteria exploit an existing defense system to gain fitness advantage during a host-parasite coevolutionary "arms race."
Collapse
|
40
|
Qian L, Kussell E. Evolutionary dynamics of restriction site avoidance. PHYSICAL REVIEW LETTERS 2012; 108:158105. [PMID: 22587291 DOI: 10.1103/physrevlett.108.158105] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2011] [Indexed: 05/31/2023]
Abstract
Molecular noise in bacterial restriction-modification systems can cause rare events of host DNA cleavage at restriction sites. Such noise-induced selective pressure may result in evolved sequences exhibiting restriction site avoidance. We identify a two-state regime of evolutionary dynamics, in which populations either develop avoidance or go extinct. Using perturbation theory, we show that equilibrium sequence statistics exhibit power-law scaling in the ratio of restriction strength to mutation rate. Noise levels comparable to mutation rates can be sufficient to evolve detectable avoidance.
Collapse
Affiliation(s)
- Long Qian
- Department of Biology and Center for Genomics and Systems Biology, New York University, 12 Waverly Place, New York, New York 10003, USA
| | | |
Collapse
|
41
|
Dutta C, Paul S. Microbial lifestyle and genome signatures. Curr Genomics 2012; 13:153-62. [PMID: 23024607 PMCID: PMC3308326 DOI: 10.2174/138920212799860698] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2011] [Revised: 09/13/2011] [Accepted: 09/28/2011] [Indexed: 12/29/2022] Open
Abstract
Microbes are known for their unique ability to adapt to varying lifestyle and environment, even to the extreme or adverse ones. The genomic architecture of a microbe may bear the signatures not only of its phylogenetic position, but also of the kind of lifestyle to which it is adapted. The present review aims to provide an account of the specific genome signatures observed in microbes acclimatized to distinct lifestyles or ecological niches. Niche-specific signatures identified at different levels of microbial genome organization like base composition, GC-skew, purine-pyrimidine ratio, dinucleotide abundance, codon bias, oligonucleotide composition etc. have been discussed. Among the specific cases highlighted in the review are the phenomena of genome shrinkage in obligatory host-restricted microbes, genome expansion in strictly intra-amoebal pathogens, strand-specific codon usage in intracellular species, acquisition of genome islands in pathogenic or symbiotic organisms, discriminatory genomic traits of marine microbes with distinct trophic strategies, and conspicuous sequence features of certain extremophiles like those adapted to high temperature or high salinity.
Collapse
Affiliation(s)
- Chitra Dutta
- Structural Biology & Bioinformatics Division, CSIR- Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Kolkata 700032, India
| | | |
Collapse
|
42
|
Basu MK, Selengut JD, Haft DH. ProPhylo: partial phylogenetic profiling to guide protein family construction and assignment of biological process. BMC Bioinformatics 2011; 12:434. [PMID: 22070167 PMCID: PMC3226654 DOI: 10.1186/1471-2105-12-434] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2011] [Accepted: 11/09/2011] [Indexed: 12/02/2022] Open
Abstract
Background Phylogenetic profiling is a technique of scoring co-occurrence between a protein family and some other trait, usually another protein family, across a set of taxonomic groups. In spite of several refinements in recent years, the technique still invites significant improvement. To be its most effective, a phylogenetic profiling algorithm must be able to examine co-occurrences among protein families whose boundaries are uncertain within large homologous protein superfamilies. Results Partial Phylogenetic Profiling (PPP) is an iterative algorithm that scores a given taxonomic profile against the taxonomic distribution of families for all proteins in a genome. The method works through optimizing the boundary of each protein family, rather than by relying on prebuilt protein families or fixed sequence similarity thresholds. Double Partial Phylogenetic Profiling (DPPP) is a related procedure that begins with a single sequence and searches for optimal granularities for its surrounding protein family in order to generate the best query profiles for PPP. We present ProPhylo, a high-performance software package for phylogenetic profiling studies through creating individually optimized protein family boundaries. ProPhylo provides precomputed databases for immediate use and tools for manipulating the taxonomic profiles used as queries. Conclusion ProPhylo results show universal markers of methanogenesis, a new DNA phosphorothioation-dependent restriction enzyme, and efficacy in guiding protein family construction. The software and the associated databases are freely available under the open source Perl Artistic License from ftp://ftp.jcvi.org/pub/data/ppp/.
Collapse
Affiliation(s)
- Malay K Basu
- J. Craig Venter Institute, Rockville, MD 20850, USA.
| | | | | |
Collapse
|
43
|
Abstract
All life must survive their corresponding viruses. Thus antiviral systems are essential in all living organisms. Remnants of virus derived information are also found in all life forms but have historically been considered mostly as junk DNA. However, such virus derived information can strongly affect host susceptibility to viruses. In this review, I evaluate the role viruses have had in the origin and evolution of host antiviral systems. From Archaea through bacteria and from simple to complex eukaryotes I trace the viral components that became essential elements of antiviral immunity. I conclude with a reexamination of the 'Big Bang' theory for the emergence of the adaptive immune system in vertebrates by horizontal transfer and note how viruses could have and did provide crucial and coordinated features.
Collapse
|
44
|
Viral ancestors of antiviral systems. Viruses 2011; 3:1933-58. [PMID: 22069523 PMCID: PMC3205389 DOI: 10.3390/v3101933] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2011] [Revised: 10/01/2011] [Accepted: 10/10/2011] [Indexed: 02/06/2023] Open
Abstract
All life must survive their corresponding viruses. Thus antiviral systems are essential in all living organisms. Remnants of virus derived information are also found in all life forms but have historically been considered mostly as junk DNA. However, such virus derived information can strongly affect host susceptibility to viruses. In this review, I evaluate the role viruses have had in the origin and evolution of host antiviral systems. From Archaea through bacteria and from simple to complex eukaryotes I trace the viral components that became essential elements of antiviral immunity. I conclude with a reexamination of the ‘Big Bang’ theory for the emergence of the adaptive immune system in vertebrates by horizontal transfer and note how viruses could have and did provide crucial and coordinated features.
Collapse
|
45
|
Lamprea-Burgunder E, Ludin P, Mäser P. Species-specific typing of DNA based on palindrome frequency patterns. DNA Res 2011; 18:117-24. [PMID: 21429991 PMCID: PMC3077040 DOI: 10.1093/dnares/dsr004] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
DNA in its natural, double-stranded form may contain palindromes, sequences which read the same from either side because they are identical to their reverse complement on the sister strand. Short palindromes are underrepresented in all kinds of genomes. The frequency distribution of short palindromes exhibits more than twice the inter-species variance of non-palindromic sequences, which renders palindromes optimally suited for the typing of DNA. Here, we show that based on palindrome frequency, DNA sequences can be discriminated to the level of species of origin. By plotting the ratios of actual occurrence to expectancy, we generate palindrome frequency patterns that allow to cluster different sequences of the same genome and to assign plasmids, and in some cases even viruses to their respective host genomes. This finding will be of use in the growing field of metagenomics.
Collapse
|
46
|
Abstract
Genomes encode multiple signals, raising the question of how these different codes are organized along the linear genome sequence. Within protein-coding regions, the redundancy of the genetic code can, in principle, allow for the overlapping encoding of signals in addition to the amino acid sequence, but it is not known to what extent genomes exploit this potential and, if so, for what purpose. Here, we systematically explore whether protein-coding regions accommodate overlapping codes, by comparing the number of occurrences of each possible short sequence within the protein-coding regions of over 700 species from viruses to plants, to the same number in randomizations that preserve amino acid sequence and codon bias. We find that coding regions across all phyla encode additional information, with bacteria carrying more information than eukaryotes. The detailed signals consist of both known and potentially novel codes, including position-dependent secondary RNA structure, bacteria-specific depletion of transcription and translation initiation signals, and eukaryote-specific enrichment of microRNA target sites. Our results suggest that genomes may have evolved to encode extensive overlapping information within protein-coding regions.
Collapse
|
47
|
Davenport C, Ussery DW, Tümmler B. Comparative genomics of green sulfur bacteria. PHOTOSYNTHESIS RESEARCH 2010; 104:137-152. [PMID: 20099081 DOI: 10.1007/s11120-009-9515-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2009] [Accepted: 12/07/2009] [Indexed: 05/28/2023]
Abstract
Eleven completely sequenced Chlorobi genomes were compared in oligonucleotide usage, gene contents, and synteny. The green sulfur bacteria (GSB) are equipped with a core genome that sustains their anoxygenic phototrophic lifestyle by photosynthesis, sulfur oxidation, and CO(2) fixation. Whole-genome gene family and single gene sequence comparisons yielded similar phylogenetic trees of the sequenced chromosomes indicating a concerted vertical evolution of large gene sets. Chromosomal synteny of genes is not preserved in the phylum Chlorobi. The accessory genome is characterized by anomalous oligonucleotide usage and endows the strains with individual features for transport, secretion, cell wall, extracellular constituents, and a few elements of the biosynthetic apparatus. Giant genes are a peculiar feature of the genera Chlorobium and Prosthecochloris. The predicted proteins have a huge molecular weight of 10(6), and are probably instrumental for the bacteria to generate their own intimate (micro)environment.
Collapse
Affiliation(s)
- Colin Davenport
- Klinische Forschergruppe, Klinik für Pädiatrische Pneumologie und Neonatologie, Medizinische Hochschule Hannover, Carl-Neuberg-Strasse 1, Hannover, Germany
| | | | | |
Collapse
|
48
|
Villarreal LP, Witzany G. Viruses are essential agents within the roots and stem of the tree of life. J Theor Biol 2009; 262:698-710. [PMID: 19833132 DOI: 10.1016/j.jtbi.2009.10.014] [Citation(s) in RCA: 109] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2009] [Revised: 09/28/2009] [Accepted: 10/08/2009] [Indexed: 02/06/2023]
Abstract
In contrast with former definitions of life limited to membrane-bound cellular life forms which feed, grow, metabolise and replicate (i) a role of viruses as genetic symbionts, (ii) along with peripheral phenomena such as cryptobiosis and (iii) the horizontal nature of genetic information acquisition and processing broaden our view of the tree of life. Some researchers insist on the traditional textbook conviction of what is part of the community of life. In a recent review [Moreira, D., Lopez-Garcia, P., 2009. Ten reasons to exclude viruses from the tree of life. Nat. Rev. Microbiol. 7, 306-311.] they assemble four main arguments which should exclude viruses from the tree of life because of their inability to self-sustain and self-replicate, their polyphyly, the cellular origin of their cell-like genes and the volatility of their genomes. In this article we will show that these features are not coherent with current knowledge about viruses but that viral agents play key roles within the roots and stem of the tree of life.
Collapse
Affiliation(s)
- Luis P Villarreal
- Department of Molecular Biology and Biochemistry, University of California, Irvine, CA 92697, USA
| | | |
Collapse
|
49
|
Asakura Y, Kobayashi I. From damaged genome to cell surface: transcriptome changes during bacterial cell death triggered by loss of a restriction-modification gene complex. Nucleic Acids Res 2009; 37:3021-31. [PMID: 19304752 PMCID: PMC2685091 DOI: 10.1093/nar/gkp148] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Genetically programmed cell deaths play important roles in unicellular prokaryotes. In postsegregational killing, loss of a gene complex from a cell leads to its descendants' deaths. With type II restriction-modification gene complexes, such death is triggered by restriction endonuclease's attacks on under-methylated chromosomes. Here, we examined how the Escherichia coli transcriptome changes after loss of PaeR7I gene complex. At earlier time points, activation of SOS genes and sigma(E)-regulon was noticeable. With time, more SOS genes, stress-response genes (including sigma(S)-regulon, osmotic-, oxidative- and periplasmic-stress genes), biofilm-related genes, and many hitherto uncharacterized genes were induced, and genes for energy metabolism, motility and outer membrane biogenesis were repressed. As expected from the activation of sigma(E)-regulon, the death was accompanied by cell lysis and release of cellular proteins. Expression of several sigma(E)-regulon genes indeed led to cell lysis. We hypothesize that some signal was transduced, among multiple genes involved, from the damaged genome to the cell surface and led to its disintegration. These results are discussed in comparison with other forms of programmed deaths in bacteria and eukaryotes.
Collapse
Affiliation(s)
- Yoko Asakura
- Ajinomoto CO, INC, Kawasaki-shi, Kanagawa, Japan.
| | | |
Collapse
|
50
|
Pavlović-Lazetić GM, Mitić NS, Beljanski MV. n-Gram characterization of genomic islands in bacterial genomes. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2009; 93:241-56. [PMID: 19101056 PMCID: PMC7185697 DOI: 10.1016/j.cmpb.2008.10.014] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2008] [Revised: 09/10/2008] [Accepted: 10/21/2008] [Indexed: 05/27/2023]
Abstract
The paper presents a novel, n-gram-based method for analysis of bacterial genome segments known as genomic islands (GIs). Identification of GIs in bacterial genomes is an important task since many of them represent inserts that may contribute to bacterial evolution and pathogenesis. In order to characterize and distinguish GIs from rest of the genome, binary classification of islands based on n-gram frequency distribution have been performed. It consists of testing the agreement of islands n-gram frequency distributions with the complete genome and backbone sequence. In addition, a statistic based on the maximal order Markov model is used to identify significantly overrepresented and underrepresented n-grams in islands. The results may be used as a basis for Zipf-like analysis suggesting that some of the n-grams are overrepresented in a subset of islands and underrepresented in the backbone, or vice versa, thus complementing the binary classification. The method is applied to strain-specific regions in the Escherichia coli O157:H7 EDL933 genome (O-islands), resulting in two groups of O-islands with different n-gram characteristics. It refines a characterization based on other compositional features such as G+C content and codon usage, and may help in identification of GIs, and also in research and development of adequate drugs targeting virulence genes in them.
Collapse
|