1
|
Lisitskaya L, Kropocheva E, Agapov A, Prostova M, Panteleev V, Yudin D, Ryazansky S, Kuzmenko A, Aravin A, Esyunina D, Kulbachinskiy A. Bacterial Argonaute nucleases reveal different modes of DNA targeting in vitro and in vivo. Nucleic Acids Res 2023; 51:5106-5124. [PMID: 37094066 PMCID: PMC10250240 DOI: 10.1093/nar/gkad290] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 04/04/2023] [Accepted: 04/06/2023] [Indexed: 04/26/2023] Open
Abstract
Prokaryotic Argonaute proteins (pAgos) are homologs of eukaryotic Argonautes (eAgos) and are also thought to play a role in cell defense against invaders. However, pAgos are much more diverse than eAgos and little is known about their functional activities and target specificities in vivo. Here, we describe five pAgos from mesophilic bacteria that act as programmable DNA endonucleases and analyze their ability to target chromosomal and invader DNA. In vitro, the analyzed proteins use small guide DNAs for precise cleavage of single-stranded DNA at a wide range of temperatures. Upon their expression in Escherichia coli, all five pAgos are loaded with small DNAs preferentially produced from plasmids and chromosomal regions of replication termination. One of the tested pAgos, EmaAgo from Exiguobacterium marinum, can induce DNA interference between homologous sequences resulting in targeted processing of multicopy plasmid and genomic elements. EmaAgo also protects bacteria from bacteriophage infection, by loading phage-derived guide DNAs and decreasing phage DNA content and phage titers. Thus, the ability of pAgos to target multicopy elements may be crucial for their protective function. The wide spectrum of pAgo activities suggests that they may have diverse functions in vivo and paves the way for their use in biotechnology.
Collapse
Affiliation(s)
- Lidiya Lisitskaya
- Institute of Gene Biology, Russian Academy of Sciences, Moscow119334, Russia
- Institute of Molecular Genetics, National Research Center “Kurchatov Institute”, Moscow123182, Russia
| | - Ekaterina Kropocheva
- Institute of Gene Biology, Russian Academy of Sciences, Moscow119334, Russia
- Institute of Molecular Genetics, National Research Center “Kurchatov Institute”, Moscow123182, Russia
| | - Aleksei Agapov
- Institute of Molecular Genetics, National Research Center “Kurchatov Institute”, Moscow123182, Russia
| | - Maria Prostova
- Institute of Gene Biology, Russian Academy of Sciences, Moscow119334, Russia
- Institute of Molecular Genetics, National Research Center “Kurchatov Institute”, Moscow123182, Russia
| | - Vladimir Panteleev
- Institute of Gene Biology, Russian Academy of Sciences, Moscow119334, Russia
- Institute of Molecular Genetics, National Research Center “Kurchatov Institute”, Moscow123182, Russia
- Moscow Institute of Physics and Technology, Dolgoprudny141700, Russia
| | - Denis Yudin
- Institute of Molecular Genetics, National Research Center “Kurchatov Institute”, Moscow123182, Russia
| | - Sergei Ryazansky
- Institute of Molecular Genetics, National Research Center “Kurchatov Institute”, Moscow123182, Russia
| | - Anton Kuzmenko
- Institute of Molecular Genetics, National Research Center “Kurchatov Institute”, Moscow123182, Russia
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Alexei A Aravin
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Daria Esyunina
- Institute of Gene Biology, Russian Academy of Sciences, Moscow119334, Russia
- Institute of Molecular Genetics, National Research Center “Kurchatov Institute”, Moscow123182, Russia
| | - Andrey Kulbachinskiy
- Institute of Gene Biology, Russian Academy of Sciences, Moscow119334, Russia
- Institute of Molecular Genetics, National Research Center “Kurchatov Institute”, Moscow123182, Russia
| |
Collapse
|
2
|
Wilkinson M, Wilkinson OJ, Feyerherm C, Fletcher EE, Wigley DB, Dillingham MS. Structures of RecBCD in complex with phage-encoded inhibitor proteins reveal distinctive strategies for evasion of a bacterial immunity hub. eLife 2022; 11:e83409. [PMID: 36533901 PMCID: PMC9836394 DOI: 10.7554/elife.83409] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 12/18/2022] [Indexed: 12/23/2022] Open
Abstract
Following infection of bacterial cells, bacteriophage modulate double-stranded DNA break repair pathways to protect themselves from host immunity systems and prioritise their own recombinases. Here, we present biochemical and structural analysis of two phage proteins, gp5.9 and Abc2, which target the DNA break resection complex RecBCD. These exemplify two contrasting mechanisms for control of DNA break repair in which the RecBCD complex is either inhibited or co-opted for the benefit of the invading phage. Gp5.9 completely inhibits RecBCD by preventing it from binding to DNA. The RecBCD-gp5.9 structure shows that gp5.9 acts by substrate mimicry, binding predominantly to the RecB arm domain and competing sterically for the DNA binding site. Gp5.9 adopts a parallel coiled-coil architecture that is unprecedented for a natural DNA mimic protein. In contrast, binding of Abc2 does not substantially affect the biochemical activities of isolated RecBCD. The RecBCD-Abc2 structure shows that Abc2 binds to the Chi-recognition domains of the RecC subunit in a position that might enable it to mediate the loading of phage recombinases onto its single-stranded DNA products.
Collapse
Affiliation(s)
- Martin Wilkinson
- Section of Structural Biology, Department of Infectious Disease, Faculty of Medicine, Imperial College LondonLondonUnited Kingdom
| | - Oliver J Wilkinson
- DNA:protein Interactions Unit, School of Biochemistry, University of BristolBristolUnited Kingdom
| | - Connie Feyerherm
- DNA:protein Interactions Unit, School of Biochemistry, University of BristolBristolUnited Kingdom
| | - Emma E Fletcher
- DNA:protein Interactions Unit, School of Biochemistry, University of BristolBristolUnited Kingdom
| | - Dale B Wigley
- Section of Structural Biology, Department of Infectious Disease, Faculty of Medicine, Imperial College LondonLondonUnited Kingdom
| | - Mark S Dillingham
- DNA:protein Interactions Unit, School of Biochemistry, University of BristolBristolUnited Kingdom
| |
Collapse
|
3
|
Subramaniam S, Smith GR. RecBCD enzyme and Chi recombination hotspots as determinants of self vs. non-self: Myths and mechanisms. ADVANCES IN GENETICS 2022; 109:1-37. [PMID: 36334915 PMCID: PMC10047805 DOI: 10.1016/bs.adgen.2022.06.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Bacteria face a challenge when DNA enters their cells by transformation, mating, or phage infection. Should they treat this DNA as an invasive foreigner and destroy it, or consider it one of their own and potentially benefit from incorporating new genes or alleles to gain useful functions? It is frequently stated that the short nucleotide sequence Chi (5' GCTGGTGG 3'), a hotspot of homologous genetic recombination recognized by Escherichia coli's RecBCD helicase-nuclease, allows E. coli to distinguish its DNA (self) from any other DNA (non-self) and to destroy non-self DNA, and that Chi is "over-represented" in the E. coli genome. We show here that these latter statements (dogmas) are not supported by available evidence. We note Chi's wide-spread occurrence and activity in distantly related bacterial species and phages. We illustrate multiple, highly non-random features of the genomes of E. coli and coliphage P1 that account for Chi's high frequency and genomic position, leading us to propose that P1 selects for Chi's enhancement of recombination, whereas E. coli selects for the preferred codons in Chi. We discuss other, previously described mechanisms for self vs. non-self determination involving RecBCD and for RecBCD's destruction of DNA that cannot recombine, whether foreign or domestic, with or without Chi.
Collapse
Affiliation(s)
| | - Gerald R Smith
- Division of Basic Sciences, Fred Hutchinson Cancer Center, Seattle, WA, United States.
| |
Collapse
|
4
|
Meunier A, Cornet F, Campos M. Bacterial cell proliferation: from molecules to cells. FEMS Microbiol Rev 2021; 45:5912836. [PMID: 32990752 PMCID: PMC7794046 DOI: 10.1093/femsre/fuaa046] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Accepted: 09/10/2020] [Indexed: 12/11/2022] Open
Abstract
Bacterial cell proliferation is highly efficient, both because bacteria grow fast and multiply with a low failure rate. This efficiency is underpinned by the robustness of the cell cycle and its synchronization with cell growth and cytokinesis. Recent advances in bacterial cell biology brought about by single-cell physiology in microfluidic chambers suggest a series of simple phenomenological models at the cellular scale, coupling cell size and growth with the cell cycle. We contrast the apparent simplicity of these mechanisms based on the addition of a constant size between cell cycle events (e.g. two consecutive initiation of DNA replication or cell division) with the complexity of the underlying regulatory networks. Beyond the paradigm of cell cycle checkpoints, the coordination between the DNA and division cycles and cell growth is largely mediated by a wealth of other mechanisms. We propose our perspective on these mechanisms, through the prism of the known crosstalk between DNA replication and segregation, cell division and cell growth or size. We argue that the precise knowledge of these molecular mechanisms is critical to integrate the diverse layers of controls at different time and space scales into synthetic and verifiable models.
Collapse
Affiliation(s)
- Alix Meunier
- Centre de Biologie Intégrative de Toulouse (CBI Toulouse), Laboratoire de Microbiologie et Génétique Moléculaires (LMGM), Université de Toulouse, UPS, CNRS, IBCG, 165 rue Marianne Grunberg-Manago, 31062 Toulouse, France
| | - François Cornet
- Centre de Biologie Intégrative de Toulouse (CBI Toulouse), Laboratoire de Microbiologie et Génétique Moléculaires (LMGM), Université de Toulouse, UPS, CNRS, IBCG, 165 rue Marianne Grunberg-Manago, 31062 Toulouse, France
| | - Manuel Campos
- Centre de Biologie Intégrative de Toulouse (CBI Toulouse), Laboratoire de Microbiologie et Génétique Moléculaires (LMGM), Université de Toulouse, UPS, CNRS, IBCG, 165 rue Marianne Grunberg-Manago, 31062 Toulouse, France
| |
Collapse
|
5
|
Abstract
Since the nucleoid was isolated from bacteria in the 1970s, two fundamental questions emerged and are still in the spotlight: how bacteria organize their chromosomes to fit inside the cell and how nucleoid organization enables essential biological processes. During the last decades, knowledge of bacterial chromosome organization has advanced considerably, and today, such chromosomes are considered to be highly organized and dynamic structures that are shaped by multiple factors in a multiscale manner. Here we review not only the classical well-known factors involved in chromosome organization but also novel components that have recently been shown to dynamically shape the 3D structuring of the bacterial genome. We focus on the different functional elements that control short-range organization and describe how they collaborate in the establishment of the higher-order folding and disposition of the chromosome. Recent advances have opened new avenues for a deeper understanding of the principles and mechanisms of chromosome organization in bacteria. Expected final online publication date for the Annual Review of Microbiology, Volume 75 is October 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Virginia S Lioy
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France;
| | - Ivan Junier
- Université Grenoble Alpes, CNRS, TIMC-IMAG, 38000 Grenoble, France
| | - Frédéric Boccard
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France;
| |
Collapse
|
6
|
Fallon AM. DNA recombination and repair in Wolbachia: RecA and related proteins. Mol Genet Genomics 2021; 296:437-456. [PMID: 33507381 DOI: 10.1007/s00438-020-01760-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2020] [Accepted: 12/23/2020] [Indexed: 12/15/2022]
Abstract
Wolbachia is an obligate intracellular bacterium that has undergone extensive genomic streamlining in its arthropod and nematode hosts. Because the gene encoding the bacterial DNA recombination/repair protein RecA is not essential in Escherichia coli, abundant expression of this protein in a mosquito cell line persistently infected with Wolbachia strain wStri was unexpected. However, RecA's role in the lytic cycle of bacteriophage lambda provides an explanation for retention of recA in strains known to encode lambda-like WO prophages. To examine DNA recombination/repair capacities in Wolbachia, a systematic examination of RecA and related proteins in complete or nearly complete Wolbachia genomes from supergroups A, B, C, D, E, F, J and S was undertaken. Genes encoding proteins including RecA, RecF, RecO, RecR, RecG and Holliday junction resolvases RuvA, RuvB and RuvC are uniformly absent from Wolbachia in supergroup C and have reduced representation in supergroups D and J, suggesting that recombination and repair activities are compromised in nematode-associated Wolbachia, relative to strains that infect arthropods. An exception is filarial Wolbachia strain wMhie, assigned to supergroup F, which occurs in a nematode host from a poikilothermic lizard. Genes encoding LexA and error-prone polymerases are absent from all Wolbachia genomes, suggesting that the SOS functions induced by RecA-mediated activation of LexA do not occur, despite retention of genes encoding a few proteins that respond to LexA induction in E. coli. Three independent E. coli accessions converge on a single Wolbachia UvrD helicase, which interacts with mismatch repair proteins MutS and MutL, encoded in nearly all Wolbachia genomes. With the exception of MutL, which has been mapped to a eukaryotic association module in Phage WO, proteins involved in recombination/repair are uniformly represented by single protein annotations. Putative phage-encoded MutL proteins are restricted to Wolbachia supergroups A and B and show higher amino acid identity than chromosomally encoded MutL orthologs. This analysis underscores differences between nematode and arthropod-associated Wolbachia and describes aspects of DNA metabolism that potentially impact development of procedures for transformation and genetic manipulation of Wolbachia.
Collapse
Affiliation(s)
- Ann M Fallon
- Department of Entomology, University of Minnesota, 1980 Folwell Ave, St. Paul, MN, 55108, USA.
| |
Collapse
|
7
|
Buton A, Bobay LM. Evolution of Chi motifs in Proteobacteria. G3-GENES GENOMES GENETICS 2021; 11:6064151. [PMID: 33561247 PMCID: PMC8022716 DOI: 10.1093/g3journal/jkaa054] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 11/01/2020] [Indexed: 11/18/2022]
Abstract
Homologous recombination is a key pathway found in nearly all bacterial taxa. The recombination complex not only allows bacteria to repair DNA double-strand breaks but also promotes adaption through the exchange of DNA between cells. In Proteobacteria, this process is mediated by the RecBCD complex, which relies on the recognition of a DNA motif named Chi to initiate recombination. The Chi motif has been characterized in Escherichia coli and analogous sequences have been found in several other species from diverse families, suggesting that this mode of action is widespread across bacteria. However, the sequences of Chi-like motifs are known for only five bacterial species: E. coli, Haemophilus influenzae, Bacillus subtilis, Lactococcus lactis, and Staphylococcus aureus. In this study, we detected putative Chi motifs in a large dataset of Proteobacteria and identified four additional motifs sharing high sequence similarity and similar properties to the Chi motif of E. coli in 85 species of Proteobacteria. Most Chi motifs were detected in Enterobacteriaceae and this motif appears well conserved in this family. However, we did not detect Chi motifs for the majority of Proteobacteria, suggesting that different motifs are used in these species. Altogether these results substantially expand our knowledge on the evolution of Chi motifs and on the recombination process in bacteria.
Collapse
Affiliation(s)
- Angélique Buton
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC 27402, USA
| | - Louis-Marie Bobay
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC 27402, USA
| |
Collapse
|
8
|
Kuzmenko A, Oguienko A, Esyunina D, Yudin D, Petrova M, Kudinova A, Maslova O, Ninova M, Ryazansky S, Leach D, Aravin AA, Kulbachinskiy A. DNA targeting and interference by a bacterial Argonaute nuclease. Nature 2020; 587:632-637. [PMID: 32731256 DOI: 10.1038/s41586-020-2605-1] [Citation(s) in RCA: 89] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 07/24/2020] [Indexed: 12/21/2022]
Abstract
Members of the conserved Argonaute protein family use small RNA guides to locate their mRNA targets and regulate gene expression and suppress mobile genetic elements in eukaryotes1,2. Argonautes are also present in many bacterial and archaeal species3-5. Unlike eukaryotic proteins, several prokaryotic Argonaute proteins use small DNA guides to cleave DNA, a process known as DNA interference6-10. However, the natural functions and targets of DNA interference are poorly understood, and the mechanisms of DNA guide generation and target discrimination remain unknown. Here we analyse the activity of a bacterial Argonaute nuclease from Clostridium butyricum (CbAgo) in vivo. We show that CbAgo targets multicopy genetic elements and suppresses the propagation of plasmids and infection by phages. CbAgo induces DNA interference between homologous sequences and triggers DNA degradation at double-strand breaks in the target DNA. The loading of CbAgo with locus-specific small DNA guides depends on both its intrinsic endonuclease activity and the cellular double-strand break repair machinery. A similar interaction was reported for the acquisition of new spacers during CRISPR adaptation, and prokaryotic genomes that encode Ago nucleases are enriched in CRISPR-Cas systems. These results identify molecular mechanisms that generate guides for DNA interference and suggest that the recognition of foreign nucleic acids by prokaryotic defence systems involves common principles.
Collapse
Affiliation(s)
- Anton Kuzmenko
- Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia. .,Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
| | - Anastasiya Oguienko
- Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Daria Esyunina
- Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Denis Yudin
- Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia.,Department of Biology, Institute of Molecular Biology and Biophysics, ETH Zurich, Zurich, Switzerland
| | - Mayya Petrova
- Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Alina Kudinova
- Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Olga Maslova
- Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Maria Ninova
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Sergei Ryazansky
- Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia
| | - David Leach
- Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Alexei A Aravin
- Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia. .,Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
| | | |
Collapse
|
9
|
Wimmer F, Beisel CL. CRISPR-Cas Systems and the Paradox of Self-Targeting Spacers. Front Microbiol 2020; 10:3078. [PMID: 32038537 PMCID: PMC6990116 DOI: 10.3389/fmicb.2019.03078] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2019] [Accepted: 12/19/2019] [Indexed: 12/26/2022] Open
Abstract
CRISPR-Cas immune systems in bacteria and archaea record prior infections as spacers within each system’s CRISPR arrays. Spacers are normally derived from invasive genetic material and direct the immune system to complementary targets as part of future infections. However, not all spacers appear to be derived from foreign genetic material and instead can originate from the host genome. Their presence poses a paradox, as self-targeting spacers would be expected to induce an autoimmune response and cell death. In this review, we discuss the known frequency of self-targeting spacers in natural CRISPR-Cas systems, how these spacers can be incorporated into CRISPR arrays, and how the host can evade lethal attack. We also discuss how self-targeting spacers can become the basis for alternative functions performed by CRISPR-Cas systems that extend beyond adaptive immunity. Overall, the acquisition of genome-targeting spacers poses a substantial risk but can aid in the host’s evolution and potentially lead to or support new functionalities.
Collapse
Affiliation(s)
- Franziska Wimmer
- Helmholtz Institute for RNA-Based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
| | - Chase L Beisel
- Helmholtz Institute for RNA-Based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany.,Medical Faculty, University of Würzburg, Würzburg, Germany
| |
Collapse
|
10
|
A conformational switch in response to Chi converts RecBCD from phage destruction to DNA repair. Nat Struct Mol Biol 2020; 27:71-77. [PMID: 31907455 PMCID: PMC7000243 DOI: 10.1038/s41594-019-0355-2] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Accepted: 11/21/2019] [Indexed: 12/04/2022]
Abstract
The RecBCD complex plays key roles in phage DNA degradation, CRISPR array acquisition (adaptation) and host DNA repair. The switch between these roles is regulated by a DNA sequence called Chi. We report cryo-EM structures of the Escherichia coli RecBCD complex bound to several different DNA forks containing a Chi sequence, including one in which Chi is recognised and others in which it is not. The Chi-recognised structure shows conformational changes in regions of the protein that contact Chi and reveals a tortuous path taken by the DNA. Sequence specificity arises from interactions with both the RecC subunit and the sequence itself. These structures provide molecular details for how Chi is recognised and insights into the changes that occur in response to Chi binding that switch RecBCD from bacteriophage destruction and CRISPR spacer acquisition, to constructive host DNA repair.
Collapse
|
11
|
Abstract
Microbial populations exchange genetic material through a process called homologous recombination. Although this process has been studied in particular organisms, we lack an understanding of its differential impact over the genome and across microbes with different life-styles. We used a common analytical framework to assess this process in a representative set of microorganisms. Our results uncovered important trends. First, microbes with different lifestyles are differentially impacted, with endosymbionts and obligate pathogens being those less prone to undergo this process. Second, certain genetic elements such as restriction-modification systems seem to be associated with higher rates of recombination. Most importantly, recombined genomes show the footprints of natural selection in which recombined regions preferentially contain genes that can be related to specific ecological adaptations. Taken together, our results clarify the relative contributions of factors modulating homologous recombination and show evidence for a clear a role of this process in shaping microbial genomes and driving ecological adaptations. Homologous recombination (HR) enables the exchange of genetic material between and within species. Recent studies suggest that this process plays a major role in the microevolution of microbial genomes, contributing to core genome homogenization and to the maintenance of cohesive population structures. However, we still have a very poor understanding of the possible adaptive roles of intraspecific HR and of the factors that determine its differential impact across clades and lifestyles. Here we used a unified methodological framework to assess HR in 338 complete genomes from 54 phylogenetically diverse and representative prokaryotic species, encompassing different lifestyles and a broad phylogenetic distribution. Our results indicate that lifestyle and presence of restriction-modification (RM) machineries are among the main factors shaping HR patterns, with symbionts and intracellular pathogens having the lowest HR levels. Similarly, the size of exchanged genomic fragments correlated with the presence of RM and competence machineries. Finally, genes exchanged by HR showed functional enrichments which could be related to adaptations to different environments and ecological strategies. Taken together, our results clarify the factors underlying HR impact and suggest important adaptive roles of genes exchanged through this mechanism. Our results also revealed that the extent of genetic exchange correlated with lifestyle and some genomic features. Moreover, the genes in exchanged regions were enriched for functions that reflected specific adaptations, supporting identification of HR as one of the main evolutionary mechanisms shaping prokaryotic core genomes.
Collapse
|
12
|
Charubin K, Bennett RK, Fast AG, Papoutsakis ET. Engineering Clostridium organisms as microbial cell-factories: challenges & opportunities. Metab Eng 2018; 50:173-191. [DOI: 10.1016/j.ymben.2018.07.012] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Revised: 07/18/2018] [Accepted: 07/19/2018] [Indexed: 11/25/2022]
|
13
|
Bacterial RecA Protein Promotes Adenoviral Recombination during In Vitro Infection. mSphere 2018; 3:3/3/e00105-18. [PMID: 29925671 PMCID: PMC6010623 DOI: 10.1128/msphere.00105-18] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2018] [Accepted: 06/03/2018] [Indexed: 12/30/2022] Open
Abstract
Adenoviruses are common human mucosal pathogens of the gastrointestinal, respiratory, and genitourinary tracts and ocular surface. Here, we report finding Chi-like sequences in adenovirus recombination hot spots. Adenovirus coinfection in the presence of bacterial RecA protein facilitated homologous recombination between viruses. Genetic recombination led to evolution of an important external feature on the adenoviral capsid, namely, the penton base protein hypervariable loop 2, which contains the arginine-glycine-aspartic acid motif critical to viral internalization. We speculate that free Rec proteins present in gastrointestinal secretions upon bacterial cell death facilitate the evolution of human adenoviruses through homologous recombination, an example of viral commensalism and the complexity of virus-host interactions, including regional microbiota. Adenovirus infections in humans are common and sometimes lethal. Adenovirus-derived vectors are also commonly chosen for gene therapy in human clinical trials. We have shown in previous work that homologous recombination between adenoviral genomes of human adenovirus species D (HAdV-D), the largest and fastest growing HAdV species, is responsible for the rapid evolution of this species. Because adenovirus infection initiates in mucosal epithelia, particularly at the gastrointestinal, respiratory, genitourinary, and ocular surfaces, we sought to determine a possible role for mucosal microbiota in adenovirus genome diversity. By analysis of known recombination hot spots across 38 human adenovirus genomes in species D (HAdV-D), we identified nucleotide sequence motifs similar to bacterial Chi sequences, which facilitate homologous recombination in the presence of bacterial Rec enzymes. These motifs, referred to here as ChiAD, were identified immediately 5′ to the sequence encoding penton base hypervariable loop 2, which expresses the arginine-glycine-aspartate moiety critical to adenoviral cellular entry. Coinfection with two HAdV-Ds in the presence of an Escherichia coli lysate increased recombination; this was blocked in a RecA mutant strain, E. coli DH5α, or upon RecA depletion. Recombination increased in the presence of E. coli lysate despite a general reduction in viral replication. RecA colocalized with viral DNA in HAdV-D-infected cell nuclei and was shown to bind specifically to ChiAD sequences. These results indicate that adenoviruses may repurpose bacterial recombination machinery, a sharing of evolutionary mechanisms across a diverse microbiota, and unique example of viral commensalism. IMPORTANCE Adenoviruses are common human mucosal pathogens of the gastrointestinal, respiratory, and genitourinary tracts and ocular surface. Here, we report finding Chi-like sequences in adenovirus recombination hot spots. Adenovirus coinfection in the presence of bacterial RecA protein facilitated homologous recombination between viruses. Genetic recombination led to evolution of an important external feature on the adenoviral capsid, namely, the penton base protein hypervariable loop 2, which contains the arginine-glycine-aspartic acid motif critical to viral internalization. We speculate that free Rec proteins present in gastrointestinal secretions upon bacterial cell death facilitate the evolution of human adenoviruses through homologous recombination, an example of viral commensalism and the complexity of virus-host interactions, including regional microbiota.
Collapse
|
14
|
Pavankumar TL, Sinha AK, Ray MK. Biochemical characterization of RecBCD enzyme from an Antarctic Pseudomonas species and identification of its cognate Chi (χ) sequence. PLoS One 2018; 13:e0197476. [PMID: 29775464 PMCID: PMC5959072 DOI: 10.1371/journal.pone.0197476] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Accepted: 03/13/2018] [Indexed: 11/18/2022] Open
Abstract
Pseudomonas syringae Lz4W RecBCD enzyme, RecBCDPs, is a trimeric protein complex comprised of RecC, RecB, and RecD subunits. RecBCD enzyme is essential for P. syringae growth at low temperature, and it protects cells from low temperature induced replication arrest. In this study, we show that the RecBCDPs enzyme displays distinct biochemical behaviors. Unlike E. coli RecBCD enzyme, the RecD subunit is indispensable for RecBCDPs function. The RecD motor activity is essential for the Chi-like fragments production in P. syringae, highlighting a distinct role for P. syringae RecD subunit in DNA repair and recombination process. Here, we demonstrate that the RecBCDPs enzyme recognizes a unique octameric DNA sequence, 5′-GCTGGCGC-3′ (ChiPs) that attenuates nuclease activity of the enzyme when it enters dsDNA from the 3′-end. We propose that the reduced translocation activities manifested by motor-defective mutants cause cold sensitivity in P. syrinage; emphasizing the importance of DNA processing and recombination functions in rescuing low temperature induced replication fork arrest.
Collapse
Affiliation(s)
- Theetha L. Pavankumar
- CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India
- * E-mail: (TLP); (MKR)
| | - Anurag K. Sinha
- CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India
| | - Malay K. Ray
- CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India
- * E-mail: (TLP); (MKR)
| |
Collapse
|
15
|
Badrinarayanan A, Le TBK, Spille JH, Cisse II, Laub MT. Global analysis of double-strand break processing reveals in vivo properties of the helicase-nuclease complex AddAB. PLoS Genet 2017; 13:e1006783. [PMID: 28489851 PMCID: PMC5443536 DOI: 10.1371/journal.pgen.1006783] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Revised: 05/24/2017] [Accepted: 04/26/2017] [Indexed: 12/03/2022] Open
Abstract
In bacteria, double-strand break (DSB) repair via homologous recombination is thought to be initiated through the bi-directional degradation and resection of DNA ends by a helicase-nuclease complex such as AddAB. The activity of AddAB has been well-studied in vitro, with translocation speeds between 400–2000 bp/s on linear DNA suggesting that a large section of DNA around a break site is processed for repair. However, the translocation rate and activity of AddAB in vivo is not known, and how AddAB is regulated to prevent excessive DNA degradation around a break site is unclear. To examine the functions and mechanistic regulation of AddAB inside bacterial cells, we developed a next-generation sequencing-based approach to assay DNA processing after a site-specific DSB was introduced on the chromosome of Caulobacter crescentus. Using this assay we determined the in vivo rates of DSB processing by AddAB and found that putative chi sites attenuate processing in a RecA-dependent manner. This RecA-mediated regulation of AddAB prevents the excessive loss of DNA around a break site, limiting the effects of DSB processing on transcription. In sum, our results, taken together with prior studies, support a mechanism for regulating AddAB that couples two key events of DSB repair–the attenuation of DNA-end processing and the initiation of homology search by RecA–thereby helping to ensure that genomic integrity is maintained during DSB repair. Double-strand breaks (DSBs) are a threat to genome integrity and are faithfully repaired via homologous recombination. The initial processing of DSB ends that prepares them for recombination has been well-studied in vitro, but is less well characterized in vivo. We describe a deep sequencing-based assay for assessing the early steps of DSB processing in bacterial cells by the helicase-nuclease complex AddAB. We find that a combination of chi site recognition and RecA loading is required to attenuate AddAB activity. In the absence of RecA, the chromosome is excessively degraded with a concomitant loss in transcription. Our results, along with prior studies, support a model for how chi recognition and RecA together regulate AddAB to maintain genome integrity and facilitate recombination.
Collapse
Affiliation(s)
- Anjana Badrinarayanan
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, United States of America
- National Centre for Biological Sciences (NCBS), Tata Institute of Fundamental Research, Bangalore, India
| | - Tung B. K. Le
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, United States of America
- Department of Molecular Microbiology, John Innes Centre, Norwich, United Kingdom
| | - Jan-Hendrik Spille
- Department of Physics, Massachusetts Institute of Technology, Cambridge, MA, United States of America
| | - Ibrahim I. Cisse
- Department of Physics, Massachusetts Institute of Technology, Cambridge, MA, United States of America
| | - Michael T. Laub
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, United States of America
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, MA, United States of America
- * E-mail:
| |
Collapse
|
16
|
On the First k Moments of the Random Count of a Pattern in a Multistate Sequence Generated by a Markov Source. J Appl Probab 2016. [DOI: 10.1017/s0021900200007403] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
In this paper we develop an explicit formula that allows us to compute the firstkmoments of the random count of a pattern in a multistate sequence generated by a Markov source. We derive efficient algorithms that allow us to deal with any pattern (low or high complexity) in any Markov model (homogeneous or not). We then apply these results to the distribution of DNA patterns in genomic sequences, and we show that moment-based developments (namely Edgeworth's expansion and Gram-Charlier type-B series) allow us to improve the reliability of common asymptotic approximations, such as Gaussian or Poisson approximations.
Collapse
|
17
|
Nuel G. On the First k Moments of the Random Count of a Pattern in a Multistate Sequence Generated by a Markov Source. J Appl Probab 2016. [DOI: 10.1239/jap/1294170523] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In this paper we develop an explicit formula that allows us to compute the first k moments of the random count of a pattern in a multistate sequence generated by a Markov source. We derive efficient algorithms that allow us to deal with any pattern (low or high complexity) in any Markov model (homogeneous or not). We then apply these results to the distribution of DNA patterns in genomic sequences, and we show that moment-based developments (namely Edgeworth's expansion and Gram-Charlier type-B series) allow us to improve the reliability of common asymptotic approximations, such as Gaussian or Poisson approximations.
Collapse
|
18
|
Taylor AF, Amundsen SK, Smith GR. Unexpected DNA context-dependence identifies a new determinant of Chi recombination hotspots. Nucleic Acids Res 2016; 44:8216-28. [PMID: 27330137 PMCID: PMC5041463 DOI: 10.1093/nar/gkw541] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 06/03/2016] [Indexed: 11/23/2022] Open
Abstract
Homologous recombination occurs especially frequently near special chromosomal sites called hotspots. In Escherichia coli, Chi hotspots control RecBCD enzyme, a protein machine essential for the major pathway of DNA break-repair and recombination. RecBCD generates recombinogenic single-stranded DNA ends by unwinding DNA and cutting it a few nucleotides to the 3′ side of 5′ GCTGGTGG 3′, the sequence historically equated with Chi. To test if sequence context affects Chi activity, we deep-sequenced the products of a DNA library containing 10 random base-pairs on each side of the Chi sequence and cut by purified RecBCD. We found strongly enhanced cutting at Chi with certain preferred sequences, such as A or G at nucleotides 4–7, on the 3′ flank of the Chi octamer. These sequences also strongly increased Chi hotspot activity in E. coli cells. Our combined enzymatic and genetic results redefine the Chi hotspot sequence, implicate the nuclease domain in Chi recognition, indicate that nicking of one strand at Chi is RecBCD's biologically important reaction in living cells, and enable more precise analysis of Chi's role in recombination and genome evolution.
Collapse
Affiliation(s)
- Andrew F Taylor
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109, USA
| | - Susan K Amundsen
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109, USA
| | - Gerald R Smith
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109, USA
| |
Collapse
|
19
|
Marraffini LA. CRISPR-Cas immunity in prokaryotes. Nature 2015; 526:55-61. [PMID: 26432244 DOI: 10.1038/nature15386] [Citation(s) in RCA: 513] [Impact Index Per Article: 57.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 08/07/2015] [Indexed: 12/12/2022]
Abstract
Prokaryotic organisms are threatened by a large array of viruses and have developed numerous defence strategies. Among these, only clustered, regularly interspaced short palindromic repeat (CRISPR)-Cas systems provide adaptive immunity against foreign elements. Upon viral injection, a small sequence of the viral genome, known as a spacer, is integrated into the CRISPR locus to immunize the host cell. Spacers are transcribed into small RNA guides that direct the cleavage of the viral DNA by Cas nucleases. Immunization through spacer acquisition enables a unique form of evolution whereby a population not only rapidly acquires resistance to its predators but also passes this resistance mechanism vertically to its progeny.
Collapse
Affiliation(s)
- Luciano A Marraffini
- Laboratory of Bacteriology, The Rockefeller University, 1230 York Avenue, New York, New York 10065, USA
| |
Collapse
|
20
|
Elhai J. Highly Iterated Palindromic Sequences (HIPs) and Their Relationship to DNA Methyltransferases. Life (Basel) 2015; 5:921-48. [PMID: 25789551 PMCID: PMC4390886 DOI: 10.3390/life5010921] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2015] [Revised: 02/24/2015] [Accepted: 03/09/2015] [Indexed: 11/16/2022] Open
Abstract
The sequence GCGATCGC (Highly Iterated Palindrome, HIP1) is commonly found in high frequency in cyanobacterial genomes. An important clue to its function may be the presence of two orphan DNA methyltransferases that recognize internal sequences GATC and CGATCG. An examination of genomes from 97 cyanobacteria, both free-living and obligate symbionts, showed that there are exceptional cases in which HIP1 is at a low frequency or nearly absent. In some of these cases, it appears to have been replaced by a different GC-rich palindromic sequence, alternate HIPs. When HIP1 is at a high frequency, GATC- and CGATCG-specific methyltransferases are generally present in the genome. When an alternate HIP is at high frequency, a methyltransferase specific for that sequence is present. The pattern of 1-nt deviations from HIP1 sequences is biased towards the first and last nucleotides, i.e., those distinguish CGATCG from HIP1. Taken together, the results point to a role of DNA methylation in the creation or functioning of HIP sites. A model is presented that postulates the existence of a GmeC-dependent mismatch repair system whose activity creates and maintains HIP sequences.
Collapse
Affiliation(s)
- Jeff Elhai
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA 23284, USA.
| |
Collapse
|
21
|
Bobay LM, Touchon M, Rocha EPC. Manipulating or superseding host recombination functions: a dilemma that shapes phage evolvability. PLoS Genet 2013; 9:e1003825. [PMID: 24086157 PMCID: PMC3784561 DOI: 10.1371/journal.pgen.1003825] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2013] [Accepted: 08/08/2013] [Indexed: 11/18/2022] Open
Abstract
Phages, like many parasites, tend to have small genomes and may encode autonomous functions or manipulate those of their hosts'. Recombination functions are essential for phage replication and diversification. They are also nearly ubiquitous in bacteria. The E. coli genome encodes many copies of an octamer (Chi) motif that upon recognition by RecBCD favors repair of double strand breaks by homologous recombination. This might allow self from non-self discrimination because RecBCD degrades DNA lacking Chi. Bacteriophage Lambda, an E. coli parasite, lacks Chi motifs, but escapes degradation by inhibiting RecBCD and encoding its own autonomous recombination machinery. We found that only half of 275 lambdoid genomes encode recombinases, the remaining relying on the host's machinery. Unexpectedly, we found that some lambdoid phages contain extremely high numbers of Chi motifs concentrated between the phage origin of replication and the packaging site. This suggests a tight association between replication, packaging and RecBCD-mediated recombination in these phages. Indeed, phages lacking recombinases strongly over-represent Chi motifs. Conversely, phages encoding recombinases and inhibiting host recombination machinery select for the absence of Chi motifs. Host and phage recombinases use different mechanisms and the latter are more tolerant to sequence divergence. Accordingly, we show that phages encoding their own recombination machinery have more mosaic genomes resulting from recent recombination events and have more diverse gene repertoires, i.e. larger pan genomes. We discuss the costs and benefits of superseding or manipulating host recombination functions and how this decision shapes phage genome structure and evolvability. Bacterial viruses, called bacteriophages, are extremely abundant in the biosphere. They have key roles in the regulation of bacterial populations and in the diversification of bacterial genomes. Among these viruses, lambdoid phages are very abundant in enterobacteria and exchange genetic material very frequently. This latter process is thought to increase phage diversity and therefore facilitate adaptation to hosts. Recombination is also essential for the replication of many lambdoid phages. Lambdoids have been described to encode their own recombination genes and inhibit their hosts'. In this study, we show that lambdoids are split regarding their capacity to encode autonomous recombination functions and that this affects the abundance of recombination-related sequence motifs. Half of the phages encode an autonomous system and inhibit their hosts'. The trade-off between superseding and manipulating the hosts' recombination functions has important consequences. The phages encoding autonomous recombination functions have more diverse gene repertoires and recombine more frequently. Viruses, as many other parasites, have small genomes and depend on their hosts for several housekeeping functions. Hence, they often face trade-offs between supersession and manipulation of molecular machineries. Our results suggest these trade-offs may shape viral gene repertoires, their sequence composition and even influence their evolvability.
Collapse
Affiliation(s)
- Louis-Marie Bobay
- Microbial Evolutionary Genomics, Institut Pasteur, Paris, France
- CNRS, UMR3525, Paris, France
- Université Pierre et Marie Curie, Cellule Pasteur UPMC, Paris, France
- * E-mail:
| | - Marie Touchon
- Microbial Evolutionary Genomics, Institut Pasteur, Paris, France
- CNRS, UMR3525, Paris, France
| | - Eduardo P. C. Rocha
- Microbial Evolutionary Genomics, Institut Pasteur, Paris, France
- CNRS, UMR3525, Paris, France
| |
Collapse
|
22
|
Nunvar J, Licha I, Schneider B. Evolution of REP diversity: a comparative study. BMC Genomics 2013; 14:385. [PMID: 23758774 PMCID: PMC3686654 DOI: 10.1186/1471-2164-14-385] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2013] [Accepted: 06/03/2013] [Indexed: 12/05/2022] Open
Abstract
Background Repetitive extragenic palindromic elements (REPs) constitute a group of bacterial genomic repeats known for their high abundance and several roles in host cells´ physiology. We analyzed the phylogenetic distribution of particular REP classes in genomic sequences of sixty-three bacterial strains belonging to the Pseudomonas fluorescens species complex and ten strains of Stenotrophomonas sp., in order to assess intraspecific REP diversity and to gain insight into long-term REP evolution. Results Based on proximity to RAYT (REP-associated tyrosine transposase) genes, twenty-two and thirteen unique REP classes were determined in fluorescent pseudomonads and stenotrophomonads, respectively. In stenotrophomonads, REP elements were typically found in tens or a few hundred copies per genome. REPs of fluorescent pseudomonads were generally more numerous, occurring in hundreds or even over a thousand perfect copies of particular REP class per genome. REP sequences showed highly heterogeneous distribution. The abundances of REP classes roughly followed host strains´ phylogeny, differing markedly among individual clades. High abundances of particular REP classes appeared to depend on the presence of the cognate RAYT gene, and deviations from this state could be attributed to recent or ancient mutations of rayt-flanking REPs, or RAYT loss. RAYTs of both studied bacterial groups are monophyletic, and their cognate REPs show species-specific characteristics, suggesting shared evolutionary history of REPs, RAYTs and their hosts. Conclusions The results of our large-scale analysis show that REP elements constitute intriguingly dynamic components of genomes of fluorescent pseudomonads and stenotrophomonads, and indicate that REP diversification and proliferation are ongoing processes. High numbers of REPs have apparently been retained during the entire evolutionary time since the establishment of these two bacterial lineages, probably because of their beneficial effect on host long-term fitness. REP elements in these bacteria represent a suitable platform to study the interplay between repeated elements, their mobilizers and host bacterial cells.
Collapse
Affiliation(s)
- Jaroslav Nunvar
- Department of Genetics and Microbiology, Faculty of Science, Charles University, Prague 2, Czech Republic.
| | | | | |
Collapse
|
23
|
Lopez-Vernaza MA, Leach DRF. WITHDRAWN: Symmetries and Asymmetries Associated with Non-Random Segregation of Sister DNA Strands in Escherichia coli. Semin Cell Dev Biol 2013:S1084-9521(13)00077-3. [PMID: 23692810 DOI: 10.1016/j.semcdb.2013.05.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2013] [Accepted: 05/06/2013] [Indexed: 11/19/2022]
Abstract
The Publisher regrets that this article is an accidental duplication of an article that has already been published, http://dx.doi.org/10.1016/j.semcdb.2013.05.010. The duplicate article has therefore been withdrawn.
Collapse
Affiliation(s)
- Manuel A Lopez-Vernaza
- Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JR, United Kingdom
| | | |
Collapse
|
24
|
Lopez-Vernaza MA, Leach DRF. Symmetries and asymmetries associated with non-random segregation of sister DNA strands in Escherichia coli. Semin Cell Dev Biol 2013; 24:610-7. [PMID: 23685127 DOI: 10.1016/j.semcdb.2013.05.010] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The successful inheritance of genetic information across generations is a complex process requiring replication of the genome and its faithful segregation into two daughter cells. At each replication cycle there is a risk that new DNA strands incorporate genetic changes caused by miscopying of parental information. By contrast the parental strands retain the original information. This raises the intriguing possibility that specific cell lineages might inherit "immortal" parental DNA strands via non-random segregation. If so, this requires an understanding of the mechanisms of non-random segregation. Here, we review several aspects of asymmetry in the very symmetrical cell, Escherichia coli, in the interest of exploring the potential basis for non-random segregation of leading- and lagging-strand replicated chromosome arms. These considerations lead us to propose a model for DNA replication that integrates chromosome segregation and genomic localisation with non-random strand segregation.
Collapse
Affiliation(s)
- Manuel A Lopez-Vernaza
- Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, United Kingdom
| | | |
Collapse
|
25
|
Demarre G, Galli E, Barre FX. The FtsK Family of DNA Pumps. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2013; 767:245-62. [PMID: 23161015 DOI: 10.1007/978-1-4614-5037-5_12] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Interest for proteins of the FtsK family initially arose from their implication in many primordial processes in which DNA needs to be transported from one cell compartment to another in eubacteria. In the first section of this chapter, we address a list of the cellular functions of the different members of the FtsK family that have been so far studied. Soon after their discovery, interest for the FstK proteins spread because of their unique biochemical properties: most DNA transport systems rely on the assembly of complex multicomponent machines. In contrast, six FtsK proteins are sufficient to assemble into a fast and powerful DNA pump; the pump transports closed circular double stranded DNA molecules without any covalent-bond breakage nor topological alteration; transport is oriented despite the intrinsic symmetrical nature of the double stranded DNA helix and can occur across cell membranes. The different activities required for the oriented transport of DNA across cell compartments are achieved by three separate modules within the FtsK proteins: a DNA translocation module, an orientation module and an anchoring module. In the second part of this chapter, we review the structural and biochemical properties of these different modules.
Collapse
Affiliation(s)
- Gaëlle Demarre
- Centre de Génétique Moléculaire, CNRS, Gif sur Yvette, Cedex, France,
| | | | | |
Collapse
|
26
|
Elhai J, Liu H, Taton A. Detection of horizontal transfer of individual genes by anomalous oligomer frequencies. BMC Genomics 2012; 13:245. [PMID: 22702893 PMCID: PMC3497702 DOI: 10.1186/1471-2164-13-245] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2011] [Accepted: 05/18/2012] [Indexed: 11/10/2022] Open
Abstract
Background Understanding the history of life requires that we understand the transfer of genetic material across phylogenetic boundaries. Detecting genes that were acquired by means other than vertical descent is a basic step in that process. Detection by discordant phylogenies is computationally expensive and not always definitive. Many have used easily computed compositional features as an alternative procedure. However, different compositional methods produce different predictions, and the effectiveness of any method is not well established. Results The ability of octamer frequency comparisons to detect genes artificially seeded in cyanobacterial genomes was markedly increased by using as a training set those genes that are highly conserved over all bacteria. Using a subset of octamer frequencies in such tests also increased effectiveness, but this depended on the specific target genome and the source of the contaminating genes. The presence of high frequency octamers and the GC content of the contaminating genes were important considerations. A method comprising best practices from these tests was devised, the Core Gene Similarity (CGS) method, and it performed better than simple octamer frequency analysis, codon bias, or GC contrasts in detecting seeded genes or naturally occurring transposons. From a comparison of predictions with phylogenetic trees, it appears that the effectiveness of the method is confined to horizontal transfer events that have occurred recently in evolutionary time. Conclusions The CGS method may be an improvement over existing surrogate methods to detect genes of foreign origin.
Collapse
Affiliation(s)
- Jeff Elhai
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA 23284, USA.
| | | | | |
Collapse
|
27
|
Xu L, Kuo J, Liu JK, Wong TY. Bacterial phylogenetic tree construction based on genomic translation stop signals. MICROBIAL INFORMATICS AND EXPERIMENTATION 2012; 2:6. [PMID: 22651236 PMCID: PMC3466146 DOI: 10.1186/2042-5783-2-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2012] [Accepted: 04/15/2012] [Indexed: 11/10/2022]
Abstract
Background The efficiencies of the stop codons TAA, TAG, and TGA in protein synthesis termination are not the same. These variations could allow many genes to be regulated. There are many similar nucleotide trimers found on the second and third reading-frames of a gene. They are called premature stop codons (PSC). Like stop codons, the PSC in bacterial genomes are also highly bias in terms of their quantities and qualities on the genes. Phylogenetically related species often share a similar PSC profile. We want to know whether the selective forces that influence the stop codons and the PSC usage biases in a genome are related. We also wish to know how strong these trimers in a genome are related to the natural history of the bacterium. Knowing these relations may provide better knowledge in the phylogeny of bacteria Results A 16SrRNA-alignment tree of 19 well-studied α-, β- and γ-Proteobacteria Type species is used as standard reference for bacterial phylogeny. The genomes of sixty-one bacteria, belonging to the α-, β- and γ-Proteobacteria subphyla, are used for this study. The stop codons and PSC are collectively termed “Translation Stop Signals” (TSS). A gene is represented by nine scalars corresponding to the numbers of counts of TAA, TAG, and TGA on each of the three reading-frames of that gene. “Translation Stop Signals Ratio” (TSSR) is the ratio between the TSS counts. Four types of TSSR are investigated. The TSSR-1, TSSR-2 and TSSR-3 are each a 3-scalar series corresponding respectively to the average ratio of TAA: TAG: TGA on the first, second, and third reading-frames of all genes in a genome. The Genomic-TSSR is a 9-scalar series representing the ratio of distribution of all TSS on the three reading-frames of all genes in a genome. Results show that bacteria grouped by their similarities based on TSSR-1, TSSR-2, or TSSR-3 values could only partially resolve the phylogeny of the species. However, grouping bacteria based on thier Genomic-TSSR values resulted in clusters of bacteria identical to those bacterial clusters of the reference tree. Unlike the 16SrRNA method, the Genomic-TSSR tree is also able to separate closely related species/strains at high resolution. Species and strains separated by the Genomic-TSSR grouping method are often in good agreement with those classified by other taxonomic methods. Correspondence analysis of individual genes shows that most genes in a bacterial genome share a similar TSSR value. However, within a chromosome, the Genic-TSSR values of genes near the replication origin region (Ori) are more similar to each other than those genes near the terminus region (Ter). Conclusion The translation stop signals on the three reading-frames of the genes on a bacterial genome are interrelated, possibly due to frequent off-frame recombination facilitated by translational-associated recombination (TSR). However, TSR may not occur randomly in a bacterial chromosome. Genes near the Ori region are often highly expressed and a bacterium always maintains multiple copies of Ori. Frequent collisions between DNA- polymerase and RNA-polymerase would create many DNA strand-breaks on the genes; whereas DNA strand-break induced homologues-recombination is more likely to take place between genes with similar sequence. Thus, localized recombination could explain why the TSSR of genes near the Ori region are more similar to each other. The quantity and quality of these TSS in a genome strongly reflect the natural history of a bacterium. We propose that the Genomic- TSSR can be used as a subjective biomarker to represent the phyletic status of a bacterium.
Collapse
Affiliation(s)
- Lijing Xu
- Department of Biological Sciences, Bioinformatics Program, The University of Memphis, Memphis, TN, USA
| | - Jimmy Kuo
- Department of Planning and Research, National Museum of Marine Biology and Aquarium, Pingtung, Taiwan
| | - Jong-Kang Liu
- Department of Biological Sciences, National Sun Yat-sen University, Kaohsiung, Taiwan
| | - Tit-Yee Wong
- Department of Biological Sciences, Bioinformatics Program, The University of Memphis, Memphis, TN, USA
| |
Collapse
|
28
|
Touzain F, Petit MA, Schbath S, El Karoui M. DNA motifs that sculpt the bacterial chromosome. Nat Rev Microbiol 2011; 9:15-26. [PMID: 21164534 DOI: 10.1038/nrmicro2477] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
During the bacterial cell cycle, the processes of chromosome replication, DNA segregation, DNA repair and cell division are coordinated by precisely defined events. Tremendous progress has been made in recent years in identifying the mechanisms that underlie these processes. A striking feature common to these processes is that non-coding DNA motifs play a central part, thus 'sculpting' the bacterial chromosome. Here, we review the roles of these motifs in the mechanisms that ensure faithful transmission of genetic information to daughter cells. We show how their chromosomal distribution is crucial for their function and how it can be analysed quantitatively. Finally, the potential roles of these motifs in bacterial chromosome evolution are discussed.
Collapse
Affiliation(s)
- Fabrice Touzain
- INRA, UMR 1319, Institut Micalis, FR-78352, Jouy-en-Josas, France
| | | | | | | |
Collapse
|
29
|
Matilla I, Alfonso C, Rivas G, Bolt EL, de la Cruz F, Cabezon E. The conjugative DNA translocase TrwB is a structure-specific DNA-binding protein. J Biol Chem 2010; 285:17537-44. [PMID: 20375020 DOI: 10.1074/jbc.m109.084137] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
TrwB is a DNA-dependent ATPase involved in DNA transport during bacterial conjugation. The protein presents structural similarity to hexameric molecular motors such as F(1)-ATPase, FtsK, or ring helicases, suggesting that TrwB also operates as a motor, using energy released from ATP hydrolysis to pump single-stranded DNA through its central channel. In this work, we have carried out an extensive analysis with various DNA substrates to determine the preferred substrate for TrwB. Oligonucleotides with G-rich sequences forming G4 DNA structures were the optimal substrates for TrwB ATPase activity. The protein bound with 100-fold higher affinity to G4 DNA than to single-stranded DNA of the same sequence. Moreover, TrwB formed oligomeric protein complexes only with oligonucleotides presenting such a G-quadruplex DNA structure, consistent with stoichiometry of six TrwB monomers to G4 DNA, as demonstrated by gel filtration chromatography and analytical ultracentrifugation experiments. A protein-DNA complex was also formed with unstructured oligonucleotides, but the molecular mass corresponded to one monomer protein bound to one oligonucleotide molecule. Sequences capable of forming G-quadruplex structures are widespread through genomes and are thought to play a biological function in transcriptional regulation. They form stable structures that can obstruct DNA replication, requiring the action of specific helicases to resolve them. Nevertheless, TrwB displayed no G4 DNA unwinding activity. These observations are discussed in terms of a possible role for TrwB in recognizing G-quadruplex structures as loading sites on the DNA.
Collapse
Affiliation(s)
- Inmaculada Matilla
- Departamento de Biología Molecular, Universidad de Cantabria, and Instituto de Biomedicina y Biotecnología de Cantabria, CSIC-UC-IDICAN, 39011 Santander, Spain
| | | | | | | | | | | |
Collapse
|
30
|
Szczepańska AK. Bacteriophage-encoded functions engaged in initiation of homologous recombination events. Crit Rev Microbiol 2010; 35:197-220. [PMID: 19563302 DOI: 10.1080/10408410902983129] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Recombination plays a significant role in bacteriophage biology. Functions promoting recombination are involved in key stages of phage multiplication and drive phage evolution. Their biological role is reflected by the great variety of phages existing in the environment. This work presents the role of recombination in the phage life cycle and highlights the discrete character of phage-encoded recombination functions (anti-RecBCD activities, 5' --> 3' DNA exonucleases, single-stranded DNA binding proteins, single-stranded DNA annealing proteins, and recombinases). The focus of this review is on phage proteins that initiate genetic exchange. Importance of recombination is reviewed based on the accepted coli-phages T4 and lambda models, the recombination system of phage P22, and the recently characterized recombination functions of Bacillus subtilis phage SPP1 and mycobacteriophage Che9c. Key steps of the molecular mechanisms involving phage recombination functions and their application in molecular engineering are discussed.
Collapse
Affiliation(s)
- Agnieszka K Szczepańska
- Department of Microbial Biochemistry, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, Poland.
| |
Collapse
|
31
|
RecBCD enzyme and the repair of double-stranded DNA breaks. Microbiol Mol Biol Rev 2009; 72:642-71, Table of Contents. [PMID: 19052323 DOI: 10.1128/mmbr.00020-08] [Citation(s) in RCA: 404] [Impact Index Per Article: 26.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
The RecBCD enzyme of Escherichia coli is a helicase-nuclease that initiates the repair of double-stranded DNA breaks by homologous recombination. It also degrades linear double-stranded DNA, protecting the bacteria from phages and extraneous chromosomal DNA. The RecBCD enzyme is, however, regulated by a cis-acting DNA sequence known as Chi (crossover hotspot instigator) that activates its recombination-promoting functions. Interaction with Chi causes an attenuation of the RecBCD enzyme's vigorous nuclease activity, switches the polarity of the attenuated nuclease activity to the 5' strand, changes the operation of its motor subunits, and instructs the enzyme to begin loading the RecA protein onto the resultant Chi-containing single-stranded DNA. This enzyme is a prototypical example of a molecular machine: the protein architecture incorporates several autonomous functional domains that interact with each other to produce a complex, sequence-regulated, DNA-processing machine. In this review, we discuss the biochemical mechanism of the RecBCD enzyme with particular emphasis on new developments relating to the enzyme's structure and DNA translocation mechanism.
Collapse
|
32
|
Using RSAT oligo-analysis and dyad-analysis tools to discover regulatory signals in nucleic sequences. Nat Protoc 2008; 3:1589-603. [PMID: 18802440 DOI: 10.1038/nprot.2008.98] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
This protocol explains how to discover functional signals in genomic sequences by detecting over- or under-represented oligonucleotides (words) or spaced pairs thereof (dyads) with the Regulatory Sequence Analysis Tools (http://rsat.ulb.ac.be/rsat/). Two typical applications are presented: (i) predicting transcription factor-binding motifs in promoters of coregulated genes and (ii) discovering phylogenetic footprints in promoters of orthologous genes. The steps of this protocol include purging genomic sequences to discard redundant fragments, discovering over-represented patterns and assembling them to obtain degenerate motifs, scanning sequences and drawing feature maps. The main strength of the method is its statistical ground: the binomial significance provides an efficient control on the rate of false positives. In contrast with optimization-based pattern discovery algorithms, the method supports the detection of under- as well as over-represented motifs. Computation times vary from seconds (gene clusters) to minutes (whole genomes). The execution of the whole protocol should take approximately 1 h.
Collapse
|
33
|
Duggin IG, Wake RG, Bell SD, Hill TM. The replication fork trap and termination of chromosome replication. Mol Microbiol 2008; 70:1323-33. [PMID: 19019156 DOI: 10.1111/j.1365-2958.2008.06500.x] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Bacteria that have a circular chromosome with a bidirectional DNA replication origin are thought to utilize a 'replication fork trap' to control termination of replication. The fork trap is an arrangement of replication pause sites that ensures that the two replication forks fuse within the terminus region of the chromosome, approximately opposite the origin on the circular map. However, the biological significance of the replication fork trap has been mysterious, as its inactivation has no obvious consequence. Here we review the research that led to the replication fork trap theory, and we aim to integrate several recent findings that contribute towards an understanding of the physiological roles of the replication fork trap. Likely roles include the prevention of over-replication, and the optimization of post-replicative mechanisms of chromosome segregation, such as that involving FtsK in Escherichia coli.
Collapse
Affiliation(s)
- Iain G Duggin
- Sir William Dunn School of Pathology, University of Oxford, Oxford OX1 3RE, UK.
| | | | | | | |
Collapse
|
34
|
Sernova NV, Gelfand MS. Identification of replication origins in prokaryotic genomes. Brief Bioinform 2008; 9:376-91. [PMID: 18660512 DOI: 10.1093/bib/bbn031] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The availability of hundreds of complete bacterial genomes has created new challenges and simultaneously opportunities for bioinformatics. In the area of statistical analysis of genomic sequences, the studies of nucleotide compositional bias and gene bias between strands and replichores paved way to the development of tools for prediction of bacterial replication origins. Only a few (about 20) origin regions for eubacteria and archaea have been proven experimentally. One reason for that may be that this is now considered as an essentially bioinformatics problem, where predictions are sufficiently reliable not to run labor-intensive experiments, unless specifically needed. Here we describe the main existing approaches to the identification of replication origin (oriC) and termination (terC) loci in prokaryotic chromosomes and characterize a number of computational tools based on various skew types and other types of evidence. We also classify the eubacterial and archaeal chromosomes by predictability of their replication origins using skew plots. Finally, we discuss possible combined approaches to the identification of the oriC sites that may be used to improve the prediction tools, in particular, the analysis of DnaA binding sites using the comparative genomic methods.
Collapse
Affiliation(s)
- Natalia V Sernova
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Bolshoi Karetny pereulok, 19, Moscow, 127994, Russia
| | | |
Collapse
|
35
|
Esnault E, Valens M, Espéli O, Boccard F. Chromosome structuring limits genome plasticity in Escherichia coli. PLoS Genet 2008; 3:e226. [PMID: 18085828 PMCID: PMC2134941 DOI: 10.1371/journal.pgen.0030226] [Citation(s) in RCA: 93] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2007] [Accepted: 11/06/2007] [Indexed: 11/22/2022] Open
Abstract
Chromosome organizations of related bacterial genera are well conserved despite a very long divergence period. We have assessed the forces limiting bacterial genome plasticity in Escherichia coli by measuring the respective effect of altering different parameters, including DNA replication, compositional skew of replichores, coordination of gene expression with DNA replication, replication-associated gene dosage, and chromosome organization into macrodomains. Chromosomes were rearranged by large inversions. Changes in the compositional skew of replichores, in the coordination of gene expression with DNA replication or in the replication-associated gene dosage have only a moderate effect on cell physiology because large rearrangements inverting the orientation of several hundred genes inside a replichore are only slightly detrimental. By contrast, changing the balance between the two replication arms has a more drastic effect, and the recombinational rescue of replication forks is required for cell viability when one of the chromosome arms is less than half than the other one. Macrodomain organization also appears to be a major factor restricting chromosome plasticity, and two types of inverted configurations severely affect the cell cycle. First, the disruption of the Ter macrodomain with replication forks merging far from the normal replichore junction provoked chromosome segregation defects. The second major problematic configurations resulted from inversions between Ori and Right macrodomains, which perturb nucleoid distribution and early steps of cytokinesis. Consequences for the control of the bacterial cell cycle and for the evolution of bacterial chromosome configuration are discussed. Genomic analyses have revealed that bacterial genomes are dynamic entities that evolve through various processes including intrachromosome genetic rearrangements, gene duplication, and gene loss or acquisition by gene transfer. Nevertheless, comparison of bacterial chromosomes from related genera revealed a conservation of genetic organization. Most bacterial genomes are circular molecules, and DNA replication proceeds bidirectionally from a single origin to an opposite region where replication forks meet. The replication process imprints the bacterial chromosome because initiation and termination at defined loci result in strand biases due to the mutational differences occurring during leading and lagging strands synthesis. We analyze the strength of different parameters that may limit genome plasticity. We show that the preferential positioning of essential genes on the leading strand, the proximity of genes involved in transcription and translation to the origin of replication on the leading strand, and the presence of biased motifs along the replichores operate only as long-term positive selection determinants. By contrast, selection operates to maintain replication arms of similar lengths. Finally, we demonstrate that spatial structuring of the chromosome impedes strongly genome plasticity. Genetic evidence supports the presence of two steps in the cell cycle controlled by the spatial organization of the chromosome.
Collapse
Affiliation(s)
- Emilie Esnault
- Centre de Génétique Moléculaire du CNRS, 91198 Gif-sur-Yvette, France
| | - Michèle Valens
- Centre de Génétique Moléculaire du CNRS, 91198 Gif-sur-Yvette, France
| | - Olivier Espéli
- Centre de Génétique Moléculaire du CNRS, 91198 Gif-sur-Yvette, France
| | - Frédéric Boccard
- Centre de Génétique Moléculaire du CNRS, 91198 Gif-sur-Yvette, France
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
36
|
Identification of DNA motifs implicated in maintenance of bacterial core genomes by predictive modeling. PLoS Genet 2007; 3:1614-21. [PMID: 17941709 PMCID: PMC1976330 DOI: 10.1371/journal.pgen.0030153] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2007] [Accepted: 07/23/2007] [Indexed: 12/19/2022] Open
Abstract
Bacterial biodiversity at the species level, in terms of gene acquisition or loss, is so immense that it raises the question of how essential chromosomal regions are spared from uncontrolled rearrangements. Protection of the genome likely depends on specific DNA motifs that impose limits on the regions that undergo recombination. Although most such motifs remain unidentified, they are theoretically predictable based on their genomic distribution properties. We examined the distribution of the "crossover hotspot instigator," or Chi, in Escherichia coli, and found that its exceptional distribution is restricted to the core genome common to three strains. We then formulated a set of criteria that were incorporated in a statistical model to search core genomes for motifs potentially involved in genome stability in other species. Our strategy led us to identify and biologically validate two distinct heptamers that possess Chi properties, one in Staphylococcus aureus, and the other in several streptococci. This strategy paves the way for wide-scale discovery of other important functional noncoding motifs that distinguish core genomes from the strain-variable regions.
Collapse
|
37
|
Fall S, Mercier A, Bertolla F, Calteau A, Gueguen L, Perrière G, Vogel TM, Simonet P. Horizontal gene transfer regulation in bacteria as a "spandrel" of DNA repair mechanisms. PLoS One 2007; 2:e1055. [PMID: 17957239 PMCID: PMC2013936 DOI: 10.1371/journal.pone.0001055] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2007] [Accepted: 10/02/2007] [Indexed: 12/01/2022] Open
Abstract
Horizontal gene transfer (HGT) is recognized as the major force for bacterial genome evolution. Yet, numerous questions remain about the transferred genes, their function, quantity and frequency. The extent to which genetic transformation by exogenous DNA has occurred over evolutionary time was initially addressed by an in silico approach using the complete genome sequence of the Ralstonia solanacearum GMI1000 strain. Methods based on phylogenetic reconstruction of prokaryote homologous genes families detected 151 genes (13.3%) of foreign origin in the R. solanacearum genome and tentatively identified their bacterial origin. These putative transfers were analyzed in comparison to experimental transformation tests involving 18 different genomic DNA positions in the genome as sites for homologous or homeologous recombination. Significant transformation frequency differences were observed among these positions tested regardless of the overall genomic divergence of the R. solanacearum strains tested as recipients. The genomic positions containing the putative exogenous DNA were not systematically transformed at the highest frequencies. The two genomic “hot spots”, which contain recA and mutS genes, exhibited transformation frequencies from 2 to more than 4 orders of magnitude higher than positions associated with other genes depending on the recipient strain. These results support the notion that the bacterial cell is equipped with active mechanisms to modulate acquisition of new DNA in different genomic positions. Bio-informatics study correlated recombination “hot-spots” to the presence of Chi-like signature sequences with which recombination might be preferentially initiated. The fundamental role of HGT is certainly not limited to the critical impact that the very rare foreign genes acquired mainly by chance can have on the bacterial adaptation potential. The frequency to which HGT with homologous and homeologous DNA happens in the environment might have led the bacteria to hijack DNA repair mechanisms in order to generate genetic diversity without losing too much genomic stability.
Collapse
Affiliation(s)
- Saliou Fall
- Environmental Microbial Genomics Group, Laboratoire AMPERE UMR CNRS 5005, Ecole Centrale de Lyon et Université de Lyon, Ecully, France
| | - Anne Mercier
- Ecologie Microbienne, UMR CNRS 5557, Université Claude Bernard–Lyon 1, Villeurbanne, France
| | - Franck Bertolla
- Ecologie Microbienne, UMR CNRS 5557, Université Claude Bernard–Lyon 1, Villeurbanne, France
| | - Alexandra Calteau
- Laboratoire de Biométrie et Biologie Évolutive, UMR CNRS 5558, Université Claude Bernard–Lyon 1, Villeurbanne, France
| | - Laurent Gueguen
- Laboratoire de Biométrie et Biologie Évolutive, UMR CNRS 5558, Université Claude Bernard–Lyon 1, Villeurbanne, France
| | - Guy Perrière
- Laboratoire de Biométrie et Biologie Évolutive, UMR CNRS 5558, Université Claude Bernard–Lyon 1, Villeurbanne, France
| | - Timothy M. Vogel
- Environmental Microbial Genomics Group, Laboratoire AMPERE UMR CNRS 5005, Ecole Centrale de Lyon et Université de Lyon, Ecully, France
| | - Pascal Simonet
- Environmental Microbial Genomics Group, Laboratoire AMPERE UMR CNRS 5005, Ecole Centrale de Lyon et Université de Lyon, Ecully, France
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
38
|
Touchon M, Rocha EPC. From GC skews to wavelets: a gentle guide to the analysis of compositional asymmetries in genomic data. Biochimie 2007; 90:648-59. [PMID: 17988781 DOI: 10.1016/j.biochi.2007.09.015] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2007] [Accepted: 09/21/2007] [Indexed: 12/29/2022]
Abstract
Compositional asymmetries are pervasive in DNA sequences. They are the result of the asymmetric interactions between DNA and cellular mechanisms such as replication and transcription. Here, we review many of the methods that have been proposed over the years to analyse compositional asymmetries in DNA sequences. Among these we list GC skews, oligonucleotide skews and wavelets, which among other uses have been extensively employed to delimitate origins and termini of replication in genomes. We also review the use of multivariate methods, such as factorial correspondence analysis, discriminant analysis and analysis of variance, which allow assigning compositional strand asymmetries to the different biological processes shaping sequence composition. Finally, we review methods that have been used to infer substitution matrices and allow understanding the mutational processes underlying strand asymmetry. We focus on replication asymmetries because they have been more thoroughly studied, but the methods may be adapted, and often are, to other problems. Although strand asymmetry has been studied more frequently through compositional skews of nucleotides or oligonucleotides, we recall that, depending on the goal of the analysis, other methods may be more appropriate to answer certain biological questions. We also refer to programs freely available to analyse strand asymmetry.
Collapse
Affiliation(s)
- Marie Touchon
- Atelier de Bioinformatique, Université Pierre et Marie Curie-Paris 6, Paris, France
| | | |
Collapse
|
39
|
Abstract
The study of chromosome segregation in bacteria has gained strong insights from the use of cytology techniques. A global view of chromosome choreography during the cell cycle is emerging, highlighting as a next challenge the description of the molecular mechanisms and factors involved. Here, we review one of such factor, the FtsK DNA translocase. FtsK couples segregation of the chromosome terminus, the ter region, with cell division. It is a powerful and fast translocase that reads chromosome polarity to find the end, thereby sorting sister ter regions on either side of the division septum, and activating the last steps of segregation. Recent data have revealed the structure of the FtsK motor, how translocation is oriented by specific DNA motifs, termed KOPS, and suggests novel mechanisms for translocation and sensing chromosome polarity.
Collapse
Affiliation(s)
- Sarah Bigot
- Laboratoire de Microbiologie et de Génétique Moléculaire du CNRS, Université Paul Sabatier--Toulouse III, 118 route de Narbonne, 31062 Toulouse Cedex, France.
| | | | | | | | | |
Collapse
|
40
|
Robin S, Schbath S, Vandewalle V. Statistical tests to compare motif count exceptionalities. BMC Bioinformatics 2007; 8:84. [PMID: 17346349 PMCID: PMC1838430 DOI: 10.1186/1471-2105-8-84] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2006] [Accepted: 03/08/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing the exceptionality of one motif in two different sequences. Just comparing the motif count p-values in each sequence is indeed not sufficient to decide if this motif is significantly more exceptional in one sequence compared to the other one. A statistical test is required. RESULTS We develop and analyze two statistical tests, an exact binomial one and an asymptotic likelihood ratio test, to decide whether the exceptionality of a given motif is equivalent or significantly different in two sequences of interest. For that purpose, motif occurrences are modeled by Poisson processes, with a special care for overlapping motifs. Both tests can take the sequence compositions into account. As an illustration, we compare the octamer exceptionalities in the Escherichia coli K-12 backbone versus variable strain-specific loops. CONCLUSION The exact binomial test is particularly adapted for small counts. For large counts, we advise to use the likelihood ratio test which is asymptotic but strongly correlated with the exact binomial test and very simple to use.
Collapse
Affiliation(s)
- Stéphane Robin
- INA PG/ENGREF/INRA, UMR518 Unité Mathématiques et Informatique Appliquées, 75005 Paris, France
| | - Sophie Schbath
- INRA, UR1077 Unité Mathématique, Informatique et Génome, 78350 Jouy-en-Josas, France
| | - Vincent Vandewalle
- INA PG/ENGREF/INRA, UMR518 Unité Mathématiques et Informatique Appliquées, 75005 Paris, France
| |
Collapse
|
41
|
Arakawa K, Uno R, Nakayama Y, Tomita M. Validating the significance of genomic properties of Chi sites from the distribution of all octamers in Escherichia coli. Gene 2007; 392:239-46. [PMID: 17270364 DOI: 10.1016/j.gene.2006.12.022] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2006] [Revised: 12/15/2006] [Accepted: 12/18/2006] [Indexed: 10/23/2022]
Abstract
Chi sites (5'-GCTGGTGG-3') are homologous recombinational hotspot octamer sequences, which attenuate the exonuclease activity of RecBCD in Escherichia coli. They are overrepresented in the genome (1008 occurrences), preferentially located within coding regions (98%), oriented in the direction of replication (75%), and occur most commonly on the mRNA-synonymous sense strand of the double helix (79%). Previous statistical studies of the genome sequence suggested that these genomic properties of Chi sites appear to be related to their role in recombinational repair and therefore to replication and transcription. In this study, we employ three mathematical models to predict the properties of Chi sites from single nucleotide and multi-nucleotide compositions, and validate them statistically using the distribution of all octamer sequences in the entire genome, or exclusively within ORFs. The model based on the overall distribution of all octamers provided better predictions than the single nucleotide composition model, and the ORF and sense strand preference of Chi sites were shown to be within the standard deviation of all octamers. In contrast, the orientation bias of the Chi sites in the direction of replication was significant, although the bias was not as pronounced as with the single nucleotide composition model, suggesting a selective pressure related to the role of RecBCD in replication.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan
| | | | | | | |
Collapse
|
42
|
Fearnhead P, Sherlock C. An exact Gibbs sampler for the Markov-modulated Poisson process. J R Stat Soc Series B Stat Methodol 2006. [DOI: 10.1111/j.1467-9868.2006.00566.x] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
43
|
Rocha EPC, Touchon M, Feil EJ. Similar compositional biases are caused by very different mutational effects. Genome Res 2006; 16:1537-47. [PMID: 17068325 PMCID: PMC1665637 DOI: 10.1101/gr.5525106] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Compositional replication strand bias, commonly referred to as GC skew, is present in many genomes of prokaryotes, eukaryotes, and viruses. Although cytosine deamination in ssDNA (resulting in C-->T changes on the leading strand) is often invoked as its major cause, the precise contributions of this and other substitution types are currently unknown. It is also unclear if the underlying mutational asymmetries are the same among taxa, are stable over time, or how closely the observed biases are to mutational equilibrium. We analyzed nearly neutral sites of seven taxa each with between three and six complete bacterial genomes, and inferred the substitution spectra of fourfold degenerate positions in nonhighly expressed genes. Using a bootstrap procedure, we extracted compositional biases associated with replication and identified the significant asymmetries. Although all taxa showed an overrepresentation of G relative to C on the leading strand (and imbalances between A and T), widely variable substitution asymmetries are noted. Surprisingly, all substitution types show significant asymmetry in at least one taxon, but none were universally biased in all taxa. Notably, in the two most biased genomes, A-->G, rather than C-->T, shapes the compositional bias. Given the variability in these biases, we propose that the process is multifactorial. Finally, we also find that most genomes are not at compositional equilibrium, and suggest that mutational-based heterotachy is deeply imprinted in the history of biological macromolecules. This shows that similar compositional biases associated with the same essential well-conserved process, replication, do not reflect similar mutational processes in different genomes, and that caution is required in inferring the roles of specific mutational biases on the basis of contemporary patterns of sequence composition.
Collapse
Affiliation(s)
- Eduardo P C Rocha
- Unité Génétique des Génomes Bactériens, URA 2171, Institut Pasteur, 75015 Paris, France.
| | | | | |
Collapse
|
44
|
Jensen RB. Analysis of the terminus region of the Caulobacter crescentus chromosome and identification of the dif site. J Bacteriol 2006; 188:6016-9. [PMID: 16885470 PMCID: PMC1540080 DOI: 10.1128/jb.00330-06] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The terminus region of the Caulobacter crescentus chromosome and the dif chromosome dimer resolution site were characterized. The Caulobacter genome contains skewed sequences that abruptly switch strands at dif and may have roles in chromosome maintenance and segregation. Absence of dif or the XerCD recombinase results in a chromosome segregation defect. The Caulobacter terminus region is unusual, since it contains many essential or highly expressed genes.
Collapse
Affiliation(s)
- Rasmus B Jensen
- Department of Life Sciences and Chemistry, Roskilde University, Universitetsvej 1, DK-4000 Roskilde, Denmark.
| |
Collapse
|
45
|
Uno R, Nakayama Y, Tomita M. Over-representation of Chi sequences caused by di-codon increase in Escherichia coli K-12. Gene 2006; 380:30-7. [PMID: 16854534 DOI: 10.1016/j.gene.2006.05.013] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2005] [Revised: 04/20/2006] [Accepted: 05/09/2006] [Indexed: 11/17/2022]
Abstract
Chi sequences (5'-GCTGGTGG-3') are cis-acting 8 bp sequence elements that enhance homologous recombination promoted by the RecBCD pathway in Escherichia coli. The genome of E. coli K-12 MG1655 contains 1009 Chi sequences and this frequency far exceeds the expected value for occurrence of an 8 bp sequence in a genome of this size. It is generally thought that the over-representation of Chi sequences indicates that they have been selected for during evolution because of their function in recombination. The genes from three E. coli strains (K-12, O157 and CFT) were classified into three categories (island, match to other E. coli, and backbone). Island genes have a different base composition and codon usage in comparison with those in the backbone genes, therefore they were relatively new and not yet adapted to the base composition patterns and codon usage typical of the recipient genome. The over-representation of Chi sequences was examined by comparing Chi frequencies and codon frequencies between island and backbone genes. The difference in the CTGGTG di-codon frequency between the backbone and island genes was correlated with the frequency of Chi sequences which were translated in the Leu-Val (-G/CTG/GTG/G-) reading frame in the K-12 strain. These results suggest that the main reading frame of Chi sequences increased as a result of the di-codon CTG-GTG increasing under a genome-wide pressure for adapting to the codon usage and base composition of the E. coli K-12 strain, and that the RecBCD recombinase might adjust its recognition sequence to a frequently occurring oligomer such as G-CTG-GTG-G.
Collapse
Affiliation(s)
- Reina Uno
- Institute for Advanced Biosciences, Keio University, Tsuruoka, 997-0014, Japan.
| | | | | |
Collapse
|
46
|
Hendrickson H, Lawrence JG. Selection for Chromosome Architecture in Bacteria. J Mol Evol 2006; 62:615-29. [PMID: 16612541 DOI: 10.1007/s00239-005-0192-2] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2005] [Accepted: 12/31/2005] [Indexed: 02/04/2023]
Abstract
Bacterial chromosomes are immense polymers whose faithful replication and segregation are crucial to cell survival. The ability of proteins such as FtsK to move unidirectionally toward the replication terminus, and direct DNA translocation into the appropriate daughter cell during cell division, requires that bacterial genomes maintain an architecture for the orderly replication and segregation of chromosomes. We suggest that proteins that locate the replication terminus exploit strand-biased sequences that are overrepresented on one DNA strand, and that selection increases with decreased distance to the replication terminus. We report a generalized method for detecting these architecture imparting sequences (AIMS) and have identified AIMS in nearly all bacterial genomes. Their increased abundance on leading strands and decreased abundance on lagging strands toward replication termini are not the result of changes in mutational bias; rather, they reflect a gradient of long-term positive selection for AIMS. The maintenance of the pattern of AIMS across the genomes of related bacteria independent of their positions within individual genes suggests a well-conserved role in genome biology. The stable gradient of AIMS abundance from replication origin to terminus suggests that the replicore acts as a target of selection, where selection for chromosome architecture results in the maintenance of gene order and in the lack of high-frequency DNA inversion within replicores.
Collapse
Affiliation(s)
- Heather Hendrickson
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | | |
Collapse
|
47
|
Bigot S, Saleh OA, Lesterlin C, Pages C, El Karoui M, Dennis C, Grigoriev M, Allemand JF, Barre FX, Cornet F. KOPS: DNA motifs that control E. coli chromosome segregation by orienting the FtsK translocase. EMBO J 2005; 24:3770-80. [PMID: 16211009 PMCID: PMC1276719 DOI: 10.1038/sj.emboj.7600835] [Citation(s) in RCA: 151] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2005] [Accepted: 09/14/2005] [Indexed: 11/09/2022] Open
Abstract
Bacterial chromosomes are organized in replichores of opposite sequence polarity. This conserved feature suggests a role in chromosome dynamics. Indeed, sequence polarity controls resolution of chromosome dimers in Escherichia coli. Chromosome dimers form by homologous recombination between sister chromosomes. They are resolved by the combined action of two tyrosine recombinases, XerC and XerD, acting at a specific chromosomal site, dif, and a DNA translocase, FtsK, which is anchored at the division septum and sorts chromosomal DNA to daughter cells. Evidences suggest that DNA motifs oriented from the replication origin towards dif provide FtsK with the necessary information to faithfully distribute chromosomal DNA to either side of the septum, thereby bringing the dif sites together at the end of this process. However, the nature of the DNA motifs acting as FtsK orienting polar sequences (KOPS) was unknown. Using genetics, bioinformatics and biochemistry, we have identified a family of DNA motifs in the E. coli chromosome with KOPS activity.
Collapse
Affiliation(s)
- Sarah Bigot
- LMGM, CNRS, 118, route de Narbonne, Toulouse, France
| | | | | | - Carine Pages
- LMGM, CNRS, 118, route de Narbonne, Toulouse, France
| | | | | | | | | | - François-Xavier Barre
- LMGM, CNRS, 118, route de Narbonne, Toulouse, France
- CGM, CNRS, Avenue de la Terrasse, 91198 Gif-sur-Yvette, France. Tel.: +33 169 82 32 24; Fax: +33 169 82 31 60; E-mail:
| | - François Cornet
- LMGM, CNRS, 118, route de Narbonne, Toulouse, France
- LMGM, CNRS, 118, route de Narbonne, 31062 Toulouse Cedex, France. Tel.: +33 561 335 986; Fax: +33 561 335 886; E-mail:
| |
Collapse
|
48
|
Rocha EPC, Cornet E, Michel B. Comparative and evolutionary analysis of the bacterial homologous recombination systems. PLoS Genet 2005; 1:e15. [PMID: 16132081 PMCID: PMC1193525 DOI: 10.1371/journal.pgen.0010015] [Citation(s) in RCA: 237] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2005] [Accepted: 06/09/2005] [Indexed: 11/18/2022] Open
Abstract
Homologous recombination is a housekeeping process involved in the maintenance of chromosome integrity and generation of genetic variability. Although detailed biochemical studies have described the mechanism of action of its components in model organisms, there is no recent extensive assessment of this knowledge, using comparative genomics and taking advantage of available experimental data on recombination. Using comparative genomics, we assessed the diversity of recombination processes among bacteria, and simulations suggest that we missed very few homologs. The work included the identification of orthologs and the analysis of their evolutionary history and genomic context. Some genes, for proteins such as RecA, the resolvases, and RecR, were found to be nearly ubiquitous, suggesting that the large majority of bacterial genomes are capable of homologous recombination. Yet many genomes show incomplete sets of presynaptic systems, with RecFOR being more frequent than RecBCD/AddAB. There is a significant pattern of co-occurrence between these systems and antirecombinant proteins such as the ones of mismatch repair and SbcB, but no significant association with nonhomologous end joining, which seems rare in bacteria. Surprisingly, a large number of genomes in which homologous recombination has been reported lack many of the enzymes involved in the presynaptic systems. The lack of obvious correlation between the presence of characterized presynaptic genes and experimental data on the frequency of recombination suggests the existence of still-unknown presynaptic mechanisms in bacteria. It also indicates that, at the moment, the assessment of the intrinsic stability or recombination isolation of bacteria in most cases cannot be inferred from the identification of known recombination proteins in the genomes. Genomes evolve mostly by modifications involving large pieces of genetic material (DNA). Exchanges of chromosome pieces between different organisms as well as intragenomic movements of DNA regions are the result of a process named homologous recombination. The central actor of this process, the RecA protein, is amazingly conserved from bacteria to human. In addition to its role in the generation of genetic variability, homologous recombination is also the guardian of genome integrity, as it acts to repair DNA damage. RecA-catalyzed DNA exchange (synapse) is facilitated by the action of presynaptic enzymes and completed by postsynaptic enzymes (resolvases). In addition, some enzymes counteract RecA. Here, the researchers assess the diversity of recombination proteins among 117 different bacterial species. They find that resolvases are nearly as ubiquitous and as well conserved at the sequence level as RecA. This suggests that the large majority of bacterial genomes are capable of homologous recombination. Presynaptic systems are less ubiquitous, and there is no obvious correlation between their presence and experimental data on the frequency of recombination. However, there is a significant pattern of co-occurrence between these systems and antirecombinant proteins.
Collapse
Affiliation(s)
- Eduardo P C Rocha
- Unité Génétique des Génomes Bactériens, Institut Pasteur, Paris, France.
| | | | | |
Collapse
|
49
|
Abstract
There are clear theoretical reasons and many well-documented examples which show that repetitive, DNA is essential for genome function. Generic repeated signals in the DNA are necessary to format expression of unique coding sequence files and to organise additional functions essential for genome replication and accurate transmission to progeny cells. Repetitive DNA sequence elements are also fundamental to the cooperative molecular interactions forming nucleoprotein complexes. Here, we review the surprising abundance of repetitive DNA in many genomes, describe its structural diversity, and discuss dozens of cases where the functional importance of repetitive elements has been studied in molecular detail. In particular, the fact that repeat elements serve either as initiators or boundaries for heterochromatin domains and provide a significant fraction of scaffolding/matrix attachment regions (S/MARs) suggests that the repetitive component of the genome plays a major architectonic role in higher order physical structuring. Employing an information science model, the 'functionalist' perspective on repetitive DNA leads to new ways of thinking about the systemic organisation of cellular genomes and provides several novel possibilities involving repeat elements in evolutionarily significant genome reorganisation. These ideas may facilitate the interpretation of comparisons between sequenced genomes, where the repetitive DNA component is often greater than the coding sequence component.
Collapse
Affiliation(s)
- James A Shapiro
- Department of Biochemistry and Molecular Biology, University of Chicago, 920 E. 58th Street, Chicago, IL 60637, USA.
| | | |
Collapse
|
50
|
Abstract
Statistics on Markov chains are widely used for the study of patterns in biological sequences. Statistics on these models can be done through several approaches. Central limit theorem (CLT) producing Gaussian approximations are one of the most popular ones. Unfortunately, in order to find a pattern of interest, these methods have to deal with tail distribution events where CLT is especially bad. In this paper, we propose a new approach based on the large deviations theory to assess pattern statistics. We first recall theoretical results for empiric mean (level 1) as well as empiric distribution (level 2) large deviations on Markov chains. Then, we present the applications of these results focusing on numerical issues. LD-SPatt is the name of GPL software implementing these algorithms. We compare this approach to several existing ones in terms of complexity and reliability and show that the large deviations are more reliable than the Gaussian approximations in absolute values as well as in terms of ranking and are at least as reliable as compound Poisson approximations. We then finally discuss some further possible improvements and applications of this new method.
Collapse
Affiliation(s)
- G Nuel
- Laboratoire Statistique et Génome, Tour Evry 2, 523 place des terasses, 91034 Evry, France.
| |
Collapse
|