1
|
Parisi G, Palopoli N, Tosatto SC, Fornasari MS, Tompa P. "Protein" no longer means what it used to. Curr Res Struct Biol 2021; 3:146-152. [PMID: 34308370 PMCID: PMC8283027 DOI: 10.1016/j.crstbi.2021.06.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 06/18/2021] [Accepted: 06/22/2021] [Indexed: 01/02/2023] Open
Abstract
Every biologist knows that the word protein describes a group of macromolecules essential to sustain life on Earth. As biologists, we are invariably trained under a protein paradigm established since the early twentieth century. However, in recent years, the term protein unveiled itself as an euphemism to describe the overwhelming heterogeneity of these compounds. Most of our current studies are targeted on carefully selected subsets of proteins, but we tend to think and write about these as representative of the whole population. Here we discuss how seeking for universal definitions and general rules in any arbitrarily segmented study would be misleading about the conclusions. Of course, it is not our purpose to discourage the use of the word protein. Instead, we suggest to embrace the extended universe of proteins to reach a deeper understanding of their full potential, realizing that the term encompasses a group of molecules very heterogeneous in terms of size, shape, chemistry and functions, i.e. the term protein no longer means what it used to.
Collapse
Affiliation(s)
- Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | - Nicolas Palopoli
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | | | - María Silvina Fornasari
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | - Peter Tompa
- VIB-VUB Center for Structural Biology (CSB), Brussels, Belgium
- Structural Biology Brussels (SBB), Vrije Universiteit Brussel (VUB), Brussels, Belgium
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| |
Collapse
|
2
|
Danchin A. Three overlooked key functional classes for building up minimal synthetic cells. Synth Biol (Oxf) 2021; 6:ysab010. [PMID: 35174295 PMCID: PMC8842674 DOI: 10.1093/synbio/ysab010] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 04/09/2021] [Accepted: 04/12/2021] [Indexed: 12/14/2022] Open
Abstract
Assembly of minimal genomes revealed many genes encoding unknown functions. Three overlooked functional categories account for some of them. Cells are prone to make errors and age. As a first key function, discrimination between proper and changed entities is indispensable. Discrimination requires management of information, an authentic, yet abstract, currency of reality. For example proteins age, sometimes very fast. The cell must identify, then get rid of old proteins without destroying young ones. Implementing discrimination in cells leads to the second set of functions, usually ignored. Being abstract, information must nevertheless be embodied into material entities, with unavoidable idiosyncratic properties. This brings about novel unmet needs. Hence, the buildup of cells elicits specific but awkward material implementations, ‘kludges’ that become essential under particular settings, while difficult to identify. Finally, a third functional category characterizes the need for growth, with metabolic implementations allowing the cell to put together the growth of its cytoplasm, membranes, and genome, spanning different spatial dimensions. Solving this metabolic quandary, critical for engineering novel synthetic biology chassis, uncovered an unexpected role for CTP synthetase as the coordinator of nonhomothetic growth. Because a significant number of SynBio constructs aim at creating cell factories we expect that they will be attacked by viruses (it is not by chance that the function of the CRISPR system was identified in industrial settings). Substantiating the role of CTP, natural selection has dealt with this hurdle via synthesis of the antimetabolite 3′‐deoxy‐3′,4′‐didehydro‐CTP, recruited for antiviral immunity in all domains of life.
Collapse
Affiliation(s)
- Antoine Danchin
- Kodikos Labs/Stellate Therapeutics, Institut Cochin, Paris, France
- School of Biomedical Sciences, Li KaShing Faculty of Medicine, Hong Kong University, Pokfulam, SAR Hong Kong, China
| |
Collapse
|
3
|
|
4
|
de Lorenzo V, Sekowska A, Danchin A. Chemical reactivity drives spatiotemporal organisation of bacterial metabolism. FEMS Microbiol Rev 2014; 39:96-119. [PMID: 25227915 DOI: 10.1111/1574-6976.12089] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
In this review, we examine how bacterial metabolism is shaped by chemical constraints acting on the material and dynamic layout of enzymatic networks and beyond. These are moulded not only for optimisation of given metabolic objectives (e.g. synthesis of a particular amino acid or nucleotide) but also for curbing the detrimental reactivity of chemical intermediates. Besides substrate channelling, toxicity is avoided by barriers to free diffusion (i.e. compartments) that separate otherwise incompatible reactions, along with ways for distinguishing damaging vs. harmless molecules. On the other hand, enzymes age and their operating lifetime must be tuned to upstream and downstream reactions. This time dependence of metabolic pathways creates time-linked information, learning and memory. These features suggest that the physical structure of existing biosystems, from operon assemblies to multicellular development may ultimately stem from the need to restrain chemical damage and limit the waste inherent to basic metabolic functions. This provides a new twist of our comprehension of fundamental biological processes in live systems as well as practical take-home lessons for the forward DNA-based engineering of novel biological objects.
Collapse
Affiliation(s)
- Víctor de Lorenzo
- Systems Biology Program, Centro Nacional de Biotecnología CSIC, Cantoblanco-Madrid, Spain
| | - Agnieszka Sekowska
- AMAbiotics SAS, Institut du Cerveau et de la Moëlle Épinière, Hôpital de la Pitié-Salpêtrière, Paris, France
| | - Antoine Danchin
- AMAbiotics SAS, Institut du Cerveau et de la Moëlle Épinière, Hôpital de la Pitié-Salpêtrière, Paris, France
| |
Collapse
|
5
|
Norris V, Merieau A. Plasmids as scribbling pads for operon formation and propagation. Res Microbiol 2013; 164:779-87. [PMID: 23587635 DOI: 10.1016/j.resmic.2013.04.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2013] [Accepted: 04/01/2013] [Indexed: 12/31/2022]
Abstract
Many bacterial genes are in operons and the process whereby operons are formed is therefore fundamental. To help elucidate this process, we propose in the Scribbling Pad hypothesis that bacteria have been constantly using plasmids for genetic experimentation and, in particular, for the construction of operons. This hypothesis simultaneously solves the problems of the creation of operons and the way operons are propagated. We cite results in the literature to support the hypothesis and make experimental predictions to test it.
Collapse
Affiliation(s)
- Vic Norris
- Theoretical Biology Unit, Department of Biology, University of Rouen, 76821 Mont Saint Aignan cedex, France.
| | | |
Collapse
|
6
|
Ahmad T, Sablok G, Tatarinova TV, Xu Q, Deng XX, Guo WW. Evaluation of codon biology in citrus and Poncirus trifoliata based on genomic features and frame corrected expressed sequence tags. DNA Res 2013; 20:135-50. [PMID: 23315666 PMCID: PMC3628444 DOI: 10.1093/dnares/dss039] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Citrus, as one of the globally important fruit trees, has been an object of interest for understanding genetics and evolutionary process in fruit crops. Meta-analyses of 19 Citrus species, including 4 globally and economically important Citrus sinensis, Citrus clementina, Citrus reticulata, and 1 Citrus relative Poncirus trifoliata, were performed. We observed that codons ending with A- or T- at the wobble position were preferred in contrast to C- or G- ending codons, indicating a close association with AT richness of Citrus species and P. trifoliata. The present study postulates a large repertoire of a set of optimal codons for the Citrus genus and P. trifoliata and demonstrates that GCT and GGT are evolutionary conserved optimal codons. Our observation suggested that mutational bias is the dominating force in shaping the codon usage bias (CUB) in Citrus and P. trifoliata. Correspondence analysis (COA) revealed that the principal axis [axis 1; COA/relative synonymous codon usage (RSCU)] contributes only a minor portion (∼10.96%) of the recorded variance. In all analysed species, except P. trifoliata, Gravy and aromaticity played minor roles in resolving CUB. Compositional constraints were found to be strongly associated with the amino acid signatures in Citrus species and P. trifoliata. Our present analysis postulates compositional constraints in Citrus species and P. trifoliata and plausible role of the stress with GC3 and coevolution pattern of amino acid.
Collapse
Affiliation(s)
- Touqeer Ahmad
- Key Laboratory of Horticultural Plant Biology MOE, Huazhong Agricultural University, Wuhan 430070, China
| | | | | | | | | | | |
Collapse
|
7
|
Codon Usage Patterns in Corynebacterium glutamicum: Mutational Bias, Natural Selection and Amino Acid Conservation. Comp Funct Genomics 2010; 2010:343569. [PMID: 20445740 PMCID: PMC2860111 DOI: 10.1155/2010/343569] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2009] [Revised: 01/29/2010] [Accepted: 02/04/2010] [Indexed: 11/17/2022] Open
Abstract
The alternative synonymous codons in Corynebacterium glutamicum, a well-known bacterium used in industry for the production of amino acid, have been investigated by multivariate analysis. As C. glutamicum is a GC-rich organism, G and C are expected to predominate at the third position of codons. Indeed, overall codon usage analyses have indicated that C and/or G ending codons are predominant in this organism. Through multivariate statistical analysis, apart from mutational selection, we identified three other trends of codon usage variation among the genes. Firstly, the majority of highly expressed genes are scattered towards the positive end of the first axis, whereas the majority of lowly expressed genes are clustered towards the other end of the first axis. Furthermore, the distinct difference in the two sets of genes was that the C ending codons are predominate in putatively highly expressed genes, suggesting that the C ending codons are translationally optimal in this organism. Secondly, the majority of the putatively highly expressed genes have a tendency to locate on the leading strand, which indicates that replicational and transciptional selection might be invoked. Thirdly, highly expressed genes are more conserved than lowly expressed genes by synonymous and nonsynonymous substitutions among orthologous genes fromthe genomes of C. glutamicum and C. diphtheriae. We also analyzed other factors such as the length of genes and hydrophobicity that might influence codon usage and found their contributions to be weak.
Collapse
|
8
|
The genome sequence of Psychrobacter arcticus 273-4, a psychroactive Siberian permafrost bacterium, reveals mechanisms for adaptation to low-temperature growth. Appl Environ Microbiol 2010; 76:2304-12. [PMID: 20154119 DOI: 10.1128/aem.02101-09] [Citation(s) in RCA: 123] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Psychrobacter arcticus strain 273-4, which grows at temperatures as low as -10 degrees C, is the first cold-adapted bacterium from a terrestrial environment whose genome was sequenced. Analysis of the 2.65-Mb genome suggested that some of the strategies employed by P. arcticus 273-4 for survival under cold and stress conditions are changes in membrane composition, synthesis of cold shock proteins, and the use of acetate as an energy source. Comparative genome analysis indicated that in a significant portion of the P. arcticus proteome there is reduced use of the acidic amino acids and proline and arginine, which is consistent with increased protein flexibility at low temperatures. Differential amino acid usage occurred in all gene categories, but it was more common in gene categories essential for cell growth and reproduction, suggesting that P. arcticus evolved to grow at low temperatures. Amino acid adaptations and the gene content likely evolved in response to the long-term freezing temperatures (-10 degrees C to -12 degrees C) of the Kolyma (Siberia) permafrost soil from which this strain was isolated. Intracellular water likely does not freeze at these in situ temperatures, which allows P. arcticus to live at subzero temperatures.
Collapse
|
9
|
Why there is more to protein evolution than protein function: splicing, nucleosomes and dual-coding sequence. Biochem Soc Trans 2009; 37:756-61. [PMID: 19614589 DOI: 10.1042/bst0370756] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
There is considerable variation in the rate at which different proteins evolve. Why is this? Classically, it has been considered that the density of functionally important sites must predict rates of protein evolution. Likewise, amino acid choice is usually assumed to reflect optimal protein function. In the present article, we briefly review evidence suggesting that this protein function-centred view is too simplistic. In particular, we concentrate on how selection acting during the protein's production history can also affect protein evolutionary rates and amino acid choice. Exploring the role of selection at the DNA and RNA level, we specifically address how the need (i) to specify exonic splice enhancer motifs in pre-mRNA, and (ii) to ensure nucleosome positioning on DNA have an impact on amino acid choice and rates of evolution. For both, we review evidence that sequence affected by more than one coding demand is particularly constrained. Strikingly, in mammals, splicing-related constraints are quantitatively as important as expression parameters in predicting rates of protein evolution. These results indicate that there is substantially more to protein evolution than protein functional constraints.
Collapse
|
10
|
Barbe V, Cruveiller S, Kunst F, Lenoble P, Meurice G, Sekowska A, Vallenet D, Wang T, Moszer I, Médigue C, Danchin A. From a consortium sequence to a unified sequence: the Bacillus subtilis 168 reference genome a decade later. MICROBIOLOGY (READING, ENGLAND) 2009; 155:1758-1775. [PMID: 19383706 PMCID: PMC2885750 DOI: 10.1099/mic.0.027839-0] [Citation(s) in RCA: 257] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2009] [Revised: 02/25/2009] [Accepted: 02/25/2009] [Indexed: 11/18/2022]
Abstract
Comparative genomics is the cornerstone of identification of gene functions. The immense number of living organisms precludes experimental identification of functions except in a handful of model organisms. The bacterial domain is split into large branches, among which the Firmicutes occupy a considerable space. Bacillus subtilis has been the model of Firmicutes for decades and its genome has been a reference for more than 10 years. Sequencing the genome involved more than 30 laboratories, with different expertises, in a attempt to make the most of the experimental information that could be associated with the sequence. This had the expected drawback that the sequencing expertise was quite varied among the groups involved, especially at a time when sequencing genomes was extremely hard work. The recent development of very efficient, fast and accurate sequencing techniques, in parallel with the development of high-level annotation platforms, motivated the present resequencing work. The updated sequence has been reannotated in agreement with the UniProt protein knowledge base, keeping in perspective the split between the paleome (genes necessary for sustaining and perpetuating life) and the cenome (genes required for occupation of a niche, suggesting here that B. subtilis is an epiphyte). This should permit investigators to make reliable inferences to prepare validation experiments in a variety of domains of bacterial growth and development as well as build up accurate phylogenies.
Collapse
Affiliation(s)
- Valérie Barbe
- CEA, Institut de Génomique, Génoscope, 2 rue Gaston Crémieux, 91057 Évry, France
| | - Stéphane Cruveiller
- CEA, Institut de Génomique, Laboratoire de Génomique Comparative/CNRS UMR8030, Génoscope, 2 rue Gaston Crémieux, 91057 Évry, France
| | - Frank Kunst
- CEA, Institut de Génomique, Génoscope, 2 rue Gaston Crémieux, 91057 Évry, France
| | - Patricia Lenoble
- CEA, Institut de Génomique, Génoscope, 2 rue Gaston Crémieux, 91057 Évry, France
| | - Guillaume Meurice
- Institut Pasteur, Intégration et Analyse Génomiques, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France
| | - Agnieszka Sekowska
- Institut Pasteur, Génétique des Génomes Bactériens/CNRS URA2171, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France
| | - David Vallenet
- CEA, Institut de Génomique, Laboratoire de Génomique Comparative/CNRS UMR8030, Génoscope, 2 rue Gaston Crémieux, 91057 Évry, France
| | - Tingzhang Wang
- Institut Pasteur, Génétique des Génomes Bactériens/CNRS URA2171, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France
| | - Ivan Moszer
- Institut Pasteur, Intégration et Analyse Génomiques, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France
| | - Claudine Médigue
- CEA, Institut de Génomique, Laboratoire de Génomique Comparative/CNRS UMR8030, Génoscope, 2 rue Gaston Crémieux, 91057 Évry, France
| | - Antoine Danchin
- Institut Pasteur, Génétique des Génomes Bactériens/CNRS URA2171, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France
| |
Collapse
|
11
|
de Bivort BL, Perlstein EO, Kunes S, Schreiber SL. Amino acid metabolic origin as an evolutionary influence on protein sequence in yeast. J Mol Evol 2009; 68:490-7. [PMID: 19357800 PMCID: PMC2687519 DOI: 10.1007/s00239-009-9218-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2008] [Revised: 11/25/2008] [Accepted: 02/19/2009] [Indexed: 11/25/2022]
Abstract
The metabolic cycle of Saccharomyces cerevisiae consists of alternating oxidative (respiration) and reductive (glycolysis) energy-yielding reactions. The intracellular concentrations of amino acid precursors generated by these reactions oscillate accordingly, attaining maximal concentration during the middle of their respective yeast metabolic cycle phases. Typically, the amino acids themselves are most abundant at the end of their precursor's phase. We show that this metabolic cycling has likely biased the amino acid composition of proteins across the S. cerevisiae genome. In particular, we observed that the metabolic source of amino acids is the single most important source of variation in the amino acid compositions of functionally related proteins and that this signal appears only in (facultative) organisms using both oxidative and reductive metabolism. Periodically expressed proteins are enriched for amino acids generated in the preceding phase of the metabolic cycle. Proteins expressed during the oxidative phase contain more glycolysis-derived amino acids, whereas proteins expressed during the reductive phase contain more respiration-derived amino acids. Rare amino acids (e.g., tryptophan) are greatly overrepresented or underrepresented, relative to the proteomic average, in periodically expressed proteins, whereas common amino acids vary by a few percent. Genome-wide, we infer that 20,000 to 60,000 residues have been modified by this previously unappreciated pressure. This trend is strongest in ancient proteins, suggesting that oscillating endogenous amino acid availability exerted genome-wide selective pressure on protein sequences across evolutionary time.
Collapse
Affiliation(s)
- Benjamin L de Bivort
- Rowland Institute at Harvard, Harvard University, 100 Edwin Land Boulevard, Cambridge, MA 02142, USA.
| | | | | | | |
Collapse
|
12
|
Abstract
ORFan genes can constitute a large fraction of a bacterial genome, but due to their lack of homologs, their functions have remained largely unexplored. To determine if particular features of ORFan-encoded proteins promote their presence in a genome, we analyzed properties of ORFans that originated over a broad evolutionary timescale. We also compared ORFan genes to another class of acquired genes, heterogeneous occurrence in prokaryotes (HOPs), which have homologs in other bacteria. A total of 54 ORFan and HOP genes selected from different phylogenetic depths in the Escherichia coli lineage were cloned, expressed, purified, and subjected to circular dichroism (CD) spectroscopy. A majority of genes could be expressed, but only 18 yielded sufficient soluble protein for spectral analysis. Of these, half were significantly alpha-helical, three were predominantly beta-sheet, and six were of intermediate/indeterminate structure. Although a higher proportion of HOPs yielded soluble proteins with resolvable secondary structures, ORFans resembled HOPs with regard to most of the other features tested. Overall, we found that those ORFan and HOP genes that have persisted in the E. coli lineage were more likely to encode soluble and folded proteins, more likely to display environmental modulation of their gene expression, and by extrapolation, are more likely to be functional.
Collapse
Affiliation(s)
- Hema Prasad Narra
- Department of Biochemistry & Molecular Biophysics, University of Arizona, Tucson, AZ 85721, USA
| | | | | |
Collapse
|
13
|
Chapter 1 A Phylogenetic View of Bacterial Ribonucleases. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2009; 85:1-41. [DOI: 10.1016/s0079-6603(08)00801-5] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
14
|
Danchin A, Fang G, Noria S. The extant core bacterial proteome is an archive of the origin of life. Proteomics 2007; 7:875-89. [PMID: 17370266 DOI: 10.1002/pmic.200600442] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Genes consistently present in a clique of genomes, preferring the leading DNA strands are deemed persistent. The persistent bacterial proteome organises around intermediary and RNA metabolism, and RNA-related information transfer, with a significant contribution to compartmentalisation. Despite inevitable losses during evolution, the extant persistent proteome displays functions present early on. Proteins coded by genes staying clustered in a majority of genomes constitute a network of mutual attraction made up of three concentric circles. The outer one, mostly devoted to metabolism, breaks into small pieces and fades away. The second, more continuous, one organises around class I tRNA synthetases. The well-connected inner circle comprises the ribosome and information transfer. This reflects the progressive construction of cells, starting from the metabolism of coenzymes, nucleotides and fatty acids-related molecules. Subsequently, a core set of aminoacyl-tRNA synthetases scaffolded around RNA, connected to cell division machinery and organised metabolism around translation. This remarkable organisation reflects the evolution of life from small molecules metabolism to the RNA world, suggesting that extant microorganisms carry the marks of the ancient processes that created life. Further analysis suggests that RNA degradation, associated to the presence of iron, still plays a role in extant metabolism, including the evolution of genome structures.
Collapse
Affiliation(s)
- Antoine Danchin
- Génétique des Génomes Bactériens, Institut Pasteur, Paris, France.
| | | | | |
Collapse
|
15
|
Norris V, den Blaauwen T, Cabin-Flaman A, Doi RH, Harshey R, Janniere L, Jimenez-Sanchez A, Jin DJ, Levin PA, Mileykovskaya E, Minsky A, Saier M, Skarstad K. Functional taxonomy of bacterial hyperstructures. Microbiol Mol Biol Rev 2007; 71:230-53. [PMID: 17347523 PMCID: PMC1847379 DOI: 10.1128/mmbr.00035-06] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The levels of organization that exist in bacteria extend from macromolecules to populations. Evidence that there is also a level of organization intermediate between the macromolecule and the bacterial cell is accumulating. This is the level of hyperstructures. Here, we review a variety of spatially extended structures, complexes, and assemblies that might be termed hyperstructures. These include ribosomal or "nucleolar" hyperstructures; transertion hyperstructures; putative phosphotransferase system and glycolytic hyperstructures; chemosignaling and flagellar hyperstructures; DNA repair hyperstructures; cytoskeletal hyperstructures based on EF-Tu, FtsZ, and MreB; and cell cycle hyperstructures responsible for DNA replication, sequestration of newly replicated origins, segregation, compaction, and division. We propose principles for classifying these hyperstructures and finally illustrate how thinking in terms of hyperstructures may lead to a different vision of the bacterial cell.
Collapse
Affiliation(s)
- Vic Norris
- Department of Science, University of Rouen, 76821 Mont Saint Aignan Cedex, France.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. ACTA ACUST UNITED AC 2007; 23:673-9. [PMID: 17237039 PMCID: PMC2387122 DOI: 10.1093/bioinformatics/btm009] [Citation(s) in RCA: 2377] [Impact Index Per Article: 139.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION The Glimmer gene-finding software has been successfully used for finding genes in bacteria, archaea and viruses representing hundreds of species. We describe several major changes to the Glimmer system, including improved methods for identifying both coding regions and start codons. We also describe a new module of Glimmer that can distinguish host and endosymbiont DNA. This module was developed in response to the discovery that eukaryotic genome sequencing projects sometimes inadvertently capture the DNA of intracellular bacteria living in the host. RESULTS The new methods dramatically reduce the rate of false-positive predictions, while maintaining Glimmer's 99% sensitivity rate at detecting genes in most species, and they find substantially more correct start sites, as measured by comparisons to known and well-curated genes. We show that our interpolated Markov model (IMM) DNA discriminator correctly separated 99% of the sequences in a recent genome project that produced a mixture of sequences from the bacterium Prochloron didemni and its sea squirt host, Lissoclinum patella. AVAILABILITY Glimmer is OSI Certified Open Source and available at http://cbcb.umd.edu/software/glimmer.
Collapse
Affiliation(s)
- Arthur L Delcher
- Center for Bioinformatics & Computational Biology, University of Maryland, College Park, MD 20742, USA.
| | | | | | | |
Collapse
|
17
|
Turlin E, Pascal G, Rousselle JC, Lenormand P, Ngo S, Danchin A, Derzelle S. Proteome analysis of the phenotypic variation process in Photorhabdus luminescens. Proteomics 2006; 6:2705-25. [PMID: 16548063 DOI: 10.1002/pmic.200500646] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Photorhabdus luminescens is an insect pathogen associated with specific soil nematodes. The bacterium has a complex life cycle with a symbiotic stage in which bacteria colonize the intestinal tract of the nematodes, and a pathogenic stage against susceptible larval-stage insect. Symbiosis-"deficient" phenotypic variants (known as secondary forms) arise during prolonged incubation. Correspondence analysis of the in silico proteome translated from the genome sequence of strain TT01 identified two major biases in the amino acid composition of the proteins. We analyzed the proteome, separating three classes of extracts: cellular, extracellular, and membrane-associated proteins, resolved by 2-DE. Approximately 450 spots matching the translation products of 231 different coding DNA sequences were identified by PMF. A comparative analysis was performed to characterize the protein content of both variants. Differences were evident during stationary growth phase. Very few proteins were found in variant II supernatants, and numerous proteins were lacking in the membrane-associated fraction. Proteins up-regulated by the phenotypic variation phenomenon were involved in oxidative stress, energy metabolism, and translation. The transport and binding of iron, sugars and amino acids were also affected and molecular chaperones were strongly down-regulated. A potential role for H-NS in phenotypic variation control is discussed.
Collapse
Affiliation(s)
- Evelyne Turlin
- Unité de Génétique des Génomes Bactériens, Département de Structure et Dynamique des Génomes, Institut Pasteur, Paris, France.
| | | | | | | | | | | | | |
Collapse
|
18
|
Frutos R, Viari A, Ferraz C, Morgat A, Eychenié S, Kandassamy Y, Chantal I, Bensaid A, Coissac E, Vachiery N, Demaille J, Martinez D. Comparative genomic analysis of three strains of Ehrlichia ruminantium reveals an active process of genome size plasticity. J Bacteriol 2006; 188:2533-42. [PMID: 16547041 PMCID: PMC1428390 DOI: 10.1128/jb.188.7.2533-2542.2006] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Ehrlichia ruminantium is the causative agent of heartwater, a major tick-borne disease of livestock in Africa that has been introduced in the Caribbean and is threatening to emerge and spread on the American mainland. We sequenced the complete genomes of two strains of E. ruminantium of differing phenotypes, strains Gardel (Erga; 1,499,920 bp), from the island of Guadeloupe, and Welgevonden (Erwe; 1,512,977 bp), originating in South Africa and maintained in Guadeloupe in a different cell environment. Comparative genomic analysis of these two strains was performed with the recently published parent strain of Erwe (Erwo) and other Rickettsiales (Anaplasma, Wolbachia, and Rickettsia spp.). Gene order is highly conserved between the E. ruminantium strains and with A. marginale. In contrast, there is very little conservation of gene order with members of the Rickettsiaceae. However, gene order may be locally conserved, as illustrated by the tuf operons. Eighteen truncated protein-encoding sequences (CDSs) differentiate Erga from Erwe/Erwo, whereas four other truncated CDSs differentiate Erwe from Erwo. Moreover, E. ruminantium displays the lowest coding ratio observed among bacteria due to unusually long intergenic regions. This is related to an active process of genome expansion/contraction targeted at tandem repeats in noncoding regions and based on the addition or removal of ca. 150-bp tandem units. This process seems to be specific to E. ruminantium and is not observed in the other Rickettsiales.
Collapse
Affiliation(s)
- Roger Frutos
- CIRAD TA30/G, Campus International de Baillarguet, 34398 Montpellier Cedex 5, France.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Bailly-Bechet M, Danchin A, Iqbal M, Marsili M, Vergassola M. Codon usage domains over bacterial chromosomes. PLoS Comput Biol 2006; 2:e37. [PMID: 16683018 PMCID: PMC1447655 DOI: 10.1371/journal.pcbi.0020037] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2005] [Accepted: 03/13/2006] [Indexed: 11/19/2022] Open
Abstract
The geography of codon bias distributions over prokaryotic genomes and its impact upon chromosomal organization are analyzed. To this aim, we introduce a clustering method based on information theory, specifically designed to cluster genes according to their codon usage and apply it to the coding sequences of Escherichia coli and Bacillus subtilis. One of the clusters identified in each of the organisms is found to be related to expression levels, as expected, but other groups feature an over-representation of genes belonging to different functional groups, namely horizontally transferred genes, motility, and intermediary metabolism. Furthermore, we show that genes with a similar bias tend to be close to each other on the chromosome and organized in coherent domains, more extended than operons, demonstrating a role of translation in structuring bacterial chromosomes. It is argued that a sizeable contribution to this effect comes from the dynamical compartimentalization induced by the recycling of tRNAs, leading to gene expression rates dependent on their genomic and expression context.
Collapse
Affiliation(s)
- Marc Bailly-Bechet
- CNRS URA 2171, Institute Pasteur, Unité Génétique in silico, Paris, France
| | - Antoine Danchin
- CNRS URA 2171, Institute Pasteur, Unité Génétique des Génomes Bactériens, Paris, France
| | - Mudassar Iqbal
- Abdus Salam International Center Theoretical Physics, Trieste, Italy
- Computing Laboratory, University of Kent, Canterbury, Kent, United Kingdom
| | - Matteo Marsili
- Abdus Salam International Center Theoretical Physics, Trieste, Italy
| | - Massimo Vergassola
- CNRS URA 2171, Institute Pasteur, Unité Génétique in silico, Paris, France
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
20
|
Das S, Paul S, Dutta C. Evolutionary constraints on codon and amino acid usage in two strains of human pathogenic actinobacteria Tropheryma whipplei. J Mol Evol 2006; 62:645-58. [PMID: 16557339 DOI: 10.1007/s00239-005-0164-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2005] [Accepted: 12/20/2005] [Indexed: 12/13/2022]
Abstract
The factors governing codon and amino acid usages in the predicted protein-coding sequences of Tropheryma whipplei TW08/27 and Twist genomes have been analyzed. Multivariate analysis identifies the replicational-transcriptional selection coupled with DNA strand-specific asymmetric mutational bias as a major driving force behind the significant interstrand variations in synonymous codon usage patterns in T. whipplei genes, while a residual intrastrand synonymous codon bias is imparted by a selection force operating at the level of translation. The strand-specific mutational pressure has little influence on the amino acid usage, for which the mean hydropathy level and aromaticity are the major sources of variation, both having nearly equal impact. In spite of the intracellular lifestyle, the amino acid usage in highly expressed gene products of T. whipplei follows the cost-minimization hypothesis. The products of the highly expressed genes of these relatively A + T-rich actinobacteria prefer to use the residues encoded by GC-rich codons, probably due to greater conservation of a GC-rich ancestral state in the highly expressed genes, as suggested by the lower values of the rate of nonsynonymous divergences between orthologous sequences of highly expressed genes from the two strains of T. whipplei. Both the genomes under study are characterized by the presence of two distinct groups of membrane-associated genes, products of which exhibit significant differences in primary and potential secondary structures as well as in the propensity of protein disorder.
Collapse
Affiliation(s)
- Sabyasachi Das
- Bioinformatics Centre, Indian Institute of Chemical Biology, 4 Raja S. C. Mullick Road, Kolkata 700 032, India
| | | | | |
Collapse
|
21
|
Jestin JL. Degeneracy in the genetic code and its symmetries by base substitutions. C R Biol 2006; 329:168-71. [PMID: 16545757 DOI: 10.1016/j.crvi.2006.01.003] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2006] [Accepted: 01/10/2006] [Indexed: 11/16/2022]
Abstract
Degeneracy in the genetic code is known to minimise the deleterious effects of the most frequent base substitutions: transitions at the third base of codons are generally synonymous substitutions. Transversions that alter degeneracy were reported by Rumer. Here the other transversions are shown to leave invariant degeneracy when applied to the first base of codons. As a summary, degeneracy is considered with respect to all three types of base substitutions, the transitions and the two types of transversions. The symmetries of degeneracy by base substitutions are independent of the representation of the genetic code and discussed with respect to the quasi-universality of the genetic code.
Collapse
Affiliation(s)
- Jean-Luc Jestin
- Unité de chimie organique, département de biologie structurale et chimie, Institut Pasteur, 28, rue du Docteur-Roux, 75724 Paris cedex 15, France
| |
Collapse
|
22
|
Vallenet D, Labarre L, Rouy Z, Barbe V, Bocs S, Cruveiller S, Lajus A, Pascal G, Scarpelli C, Médigue C. MaGe: a microbial genome annotation system supported by synteny results. Nucleic Acids Res 2006; 34:53-65. [PMID: 16407324 PMCID: PMC1326237 DOI: 10.1093/nar/gkj406] [Citation(s) in RCA: 323] [Impact Index Per Article: 17.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Magnifying Genomes (MaGe) is a microbial genome annotation system based on a relational database containing information on bacterial genomes, as well as a web interface to achieve genome annotation projects. Our system allows one to initiate the annotation of a genome at the early stage of the finishing phase. MaGe's main features are (i) integration of annotation data from bacterial genomes enhanced by a gene coding re-annotation process using accurate gene models, (ii) integration of results obtained with a wide range of bioinformatics methods, among which exploration of gene context by searching for conserved synteny and reconstruction of metabolic pathways, (iii) an advanced web interface allowing multiple users to refine the automatic assignment of gene product functions. MaGe is also linked to numerous well-known biological databases and systems. Our system has been thoroughly tested during the annotation of complete bacterial genomes (Acinetobacter baylyi ADP1, Pseudoalteromonas haloplanktis, Frankia alni) and is currently used in the context of several new microbial genome annotation projects. In addition, MaGe allows for annotation curation and exploration of already published genomes from various genera (e.g. Yersinia, Bacillus and Neisseria). MaGe can be accessed at .
Collapse
Affiliation(s)
- David Vallenet
- Atelier de Génomique Comparative, CNRS-UMR8030, 2 rue Gaston Crémieux, 91057 Evry, Cedex, France.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Pascal G, Médigue C, Danchin A. Persistent biases in the amino acid composition of prokaryotic proteins. Bioessays 2006; 28:726-38. [PMID: 16850406 DOI: 10.1002/bies.20431] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Correspondence analysis of 28 proteomes selected to span the entire realm of prokaryotes revealed universal biases in the proteins' amino acid distribution. Integral Inner Membrane Proteins always form an individual cluster, which can then be used to predict protein localisation in unknown proteomes, independently of the organism's biotope or kingdom. Orphan proteins are consistently rich in aromatic residues. Another bias is also ubiquitous: the amino acid composition is driven by the G + C content of the first codon position. An unexpected bias is driven, in many proteomes, by the AAN box of the genetic code, suggesting some functional biochemical relationship between asparagine and lysine. Less-significant biases are driven by the rare amino acids, cysteine and tryptophan. Some allow identification of species-specific functions or localisation such as surface or exported proteins. Errors in genome annotations are also revealed by correspondence analysis, making it useful for quality control and correction.
Collapse
Affiliation(s)
- Géraldine Pascal
- Genoscope/CNRS UMR 8030, Atelier de Génomique Comparative, Evry, France
| | | | | |
Collapse
|
24
|
Médigue C, Krin E, Pascal G, Barbe V, Bernsel A, Bertin PN, Cheung F, Cruveiller S, D'Amico S, Duilio A, Fang G, Feller G, Ho C, Mangenot S, Marino G, Nilsson J, Parrilli E, Rocha EPC, Rouy Z, Sekowska A, Tutino ML, Vallenet D, von Heijne G, Danchin A. Coping with cold: the genome of the versatile marine Antarctica bacterium Pseudoalteromonas haloplanktis TAC125. Genome Res 2005; 15:1325-35. [PMID: 16169927 PMCID: PMC1240074 DOI: 10.1101/gr.4126905] [Citation(s) in RCA: 285] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
A considerable fraction of life develops in the sea at temperatures lower than 15 degrees C. Little is known about the adaptive features selected under those conditions. We present the analysis of the genome sequence of the fast growing Antarctica bacterium Pseudoalteromonas haloplanktis TAC125. We find that it copes with the increased solubility of oxygen at low temperature by multiplying dioxygen scavenging while deleting whole pathways producing reactive oxygen species. Dioxygen-consuming lipid desaturases achieve both protection against oxygen and synthesis of lipids making the membrane fluid. A remarkable strategy for avoidance of reactive oxygen species generation is developed by P. haloplanktis, with elimination of the ubiquitous molybdopterin-dependent metabolism. The P. haloplanktis proteome reveals a concerted amino acid usage bias specific to psychrophiles, consistently appearing apt to accommodate asparagine, a residue prone to make proteins age. Adding to its originality, P. haloplanktis further differs from its marine counterparts with recruitment of a plasmid origin of replication for its second chromosome.
Collapse
Affiliation(s)
- Claudine Médigue
- Genoscope, CNRS-UMR 8030, Atelier de Génomique Comparative, 91006 Evry Cedex, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|