1
|
Gonzalez V, Abarca-Hurtado J, Arancibia A, Claverías F, Guevara MR, Orellana R. Novel Insights on Extracellular Electron Transfer Networks in the Desulfovibrionaceae Family: Unveiling the Potential Significance of Horizontal Gene Transfer. Microorganisms 2024; 12:1796. [PMID: 39338472 PMCID: PMC11434368 DOI: 10.3390/microorganisms12091796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Revised: 07/24/2024] [Accepted: 07/25/2024] [Indexed: 09/30/2024] Open
Abstract
Some sulfate-reducing bacteria (SRB), mainly belonging to the Desulfovibrionaceae family, have evolved the capability to conserve energy through microbial extracellular electron transfer (EET), suggesting that this process may be more widespread than previously believed. While previous evidence has shown that mobile genetic elements drive the plasticity and evolution of SRB and iron-reducing bacteria (FeRB), few have investigated the shared molecular mechanisms related to EET. To address this, we analyzed the prevalence and abundance of EET elements and how they contributed to their differentiation among 42 members of the Desulfovibrionaceae family and 23 and 59 members of Geobacteraceae and Shewanellaceae, respectively. Proteins involved in EET, such as the cytochromes PpcA and CymA, the outer membrane protein OmpJ, and the iron-sulfur cluster-binding CbcT, exhibited widespread distribution within Desulfovibrionaceae. Some of these showed modular diversification. Additional evidence revealed that horizontal gene transfer was involved in the acquiring and losing of critical genes, increasing the diversification and plasticity between the three families. The results suggest that specific EET genes were widely disseminated through horizontal transfer, where some changes reflected environmental adaptations. These findings enhance our comprehension of the evolution and distribution of proteins involved in EET processes, shedding light on their role in iron and sulfur biogeochemical cycling.
Collapse
Affiliation(s)
- Valentina Gonzalez
- Laboratorio de Biología Celular y Ecofisiología Microbiana, Facultad de Ciencias Naturales y Exactas, Universidad de Playa Ancha, Leopoldo Carvallo 270, Valparaíso 2360001, Chile; (V.G.); (J.A.-H.); (A.A.)
- Laboratorio de Microbiología Molecular y Biotecnología Ambiental, Departamento de Química & Centro de Biotecnología Daniel Alkalay-Lowitt, Universidad Técnica Federico Santa María, Avenida España 1680, Valparaíso 2390123, Chile;
- Departamento de Química y Medio Ambiente, Sede Viña del Mar, Universidad Técnica Federico Santa María, Avenida Federico Santa María 6090, Viña del Mar 2520000, Chile
| | - Josefina Abarca-Hurtado
- Laboratorio de Biología Celular y Ecofisiología Microbiana, Facultad de Ciencias Naturales y Exactas, Universidad de Playa Ancha, Leopoldo Carvallo 270, Valparaíso 2360001, Chile; (V.G.); (J.A.-H.); (A.A.)
| | - Alejandra Arancibia
- Laboratorio de Biología Celular y Ecofisiología Microbiana, Facultad de Ciencias Naturales y Exactas, Universidad de Playa Ancha, Leopoldo Carvallo 270, Valparaíso 2360001, Chile; (V.G.); (J.A.-H.); (A.A.)
- HUB Ambiental UPLA, Universidad de Playa Ancha, Leopoldo Carvallo 207, Playa Ancha, Valparaíso 2340000, Chile
| | - Fernanda Claverías
- Laboratorio de Microbiología Molecular y Biotecnología Ambiental, Departamento de Química & Centro de Biotecnología Daniel Alkalay-Lowitt, Universidad Técnica Federico Santa María, Avenida España 1680, Valparaíso 2390123, Chile;
| | - Miguel R. Guevara
- Laboratorio de Data Science, Facultad de Ingeniería, Universidad de Playa Ancha, Leopoldo Carvallo 270, Valparaíso 2340000, Chile;
| | - Roberto Orellana
- Laboratorio de Biología Celular y Ecofisiología Microbiana, Facultad de Ciencias Naturales y Exactas, Universidad de Playa Ancha, Leopoldo Carvallo 270, Valparaíso 2360001, Chile; (V.G.); (J.A.-H.); (A.A.)
- HUB Ambiental UPLA, Universidad de Playa Ancha, Leopoldo Carvallo 207, Playa Ancha, Valparaíso 2340000, Chile
- Núcleo Milenio BioGEM, Valparaíso 2390123, Chile
| |
Collapse
|
2
|
McCutcheon JP, Garber AI, Spencer N, Warren JM. How do bacterial endosymbionts work with so few genes? PLoS Biol 2024; 22:e3002577. [PMID: 38626194 PMCID: PMC11020763 DOI: 10.1371/journal.pbio.3002577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/18/2024] Open
Abstract
The move from a free-living environment to a long-term residence inside a host eukaryotic cell has profound effects on bacterial function. While endosymbioses are found in many eukaryotes, from protists to plants to animals, the bacteria that form these host-beneficial relationships are even more diverse. Endosymbiont genomes can become radically smaller than their free-living relatives, and their few remaining genes show extreme compositional biases. The details of how these reduced and divergent gene sets work, and how they interact with their host cell, remain mysterious. This Unsolved Mystery reviews how genome reduction alters endosymbiont biology and highlights a "tipping point" where the loss of the ability to build a cell envelope coincides with a marked erosion of translation-related genes.
Collapse
Affiliation(s)
- John P. McCutcheon
- Biodesign Institute and School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
- Howard Hughes Medical Institute, Chevy Chase, Maryland, United States of America
| | - Arkadiy I. Garber
- Biodesign Institute and School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | - Noah Spencer
- Biodesign Institute and School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | - Jessica M. Warren
- Biodesign Institute and School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
- Howard Hughes Medical Institute, Chevy Chase, Maryland, United States of America
| |
Collapse
|
3
|
Biosynthetic constraints on amino acid synthesis at the base of the food chain may determine their use in higher-order consumer genomes. PLoS Genet 2023; 19:e1010635. [PMID: 36780875 PMCID: PMC9956874 DOI: 10.1371/journal.pgen.1010635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 02/24/2023] [Accepted: 01/24/2023] [Indexed: 02/15/2023] Open
Abstract
Dietary nutrient composition is essential for shaping important fitness traits and behaviours. Many organisms are protein limited, and for Drosophila melanogaster this limitation manifests at the level of the single most limiting essential Amino Acid (AA) in the diet. The identity of this AA and its effects on female fecundity is readily predictable by a procedure called exome matching in which the sum of AAs encoded by a consumer's exome is used to predict the relative proportion of AAs required in its diet. However, the exome matching calculation does not weight AA contributions to the overall profile by protein size or expression. Here, we update the exome matching calculation to include these weightings. Surprisingly, although nearly half of the transcriptome is differentially expressed when comparing male and female flies, we found that creating transcriptome-weighted exome matched diets for each sex did not enhance their fecundity over that supported by exome matching alone. These data indicate that while organisms may require different amounts of dietary protein across conditions, the relative proportion of the constituent AAs remains constant. Interestingly, we also found that exome matched AA profiles are generally conserved across taxa and that the composition of these profiles might be explained by energetic and elemental limitations on microbial AA synthesis. Thus, it appears that ecological constraints amongst autotrophs shape the relative proportion of AAs that are available across trophic levels and that this constrains biomass composition.
Collapse
|
4
|
Panda A, Tuller T. Determinants of associations between codon and amino acid usage patterns of microbial communities and the environment inferred based on a cross-biome metagenomic analysis. NPJ Biofilms Microbiomes 2023; 9:5. [PMID: 36693851 PMCID: PMC9873608 DOI: 10.1038/s41522-023-00372-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 01/11/2023] [Indexed: 01/25/2023] Open
Abstract
Codon and amino acid usage were associated with almost every aspect of microbial life. However, how the environment may impact the codon and amino acid choice of microbial communities at the habitat level is not clearly understood. Therefore, in this study, we analyzed codon and amino acid usage patterns of a large number of environmental samples collected from diverse ecological niches. Our results suggested that samples derived from similar environmental niches, in general, show overall similar codon and amino acid distribution as compared to samples from other habitats. To substantiate the relative impact of the environment, we considered several factors, such as their similarity in GC content, or in functional or taxonomic abundance. Our analysis demonstrated that none of these factors can fully explain the trends that we observed at the codon or amino acid level implying a direct environmental influence on them. Further, our analysis demonstrated different levels of selection on codon bias in different microbial communities with the highest bias in host-associated environments such as the digestive system or oral samples and the lowest level of selection in soil and water samples. Considering a large number of metagenomic samples here we showed that microorganisms collected from similar environmental backgrounds exhibit similar patterns of codon and amino acid usage irrespective of the location or time from where the samples were collected. Thus our study suggested a direct impact of the environment on codon and amino usage of microorganisms that cannot be explained considering the influence of other factors.
Collapse
Affiliation(s)
- Arup Panda
- Department of Biomedical Engineering, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Tamir Tuller
- Department of Biomedical Engineering, Tel Aviv University, Tel Aviv, 69978, Israel.
| |
Collapse
|
5
|
Barceló-Antemate D, Fontove-Herrera F, Santos W, Merino E. The effect of the genomic GC content bias of prokaryotic organisms on the secondary structures of their proteins. PLoS One 2023; 18:e0285201. [PMID: 37141209 PMCID: PMC10159118 DOI: 10.1371/journal.pone.0285201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Accepted: 04/17/2023] [Indexed: 05/05/2023] Open
Abstract
One of the main characteristics of prokaryotic genomes is the ratio in which guanine-cytosine bases are used in their DNA sequences. This is known as the genomic GC content and varies widely, from values below 20% to values greater than 74%. It has been demonstrated that the genomic GC content varies in accordance with the phylogenetic distribution of organisms and influences the amino acid composition of their corresponding proteomes. This bias is particularly important for amino acids that are coded by GC content-rich codons such as alanine, glycine, and proline, as well as amino acids that are coded by AT-rich codons, such as lysine, asparagine, and isoleucine. In our study, we extend these results by considering the effect of the genomic GC content on the secondary structure of proteins. On a set of 192 representative prokaryotic genomes and proteome sequences, we identified through a bioinformatic study that the composition of the secondary structures of the proteomes varies in relation to the genomic GC content; random coils increase as the genomic GC content increases, while alpha-helices and beta-sheets present an inverse relationship. In addition, we found that the tendency of an amino acid to form part of a secondary structure of proteins is not ubiquitous, as previously expected, but varies according to the genomic GC content. Finally, we discovered that for some specific groups of orthologous proteins, the GC content of genes biases the composition of secondary structures of the proteins for which they code.
Collapse
Affiliation(s)
- Diana Barceló-Antemate
- Departamento de Microbiología Molecular, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
- Centro de Investigación en Dinámica Celular, Instituto de Investigación en Ciencias Básicas y Aplicadas, Universidad Autónoma del Estado de Morelos (UAEM), Cuernavaca, Morelos, México
| | | | - Walter Santos
- Departamento de Microbiología Molecular, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
| | - Enrique Merino
- Departamento de Microbiología Molecular, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
| |
Collapse
|
6
|
Masłowska-Górnicz A, van den Bosch MRM, Saccenti E, Suarez-Diez M. A large-scale analysis of codon usage bias in 4868 bacterial genomes shows association of codon adaptation index with GC content, protein functional domains and bacterial phenotypes. BIOCHIMICA ET BIOPHYSICA ACTA. GENE REGULATORY MECHANISMS 2022; 1865:194826. [PMID: 35605953 DOI: 10.1016/j.bbagrm.2022.194826] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 05/05/2022] [Accepted: 05/12/2022] [Indexed: 06/15/2023]
Abstract
Multiple synonymous codons code for the same amino acid, resulting in the degeneracy of the genetic code and in the preferred used of some codons called codon bias usage (CBU). We performed a large-scale analysis of codon usage bias analysing the distribution of the codon adaptation index (CAI) and the codon relative adaptiveness index (RA) in 4868 bacterial genomes. We found that CAI values differ significantly between protein functional domains and part of the protein outside domains and show how CAI, GC content and preferred usage of polymerase III alpha subunits are related. Additionally, we give evidence of the association between CAI and bacterial phenotypes.
Collapse
Affiliation(s)
- Anna Masłowska-Górnicz
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Stippeneng 4, 6708 WE Wageningen, the Netherlands
| | - Melanie R M van den Bosch
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Stippeneng 4, 6708 WE Wageningen, the Netherlands
| | - Edoardo Saccenti
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Stippeneng 4, 6708 WE Wageningen, the Netherlands.
| | - Maria Suarez-Diez
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Stippeneng 4, 6708 WE Wageningen, the Netherlands.
| |
Collapse
|
7
|
Mahajan S, Agashe D. Evolutionary jumps in bacterial GC content. G3 (BETHESDA, MD.) 2022; 12:jkac108. [PMID: 35579351 PMCID: PMC9339322 DOI: 10.1093/g3journal/jkac108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Accepted: 04/20/2022] [Indexed: 11/29/2022]
Abstract
Genomic GC (Guanine-Cytosine) content is a fundamental molecular trait linked with many key genomic features such as codon and amino acid use. Across bacteria, GC content is surprisingly diverse and has been studied for many decades; yet its evolution remains incompletely understood. Since it is difficult to observe GC content evolve on laboratory time scales, phylogenetic comparative approaches are instrumental; but this dimension is rarely studied systematically in the case of bacterial GC content. We applied phylogenetic comparative models to analyze GC content evolution in multiple bacterial groups across 2 major bacterial phyla. We find that GC content diversifies via a combination of gradual evolution and evolutionary "jumps." Surprisingly, unlike prior reports that solely focused on reductions in GC, we found a comparable number of jumps with both increased and decreased GC content. Overall, many of the identified jumps occur in lineages beyond the well-studied peculiar examples of endosymbiotic and AT-rich marine bacteria and do not support the predicted role of oxygen dependence. Our analysis of rapid and large shifts in GC content thus identifies new clades and novel contexts to further understand the ecological and evolutionary drivers of this important genomic trait.
Collapse
Affiliation(s)
- Saurabh Mahajan
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bengaluru 560065, India
- Atria University, Bengaluru 560024, India
| | - Deepa Agashe
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bengaluru 560065, India
| |
Collapse
|
8
|
Shulgina Y, Eddy SR. A computational screen for alternative genetic codes in over 250,000 genomes. eLife 2021; 10:71402. [PMID: 34751130 PMCID: PMC8629427 DOI: 10.7554/elife.71402] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 10/26/2021] [Indexed: 11/25/2022] Open
Abstract
The genetic code has been proposed to be a ‘frozen accident,’ but the discovery of alternative genetic codes over the past four decades has shown that it can evolve to some degree. Since most examples were found anecdotally, it is difficult to draw general conclusions about the evolutionary trajectories of codon reassignment and why some codons are affected more frequently. To fill in the diversity of genetic codes, we developed Codetta, a computational method to predict the amino acid decoding of each codon from nucleotide sequence data. We surveyed the genetic code usage of over 250,000 bacterial and archaeal genome sequences in GenBank and discovered five new reassignments of arginine codons (AGG, CGA, and CGG), representing the first sense codon changes in bacteria. In a clade of uncultivated Bacilli, the reassignment of AGG to become the dominant methionine codon likely evolved by a change in the amino acid charging of an arginine tRNA. The reassignments of CGA and/or CGG were found in genomes with low GC content, an evolutionary force that likely helped drive these codons to low frequency and enable their reassignment. All life forms rely on a ‘code’ to translate their genetic information into proteins. This code relies on limited permutations of three nucleotides – the building blocks that form DNA and other types of genetic information. Each ‘triplet’ of nucleotides – or codon – encodes a specific amino acid, the basic component of proteins. Reading the sequence of codons in the right order will let the cell know which amino acid to assemble next on a growing protein. For instance, the codon CGG – formed of the nucleotides guanine (G) and cytosine (C) – codes for the amino acid arginine. From bacteria to humans, most life forms rely on the same genetic code. Yet certain organisms have evolved to use slightly different codes, where one or several codons have an altered meaning. To better understand how alternative genetic codes have evolved, Shulgina and Eddy set out to find more organisms featuring these altered codons, creating a new software called Codetta that can analyze the genome of a microorganism and predict the genetic code it uses. Codetta was then used to sift through the genetic information of 250,000 microorganisms. This was made possible by the sequencing, in recent years, of the genomes of hundreds of thousands of bacteria and other microorganisms – including many never studied before. These analyses revealed five groups of bacteria with alternative genetic codes, all of which had changes in the codons that code for arginine. Amongst these, four had genomes with a low proportion of guanine and cytosine nucleotides. This may have made some guanine and cytosine-rich arginine codons very rare in these organisms and, therefore, easier to be reassigned to encode another amino acid. The work by Shulgina and Eddy demonstrates that Codetta is a new, useful tool that scientists can use to understand how genetic codes evolve. In addition, it can also help to ensure the accuracy of widely used protein databases, which assume which genetic code organisms use to predict protein sequences from their genomes.
Collapse
Affiliation(s)
| | - Sean R Eddy
- Molecular & Cellular Biology, Harvard University, Cambridge, United States
| |
Collapse
|
9
|
The efficacy of partial 16S rRNA gene sequencing for precise determination of phylogenetic relatedness among Salmonellae. SCIENTIFIC AFRICAN 2021. [DOI: 10.1016/j.sciaf.2021.e01004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
|
10
|
Ely B. Genomic GC content drifts downward in most bacterial genomes. PLoS One 2021; 16:e0244163. [PMID: 34038432 PMCID: PMC8153448 DOI: 10.1371/journal.pone.0244163] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 05/07/2021] [Indexed: 11/18/2022] Open
Abstract
In every kingdom of life, GC->AT transitions occur more frequently than any other type of mutation due to the spontaneous deamination of cytidine. In eukaryotic genomes, this slow loss of GC base pairs is counteracted by biased gene conversion which increases genomic GC content as part of the recombination process. However, this type of biased gene conversion has not been observed in bacterial genomes, so we hypothesized that GC->AT transitions cause a reduction of genomic GC content in prokaryotic genomes on an evolutionary time scale. To test this hypothesis, we used a phylogenetic approach to analyze triplets of closely related genomes representing a wide range of the bacterial kingdom. The resulting data indicate that genomic GC content is drifting downward in bacterial genomes where GC base pairs comprise 40% or more of the total genome. In contrast, genomes containing less than 40% GC base pairs have fewer opportunities for GC->AT transitions to occur so genomic GC content is relatively stable or actually increasing. It should be noted that this observed change in genomic GC content is the net change in shared parts of the genome and does not apply to parts of the genome that have been lost or acquired since the genomes being compared shared common ancestor. However, a more detailed analysis of two Caulobacter genomes revealed that the acquisition of mobile elements by the two genomes actually reduced the total genomic GC content as well.
Collapse
Affiliation(s)
- Bert Ely
- Department of Biological Sciences, University of South Carolina, Columbia, South Carolina, United States of America
- * E-mail:
| |
Collapse
|
11
|
The Changing Face of the Family Enterobacteriaceae (Order: " Enterobacterales"): New Members, Taxonomic Issues, Geographic Expansion, and New Diseases and Disease Syndromes. Clin Microbiol Rev 2021; 34:34/2/e00174-20. [PMID: 33627443 DOI: 10.1128/cmr.00174-20] [Citation(s) in RCA: 86] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
The family Enterobacteriaceae has undergone significant morphogenetic changes in its more than 85-year history, particularly during the past 2 decades (2000 to 2020). The development and introduction of new and novel molecular methods coupled with innovative laboratory techniques have led to many advances. We now know that the global range of enterobacteria is much more expansive than previously recognized, as they play important roles in the environment in vegetative processes and through widespread environmental distribution through insect vectors. In humans, many new species have been described, some associated with specific disease processes. Some established species are now observed in new infectious disease settings and syndromes. The results of molecular taxonomic and phylogenetics studies suggest that the current family Enterobacteriaceae should possibly be divided into seven or more separate families. The logarithmic explosion in the number of enterobacterial species described brings into question the relevancy, need, and mechanisms to potentially identify these taxa. This review covers the progression, transformation, and morphogenesis of the family from the seminal Centers for Disease Control and Prevention publication (J. J. Farmer III, B. R. Davis, F. W. Hickman-Brenner, A. McWhorter, et al., J Clin Microbiol 21:46-76, 1985, https://doi.org/10.1128/JCM.21.1.46-76.1985) to the present.
Collapse
|
12
|
Su W, Liu ML, Yang YH, Wang JS, Li SH, Lv H, Dao FY, Yang H, Lin H. PPD: A Manually Curated Database for Experimentally Verified Prokaryotic Promoters. J Mol Biol 2021; 433:166860. [PMID: 33539888 DOI: 10.1016/j.jmb.2021.166860] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 12/13/2020] [Accepted: 01/27/2021] [Indexed: 12/16/2022]
Abstract
As a key region, promoter plays a key role in transcription regulation. A eukaryotic promoter database called EPD has been constructed to store eukaryotic POL II promoters. Although there are some promoter databases for specific prokaryotic species or specific promoter type, such as RegulonDB for Escherichia coli K-12, DBTBS for Bacillus subtilis and Pro54DB for sigma 54 promoter, because of the diversity of prokaryotes and the development of sequencing technology, huge amounts of prokaryotic promoters are scattered in numerous published articles, which is inconvenient for researchers to explore the process of gene regulation in prokaryotes. In this study, we constructed a Prokaryotic Promoter Database (PPD), which records the experimentally validated promoters in prokaryotes, from published articles. Up to now, PPD has stored 129,148 promoters across 63 prokaryotic species manually extracted from published papers. We provided a friendly interface for users to browse, search, blast, visualize, submit and download data. The PPD will provide relatively comprehensive resources of prokaryotic promoter for the study of prokaryotic gene transcription. The PPD is freely available and easy accessed at http://lin-group.cn/database/ppd/.
Collapse
Affiliation(s)
- Wei Su
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Meng-Lu Liu
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Yu-He Yang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Jia-Shu Wang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Shi-Hao Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hao Lv
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Fu-Ying Dao
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hui Yang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hao Lin
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|
13
|
Abstract
Reconstruction of target genomes from sequence data produced by instruments that are agnostic as to the species-of-origin may be confounded by contaminant DNA. Whether introduced during sample processing or through co-extraction alongside the target DNA, if insufficient care is taken during the assembly process, the final assembled genome may be a mixture of data from several species. Such assemblies can confound sequence-based biological inference and, when deposited in public databases, may be included in downstream analyses by users unaware of underlying problems. We present BlobToolKit, a software suite to aid researchers in identifying and isolating non-target data in draft and publicly available genome assemblies. BlobToolKit can be used to process assembly, read and analysis files for fully reproducible interactive exploration in the browser-based Viewer. BlobToolKit can be used during assembly to filter non-target DNA, helping researchers produce assemblies with high biological credibility. We have been running an automated BlobToolKit pipeline on eukaryotic assemblies publicly available in the International Nucleotide Sequence Data Collaboration and are making the results available through a public instance of the Viewer at https://blobtoolkit.genomehubs.org/view. We aim to complete analysis of all publicly available genomes and then maintain currency with the flow of new genomes. We have worked to embed these views into the presentation of genome assemblies at the European Nucleotide Archive, providing an indication of assembly quality alongside the public record with links out to allow full exploration in the Viewer.
Collapse
|
14
|
Mittal A, Changani AM, Taparia S, Goel D, Parihar A, Singh I. Structural disorder originates beyond narrow stoichiometric margins of amino acids in naturally occurring folded proteins. J Biomol Struct Dyn 2020; 39:2364-2375. [PMID: 32238088 DOI: 10.1080/07391102.2020.1751299] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Rigorous analyses of Euclidean distances between non-peptide bonded residues in structures of several thousand naturally occurring folded proteins yielded a surprising "margin of life" for percentage occurrence of individual amino acids in naturally occurring folded proteins. On one hand, the concept of "margin of life", referring to lower than expected variances in average stoichiometric occurrences of individual amino acids in folded proteins, remains unchallenged since its discovery a decade ago. On the other hand, within this past decade there has been a strong emergence of a gradual paradigm shift in biology, from sequence-structure-function in proteins to sequence-disorder-function, fuelled by discoveries on functional implications of intrinsically disordered proteins (primary sequences that do not form stable structures). Thus the applicability of "margin of life" to peptide-bonded residues in all known natural proteins, adopting stable structures vis-à-vis intrinsically disordered needs to be explored. Therefore in this work, we analyze compositions of the complete naturally occurring primary sequence space (over 560000 sequences) after dividing it into mutually exclusive subsets of structured and intrinsically disordered proteins along with a subset without any structural information. While finding that occurrence of different peptides (up to pentapeptides) is a direct consequence of the relative occurrences of their constituting residues in folded proteins, we report that structural disorder in natural proteins originates beyond the narrow stoichiometric margins of amino acids found in structured proteins.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Aditya Mittal
- Kusuma School of Biological Sciences, Indian Institute of Technology Delhi (IIT Delhi), New Delhi, India.,Supercomputing Facility for Bioinformatics & Computational Biology, Indian Institute of Technology Delhi (IIT Delhi), New Delhi, India
| | | | - Sakshi Taparia
- Department of Mathematics (Bachelors program in Mathematics & Computing), Indian Institute of Technology Delhi (IIT Delhi), New Delhi, India
| | - Deepanshu Goel
- Department of Biochemical Engineering and Biotechnology (Bachelors program), Indian Institute of Technology Delhi (IIT Delhi), New Delhi, India
| | - Animesh Parihar
- Department of Biochemical Engineering and Biotechnology (Bachelors program), Indian Institute of Technology Delhi (IIT Delhi), New Delhi, India
| | - Ishan Singh
- Department of Computer Science & Engineering (Bachelors program Computer Science), Indian Institute of Technology Delhi (IIT Delhi), New Delhi, India
| |
Collapse
|
15
|
Pal A, Saha BK, Saha J. Comparative in silico analysis of ftsZ gene from different bacteria reveals the preference for core set of codons in coding sequence structuring and secondary structural elements determination. PLoS One 2019; 14:e0219231. [PMID: 31841523 PMCID: PMC6913975 DOI: 10.1371/journal.pone.0219231] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 11/28/2019] [Indexed: 11/19/2022] Open
Abstract
The deluge of sequence information in the recent times provide us with an excellent opportunity to compare organisms on a large genomic scale. In this study we have tried to decipher the variation in the gene organization and structuring of a vital bacterial gene called ftsZ which codes for an integral component of the bacterial cell division, the FtsZ protein. FtsZ is homologous to tubulin protein and has been found to be ubiquitous in eubacteria. FtsZ is showing increasing promise as a target for antibacterial drug discovery. Our study of ftsZ protein from 143 different bacterial species spanning a wider range of morphological and physiological type demonstrates that the ftsZ gene of about ninety three percent of the organisms show relatively biased codon usage profile and significant GC deviation from their genomic GC content. Comparative codon usage analysis of ftsZ and a core housekeeping gene rpoB demonstrated that codon usage pattern of ftsZ CDS is shaped by natural selection to a large extent and mimics that of a housekeeping gene. We have also detected a tendency among the different organisms to utilize a core set of codons in structuring the ftsZ coding sequence. We observed that the compositional frequency of the amino acid serine in the FtsZ protein appears to be a indicator of the bacterial lifestyle. Our meticulous analysis of the ftsZ gene linked with the corresponding FtsZ protein show that there is a bias towards the use of specific synonymous codons particularly in the helix and strand regions of the multi-domain FtsZ protein. Overall our findings suggest that in an indispensable and vital protein such as FtsZ, there is an inherent tendency to maintain form for optimized performance in spite of the extrinsic variability in coding features.
Collapse
Affiliation(s)
- Ayon Pal
- Microbiology & Computational Biology Laboratory, Department of Botany, Raiganj University, Raiganj, West Bengal, India
| | - Barnan Kumar Saha
- Microbiology & Computational Biology Laboratory, Department of Botany, Raiganj University, Raiganj, West Bengal, India
| | - Jayanti Saha
- Microbiology & Computational Biology Laboratory, Department of Botany, Raiganj University, Raiganj, West Bengal, India
| |
Collapse
|
16
|
Lee MD, Ahlgren NA, Kling JD, Walworth NG, Rocap G, Saito MA, Hutchins DA, Webb EA. Marine
Synechococcus
isolates representing globally abundant genomic lineages demonstrate a unique evolutionary path of genome reduction without a decrease in GC content. Environ Microbiol 2019; 21:1677-1686. [DOI: 10.1111/1462-2920.14552] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Revised: 01/10/2019] [Accepted: 01/29/2019] [Indexed: 11/29/2022]
Affiliation(s)
- Michael D. Lee
- Department of Biological Sciences University of Southern California Los Angeles CA USA
- Exobiology, Ames Research Center Moffett Field CA USA
| | | | - Joshua D. Kling
- Department of Biological Sciences University of Southern California Los Angeles CA USA
| | - Nathan G. Walworth
- Department of Biological Sciences University of Southern California Los Angeles CA USA
| | - Gabrielle Rocap
- School of Oceanography University of Washington Seattle WA USA
| | - Mak A. Saito
- Marine Chemistry and Geochemistry, Woods Hole Oceanographic Institute Woods Hole MA USA
| | - David A. Hutchins
- Department of Biological Sciences University of Southern California Los Angeles CA USA
| | - Eric A. Webb
- Department of Biological Sciences University of Southern California Los Angeles CA USA
| |
Collapse
|
17
|
Venev SV, Zeldovich KB. Thermophilic Adaptation in Prokaryotes Is Constrained by Metabolic Costs of Proteostasis. Mol Biol Evol 2019; 35:211-224. [PMID: 29106597 PMCID: PMC5850847 DOI: 10.1093/molbev/msx282] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Prokaryotes evolved to thrive in an extremely diverse set of habitats, and their proteomes bear signatures of environmental conditions. Although correlations between amino acid usage and environmental temperature are well-documented, understanding of the mechanisms of thermal adaptation remains incomplete. Here, we couple the energetic costs of protein folding and protein homeostasis to build a microscopic model explaining both the overall amino acid composition and its temperature trends. Low biosynthesis costs lead to low diversity of physical interactions between amino acid residues, which in turn makes proteins less stable and drives up chaperone activity to maintain appropriate levels of folded, functional proteins. Assuming that the cost of chaperone activity is proportional to the fraction of unfolded client proteins, we simulated thermal adaptation of model proteins subject to minimization of the total cost of amino acid synthesis and chaperone activity. For the first time, we predicted both the proteome-average amino acid abundances and their temperature trends simultaneously, and found strong correlations between model predictions and 402 genomes of bacteria and archaea. The energetic constraint on protein evolution is more apparent in highly expressed proteins, selected by codon adaptation index. We found that in bacteria, highly expressed proteins are similar in composition to thermophilic ones, whereas in archaea no correlation between predicted expression level and thermostability was observed. At the same time, thermal adaptations of highly expressed proteins in bacteria and archaea are nearly identical, suggesting that universal energetic constraints prevail over the phylogenetic differences between these domains of life.
Collapse
Affiliation(s)
- Sergey V Venev
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation St, Worcester, MA
| | - Konstantin B Zeldovich
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation St, Worcester, MA
| |
Collapse
|
18
|
Du MZ, Zhang C, Wang H, Liu S, Wei W, Guo FB. The GC Content as a Main Factor Shaping the Amino Acid Usage During Bacterial Evolution Process. Front Microbiol 2018; 9:2948. [PMID: 30581420 PMCID: PMC6292993 DOI: 10.3389/fmicb.2018.02948] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Accepted: 11/16/2018] [Indexed: 11/13/2022] Open
Abstract
Understanding how proteins evolve is important, and the order of amino acids being recruited into the genetic codons was found to be an important factor shaping the amino acid composition of proteins. The latest work about the last universal common ancestor (LUCA) makes it possible to determine the potential factors shaping amino acid compositions during evolution. Those LUCA genes/proteins from Methanococcus maripaludis S2, which is one of the possible LUCA, were investigated. The evolutionary rates of these genes positively correlate with GC contents with P-value significantly lower than 0.05 for 94% homologous genes. Linear regression results showed that compositions of amino acids coded by GC-rich codons positively contribute to the evolutionary rates, while these amino acids tend to be gained in GC-rich organisms according to our results. The first principal component correlates with the GC content very well. The ratios of amino acids of the LUCA proteins coded by GC rich codons positively correlate with the GC content of different bacteria genomes, while the ratios of amino acids coded by AT rich codons negatively correlate with the increase of GC content of genomes. Next, we found that the recruitment order does correlate with the amino acid compositions, but gain and loss in codons showed newly recruited amino acids are not significantly increased along with the evolution. Thus, we conclude that GC content is a primary factor shaping amino acid compositions. GC content shapes amino acid composition to trade off the cost of amino acids with bases, which could be caused by the energy efficiency.
Collapse
Affiliation(s)
- Meng-Ze Du
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | | | - Huan Wang
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Shuo Liu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Wen Wei
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Feng-Biao Guo
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
- Centre for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
19
|
Rao Y, Wang Z, Luo W, Sheng W, Zhang R, Chai X. Base composition is the primary factor responsible for the variation of amino acid usage in zebra finch (Taeniopygia guttata). PLoS One 2018; 13:e0204796. [PMID: 30517105 PMCID: PMC6281210 DOI: 10.1371/journal.pone.0204796] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 09/15/2018] [Indexed: 11/19/2022] Open
Abstract
In the present study, we carried out an examination of the amino acid usage in the zebra finch (Taeniopygia guttata) proteome. We found that tRNA abundance, base composition, hydrophobicity and aromaticity, protein second structure, cysteine residue (Cys) content and protein molecular weight had significant impact on the amino acid usage of the zebra finch. The above factors explained the total variability of 22.85%, 25.37%, 10.91%, 5.06%, 4.21%, and 3.14%, respectively. Altogether, approximately 70% of the total variability in zebra finch could be explained by such factors. Comparison of the amino acid usage between zebra finch, chicken (Gallus gallus) and human (Homo sapiens) suggested that the average frequency of various amino acid usage is generally consistent among them. Correspondence analysis indicated that base composition was the primary factor affecting the amino acid usage in zebra finch. This trend was different from chicken, but similar to human. Other factors affecting the amino acid usage in zebra finch, such as isochore structure, protein second structure, Cys frequency and protein molecular weight also showed the similar trends with human. We do not know whether the similar amino acid usage trend between human and zebra finch is related to the distinctive neural and behavioral traits, but it is worth studying in depth.
Collapse
Affiliation(s)
- Yousheng Rao
- Department of Biological Technology, Nanchang Normal University, Nanchang, Jiangxi, China
- Jiang Xi Province Key Lab of Genetic Improvement of Indigenous Chicken Breeds, Nanchang, Jiangxi, China)
- * E-mail:
| | - Zhangfeng Wang
- Department of Biological Technology, Nanchang Normal University, Nanchang, Jiangxi, China
- Jiang Xi Province Key Lab of Genetic Improvement of Indigenous Chicken Breeds, Nanchang, Jiangxi, China)
| | - Wen Luo
- Jiang Xi Province Key Lab of Genetic Improvement of Indigenous Chicken Breeds, Nanchang, Jiangxi, China)
| | - Wentao Sheng
- Department of Biological Technology, Nanchang Normal University, Nanchang, Jiangxi, China
- Jiang Xi Province Key Lab of Genetic Improvement of Indigenous Chicken Breeds, Nanchang, Jiangxi, China)
| | - Rendian Zhang
- Department of Biological Technology, Nanchang Normal University, Nanchang, Jiangxi, China
- Jiang Xi Province Key Lab of Genetic Improvement of Indigenous Chicken Breeds, Nanchang, Jiangxi, China)
| | - Xuewen Chai
- Department of Biological Technology, Nanchang Normal University, Nanchang, Jiangxi, China
- Jiang Xi Province Key Lab of Genetic Improvement of Indigenous Chicken Breeds, Nanchang, Jiangxi, China)
| |
Collapse
|
20
|
Kwak J, Park J. What we can see from very small size sample of metagenomic sequences. BMC Bioinformatics 2018; 19:399. [PMID: 30390617 PMCID: PMC6215618 DOI: 10.1186/s12859-018-2431-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Accepted: 10/10/2018] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Since the analysis of a large number of metagenomic sequences costs heavy computing resources and takes long time, we examined a selected small part of metagenomic sequences as "sample"s of the entire full sequences, both for a mock community and for 10 different existing metagenomics case studies. A mock community with 10 bacterial strains was prepared, and their mixed genome were sequenced by Hiseq. The hits of BLAST search for reference genome of each strain were counted. Each of 176 different small parts selected from these sequences were also searched by BLAST and their hits were also counted, in order to compare them to the original search results from the full sequences. We also prepared small parts of sequences which were selected from 10 publicly downloadable research data of MG-RAST service, and analyzed these samples with MG-RAST. RESULTS Both the BLAST search tests of the mock community and the results from the publicly downloadable researches of MG-RAST show that sampling an extremely small part from sequence data is useful to estimate brief taxonomic information of the original metagenomic sequences. For 9 cases out of 10, the most annotated classes from the MG-RAST analyses of the selected partial sample sequences are the same as the ones from the originals. CONCLUSIONS When a researcher wants to estimate brief information of a metagenome's taxonomic distribution with less computing resources and within shorter time, the researcher can analyze a selected small part of metagenomic sequences. With this approach, we can also build a strategy to monitor metagenome samples of wider geographic area, more frequently.
Collapse
Affiliation(s)
- Jaesik Kwak
- Graduate Program in Technology Policy, Yonsei University, 50 Yonsei Ro, Seodaemun Gu, Seoul, 038722 South Korea
| | - Joonhong Park
- School of Civil and Environmental Engineering, Yonsei University, 50 Yonsei Ro, Seodaemun Gu, Seoul, 038722 South Korea
| |
Collapse
|
21
|
Sun J, Meng Z, Wu K, Liu B, Zhang S, Liu Y, Wang Y, Zheng H, Huang J, Zhou P. Tracing the origin of Treponema pallidum in China using next-generation sequencing. Oncotarget 2018; 7:42904-42918. [PMID: 27344187 PMCID: PMC5189996 DOI: 10.18632/oncotarget.10154] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2015] [Accepted: 06/01/2016] [Indexed: 12/29/2022] Open
Abstract
Syphilis is a systemic sexually transmitted disease caused by Treponema pallidum ssp. pallidum (TPA). The origin and genetic background of Chinese TPA strains remain unclear. We identified a total of 329 single-nucleotide variants (SNVs) in eight Chinese TPA strains using next-generation sequencing. All of the TPA strains were clustered into three lineages, and Chinese TPA strains were grouped in Lineage 2 based on phylogenetic analysis. The phylogeographical data showed that TPA strains originated earlier than did T. pallidum ssp. pertenue (TPE) and T. pallidum ssp. endemicum (TPN) strains and that Chinese TPA strains might be derived from recombination between Lineage 1 and Lineage 3. Moreover, we found through a homology modeling analysis that a nonsynonymous substitution (I415F) in the PBP3 protein might affect the structural flexibility of PBP3 and the binding constant for substrates based on its possible association with penicillin resistance in T. pallidum. Our findings provide new insight into the molecular foundation of the evolutionary origin of TPA and support the development of novel diagnostic/therapeutic technology for syphilis.
Collapse
Affiliation(s)
- Jun Sun
- STD Institute, Shanghai Skin Disease Hospital, Shanghai, China
| | - Zhefeng Meng
- Oncology Bioinformatics Center, Minhang Hospital, Fudan University, Shanghai, China
| | - Kaiqi Wu
- School of Laboratory Medicine, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Biao Liu
- School of Laboratory Medicine, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Sufang Zhang
- Shanghai Skin Disease Hospital, Clinical School of Anhui Medical University, Shanghai, China
| | - Yudan Liu
- Shanghai Skin Disease Hospital, Clinical School of Anhui Medical University, Shanghai, China
| | - Yuezhu Wang
- Shanghai-MOST Key Laboratory for Disease and Health Genomics, Chinese National Human Genome Center and National Engineering Center for Biochip at Shanghai, Shanghai, China
| | - Huajun Zheng
- Shanghai-MOST Key Laboratory for Disease and Health Genomics, Chinese National Human Genome Center and National Engineering Center for Biochip at Shanghai, Shanghai, China
| | - Jian Huang
- Shanghai-MOST Key Laboratory for Disease and Health Genomics, Chinese National Human Genome Center and National Engineering Center for Biochip at Shanghai, Shanghai, China.,Key Laboratory of Systems Biomedicine (Ministry of Education) and Collaborative Innovation Center of Systems Biomedicine, Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai, China
| | - Pingyu Zhou
- STD Institute, Shanghai Skin Disease Hospital, Shanghai, China.,Shanghai Skin Disease Hospital, Clinical School of Anhui Medical University, Shanghai, China
| |
Collapse
|
22
|
Carels N, Gumiel M, da Mota FF, de Carvalho Moreira CJ, Azambuja P. A Metagenomic Analysis of Bacterial Microbiota in the Digestive Tract of Triatomines. Bioinform Biol Insights 2017; 11:1177932217733422. [PMID: 28989277 PMCID: PMC5624349 DOI: 10.1177/1177932217733422] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Accepted: 04/10/2017] [Indexed: 12/04/2022] Open
Abstract
The digestive tract of triatomines (DTT) is an ecological niche favored by microbiota whose enzymatic profile is adapted to the specific substrate availability in this medium. This report describes the molecular enzymatic properties that promote bacterial prominence in the DTT. The microbiota composition was assessed previously based on 16S ribosomal DNA, and whole sequenced genomes of bacteria from the same genera were used to calculate the GC level of rare and prominent bacterial species in the DTT. The enzymatic reactions encoded by coding sequences of both rare and common bacterial species were then compared and revealed key functions explaining why some genera outcompete others in the DTT. Representativeness of DTT microbiota was investigated by shotgun sequencing of DNA extracted from bacteria grown in liquid Luria-Bertani broth (LB) medium. Results showed that GC-rich bacteria outcompete GC-poor bacteria and are the dominant components of the DTT microbiota. In addition, oxidoreductases are the main enzymatic components of these bacteria. In particular, nitrate reductases (anaerobic respiration), oxygenases (catabolism of complex substrates), acetate-CoA ligase (tricarboxylic acid cycle and energy metabolism), and kinase (signaling pathway) were the major enzymatic determinants present together with a large group of minor enzymes including hydrogenases involved in energy and amino acid metabolism. In conclusion, despite their slower growth in liquid LB medium, bacteria from GC-rich genera outcompete the GC-poor bacteria because their specific enzymatic abilities impart a selective advantage in the DTT.
Collapse
Affiliation(s)
- Nicolas Carels
- Laboratório de Modelagem de Sistemas Biológicos, National Institute for Science and Technology on Innovation in Neglected Diseases (INCT-IDN), Centro de Desenvolvimento Tecnológico em Saúde (CDTS), Fundação Oswaldo Cruz (FIOCRUZ), Rio de Janeiro, Brazil
| | - Marcial Gumiel
- Laboratório de Bioquímica e Fisiologia de Insetos, Instituto Oswaldo Cruz, Fundação Oswaldo Cruz (IOC/FIOCRUZ), Rio de Janeiro, Brazil
| | - Fabio Faria da Mota
- Laboratório de Biologia Computacional e Sistemas, Instituto Oswaldo Cruz, Fundação Oswaldo Cruz (IOC/FIOCRUZ), Rio de Janeiro, Brazil
| | | | - Patricia Azambuja
- Laboratório de Bioquímica e Fisiologia de Insetos, Instituto Oswaldo Cruz, Fundação Oswaldo Cruz (IOC/FIOCRUZ), Rio de Janeiro, Brazil.,Departamento de Entomologia Molecular, Instituto Nacional de Entomologia Molecular (INCT-EM), Rio de Janeiro, Brazil
| |
Collapse
|
23
|
Machado H, Gram L. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium. Front Microbiol 2017; 8:1204. [PMID: 28706512 PMCID: PMC5489566 DOI: 10.3389/fmicb.2017.01204] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2017] [Accepted: 06/13/2017] [Indexed: 11/13/2022] Open
Abstract
Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.
Collapse
Affiliation(s)
- Henrique Machado
- Department of Biotechnology and Biomedicine, Technical University of Denmark, MatematiktorvetKgs Lyngby, Denmark.,Novo Nordisk Foundation Center for Biosustainability, Technical University of DenmarkHørsholm, Denmark
| | - Lone Gram
- Department of Biotechnology and Biomedicine, Technical University of Denmark, MatematiktorvetKgs Lyngby, Denmark
| |
Collapse
|
24
|
Cuthbertson L, Amores-Arrocha H, Malard LA, Els N, Sattler B, Pearce DA. Characterisation of Arctic Bacterial Communities in the Air above Svalbard. BIOLOGY 2017; 6:biology6020029. [PMID: 28481257 PMCID: PMC5485476 DOI: 10.3390/biology6020029] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2016] [Revised: 04/20/2017] [Accepted: 04/21/2017] [Indexed: 01/09/2023]
Abstract
Atmospheric dispersal of bacteria is increasingly acknowledged as an important factor influencing bacterial community biodiversity, biogeography and bacteria–human interactions, including those linked to human health. However, knowledge about patterns in microbial aerobiology is still relatively scarce, and this can be attributed, in part, to a lack of consensus on appropriate sampling and analytical methodology. In this study, three different methods were used to investigate aerial biodiversity over Svalbard: impaction, membrane filtration and drop plates. Sites around Svalbard were selected due to their relatively remote location, low human population, geographical location with respect to air movement and the tradition and history of scientific investigation on the archipelago, ensuring the presence of existing research infrastructure. The aerial bacterial biodiversity found was similar to that described in other aerobiological studies from both polar and non-polar environments, with Proteobacteria, Actinobacteria, and Firmicutes being the predominant groups. Twelve different phyla were detected in the air collected above Svalbard, although the diversity was considerably lower than in urban environments elsewhere. However, only 58 of 196 bacterial genera detected were consistently present, suggesting potentially higher levels of heterogeneity. Viable bacteria were present at all sampling locations, showing that living bacteria are ubiquitous in the air around Svalbard. Sampling location influenced the results obtained, as did sampling method. Specifically, impaction with a Sartorius MD8 produced a significantly higher number of viable colony forming units (CFUs) than drop plates alone.
Collapse
Affiliation(s)
- Lewis Cuthbertson
- Department of Applied Sciences, Faculty of Health and Life Sciences, University of Northumbria at Newcastle, Ellison Building, Newcastle-upon-Tyne NE1 8ST, UK.
| | - Herminia Amores-Arrocha
- Department of Applied Sciences, Faculty of Health and Life Sciences, University of Northumbria at Newcastle, Ellison Building, Newcastle-upon-Tyne NE1 8ST, UK.
| | - Lucie A Malard
- Department of Applied Sciences, Faculty of Health and Life Sciences, University of Northumbria at Newcastle, Ellison Building, Newcastle-upon-Tyne NE1 8ST, UK.
| | - Nora Els
- Institute of Ecology, Austrian Polar Research Institute, University of Innsbruck, Technikerstrasse 25, 6020 Innsbruck, Austria.
| | - Birgit Sattler
- Institute of Ecology, Austrian Polar Research Institute, University of Innsbruck, Technikerstrasse 25, 6020 Innsbruck, Austria.
| | - David A Pearce
- Department of Applied Sciences, Faculty of Health and Life Sciences, University of Northumbria at Newcastle, Ellison Building, Newcastle-upon-Tyne NE1 8ST, UK.
| |
Collapse
|
25
|
Jag V, Poehlein A, Bengelsdorf FR, Daniel R, Dürre P. Genome sequencing and description of Oerskovia enterophila VJag, an agar- and cellulose-degrading bacterium. Stand Genomic Sci 2017; 12:30. [PMID: 28484582 PMCID: PMC5418683 DOI: 10.1186/s40793-017-0244-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Accepted: 04/27/2017] [Indexed: 01/28/2023] Open
Abstract
A nonmotile, Gram-positive bacterium that shows an elongated and branching cell shape was isolated from soil samples from the botanical garden of Ulm University, Ulm, Germany. Here, the isolation procedure, identification, genome sequencing and metabolic features of the strain are described. Phylogenetic analysis allowed to identify the isolated strain as Oerskovia enterophila. The genus Oerskovia belongs to the family Cellulomonadaceae within the order Actinomycetales. The length of cells of O. enterophila ranges from 1 μm to 15 μm, depending on the growth phase. In the exponential growth phase, cells show an elongated and branching shape, whereas cells break up to round or coccoid elements in the stationary growth phase. The 4,535,074 bp long genome consists of 85 contigs with 3918 protein-coding genes and 57 RNA genes. The isolated strain was shown to degrade numerous complex carbon sources such as cellulose, chitin, and starch, which can be found ubiquitously in nature. Moreover, analysis of the genomic sequence revealed the genetic potential to degrade these compounds.
Collapse
Affiliation(s)
- Vanessa Jag
- Institut für Mikrobiologie und Biotechnologie, Universität Ulm, Albert-Einstein-Allee 11, D-89081 Ulm, Germany
| | - Anja Poehlein
- Genomic and Applied Microbiology & Göttingen Genomics Laboratory, Institute of Microbiology and Genetics, Georg-August-University Göttingen, Grisebachstr. 8, D-37077 Göttingen, Germany
| | - Frank R Bengelsdorf
- Institut für Mikrobiologie und Biotechnologie, Universität Ulm, Albert-Einstein-Allee 11, D-89081 Ulm, Germany
| | - Rolf Daniel
- Genomic and Applied Microbiology & Göttingen Genomics Laboratory, Institute of Microbiology and Genetics, Georg-August-University Göttingen, Grisebachstr. 8, D-37077 Göttingen, Germany
| | - Peter Dürre
- Institut für Mikrobiologie und Biotechnologie, Universität Ulm, Albert-Einstein-Allee 11, D-89081 Ulm, Germany
| |
Collapse
|
26
|
Matobole RM, van Zyl LJ, Parker-Nance S, Davies-Coleman MT, Trindade M. Antibacterial Activities of Bacteria Isolated from the Marine Sponges Isodictya compressa and Higginsia bidentifera Collected from Algoa Bay, South Africa. Mar Drugs 2017; 15:E47. [PMID: 28218694 PMCID: PMC5334627 DOI: 10.3390/md15020047] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 01/30/2017] [Indexed: 11/16/2022] Open
Abstract
Due to the rise in multi-drug resistant pathogens and other diseases, there is renewed interest in marine sponge endosymbionts as a rich source of natural products (NPs). The South African marine environment is rich in marine biota that remains largely unexplored and may represent an important source for the discovery of novel NPs. We first investigated the bacterial diversity associated with five South African marine sponges, whose microbial populations had not previously been investigated, and select the two sponges (Isodictya compressa and Higginsia bidentifera) with highest species richness to culture bacteria. By employing 33 different growth conditions 415 sponge-associated bacterial isolates were cultured and screened for antibacterial activity. Thirty-five isolates showed antibacterial activity, twelve of which exhibited activity against the multi-drug resistant Escherichia coli 1699, implying that some of the bioactive compounds could be novel. Genome sequencing of two of these isolates confirmed that they harbour uncharacterized biosynthetic pathways that may encode novel chemical structures.
Collapse
Affiliation(s)
- Relebohile Matthew Matobole
- Institute for Microbial Biotechnology and Metagenomics (IMBM), Department of Biotechnology, University of the Western Cape, Robert Sobukwe Road, Bellville 7535, Cape Town, South Africa.
| | - Leonardo Joaquim van Zyl
- Institute for Microbial Biotechnology and Metagenomics (IMBM), Department of Biotechnology, University of the Western Cape, Robert Sobukwe Road, Bellville 7535, Cape Town, South Africa.
| | - Shirley Parker-Nance
- Department of Zoology, Nelson Mandela Metropolitan University, University Way, Port Elizabeth 6031, South Africa.
- South African Institute for Aquatic Biodiversity (SAIAB), Somerset Street, Grahamstown 6139, South Africa.
| | - Michael T Davies-Coleman
- Department of Chemistry, University of the Western Cape, Robert Sobukwe Road, Bellville 7535, Cape Town, South Africa.
| | - Marla Trindade
- Institute for Microbial Biotechnology and Metagenomics (IMBM), Department of Biotechnology, University of the Western Cape, Robert Sobukwe Road, Bellville 7535, Cape Town, South Africa.
| |
Collapse
|
27
|
Tan SY, Tan IKP, Tan MF, Dutta A, Choo SW. Evolutionary study of Yersinia genomes deciphers emergence of human pathogenic species. Sci Rep 2016; 6:36116. [PMID: 27796355 PMCID: PMC5086877 DOI: 10.1038/srep36116] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Accepted: 10/11/2016] [Indexed: 12/25/2022] Open
Abstract
On record, there are 17 species in the Yersinia genus, of which three are known to be pathogenic to human. While the chromosomal and pYV (or pCD1) plasmid-borne virulence genes as well as pathogenesis of these three species are well studied, their genomic evolution is poorly understood. Our study aims to predict the key evolutionary events that led to the emergence of pathogenic Yersinia species by analyzing gene gain-and-loss, virulence genes, and “Clustered regularly-interspaced short palindromic repeats”. Our results suggest that the most recent ancestor shared by the human pathogenic Yersinia was most probably an environmental species that had adapted to the human body. This might have led to ecological specialization that diverged Yersinia into ecotypes and distinct lineages based on differential gene gain-and-loss in different niches. Our data also suggest that Y. pseudotuberculosis group might be the donor of the ail virulence gene to Y. enterocolitica. Hence, we postulate that evolution of human pathogenic Yersinia might not be totally in parallel, but instead, there were lateral gene transfer events. Furthermore, the presence of virulence genes seems to be important for the positive selection of virulence plasmid. Our studies provide better insights into the evolutionary biology of these bacteria.
Collapse
Affiliation(s)
- Shi Yang Tan
- Department of Oral and Craniofacial Sciences, Faculty of Dentistry, University of Malaya, 50603 Kuala Lumpur, Malaysia.,Genome Informatics Research Laboratory, High Impact Research Building, University of Malaya, 50603 Kuala Lumpur, Malaysia
| | - Irene Kit Ping Tan
- Institute of Biological Sciences, Faculty of Science, University of Malaya, 50603 Kuala Lumpur, Malaysia
| | - Mui Fern Tan
- Genome Informatics Research Laboratory, High Impact Research Building, University of Malaya, 50603 Kuala Lumpur, Malaysia
| | - Avirup Dutta
- Genome Informatics Research Laboratory, High Impact Research Building, University of Malaya, 50603 Kuala Lumpur, Malaysia
| | - Siew Woh Choo
- Department of Oral and Craniofacial Sciences, Faculty of Dentistry, University of Malaya, 50603 Kuala Lumpur, Malaysia.,Genome Informatics Research Laboratory, High Impact Research Building, University of Malaya, 50603 Kuala Lumpur, Malaysia
| |
Collapse
|
28
|
Lal D, Verma M, Behura SK, Lal R. Codon usage bias in phylum Actinobacteria : relevance to environmental adaptation and host pathogenicity. Res Microbiol 2016; 167:669-677. [DOI: 10.1016/j.resmic.2016.06.003] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2015] [Revised: 06/08/2016] [Accepted: 06/10/2016] [Indexed: 10/21/2022]
|
29
|
Barrero‐Canosa J, Moraru C, Zeugner L, Fuchs BM, Amann R. Direct‐geneFISH: a simplified protocol for the simultaneous detection and quantification of genes and rRNA in microorganisms. Environ Microbiol 2016; 19:70-82. [DOI: 10.1111/1462-2920.13432] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Affiliation(s)
- Jimena Barrero‐Canosa
- Department of Molecular EcologyMax Planck Institute for Marine MicrobiologyCelsiusstr. 1BremenD‐28359 Germany
| | - Cristina Moraru
- Department of Biology of Geological ProcessesInstitute for Chemistry and Biology of the Marine environment (ICBM)Carl‐von‐Ossietzky‐Straße 9‐11OldenburgD‐26111 Germany
| | - Laura Zeugner
- Department of Molecular EcologyMax Planck Institute for Marine MicrobiologyCelsiusstr. 1BremenD‐28359 Germany
| | - Bernhard M. Fuchs
- Department of Molecular EcologyMax Planck Institute for Marine MicrobiologyCelsiusstr. 1BremenD‐28359 Germany
| | - Rudolf Amann
- Department of Molecular EcologyMax Planck Institute for Marine MicrobiologyCelsiusstr. 1BremenD‐28359 Germany
| |
Collapse
|
30
|
Abdul Rahman N, Parks DH, Vanwonterghem I, Morrison M, Tyson GW, Hugenholtz P. A Phylogenomic Analysis of the Bacterial Phylum Fibrobacteres. Front Microbiol 2016; 6:1469. [PMID: 26779135 PMCID: PMC4704652 DOI: 10.3389/fmicb.2015.01469] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Accepted: 12/07/2015] [Indexed: 12/13/2022] Open
Abstract
The Fibrobacteres has been recognized as a bacterial phylum for over a decade, but little is known about the group beyond its environmental distribution, and characterization of its sole cultured representative genus, Fibrobacter, after which the phylum was named. Based on these incomplete data, it is thought that cellulose hydrolysis, anaerobic metabolism, and lack of motility are unifying features of the phylum. There are also contradicting views as to whether an uncultured sister lineage, candidate phylum TG3, should be included in the Fibrobacteres. Recently, chitin-degrading cultured representatives of TG3 were isolated from a hypersaline soda lake, and the genome of one species, Chitinivibrio alkaliphilus, sequenced and described in detail. Here, we performed a comparative analysis of Fibrobacter succinogenes, C. alkaliphilus and eight near or substantially complete Fibrobacteres/TG3 genomes of environmental populations recovered from termite gut, anaerobic digester, and sheep rumen metagenomes. We propose that TG3 should be amalgamated with the Fibrobacteres phylum based on robust monophyly of the two lineages and shared character traits. Polymer hydrolysis, using a distinctive set of glycoside hydrolases and binding domains, appears to be a prominent feature of members of the Fibrobacteres. Not all members of this phylum are strictly anaerobic as some termite gut Fibrobacteres have respiratory chains adapted to the microaerophilic conditions found in this habitat. Contrary to expectations, flagella-based motility is predicted to be an ancestral and common trait in this phylum and has only recently been lost in F. succinogenes and its relatives based on phylogenetic distribution of flagellar genes. Our findings extend current understanding of the Fibrobacteres and provide an improved basis for further investigation of this phylum.
Collapse
Affiliation(s)
- Nurdyana Abdul Rahman
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland Brisbane, QLD, Australia
| | - Donovan H Parks
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland Brisbane, QLD, Australia
| | - Inka Vanwonterghem
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of QueenslandBrisbane, QLD, Australia; Advanced Water Management Center, The University of QueenslandBrisbane, QLD, Australia
| | - Mark Morrison
- Microbial Biology and Metagenomics, The University of Queensland Diamantina Institute, Translational Research Institute Brisbane, QLD, Australia
| | - Gene W Tyson
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland Brisbane, QLD, Australia
| | - Philip Hugenholtz
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of QueenslandBrisbane, QLD, Australia; Genomics and Computational Biology, Institute for Molecular Bioscience, The University of QueenslandBrisbane, QLD, Australia
| |
Collapse
|
31
|
Decoding Biomass-Sensing Regulons of Clostridium thermocellum Alternative Sigma-I Factors in a Heterologous Bacillus subtilis Host System. PLoS One 2016; 11:e0146316. [PMID: 26731480 PMCID: PMC4711584 DOI: 10.1371/journal.pone.0146316] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Accepted: 12/15/2015] [Indexed: 11/25/2022] Open
Abstract
The Gram-positive, anaerobic, cellulolytic, thermophile Clostridium (Ruminiclostridium) thermocellum secretes a multi-enzyme system called the cellulosome to solubilize plant cell wall polysaccharides. During the saccharolytic process, the enzymatic composition of the cellulosome is modulated according to the type of polysaccharide(s) present in the environment. C. thermocellum has a set of eight alternative RNA polymerase sigma (σ) factors that are activated in response to extracellular polysaccharides and share sequence similarity to the Bacillus subtilis σI factor. The aim of the present work was to demonstrate whether individual C. thermocellum σI-like factors regulate specific cellulosomal genes, focusing on C. thermocellum σI6 and σI3 factors. To search for putative σI6- and σI3-dependent promoters, bioinformatic analysis of the upstream regions of the cellulosomal genes was performed. Because of the limited genetic tools available for C. thermocellum, the functionality of the predicted σI6- and σI3-dependent promoters was studied in B. subtilis as a heterologous host. This system enabled observation of the activation of 10 predicted σI6-dependent promoters associated with the C. thermocellum genes: sigI6 (itself, Clo1313_2778), xyn11B (Clo1313_0522), xyn10D (Clo1313_0177), xyn10Z (Clo1313_2635), xyn10Y (Clo1313_1305), cel9V (Clo1313_0349), cseP (Clo1313_2188), sigI1 (Clo1313_2174), cipA (Clo1313_0627), and rsgI5 (Clo1313_0985). Additionally, we observed the activation of 4 predicted σI3-dependent promoters associated with the C. thermocellum genes: sigI3 (itself, Clo1313_1911), pl11 (Clo1313_1983), ce12 (Clo1313_0693) and cipA. Our results suggest possible regulons of σI6 and σI3 in C. thermocellum, as well as the σI6 and σI3 promoter consensus sequences. The proposed -35 and -10 promoter consensus elements of σI6 are CNNAAA and CGAA, respectively. Additionally, a less conserved CGA sequence next to the C in the -35 element and a highly conserved AT sequence three bases downstream of the -10 element were also identified as important nucleotides for promoter recognition. Regarding σI3, the proposed -35 and -10 promoter consensus elements are CCCYYAAA and CGWA, respectively. The present study provides new clues for understanding these recently discovered alternative σI factors.
Collapse
|
32
|
Yeoh YK, Sekiguchi Y, Parks DH, Hugenholtz P. Comparative Genomics of Candidate Phylum TM6 Suggests That Parasitism Is Widespread and Ancestral in This Lineage. Mol Biol Evol 2015; 33:915-27. [PMID: 26615204 PMCID: PMC4776705 DOI: 10.1093/molbev/msv281] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Candidate phylum TM6 is a major bacterial lineage recognized through culture-independent rRNA surveys to be low abundance members in a wide range of habitats; however, they are poorly characterized due to a lack of pure culture representatives. Two recent genomic studies of TM6 bacteria revealed small genomes and limited gene repertoire, consistent with known or inferred dependence on eukaryotic hosts for their metabolic needs. Here, we obtained additional near-complete genomes of TM6 populations from agricultural soil and upflow anaerobic sludge blanket reactor metagenomes which, together with the two publicly available TM6 genomes, represent seven distinct family level lineages in the TM6 phylum. Genome-based phylogenetic analysis confirms that TM6 is an independent phylum level lineage in the bacterial domain, possibly affiliated with the Patescibacteria superphylum. All seven genomes are small (1.0–1.5 Mb) and lack complete biosynthetic pathways for various essential cellular building blocks including amino acids, lipids, and nucleotides. These and other features identified in the TM6 genomes such as a degenerated cell envelope, ATP/ADP translocases for parasitizing host ATP pools, and protein motifs to facilitate eukaryotic host interactions indicate that parasitism is widespread in this phylum. Phylogenetic analysis of ATP/ADP translocase genes suggests that the ancestral TM6 lineage was also parasitic. We propose the name Dependentiae (phyl. nov.) to reflect dependence of TM6 bacteria on host organisms.
Collapse
Affiliation(s)
- Yun Kit Yeoh
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD, Australia Institute for Molecular Bioscience, The University of Queensland, St Lucia, QLD, Australia
| | - Yuji Sekiguchi
- Biomedical Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Ibaraki, Japan
| | - Donovan H Parks
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD, Australia
| | - Philip Hugenholtz
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD, Australia Institute for Molecular Bioscience, The University of Queensland, St Lucia, QLD, Australia
| |
Collapse
|
33
|
White RA, Power IM, Dipple GM, Southam G, Suttle CA. Metagenomic analysis reveals that modern microbialites and polar microbial mats have similar taxonomic and functional potential. Front Microbiol 2015; 6:966. [PMID: 26441900 PMCID: PMC4585152 DOI: 10.3389/fmicb.2015.00966] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2015] [Accepted: 08/31/2015] [Indexed: 12/15/2022] Open
Abstract
Within the subarctic climate of Clinton Creek, Yukon, Canada, lies an abandoned and flooded open-pit asbestos mine that harbors rapidly growing microbialites. To understand their formation we completed a metagenomic community profile of the microbialites and their surrounding sediments. Assembled metagenomic data revealed that bacteria within the phylum Proteobacteria numerically dominated this system, although the relative abundances of taxa within the phylum varied among environments. Bacteria belonging to Alphaproteobacteria and Gammaproteobacteria were dominant in the microbialites and sediments, respectively. The microbialites were also home to many other groups associated with microbialite formation including filamentous cyanobacteria and dissimilatory sulfate-reducing Deltaproteobacteria, consistent with the idea of a shared global microbialite microbiome. Other members were present that are typically not associated with microbialites including Gemmatimonadetes and iron-oxidizing Betaproteobacteria, which participate in carbon metabolism and iron cycling. Compared to the sediments, the microbialite microbiome has significantly more genes associated with photosynthetic processes (e.g., photosystem II reaction centers, carotenoid, and chlorophyll biosynthesis) and carbon fixation (e.g., CO dehydrogenase). The Clinton Creek microbialite communities had strikingly similar functional potentials to non-lithifying microbial mats from the Canadian High Arctic and Antarctica, but are functionally distinct, from non-lithifying mats or biofilms from Yellowstone. Clinton Creek microbialites also share metabolic genes (R2 < 0.750) with freshwater microbial mats from Cuatro Ciénegas, Mexico, but are more similar to polar Arctic mats (R2 > 0.900). These metagenomic profiles from an anthropogenic microbialite-forming ecosystem provide context to microbialite formation on a human-relevant timescale.
Collapse
Affiliation(s)
- Richard Allen White
- Department of Microbiology and Immunology, University of British Columbia Vancouver, BC, Canada
| | - Ian M Power
- Department of Earth, Ocean and Atmospheric Sciences, University of British Columbia Vancouver, BC, Canada
| | - Gregory M Dipple
- Department of Earth, Ocean and Atmospheric Sciences, University of British Columbia Vancouver, BC, Canada
| | - Gordon Southam
- School of Earth Sciences, University of Queensland Brisbane, QLD, Australia
| | - Curtis A Suttle
- Department of Microbiology and Immunology, University of British Columbia Vancouver, BC, Canada ; Department of Earth, Ocean and Atmospheric Sciences, University of British Columbia Vancouver, BC, Canada ; Department of Botany, University of British Columbia Vancouver, BC, Canada ; Canadian Institute for Advanced Research Toronto, ON, Canada
| |
Collapse
|
34
|
Brbić M, Warnecke T, Kriško A, Supek F. Global Shifts in Genome and Proteome Composition Are Very Tightly Coupled. Genome Biol Evol 2015; 7:1519-32. [PMID: 25971281 PMCID: PMC4494046 DOI: 10.1093/gbe/evv088] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/09/2015] [Indexed: 02/05/2023] Open
Abstract
The amino acid composition (AAC) of proteomes differs greatly between microorganisms and is associated with the environmental niche they inhabit, suggesting that these changes may be adaptive. Similarly, the oligonucleotide composition of genomes varies and may confer advantages at the DNA/RNA level. These influences overlap in protein-coding sequences, making it difficult to gauge their relative contributions. We disentangle these effects by systematically evaluating the correspondence between intergenic nucleotide composition, where protein-level selection is absent, the AAC, and ecological parameters of 909 prokaryotes. We find that G + C content, the most frequently used measure of genomic composition, cannot capture diversity in AAC and across ecological contexts. However, di-/trinucleotide composition in intergenic DNA predicts amino acid frequencies of proteomes to the point where very little cross-species variability remains unexplained (91% of variance accounted for). Qualitatively similar results were obtained for 49 fungal genomes, where 80% of the variability in AAC could be explained by the composition of introns and intergenic regions. Upon factoring out oligonucleotide composition and phylogenetic inertia, the residual AAC is poorly predictive of the microbes' ecological preferences, in stark contrast with the original AAC. Moreover, highly expressed genes do not exhibit more prominent environment-related AAC signatures than lowly expressed genes, despite contributing more to the effective proteome. Thus, evolutionary shifts in overall AAC appear to occur almost exclusively through factors shaping the global oligonucleotide content of the genome. We discuss these results in light of contravening evidence from biophysical data and further reading frame-specific analyses that suggest that adaptation takes place at the protein level.
Collapse
Affiliation(s)
- Maria Brbić
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia Molecular Basis of Ageing, Mediterranean Institute for Life Sciences (MedILS), Split, Croatia
| | - Tobias Warnecke
- MRC Clinical Sciences Centre, Imperial College, Hammersmith Campus, London, United Kingdom
| | - Anita Kriško
- Molecular Basis of Ageing, Mediterranean Institute for Life Sciences (MedILS), Split, Croatia
| | - Fran Supek
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia EMBL/CRG Systems Biology Unit, Centre for Genomic Regulation, Barcelona, Spain
| |
Collapse
|
35
|
The Caulobacter crescentus transducing phage Cr30 is a unique member of the T4-like family of myophages. Curr Microbiol 2015; 70:854-8. [PMID: 25773204 DOI: 10.1007/s00284-015-0799-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Accepted: 02/08/2015] [Indexed: 01/25/2023]
Abstract
Bacteriophage Cr30 has proven useful for the transduction of Caulobacter crescentus. Nucleotide sequencing of Cr30 DNA revealed that the Cr30 genome consists of 155,997 bp of DNA that codes for 287 proteins and five tRNAs. In contrast to the 67 % GC content of the host genome, the GC content of the Cr30 genome is only 38 %. This lower GC content causes both the codon usage pattern and the amino acid composition of the Cr30 proteins to be quite different from those of the host bacteria. As a consequence, the Cr30 mRNAs probably are translated at a rate that is slower than the normal rate for host mRNAs. A phylogenetic comparison of the genome indicates that Cr30 is a member of the T4-like family that is most closely related to a new group of T-like phages exemplified by фM12.
Collapse
|
36
|
Krick T, Verstraete N, Alonso LG, Shub DA, Ferreiro DU, Shub M, Sánchez IE. Amino Acid metabolism conflicts with protein diversity. Mol Biol Evol 2014; 31:2905-12. [PMID: 25086000 PMCID: PMC4209132 DOI: 10.1093/molbev/msu228] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
The 20 protein-coding amino acids are found in proteomes with different relative abundances. The most abundant amino acid, leucine, is nearly an order of magnitude more prevalent than the least abundant amino acid, cysteine. Amino acid metabolic costs differ similarly, constraining their incorporation into proteins. On the other hand, a diverse set of protein sequences is necessary to build functional proteomes. Here, we present a simple model for a cost-diversity trade-off postulating that natural proteomes minimize amino acid metabolic flux while maximizing sequence entropy. The model explains the relative abundances of amino acids across a diverse set of proteomes. We found that the data are remarkably well explained when the cost function accounts for amino acid chemical decay. More than 100 organisms reach comparable solutions to the trade-off by different combinations of proteome cost and sequence diversity. Quantifying the interplay between proteome size and entropy shows that proteomes can get optimally large and diverse.
Collapse
Affiliation(s)
- Teresa Krick
- Departamento de Matemática, Facultad de Ciencias Exactas y Naturales and IMAS-CONICET, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Nina Verstraete
- Protein Physiology Laboratory, Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales and IQUIBICEN-CONICET, Universidad de Buenos Aires, Buenos Aires, Argentina
| | | | - David A Shub
- Department of Biological Sciences, University at Albany, State University of New York
| | - Diego U Ferreiro
- Protein Physiology Laboratory, Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales and IQUIBICEN-CONICET, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Michael Shub
- IMAS-CONICET, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Ignacio E Sánchez
- Protein Physiology Laboratory, Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales and IQUIBICEN-CONICET, Universidad de Buenos Aires, Buenos Aires, Argentina
| |
Collapse
|
37
|
Rao Y, Wang Z, Chai X, Nie Q, Zhang X. Hydrophobicity and aromaticity are primary factors shaping variation in amino acid usage of chicken proteome. PLoS One 2014; 9:e110381. [PMID: 25329059 PMCID: PMC4199684 DOI: 10.1371/journal.pone.0110381] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2013] [Accepted: 09/22/2014] [Indexed: 11/18/2022] Open
Abstract
Amino acids are utilized with different frequencies both among species and among genes within the same genome. Up to date, no study on the amino acid usage pattern of chicken has been performed. In the present study, we carried out a systematic examination of the amino acid usage in the chicken proteome. Our data indicated that the relative amino acid usage is positively correlated with the tRNA gene copy number. GC contents, including GC1, GC2, GC3, GC content of CDS and GC content of the introns, were correlated with the most of the amino acid usage, especially for GC rich and GC poor amino acids, however, multiple linear regression analyses indicated that only approximately 10–40% variation of amino acid usage can be explained by GC content for GC rich and GC poor amino acids. For other intermediate GC content amino acids, only approximately 10% variation can be explained. Correspondence analyses demonstrated that the main factors responsible for the variation of amino acid usage in chicken are hydrophobicity, aromaticity and genomic GC content. Gene expression level also influenced the amino acid usage significantly. We argued that the amino acid usage of chicken proteome likely reflects a balance or near balance between the action of selection, mutation, and genetic drift.
Collapse
Affiliation(s)
- Yousheng Rao
- Department of Biological Technology, Nanchang Normal University, Nanchang, Jiangxi, China
- Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, South China Agricultural University, Guangzhou, Guangdong, China
| | - Zhangfeng Wang
- Department of Biological Technology, Nanchang Normal University, Nanchang, Jiangxi, China
| | - Xuewen Chai
- Department of Biological Technology, Nanchang Normal University, Nanchang, Jiangxi, China
| | - Qinghua Nie
- Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, South China Agricultural University, Guangzhou, Guangdong, China
| | - Xiquan Zhang
- Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, South China Agricultural University, Guangzhou, Guangdong, China
- * E-mail:
| |
Collapse
|
38
|
Zhou HQ, Ning LW, Zhang HX, Guo FB. Analysis of the relationship between genomic GC Content and patterns of base usage, codon usage and amino acid usage in prokaryotes: similar GC content adopts similar compositional frequencies regardless of the phylogenetic lineages. PLoS One 2014; 9:e107319. [PMID: 25255224 PMCID: PMC4177787 DOI: 10.1371/journal.pone.0107319] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2014] [Accepted: 08/08/2014] [Indexed: 11/19/2022] Open
Abstract
The GC contents of 2670 prokaryotic genomes that belong to diverse phylogenetic lineages were analyzed in this paper. These genomes had GC contents that ranged from 13.5% to 74.9%. We analyzed the distance of base frequencies at the three codon positions, codon frequencies, and amino acid compositions across genomes with respect to the differences in the GC content of these prokaryotic species. We found that although the phylogenetic lineages were remote among some species, a similar genomic GC content forced them to adopt similar base usage patterns at the three codon positions, codon usage patterns, and amino acid usage patterns. Our work demonstrates that in prokaryotic genomes: a) base usage, codon usage, and amino acid usage change with GC content with a linear correlation; b) the distance of each usage has a linear correlation with the GC content difference; and c) GC content is more essential than phylogenetic lineage in determining base usage, codon usage, and amino acid usage. This work is exceptional in that we adopted intuitively graphic methods for all analyses, and we used these analyses to examine as many as 2670 prokaryotes. We hope that this work is helpful for understanding common features in the organization of microbial genomes.
Collapse
Affiliation(s)
- Hui-Qi Zhou
- Center of Bioinformatics and Key Laboratory for NeuroInformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China
| | - Lu-Wen Ning
- Center of Bioinformatics and Key Laboratory for NeuroInformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China
| | - Hui-Xiong Zhang
- Center of Bioinformatics and Key Laboratory for NeuroInformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China
| | - Feng-Biao Guo
- Center of Bioinformatics and Key Laboratory for NeuroInformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China
- * E-mail:
| |
Collapse
|
39
|
Das Roy R, Bhardwaj M, Bhatnagar V, Chakraborty K, Dash D. How do eubacterial organisms manage aggregation-prone proteome? F1000Res 2014; 3:137. [PMID: 25339987 PMCID: PMC4193397 DOI: 10.12688/f1000research.4307.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/24/2014] [Indexed: 11/20/2022] Open
Abstract
Eubacterial genomes vary considerably in their nucleotide composition. The percentage of genetic material constituted by guanosine and cytosine (GC) nucleotides ranges from 20% to 70%. It has been posited that GC-poor organisms are more dependent on protein folding machinery. Previous studies have ascribed this to the accumulation of mildly deleterious mutations in these organisms due to population bottlenecks. This phenomenon has been supported by protein folding simulations, which showed that proteins encoded by GC-poor organisms are more prone to aggregation than proteins encoded by GC-rich organisms. To test this proposition using a genome-wide approach, we classified different eubacterial proteomes in terms of their aggregation propensity and chaperone-dependence using multiple machine learning models. In contrast to the expected decrease in protein aggregation with an increase in GC richness, we found that the aggregation propensity of proteomes increases with GC content. A similar and even more significant correlation was obtained with the GroEL-dependence of proteomes: GC-poor proteomes have evolved to be less dependent on GroEL than GC-rich proteomes. We thus propose that a decrease in eubacterial GC content may have been selected in organisms facing proteostasis problems.
Collapse
Affiliation(s)
- Rishi Das Roy
- GNR Knowledge Centre for Genome Informatics, Institute of Genomics and Integrative Biology, Council of Scientific and Industrial Research, Delhi, 110007, India ; Department of Biotechnology, University of Pune, Pune, 411007, India
| | - Manju Bhardwaj
- Department of Computer Science, Maitreyi College, Chanakyapuri, Delhi, 110021, India
| | - Vasudha Bhatnagar
- Department of Computer Science, Faculty of Mathematical Sciences, University of Delhi, Delhi, 110007, India
| | - Kausik Chakraborty
- GNR Knowledge Centre for Genome Informatics, Institute of Genomics and Integrative Biology, Council of Scientific and Industrial Research, Delhi, 110007, India
| | - Debasis Dash
- GNR Knowledge Centre for Genome Informatics, Institute of Genomics and Integrative Biology, Council of Scientific and Industrial Research, Delhi, 110007, India ; Department of Biotechnology, University of Pune, Pune, 411007, India
| |
Collapse
|
40
|
GC constituents and relative codon expressed amino acid composition in cyanobacterial phycobiliproteins. Gene 2014; 546:162-71. [PMID: 24933001 DOI: 10.1016/j.gene.2014.06.024] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2013] [Revised: 04/17/2014] [Accepted: 06/12/2014] [Indexed: 02/01/2023]
Abstract
The genomic as well as structural relationship of phycobiliproteins (PBPs) in different cyanobacterial species are determined by nucleotides as well as amino acid composition. The genomic GC constituents influence the amino acid variability and codon usage of particular subunit of PBPs. We have analyzed 11 cyanobacterial species to explore the variation of amino acids and causal relationship between GC constituents and codon usage. The study at the first, second and third levels of GC content showed relatively more amino acid variability on the levels of G3+C3 position in comparison to the first and second positions. The amino acid encoded GC rich level including G rich and C rich or both correlate the codon variability and amino acid availability. The fluctuation in amino acids such as Arg, Ala, His, Asp, Gly, Leu and Glu in α and β subunits was observed at G1C1 position; however, fluctuation in other amino acids such as Ser, Thr, Cys and Trp was observed at G2C2 position. The coding selection pressure of amino acids such as Ala, Thr, Tyr, Asp, Gly, Ile, Leu, Asn, and Ser in α and β subunits of PBPs was more elaborated at G3C3 position. In this study, we observed that each subunit of PBPs is codon specific for particular amino acid. These results suggest that genomic constraint linked with GC constituents selects the codon for particular amino acids and furthermore, the codon level study may be a novel approach to explore many problems associated with genomics and proteomics of cyanobacteria.
Collapse
|
41
|
Isazadeh S, Ozcer PO, Frigon D. Microbial community structure of wastewater treatment subjected to high mortality rate due to ozonation of return activated sludge. J Appl Microbiol 2014; 117:587-96. [PMID: 24738966 DOI: 10.1111/jam.12523] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2014] [Revised: 03/24/2014] [Accepted: 04/07/2014] [Indexed: 11/30/2022]
Abstract
AIMS This study investigated the effects of return activated sludge (RAS) ozonation, on the bacterial community structure of pilot-scale wastewater treatment systems. METHODS AND RESULTS Two parallel activated sludge reactors were operated to treat real municipal wastewater for 98 days. The RAS of one of the reactors was subjected to increasing doses of ozone during the experimental period, which resulted in higher reduction in biosolids waste production and higher bacterial growth rate. The bacterial community structures were investigated by 16S rRNA gene amplicon high-throughput pyrosequencing and fluorescence in situ hybridization (FISH). The structures remained highly similar throughout the experiment despite the ozone treatment. Comparative analyses between pyrosequencing and FISH revealed clear discrepancies in the proportion of some bacterial populations. CONCLUSIONS The results suggest that RAS ozonation is not a main environmental factor structuring the community composition. Instead, the parallel drifts and slight convergence of the two community structures indicate that other environmental factors such as influent wastewater composition and temperature may be more important. Care should be exercised in interpreting the proportion of sequence reads as pyrosequencing may be biased as compared to FISH. SIGNIFICANCE AND IMPACT OF THE STUDY This study provides new insights on the importance of indiscriminate high mortality rates brought by external factors (here ozonation) on microbial community structures of activated sludge system.
Collapse
Affiliation(s)
- S Isazadeh
- Department of Civil Engineering and Applied Mechanics, McGill University, Montreal, QC, Canada
| | | | | |
Collapse
|
42
|
Agashe D, Shankar N. The evolution of bacterial DNA base composition. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2014; 322:517-28. [DOI: 10.1002/jez.b.22565] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2013] [Accepted: 01/22/2014] [Indexed: 11/08/2022]
Affiliation(s)
- Deepa Agashe
- National Center for Biological Sciences; Tata Institute of Fundamental Research; Bangalore India
| | - Nachiket Shankar
- National Center for Biological Sciences; Tata Institute of Fundamental Research; Bangalore India
| |
Collapse
|
43
|
Bonham-Carter O, Ali H, Bastola D. A base composition analysis of natural patterns for the preprocessing of metagenome sequences. BMC Bioinformatics 2014; 14 Suppl 11:S5. [PMID: 24564274 PMCID: PMC3816298 DOI: 10.1186/1471-2105-14-s11-s5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
Background On the pretext that sequence reads and contigs often exhibit the same kinds of base usage that is also observed in the sequences from which they are derived, we offer a base composition analysis tool. Our tool uses these natural patterns to determine relatedness across sequence data. We introduce spectrum sets (sets of motifs) which are permutations of bacterial restriction sites and the base composition analysis framework to measure their proportional content in sequence data. We suggest that this framework will increase the efficiency during the pre-processing stages of metagenome sequencing and assembly projects. Results Our method is able to differentiate organisms and their reads or contigs. The framework shows how to successfully determine the relatedness between these reads or contigs by comparison of base composition. In particular, we show that two types of organismal-sequence data are fundamentally different by analyzing their spectrum set motif proportions (coverage). By the application of one of the four possible spectrum sets, encompassing all known restriction sites, we provide the evidence to claim that each set has a different ability to differentiate sequence data. Furthermore, we show that the spectrum set selection having relevance to one organism, but not to the others of the data set, will greatly improve performance of sequence differentiation even if the fragment size of the read, contig or sequence is not lengthy. Conclusions We show the proof of concept of our method by its application to ten trials of two or three freshly selected sequence fragments (reads and contigs) for each experiment across the six organisms of our set. Here we describe a novel and computationally effective pre-processing step for metagenome sequencing and assembly tasks. Furthermore, our base composition method has applications in phylogeny where it can be used to infer evolutionary distances between organisms based on the notion that related organisms often have much conserved code.
Collapse
|
44
|
Relative amino acid composition signatures of organisms and environments. PLoS One 2013; 8:e77319. [PMID: 24204807 PMCID: PMC3808408 DOI: 10.1371/journal.pone.0077319] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Accepted: 09/09/2013] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Identifying organism-environment interactions at the molecular level is crucial to understanding how organisms adapt to and change the chemical and molecular landscape of their habitats. In this work we investigated whether relative amino acid compositions could be used as a molecular signature of an environment and whether such a signature could also be observed at the level of the cellular amino acid composition of the microorganisms that inhabit that environment. METHODOLOGIES/PRINCIPAL FINDINGS To address these questions we collected and analyzed environmental amino acid determinations from the literature, and estimated from complete genomic sequences the global relative amino acid abundances of organisms that are cognate to the different types of environment. Environmental relative amino acid abundances clustered into broad groups (ocean waters, host-associated environments, grass land environments, sandy soils and sediments, and forest soils), indicating the presence of amino acid signatures specific for each environment. These signatures correlate to those found in organisms. Nevertheless, relative amino acid abundance of organisms was more influenced by GC content than habitat or phylogeny. CONCLUSIONS Our results suggest that relative amino acid composition can be used as a signature of an environment. In addition, we observed that the relative amino acid composition of organisms is not highly determined by environment, reinforcing previous studies that find GC content to be the major factor correlating to amino acid composition in living organisms.
Collapse
|
45
|
Chen W, Shao Y, Chen F. Evolution of complete proteomes: guanine-cytosine pressure, phylogeny and environmental influences blend the proteomic architecture. BMC Evol Biol 2013; 13:219. [PMID: 24088322 PMCID: PMC3850711 DOI: 10.1186/1471-2148-13-219] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2013] [Accepted: 10/01/2013] [Indexed: 11/18/2022] Open
Abstract
Background Guanine-cytosine (GC) composition is an important feature of genomes. Likewise, amino acid composition is a distinct, but less valued, feature of proteomes. A major concern is that it is not clear what valuable information can be acquired from amino acid composition data. To address this concern, in-depth analyses of the amino acid composition of the complete proteomes from 63 archaea, 270 bacteria, and 128 eukaryotes were performed. Results Principal component analysis of the amino acid matrices showed that the main contributors to proteomic architecture were genomic GC variation, phylogeny, and environmental influences. GC pressure drove positive selection on Ala, Arg, Gly, Pro, Trp, and Val, and adverse selection on Asn, Lys, Ile, Phe, and Tyr. The physico-chemical framework of the complete proteomes withstood GC pressure by frequency complementation of GC-dependent amino acid pairs with similar physico-chemical properties. Gln, His, Ser, and Val were responsible for phylogeny and their constituted components could differentiate archaea, bacteria, and eukaryotes. Environmental niche was also a significant factor in determining proteomic architecture, especially for archaea for which the main amino acids were Cys, Leu, and Thr. In archaea, hyperthermophiles, acidophiles, mesophiles, psychrophiles, and halophiles gathered successively along the environment-based principal component. Concordance between proteomic architecture and the genetic code was also related closely to genomic GC content, phylogeny, and lifestyles. Conclusions Large-scale analyses of the complete proteomes of a wide range of organisms suggested that amino acid composition retained the trace of GC variation, phylogeny, and environmental influences during evolution. The findings from this study will help in the development of a global understanding of proteome evolution, and even biological evolution.
Collapse
Affiliation(s)
- Wanping Chen
- Key Laboratory of Environment Correlative Dietology, Huazhong Agricultural University, Wuhan, Hubei Province 430070, China.
| | | | | |
Collapse
|
46
|
Ongley SE, Bian X, Zhang Y, Chau R, Gerwick WH, Müller R, Neilan BA. High-titer heterologous production in E. coli of lyngbyatoxin, a protein kinase C activator from an uncultured marine cyanobacterium. ACS Chem Biol 2013; 8:1888-93. [PMID: 23751865 DOI: 10.1021/cb400189j] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Many chemically complex cyanobacterial polyketides and nonribosomal peptides are of great pharmaceutical interest, but the levels required for exploitation are difficult to achieve from native sources. Here we develop a framework for the expression of these multifunctional cyanobacterial assembly lines in Escherichia coli using the lyngbyatoxin biosynthetic pathway, derived from a marine microbial assemblage dominated by the cyanobacterium Moorea producens. Heterologous expression of this pathway afforded high titers of both lyngbyatoxin A (25.6 mg L(-1)) and its precursor indolactam-V (150 mg L(-1)). Production, isolation, and identification of all expected chemical intermediates of lyngbyatoxin biosynthesis in E. coli also confirmed the previously proposed biosynthetic route, setting a solid chemical foundation for future pathway engineering. The successful production of the nonribosomal peptide lyngbyatoxin A in E. coli also opens the possibility for future heterologous expression, characterization, and exploitation of other cyanobacterial natural product pathways.
Collapse
Affiliation(s)
- Sarah E. Ongley
- School of Biotechnology and
Biomolecular Sciences, The University of New South Wales, Sydney 2052, Australia
| | - Xiaoying Bian
- Department of Microbial Natural
Products, Helmholtz Institute for Pharmaceutical Research Saarland,
Helmholtz Centre for Infection Research and Department of Pharmaceutical
Biotechnology, Saarland University, Saarbrücken
66041, Germany
| | - Youming Zhang
- Shandong University-Helmholtz
Joint Institute of Biotechnology, State Key Laboratory of Microbial
Technology, Shandong University, Shanda
Nanlu 27, 250100 Jinan, P. R. China
| | - Rocky Chau
- School of Biotechnology and
Biomolecular Sciences, The University of New South Wales, Sydney 2052, Australia
| | - William H. Gerwick
- Center for Marine Biotechnology
and Biomedicine, Scripps Institution of Oceanography, and Skaggs School
of Pharmacy and Pharmaceutical Science, University of California-San Diego, La Jolla, California 92093, United
States
| | - Rolf Müller
- Department of Microbial Natural
Products, Helmholtz Institute for Pharmaceutical Research Saarland,
Helmholtz Centre for Infection Research and Department of Pharmaceutical
Biotechnology, Saarland University, Saarbrücken
66041, Germany
| | - Brett A. Neilan
- School of Biotechnology and
Biomolecular Sciences, The University of New South Wales, Sydney 2052, Australia
| |
Collapse
|
47
|
Bohlin J, Brynildsrud O, Vesth T, Skjerve E, Ussery DW. Amino acid usage is asymmetrically biased in AT- and GC-rich microbial genomes. PLoS One 2013; 8:e69878. [PMID: 23922837 PMCID: PMC3724673 DOI: 10.1371/journal.pone.0069878] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2013] [Accepted: 06/14/2013] [Indexed: 11/18/2022] Open
Abstract
INTRODUCTION Genomic base composition ranges from less than 25% AT to more than 85% AT in prokaryotes. Since only a small fraction of prokaryotic genomes is not protein coding even a minor change in genomic base composition will induce profound protein changes. We examined how amino acid and codon frequencies were distributed in over 2000 microbial genomes and how these distributions were affected by base compositional changes. In addition, we wanted to know how genome-wide amino acid usage was biased in the different genomes and how changes to base composition and mutations affected this bias. To carry this out, we used a Generalized Additive Mixed-effects Model (GAMM) to explore non-linear associations and strong data dependences in closely related microbes; principal component analysis (PCA) was used to examine genomic amino acid- and codon frequencies, while the concept of relative entropy was used to analyze genomic mutation rates. RESULTS We found that genomic amino acid frequencies carried a stronger phylogenetic signal than codon frequencies, but that this signal was weak compared to that of genomic %AT. Further, in contrast to codon usage bias (CUB), amino acid usage bias (AAUB) was differently distributed in AT- and GC-rich genomes in the sense that AT-rich genomes did not prefer specific amino acids over others to the same extent as GC-rich genomes. AAUB was also associated with relative entropy; genomes with low AAUB contained more random mutations as a consequence of relaxed purifying selection than genomes with higher AAUB. CONCLUSION Genomic base composition has a substantial effect on both amino acid- and codon frequencies in bacterial genomes. While phylogeny influenced amino acid usage more in GC-rich genomes, AT-content was driving amino acid usage in AT-rich genomes. We found the GAMM model to be an excellent tool to analyze the genomic data used in this study.
Collapse
Affiliation(s)
- Jon Bohlin
- Centre for Epidemiology and Biostatistics, Department of Food Safety and Infection Biology, Norwegian School of Veterinary Science, Oslo, Norway.
| | | | | | | | | |
Collapse
|
48
|
Alsop EB, Raymond J. Resolving prokaryotic taxonomy without rRNA: longer oligonucleotide word lengths improve genome and metagenome taxonomic classification. PLoS One 2013; 8:e67337. [PMID: 23840870 PMCID: PMC3698125 DOI: 10.1371/journal.pone.0067337] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2013] [Accepted: 05/16/2013] [Indexed: 11/19/2022] Open
Abstract
Oligonucleotide signatures, especially tetranucleotide signatures, have been used as method for homology binning by exploiting an organism’s inherent biases towards the use of specific oligonucleotide words. Tetranucleotide signatures have been especially useful in environmental metagenomics samples as many of these samples contain organisms from poorly classified phyla which cannot be easily identified using traditional homology methods, including NCBI BLAST. This study examines oligonucleotide signatures across 1,424 completed genomes from across the tree of life, substantially expanding upon previous work. A comprehensive analysis of mononucleotide through nonanucleotide word lengths suggests that longer word lengths substantially improve the classification of DNA fragments across a range of sizes of relevance to high throughput sequencing. We find that, at present, heptanucleotide signatures represent an optimal balance between prediction accuracy and computational time for resolving taxonomy using both genomic and metagenomic fragments. We directly compare the ability of tetranucleotide and heptanucleotide world lengths (tetranucleotide signatures are the current standard for oligonucleotide word usage analyses) for taxonomic binning of metagenome reads. We present evidence that heptanucleotide word lengths consistently provide more taxonomic resolving power, particularly in distinguishing between closely related organisms that are often present in metagenomic samples. This implies that longer oligonucleotide word lengths should replace tetranucleotide signatures for most analyses. Finally, we show that the application of longer word lengths to metagenomic datasets leads to more accurate taxonomic binning of DNA scaffolds and have the potential to substantially improve taxonomic assignment and assembly of metagenomic data.
Collapse
Affiliation(s)
- Eric B Alsop
- School of Earth and Space Exploration, Arizona State University, Tempe, Arizona, United States of America.
| | | |
Collapse
|
49
|
Abstract
Background Computational gene finding algorithms have proven their robustness in identifying genes in complete genomes. However, metagenomic sequencing has presented new challenges due to the incomplete and fragmented nature of the data. During the last few years, attempts have been made to extract complete and incomplete open reading frames (ORFs) directly from short reads and identify the coding ORFs, bypassing other challenging tasks such as the assembly of the metagenome. Results In this paper we introduce a metagenomics gene caller (MGC) which is an improvement over the state-of-the-art prediction algorithm Orphelia. Orphelia uses a two-stage machine learning approach and computes a model that classifies extracted ORFs from fragmented sequences. We hypothesise and demonstrate evidence that sequences need separate models based on their local GC-content in order to avoid the noise introduced to a single model computed with sequences from the entire GC spectrum. We have also added two amino-acid features based on the benefit of amino-acid usage shown in our previous research. Our algorithm is able to predict genes and translation initiation sites (TIS) more accurately than Orphelia which uses a single model. Conclusions Learning separate models for several pre-defined GC-content regions as opposed to a single model approach improves the performance of the neural network as demonstrated by the experimental results presented in this paper. The inclusion of amino-acid usage features also helps improve the overall accuracy of our algorithm. MGC's improvement sets the ground for further investigation into the use of GC-content to separate data for training models in machine learning based gene finders.
Collapse
Affiliation(s)
- Achraf El Allali
- Department of Computer Science and Engineering, University of South Carolina, 315 Main Street, Columbia, SC 29208, USA.
| | | |
Collapse
|
50
|
Comparative genome characterization of Achromobacter members reveals potential genetic determinants facilitating the adaptation to a pathogenic lifestyle. Appl Microbiol Biotechnol 2013; 97:6413-25. [DOI: 10.1007/s00253-013-5018-3] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2013] [Revised: 05/24/2013] [Accepted: 05/26/2013] [Indexed: 12/22/2022]
|