1
|
Nie S, Wang A, Chen X, Gong Y, Yuan Y. Microbial-Related Metabolites May Be Involved in Eight Major Biological Processes and Represent Potential Diagnostic Markers in Gastric Cancer. Cancers (Basel) 2023; 15:5271. [PMID: 37958446 PMCID: PMC10649575 DOI: 10.3390/cancers15215271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Revised: 10/26/2023] [Accepted: 10/30/2023] [Indexed: 11/15/2023] Open
Abstract
Metabolites associated with microbes regulate human immunity, inhibit bacterial colonization, and promote pathogenicity. Integrating microbe and metabolome research in GC provides a direction for understanding the microbe-associated pathophysiological process of metabolic changes and disease occurrence. The present study included 30 GC patients with 30 cancerous tissues and paired non-cancerous tissues (NCs) as controls. LC-MS/MS metabolomics and 16S rRNA sequencing were performed to obtain the metabolic and microbial characteristics. Integrated analysis of the microbes and metabolomes was conducted to explore the coexistence relationship between the microbial and metabolic characteristics of GC and to identify microbial-related metabolite diagnostic markers. The metabolic analysis showed that the overall metabolite distribution differed between the GC tissues and the NC tissues: 25 metabolites were enriched in the NC tissues and 42 metabolites were enriched in the GC tissues. The α and β microbial diversities were higher in the GC tissues than in the NC tissues, with 11 differential phyla and 52 differential genera. In the correlation and coexistence integrated analysis, 66 differential metabolites were correlated and coexisted, with specific differential microbes. The microbes in the GC tissue likely regulated eight metabolic pathways. In the efficacy evaluation of the microbial-related differential metabolites in the diagnosis of GC, 12 differential metabolites (area under the curve [AUC] >0.9) exerted relatively high diagnostic efficiency, and the combined diagnostic efficacy of 5 to 6 microbial-related differential metabolites was higher than the diagnostic efficacy of a single feature. Therefore, microbial diversity and metabolite distribution differed between the GC tissues and the NC tissues. Microbial-related metabolites may be involved in eight major metabolism-based biological processes in GC and represent potential diagnostic markers.
Collapse
Affiliation(s)
- Siru Nie
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang 110001, China; (S.N.); (A.W.); (X.C.)
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang 110001, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang 110001, China
| | - Ang Wang
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang 110001, China; (S.N.); (A.W.); (X.C.)
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang 110001, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang 110001, China
| | - Xiaohui Chen
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang 110001, China; (S.N.); (A.W.); (X.C.)
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang 110001, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang 110001, China
| | - Yuehua Gong
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang 110001, China; (S.N.); (A.W.); (X.C.)
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang 110001, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang 110001, China
| | - Yuan Yuan
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang 110001, China; (S.N.); (A.W.); (X.C.)
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang 110001, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang 110001, China
| |
Collapse
|
2
|
Dong MJ, Luo H, Gao F. Ori-Finder 2022: A Comprehensive Web Server for Prediction and Analysis of Bacterial Replication Origins. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:1207-1213. [PMID: 36257484 DOI: 10.1016/j.gpb.2022.10.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 09/21/2022] [Accepted: 10/11/2022] [Indexed: 12/26/2022]
Abstract
The replication of DNA is a complex biological process that is essential for life. Bacterial DNA replication is initiated at genomic loci referred to as replication origins (oriCs). Integrating the Z-curve method, DnaA box distribution, and comparative genomic analysis, we developed a web server to predict bacterial oriCs in 2008 called Ori-Finder, which contributes to clarify the characteristics of bacterial oriCs. The oriCs of hundreds of sequenced bacterial genomes have been annotated in the genome reports using Ori-Finder and the predicted results have been deposited in DoriC, a manually curated database of oriCs. This has facilitated large-scale data mining of functional elements in oriCs and strand-biased analysis. Here, we describe Ori-Finder 2022 with updated prediction framework, interactive visualization module, new analysis module, and user-friendly interface. More species-specific indicator genes and functional elements of oriCs are integrated into the updated framework, which has also been redesigned to predict oriCs in draft genomes. The interactive visualization module displays more genomic information related to oriCs and their functional elements. The analysis module includes regulatory protein annotation, repeat sequence discovery, homologous oriC search, and strand-biased analyses. The redesigned interface provides additional customization options for oriC prediction. Ori-Finder 2022 is freely available at http://tubic.tju.edu.cn/Ori-Finder/ and https://tubic.org/Ori-Finder/.
Collapse
Affiliation(s)
- Mei-Jing Dong
- Department of Physics, School of Science, Tianjin University, Tianjin 300072, China
| | - Hao Luo
- Department of Physics, School of Science, Tianjin University, Tianjin 300072, China
| | - Feng Gao
- Department of Physics, School of Science, Tianjin University, Tianjin 300072, China; Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China; SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin 300072, China.
| |
Collapse
|
3
|
Dineen RL, Penno C, Kelleher P, Bourin MJB, O'Connell‐Motherway M, van Sinderen D. Molecular analysis of the replication functions of the bifidobacterial conjugative megaplasmid pMP7017. Microb Biotechnol 2021; 14:1494-1511. [PMID: 33939264 PMCID: PMC8313286 DOI: 10.1111/1751-7915.13810] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2021] [Accepted: 03/22/2021] [Indexed: 11/29/2022] Open
Abstract
pMP7017 is a conjugative megaplasmid isolated from the gut commensal Bifidobacterium breve JCM7017 and was shown to encode two putative replicases, designated here as RepA and RepB. In the current work, RepB was identified as the pMP7017 replicative initiator, as the repB gene, and its surrounding region was shown to be sufficient to allow autonomous replication in two bifidobacterial species, B. breve and Bifidobacterium longum subsp. longum. RepB was shown to bind to repeat sequence downstream of its coding sequence and this region was determined to be essential for efficient replication. Based on our results, we hypothesize that pMP7017 is an iteron-regulated plasmid (IRP) under strict auto-regulatory control. Recombinantly produced and purified RepB was determined to exist as a dimer in solution, differing from replicases of other IRPs, which exist as a mix of dimers and monomers. Furthermore, a stable low-copy Bifidobacterium-E. coli shuttle vector, pRD1.3, was created which can be employed for cloning and expression of large genes, as was demonstrated by the cloning and heterologous expression of the 5.1 kb apuB gene encoding the extracellular amylopullulanase from B. breve UCC2003 into B. longum subsp. longum NCIMB8809.
Collapse
Affiliation(s)
- Rebecca L. Dineen
- APC Microbiome IrelandUniversity College CorkWestern RoadCorkIreland
- School of MicrobiologyUniversity College CorkWestern RoadCorkIreland
| | - Christophe Penno
- CNRS UMR 6553 EcoBioUniversite de Rennes 1Campus de Beaulieu, Bat. 14ARennes cedex35042France
| | - Philip Kelleher
- APC Microbiome IrelandUniversity College CorkWestern RoadCorkIreland
- School of MicrobiologyUniversity College CorkWestern RoadCorkIreland
| | - Maxence J. B. Bourin
- APC Microbiome IrelandUniversity College CorkWestern RoadCorkIreland
- School of MicrobiologyUniversity College CorkWestern RoadCorkIreland
| | | | - Douwe van Sinderen
- APC Microbiome IrelandUniversity College CorkWestern RoadCorkIreland
- School of MicrobiologyUniversity College CorkWestern RoadCorkIreland
| |
Collapse
|
4
|
Castillo AI, Almeida RPP. Evidence of gene nucleotide composition favoring replication and growth in a fastidious plant pathogen. G3-GENES GENOMES GENETICS 2021; 11:6170658. [PMID: 33715000 PMCID: PMC8495750 DOI: 10.1093/g3journal/jkab076] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Accepted: 03/02/2021] [Indexed: 11/13/2022]
Abstract
Nucleotide composition (GC content) varies across bacteria species, genome regions, and specific genes. In Xylella fastidiosa, a vector-borne fastidious plant pathogen infecting multiple crops, GC content ranges between ∼51-52%; however, these values were gathered using limited genomic data. We evaluated GC content variations across X. fastidiosa subspecies fastidiosa (N = 194), subsp. pauca (N = 107), and subsp. multiplex (N = 39). Genomes were classified based on plant host and geographic origin; individual genes within each genome were classified based on gene function, strand, length, ortholog group, Core vs. Accessory, and Recombinant vs. Non-recombinant. GC content was calculated for each gene within each evaluated genome. The effects of genome and gene level variables were evaluated with a mixed effect ANOVA, and the marginal-GC content was calculated for each gene. Also, the correlation between gene-specific GC content vs. natural selection (dN/dS) and recombination/mutation (r/m) was estimated. Our analyses show that intra-genomic changes in nucleotide composition in X. fastidiosa are small and influenced by multiple variables. Higher AT-richness is observed in genes involved in replication and translation, and genes in the leading strand. In addition, we observed a negative correlation between high-AT and dN/dS in subsp. pauca. The relationship between recombination and GC content varied between core and accessory genes. We hypothesize that distinct evolutionary forces and energetic constraints both drive and limit these small variations in nucleotide composition.
Collapse
Affiliation(s)
- Andreina I Castillo
- Department of Environmental Science, Policy and Management, University of California, Berkeley, CA 94720, USA
| | - Rodrigo P P Almeida
- Department of Environmental Science, Policy and Management, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
5
|
Moura MN, Cardoso DC, Cristiano MP. The tight genome size of ants: diversity and evolution under ancestral state reconstruction and base composition. Zool J Linn Soc 2020. [DOI: 10.1093/zoolinnean/zlaa135] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Abstract
The mechanisms and processes driving change and variation in the genome size (GS) are not well known, and only a small set of ant species has been studied. Ants are an ecologically successful insect group present in most distinct ecosystems worldwide. Considering their wide distribution and ecological plasticity in different environmental contexts, we aimed to expand GS estimation within Formicidae to examine distribution patterns and variation in GS and base composition and to reconstruct the ancestral state of this character in an attempt to elucidate the generalized pattern of genomic expansions. Genome size estimates were generated for 99 ant species, including new GS estimates for 91 species of ants, and the mean GS of Formicidae was found to be 0.38 pg. The AT/GC ratio was 62.40/37.60. The phylogenetic reconstruction suggested an ancestral GS of 0.38 pg according to the Bayesian inference/Markov chain Monte Carlo method and 0.37 pg according to maximum likelihood and parsimony methods; significant differences in GS were observed between the subfamilies sampled. Our results suggest that the evolution of GS in Formicidae occurred through loss and accumulation of non-coding regions, mainly transposable elements, and occasionally by whole genome duplication. However, further studies are needed to verify whether these changes in DNA content are related to colonization processes, as suggested at the intraspecific level.
Collapse
Affiliation(s)
- Mariana Neves Moura
- Programa de Pós-graduação em Ecologia, Departamento de Biologia Geral, Universidade Federal de Viçosa, Minas Gerais, Brazil
| | - Danon Clemes Cardoso
- Programa de Pós-graduação em Ecologia, Departamento de Biologia Geral, Universidade Federal de Viçosa, Minas Gerais, Brazil
- Departamento de Biodiversidade, Evolução e Meio Ambiente, Universidade Federal de Ouro Preto, Minas Gerais, Brazil
| | - Maykon Passos Cristiano
- Programa de Pós-graduação em Ecologia, Departamento de Biologia Geral, Universidade Federal de Viçosa, Minas Gerais, Brazil
- Departamento de Biodiversidade, Evolução e Meio Ambiente, Universidade Federal de Ouro Preto, Minas Gerais, Brazil
| |
Collapse
|
6
|
Lu J, Salzberg SL. SkewIT: The Skew Index Test for large-scale GC Skew analysis of bacterial genomes. PLoS Comput Biol 2020; 16:e1008439. [PMID: 33275607 PMCID: PMC7717575 DOI: 10.1371/journal.pcbi.1008439] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Accepted: 10/13/2020] [Indexed: 02/07/2023] Open
Abstract
GC skew is a phenomenon observed in many bacterial genomes, wherein the two replication strands of the same chromosome contain different proportions of guanine and cytosine nucleotides. Here we demonstrate that this phenomenon, which was first discovered in the mid-1990s, can be used today as an analysis tool for the 15,000+ complete bacterial genomes in NCBI's Refseq library. In order to analyze all 15,000+ genomes, we introduce a new method, SkewIT (Skew Index Test), that calculates a single metric representing the degree of GC skew for a genome. Using this metric, we demonstrate how GC skew patterns are conserved within certain bacterial phyla, e.g. Firmicutes, but show different patterns in other phylogenetic groups such as Actinobacteria. We also discovered that outlier values of SkewIT highlight potential bacterial mis-assemblies. Using our newly defined metric, we identify multiple mis-assembled chromosomal sequences in previously published complete bacterial genomes. We provide a SkewIT web app https://jenniferlu717.shinyapps.io/SkewIT/ that calculates SkewI for any user-provided bacterial sequence. The web app also provides an interactive interface for the data generated in this paper, allowing users to further investigate the SkewI values and thresholds of the Refseq-97 complete bacterial genomes. Individual scripts for analysis of bacterial genomes are provided in the following repository: https://github.com/jenniferlu717/SkewIT.
Collapse
Affiliation(s)
- Jennifer Lu
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States
- Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, Maryland, United States
- * E-mail:
| | - Steven L. Salzberg
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States
- Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, Maryland, United States
- Departments of Computer Science and Biostatistics, Johns Hopkins University, Baltimore, Maryland, United States
| |
Collapse
|
7
|
Sabater C, Molinero-García N, Castro-Bravo N, Diez-Echave P, Hidalgo-García L, Delgado S, Sánchez B, Gálvez J, Margolles A, Ruas-Madiedo P. Exopolysaccharide Producing Bifidobacterium animalis subsp. lactis Strains Modify the Intestinal Microbiota and the Plasmatic Cytokine Levels of BALB/c Mice According to the Type of Polymer Synthesized. Front Microbiol 2020; 11:601233. [PMID: 33324384 PMCID: PMC7726137 DOI: 10.3389/fmicb.2020.601233] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 11/05/2020] [Indexed: 12/26/2022] Open
Abstract
Bacteria-host interactions are mediated by different microbial associated molecular patterns which are most often surface structures such as, among others, exopolysaccharides (EPSs). In this work, the capability of two isogenic EPS-producing Bifidobacterium animalis subsp. lactis strains to modulate the gut microbiota of healthy mice, was assessed. Each strain produces a different type of polymer; the ropy strain S89L synthesized a rhamnose-rich, high-molecular weight EPS in highest abundance than the non-ropy DMS10140 one. BALB/c mice were orally fed for 10 days with milk-bifidobacterial suspensions and followed afterward for 7 post-intervention days (wash-out period). The colonic content of mice was collected in several sampling points to perform a metataxonomic analysis. In addition, the influence of specific microbial clades, apparently stimulated by the ropy and non-ropy strains, on mouse plasmatic cytokine levels was investigated through hierarchical association testing. Analysis of 16S rRNA gene sequences showed that the abundance of Firmicutes phylum significantly increased 7 days after cessing the treatment with both strains. The relative abundance of Alloprevotella genus also rose, but after shorter post-treatment times (3 days for both DMS10140 and S89L strains). Some bacterial clades were specifically modulated by one or another strain. As such, the non-ropy DMS10140 strain exerted a significant influence on Intestinomonas genus, which increased after 4 post-administration days. On the other hand, feeding with the ropy strain S89L led to an increase in sequences of Faecalibaculum genus at 4 post-treatment days, while the abundance of Erysipelotrichaceae and Lactobacillaceae families increased for prolonged times. Association testing revealed that several lactobacilli and bifidobacterial significantly stimulated by ropy S89L strain were positively associated with the levels of certain cytokines, including IL-5 and IL-27. These results highlight relevant changes in mice gut microbiota produced after administration of the ropy S89L strain that were associated to a potential immune modulation effect.
Collapse
Affiliation(s)
- Carlos Sabater
- Department of Microbiology and Biochemistry of Dairy Products, Instituto de Productos Lácteos de Asturias - Consejo Superior de Investigaciones Científicas (IPLA-CSIC), Villaviciosa, Spain.,Microhealth Group, Instituto de Investigación Sanitaria del Principado de Asturias (ISPA), Oviedo, Spain
| | - Natalia Molinero-García
- Department of Microbiology and Biochemistry of Dairy Products, Instituto de Productos Lácteos de Asturias - Consejo Superior de Investigaciones Científicas (IPLA-CSIC), Villaviciosa, Spain.,Microhealth Group, Instituto de Investigación Sanitaria del Principado de Asturias (ISPA), Oviedo, Spain
| | - Nuria Castro-Bravo
- Department of Microbiology and Biochemistry of Dairy Products, Instituto de Productos Lácteos de Asturias - Consejo Superior de Investigaciones Científicas (IPLA-CSIC), Villaviciosa, Spain.,Microhealth Group, Instituto de Investigación Sanitaria del Principado de Asturias (ISPA), Oviedo, Spain
| | - Patricia Diez-Echave
- CIBER-EHD, Department of Pharmacology, Center for Biomedical Research (CIBM), University of Granada, Granada, Spain.,Instituto de Investigación Biosanitaria de Granada (ibs.GRANADA), Granada, Spain
| | - Laura Hidalgo-García
- CIBER-EHD, Department of Pharmacology, Center for Biomedical Research (CIBM), University of Granada, Granada, Spain.,Instituto de Investigación Biosanitaria de Granada (ibs.GRANADA), Granada, Spain
| | - Susana Delgado
- Department of Microbiology and Biochemistry of Dairy Products, Instituto de Productos Lácteos de Asturias - Consejo Superior de Investigaciones Científicas (IPLA-CSIC), Villaviciosa, Spain.,Microhealth Group, Instituto de Investigación Sanitaria del Principado de Asturias (ISPA), Oviedo, Spain
| | - Borja Sánchez
- Department of Microbiology and Biochemistry of Dairy Products, Instituto de Productos Lácteos de Asturias - Consejo Superior de Investigaciones Científicas (IPLA-CSIC), Villaviciosa, Spain.,Microhealth Group, Instituto de Investigación Sanitaria del Principado de Asturias (ISPA), Oviedo, Spain
| | - Julio Gálvez
- CIBER-EHD, Department of Pharmacology, Center for Biomedical Research (CIBM), University of Granada, Granada, Spain.,Instituto de Investigación Biosanitaria de Granada (ibs.GRANADA), Granada, Spain
| | - Abelardo Margolles
- Department of Microbiology and Biochemistry of Dairy Products, Instituto de Productos Lácteos de Asturias - Consejo Superior de Investigaciones Científicas (IPLA-CSIC), Villaviciosa, Spain.,Microhealth Group, Instituto de Investigación Sanitaria del Principado de Asturias (ISPA), Oviedo, Spain
| | - Patricia Ruas-Madiedo
- Department of Microbiology and Biochemistry of Dairy Products, Instituto de Productos Lácteos de Asturias - Consejo Superior de Investigaciones Científicas (IPLA-CSIC), Villaviciosa, Spain.,Microhealth Group, Instituto de Investigación Sanitaria del Principado de Asturias (ISPA), Oviedo, Spain
| |
Collapse
|
8
|
Sperlea T, Muth L, Martin R, Weigel C, Waldminghaus T, Heider D. gammaBOriS: Identification and Taxonomic Classification of Origins of Replication in Gammaproteobacteria using Motif-based Machine Learning. Sci Rep 2020; 10:6727. [PMID: 32317695 PMCID: PMC7174414 DOI: 10.1038/s41598-020-63424-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Accepted: 03/31/2020] [Indexed: 01/23/2023] Open
Abstract
The biology of bacterial cells is, in general, based on information encoded on circular chromosomes. Regulation of chromosome replication is an essential process that mostly takes place at the origin of replication (oriC), a locus unique per chromosome. Identification of high numbers of oriC is a prerequisite for systematic studies that could lead to insights into oriC functioning as well as the identification of novel drug targets for antibiotic development. Current methods for identifying oriC sequences rely on chromosome-wide nucleotide disparities and are therefore limited to fully sequenced genomes, leaving a large number of genomic fragments unstudied. Here, we present gammaBOriS (Gammaproteobacterial oriC Searcher), which identifies oriC sequences on gammaproteobacterial chromosomal fragments. It does so by employing motif-based machine learning methods. Using gammaBOriS, we created BOriS DB, which currently contains 25,827 gammaproteobacterial oriC sequences from 1,217 species, thus making it the largest available database for oriC sequences to date. Furthermore, we present gammaBOriTax, a machine-learning based approach for taxonomic classification of oriC sequences, which was trained on the sequences in BOriS DB. Finally, we extracted the motifs relevant for identification and classification decisions of the models. Our results suggest that machine learning sequence classification approaches can offer great support in functional motif identification.
Collapse
Affiliation(s)
- Theodor Sperlea
- Faculty of Mathematics and Computer Science, University of Marburg, Hans-Meerwein-Str. 6, D-35032, Marburg, Lahn, Germany
| | - Lea Muth
- Faculty of Mathematics and Computer Science, University of Marburg, Hans-Meerwein-Str. 6, D-35032, Marburg, Lahn, Germany
| | - Roman Martin
- Faculty of Mathematics and Computer Science, University of Marburg, Hans-Meerwein-Str. 6, D-35032, Marburg, Lahn, Germany
| | - Christoph Weigel
- Institute of Biotechnology, Faculty III, Technische Universität Berlin (TUB), Straße des 17. Juni 135, D-10623, Berlin, Germany
| | - Torsten Waldminghaus
- Chromosome Biology Group, LOEWE Center for Synthetic Microbiology (SYNMIKRO), Philipps-Universität Marburg, D-35043, Marburg, Lahn, Germany
| | - Dominik Heider
- Faculty of Mathematics and Computer Science, University of Marburg, Hans-Meerwein-Str. 6, D-35032, Marburg, Lahn, Germany.
| |
Collapse
|
9
|
Kadnikov VV, Mardanov AV, Beletsky AV, Karnachuk OV, Ravin NV. Complete Genome of a Member of a New Bacterial Lineage in the Microgenomates Group Reveals an Unusual Nucleotide Composition Disparity Between Two Strands of DNA and Limited Metabolic Potential. Microorganisms 2020; 8:microorganisms8030320. [PMID: 32106565 PMCID: PMC7143001 DOI: 10.3390/microorganisms8030320] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 02/22/2020] [Accepted: 02/23/2020] [Indexed: 11/26/2022] Open
Abstract
The candidate phyla radiation is a large monophyletic lineage comprising unculturable bacterial taxa with small cell and genome sizes, mostly known from genomes obtained from environmental sources without cultivation. Here, we present the closed complete genome of a member of the superphylum Microgenomates obtained from the metagenome of a deep subsurface thermal aquifer. Phylogenetic analysis indicates that the new bacterium, designated Ch65, represents a novel phylum-level lineage within the Microgenomates group, sibling to the candidate phylum Collierbacteria. The Ch65 genome has a highly unusual nucleotide composition with one strand of highly enriched in cytosine versus guanine throughout the whole length. Such nucleotide composition asymmetry, also detected in the members of Ca. Collierbacteria and Ca. Beckwithbacteria, suggests that most of the Ch65 chromosome is replicated in one direction. A genome analysis predicted that the Ch65 bacterium has fermentative metabolism and could produce acetate and lactate. It lacks respiratory capacity, as well as complete pathways for the biosynthesis of lipids, amino acids, and nucleotides. The Embden–Meyerhof glycolytic pathway and nonoxidative pentose phosphate pathway are mostly complete, although glucokinase, 6-phosphofructokinase, and transaldolase were not found. The Ch65 bacterium lacks secreted glycoside hydrolases and conventional transporters for importing sugars and amino acids. Overall, the metabolic predictions imply that Ch65 adopts the lifestyle of a symbiont/parasite, or a scavenger, obtaining resources from the lysed microbial biomass. We propose the provisional taxonomic assignment ‘Candidatus Chazhemtobacterium aquaticus’, genus ‘Chazhemtobacterium‘, family ‘Chazhemtobacteraceae‘ in the Microgenomates group.
Collapse
Affiliation(s)
- Vitaly V. Kadnikov
- Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, Moscow 119071, Russia
| | - Andrey V. Mardanov
- Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, Moscow 119071, Russia
| | - Alexey V. Beletsky
- Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, Moscow 119071, Russia
| | - Olga V. Karnachuk
- Laboratory of Biochemistry and Molecular Biology, Tomsk State University, Tomsk 634050, Russia
| | - Nikolai V. Ravin
- Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, Moscow 119071, Russia
- Correspondence:
| |
Collapse
|
10
|
Wang D, Gao F. Comprehensive Analysis of Replication Origins in Saccharomyces cerevisiae Genomes. Front Microbiol 2019; 10:2122. [PMID: 31572328 PMCID: PMC6753640 DOI: 10.3389/fmicb.2019.02122] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Accepted: 08/29/2019] [Indexed: 12/15/2022] Open
Abstract
DNA replication initiates from multiple replication origins (ORIs) in eukaryotes. Discovery and characterization of replication origins are essential for a better understanding of the molecular mechanism of DNA replication. In this study, the features of autonomously replicating sequences (ARSs) in Saccharomyces cerevisiae have been comprehensively analyzed as follows. Firstly, we carried out the analysis of the ARSs available in S. cerevisiae S288C. By evaluating the sequence similarity of experimentally established ARSs, we found that 94.32% of ARSs are unique across the whole genome of S. cerevisiae S288C and those with high sequence similarity are prone to locate in subtelomeres. Subsequently, we built a non-redundant dataset with a total of 520 ARSs, which are based on ARSs annotation of S. cerevisiae S288C from SGD and then supplemented with those from OriDB and DeOri databases. We conducted a large-scale comparison of ORIs among the diverse budding yeast strains from a population genomics perspective. We found that 82.7% of ARSs are not only conserved in genomic sequence but also relatively conserved in chromosomal position. The non-conserved ARSs tend to distribute in the subtelomeric regions. We also conducted a pan-genome analysis of ARSs among the S. cerevisiae strains, and a total of 183 core ARSs existing in all yeast strains were determined. We extracted the genes adjacent to replication origins among the 104 yeast strains to examine whether there are differences in their gene functions. The result showed that the genes involved in the initiation of DNA replication, such as orc3, mcm2, mcm4, mcm6, and cdc45, are conservatively located adjacent to the replication origins. Furthermore, we found the genes adjacent to conserved ARSs are significantly enriched in DNA binding, enzyme activity, transportation, and energy, whereas for the genes adjacent to non-conserved ARSs are significantly enriched in response to environmental stress, metabolites biosynthetic process and biosynthesis of antibiotics. In general, we characterized the replication origins from the genome-wide and population genomics perspectives, which would provide new insights into the replication mechanism of S. cerevisiae and facilitate the design of algorithms to identify genome-wide replication origins in yeast.
Collapse
Affiliation(s)
- Dan Wang
- Department of Physics, School of Science, Tianjin University, Tianjin, China
| | - Feng Gao
- Department of Physics, School of Science, Tianjin University, Tianjin, China.,Key Laboratory of Systems Bioengineering, Ministry of Education, Tianjin University, Tianjin, China.,SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering, Tianjin, China
| |
Collapse
|
11
|
Quan CL, Gao F. Quantitative analysis and assessment of base composition asymmetry and gene orientation bias in bacterial genomes. FEBS Lett 2019; 593:918-925. [PMID: 30941752 DOI: 10.1002/1873-3468.13374] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Revised: 03/28/2019] [Accepted: 03/31/2019] [Indexed: 11/10/2022]
Abstract
Base composition asymmetry and gene orientation bias are two common genomic structures in bacterial genomes. Here, correlation coefficients between nucleotide disparities and coding sequence (CDS) skew have been calculated, which provides insights into their relationship from an individual genome perspective. Consequently, we find GC and RY disparities correlate significantly with CDS skew, since around 60% of the bacterial genomes under study have correlation coefficients > 0.9. Then, we present a model for quantitative assessment of nucleotide disparity and CDS skew in which a numerical index R2 is used for evaluation. We find that skew curves with higher R2 perform better on the prediction of replication origins in bacteria.
Collapse
Affiliation(s)
- Chun-Lan Quan
- Department of Physics, School of Science, Tianjin University, China
| | - Feng Gao
- Department of Physics, School of Science, Tianjin University, China.,Key Laboratory of Systems Bioengineering, Ministry of Education, Tianjin University, China.,SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), China
| |
Collapse
|
12
|
Merino N, Zhang S, Tomita M, Suzuki H. Comparative genomics of Bacteria commonly identified in the built environment. BMC Genomics 2019; 20:92. [PMID: 30691394 PMCID: PMC6350394 DOI: 10.1186/s12864-018-5389-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 12/18/2018] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND The microbial community of the built environment (BE) can impact the lives of people and has been studied for a variety of indoor, outdoor, underground, and extreme locations. Thus far, these microorganisms have mainly been investigated by culture-based methods or amplicon sequencing. However, both methods have limitations, complicating multi-study comparisons and limiting the knowledge gained regarding in-situ microbial lifestyles. A greater understanding of BE microorganisms can be achieved through basic information derived from the complete genome. Here, we investigate the level of diversity and genomic features (genome size, GC content, replication strand skew, and codon usage bias) from complete genomes of bacteria commonly identified in the BE, providing a first step towards understanding these bacterial lifestyles. RESULTS Here, we selected bacterial genera commonly identified in the BE (or "Common BE genomes") and compared them against other prokaryotic genera ("Other genomes"). The "Common BE genomes" were identified in various climates and in indoor, outdoor, underground, or extreme built environments. The diversity level of the 16S rRNA varied greatly between genera. The genome size, GC content and GC skew strength of the "Common BE genomes" were statistically larger than those of the "Other genomes" but were not practically significant. In contrast, the strength of selected codon usage bias (S value) was statistically higher with a large effect size in the "Common BE genomes" compared to the "Other genomes." CONCLUSION Of the four genomic features tested, the S value could play a more important role in understanding the lifestyles of bacteria living in the BE. This parameter could be indicative of bacterial growth rates, gene expression, and other factors, potentially affected by BE growth conditions (e.g., temperature, humidity, and nutrients). However, further experimental evidence, species-level BE studies, and classification by BE location is needed to define the relationship between genomic features and the lifestyles of BE bacteria more robustly.
Collapse
Affiliation(s)
- Nancy Merino
- Earth-Life Science Institute, Tokyo Institute of Technology, Ookayama, Meguro-ku, Tokyo, 152-8550, Japan.,Department of Earth Sciences, University of Southern California, Stauffer Hall of Science, Los Angeles, CA, 90089, USA
| | - Shu Zhang
- Global Research Center for Environment and Energy based on Nanomaterials Science, National Institute for Material Science, 1-1 Namiki, Tsukuba, Ibaraki, 305-0044, Japan.,Section of Infection and Immunity, Herman Ostrow School of Dentistry of USC, University of Southern California, Los Angeles, CA, 90089-0641, USA
| | - Masaru Tomita
- Faculty of Environment and Information Studies, Keio University, Fujisawa, Kanagawa, 252-0882, Japan.,Institute for Advanced Biosciences, Keio University, Tsuruoka, Yamagata, 997-0035, Japan
| | - Haruo Suzuki
- Faculty of Environment and Information Studies, Keio University, Fujisawa, Kanagawa, 252-0882, Japan. .,Institute for Advanced Biosciences, Keio University, Tsuruoka, Yamagata, 997-0035, Japan.
| |
Collapse
|
13
|
Bochkareva OO, Moroz EV, Davydov II, Gelfand MS. Genome rearrangements and selection in multi-chromosome bacteria Burkholderia spp. BMC Genomics 2018; 19:965. [PMID: 30587126 PMCID: PMC6307245 DOI: 10.1186/s12864-018-5245-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2018] [Accepted: 11/14/2018] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND The genus Burkholderia consists of species that occupy remarkably diverse ecological niches. Its best known members are important pathogens, B. mallei and B. pseudomallei, which cause glanders and melioidosis, respectively. Burkholderia genomes are unusual due to their multichromosomal organization, generally comprised of 2-3 chromosomes. RESULTS We performed integrated genomic analysis of 127 Burkholderia strains. The pan-genome is open with the saturation to be reached between 86,000 and 88,000 genes. The reconstructed rearrangements indicate a strong avoidance of intra-replichore inversions that is likely caused by selection against the transfer of large groups of genes between the leading and the lagging strands. Translocated genes also tend to retain their position in the leading or the lagging strand, and this selection is stronger for large syntenies. Integrated reconstruction of chromosome rearrangements in the context of strains phylogeny reveals parallel rearrangements that may indicate inversion-based phase variation and integration of new genomic islands. In particular, we detected parallel inversions in the second chromosomes of B. pseudomallei with breakpoints formed by genes encoding membrane components of multidrug resistance complex, that may be linked to a phase variation mechanism. Two genomic islands, spreading horizontally between chromosomes, were detected in the B. cepacia group. CONCLUSIONS This study demonstrates the power of integrated analysis of pan-genomes, chromosome rearrangements, and selection regimes. Non-random inversion patterns indicate selective pressure, inversions are particularly frequent in a recent pathogen B. mallei, and, together with periods of positive selection at other branches, may indicate adaptation to new niches. One such adaptation could be a possible phase variation mechanism in B. pseudomallei.
Collapse
Affiliation(s)
- Olga O. Bochkareva
- Kharkevich Institute for Information Transmission Problems, Moscow, Russia
- Center of Life Sciences Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Elena V. Moroz
- Kharkevich Institute for Information Transmission Problems, Moscow, Russia
| | - Iakov I. Davydov
- Department of Ecology and Evolution & Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Mikhail S. Gelfand
- Kharkevich Institute for Information Transmission Problems, Moscow, Russia
- Center of Life Sciences Skolkovo Institute of Science and Technology, Moscow, Russia
- Faculty of Computer Science, Higher School of Economics, Moscow, Russia
| |
Collapse
|
14
|
Joesch-Cohen LM, Robinson M, Jabbari N, Lausted CG, Glusman G. Novel metrics for quantifying bacterial genome composition skews. BMC Genomics 2018; 19:528. [PMID: 29996771 PMCID: PMC6042203 DOI: 10.1186/s12864-018-4913-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2018] [Accepted: 07/02/2018] [Indexed: 11/17/2022] Open
Abstract
Background Bacterial genomes have characteristic compositional skews, which are differences in nucleotide frequency between the leading and lagging DNA strands across a segment of a genome. It is thought that these strand asymmetries arise as a result of mutational biases and selective constraints, particularly for energy efficiency. Analysis of compositional skews in a diverse set of bacteria provides a comparative context in which mutational and selective environmental constraints can be studied. These analyses typically require finished and well-annotated genomic sequences. Results We present three novel metrics for examining genome composition skews; all three metrics can be computed for unfinished or partially-annotated genomes. The first two metrics, (dot-skew and cross-skew) depend on sequence and gene annotation of a single genome, while the third metric (residual skew) highlights unusual genomes by subtracting a GC content-based model of a library of genome sequences. We applied these metrics to 7738 available bacterial genomes, including partial drafts, and identified outlier species. A phylogenetically diverse set of these outliers (i.e., Borrelia, Ehrlichia, Kinetoplastibacterium, and Phytoplasma) display similar skew patterns but share lifestyle characteristics, such as intracellularity and biosynthetic dependence on their hosts. Conclusions Our novel metrics appear to reflect the effects of biosynthetic constraints and adaptations to life within one or more hosts on genome composition. We provide results for each analyzed genome, software and interactive visualizations at http://db.systemsbiology.net/gestalt/skew_metrics. Electronic supplementary material The online version of this article (10.1186/s12864-018-4913-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Lena M Joesch-Cohen
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA, 98109, USA.,Brown University, Providence, RI, 02912, USA
| | - Max Robinson
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA, 98109, USA
| | - Neda Jabbari
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA, 98109, USA
| | | | - Gustavo Glusman
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA, 98109, USA.
| |
Collapse
|
15
|
Luo H, Quan CL, Peng C, Gao F. Recent development of Ori-Finder system and DoriC database for microbial replication origins. Brief Bioinform 2018; 20:1114-1124. [DOI: 10.1093/bib/bbx174] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Revised: 12/04/2017] [Indexed: 01/28/2023] Open
Abstract
Abstract
DNA replication begins at replication origins in all three domains of life. Identification and characterization of replication origins are important not only in providing insights into the structure and function of the replication origins but also in understanding the regulatory mechanisms of the initiation step in DNA replication. The Z-curve method has been used in the identification of replication origins in archaeal genomes successfully since 2002. Furthermore, the Web servers of Ori-Finder and Ori-Finder 2 have been developed to predict replication origins in both bacterial and archaeal genomes based on the Z-curve method, and the replication origins with manual curation have been collected into an online database, DoriC. Ori-Finder system and DoriC database are currently used in the research field of DNA replication origins in prokaryotes, including: (i) identification of oriC regions in bacterial and archaeal genomes; (ii) discovery and analysis of the conserved sequences within oriC regions; and (iii) strand-biased analysis of bacterial genomes.
Up to now, more and more predicted results by Ori-Finder system were supported by subsequent experiments, and Ori-Finder system has been used to identify the replication origins in > 100 newly sequenced prokaryotes in their genome reports. In addition, the data in DoriC database have been widely used in the large-scale analyses of replication origins and strand bias in prokaryotic genomes. Here, we review the development of Ori-Finder system and DoriC database as well as their applications. Some future directions and aspects for extending the application of Ori-Finder and DoriC are also presented.
Collapse
|
16
|
Seco EM, Ayora S. Bacillus subtilis DNA polymerases, PolC and DnaE, are required for both leading and lagging strand synthesis in SPP1 origin-dependent DNA replication. Nucleic Acids Res 2017; 45:8302-8313. [PMID: 28575448 PMCID: PMC5737612 DOI: 10.1093/nar/gkx493] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Accepted: 05/23/2017] [Indexed: 01/08/2023] Open
Abstract
Firmicutes have two distinct replicative DNA polymerases, the PolC leading strand polymerase, and PolC and DnaE synthesizing the lagging strand. We have reconstituted in vitro Bacillus subtilis bacteriophage SPP1 θ-type DNA replication, which initiates unidirectionally at oriL. With this system we show that DnaE is not only restricted to lagging strand synthesis as previously suggested. DnaG primase and DnaE polymerase are required for initiation of DNA replication on both strands. DnaE and DnaG synthesize in concert a hybrid RNA/DNA ‘initiation primer’ on both leading and lagging strands at the SPP1 oriL region, as it does the eukaryotic Pol α complex. DnaE, as a RNA-primed DNA polymerase, extends this initial primer in a reaction modulated by DnaG and one single-strand binding protein (SSB, SsbA or G36P), and hands off the initiation primer to PolC, a DNA-primed DNA polymerase. Then, PolC, stimulated by DnaG and the SSBs, performs the bulk of DNA chain elongation at both leading and lagging strands. Overall, these modulations by the SSBs and DnaG may contribute to the mechanism of polymerase switch at Firmicutes replisomes.
Collapse
Affiliation(s)
- Elena M Seco
- Centro Nacional de Biotecnología (CNB-CSIC), 28049 Madrid, Spain
| | - Silvia Ayora
- Centro Nacional de Biotecnología (CNB-CSIC), 28049 Madrid, Spain
| |
Collapse
|
17
|
Selection for energy efficiency drives strand-biased gene distribution in prokaryotes. Sci Rep 2017; 7:10572. [PMID: 28874819 PMCID: PMC5585166 DOI: 10.1038/s41598-017-11159-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2017] [Accepted: 08/18/2017] [Indexed: 01/08/2023] Open
Abstract
Lagging-strand genes accumulate more deleterious mutations. Genes are thus preferably located on the leading strand, an observation known as strand-biased gene distribution (SGD). Despite of this mechanistic understanding, a satisfactory quantitative model is still lacking. Replication-transcription-collisions induce stalling of the replication machinery, expose DNA to various attacks, and are followed by error-prone repairs. We found that mutational biases in non-transcribed regions can explain ~71% of the variations in SGDs in 1,552 genomes, supporting the mutagenesis origin of SGD. Mutational biases introduce energetically cheaper nucleotides on the lagging strand, and result in more expensive protein products; consistently, the cost difference between the two strands explains ~50% of the variance in SGDs. Protein costs decrease with increasing gene expression. At similar expression levels, protein products of leading-strand genes are generally cheaper than lagging-strand genes; however, highly-expressed lagging genes are still cheaper than lowly-expressed leading genes. Selection for energy efficiency thus drives some genes to the leading strand, especially those highly expressed and essential, but certainly not all genes. Stronger mutational biases are often associated with low-GC genomes; as low-GC genes encode expensive proteins, low-GC genomes thus tend to have stronger SGDs to alleviate the stronger pressure on efficient energy usage.
Collapse
|