1
|
Tomasch J, Kopejtka K, Shivaramu S, Mujakić I, Koblížek M. On the evolution of chromosomal regions with high gene strand bias in bacteria. mBio 2024; 15:e0060224. [PMID: 38752745 PMCID: PMC11237797 DOI: 10.1128/mbio.00602-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 04/17/2024] [Indexed: 06/13/2024] Open
Abstract
On circular bacterial chromosomes, the majority of genes are coded on the leading strand. This gene strand bias (GSB) ranges from up to 85% in some Bacillota to a little more than 50% in other phyla. The factors determining the extent of the strand bias remain to be found. Here, we report that species in the phylum Gemmatimonadota share a unique chromosome architecture, distinct from neighboring phyla: in a conserved 600-kb region around the terminus of replication, almost all genes were located on the leading strands, while on the remaining part of the chromosome, the strand preference was more balanced. The high strand bias (HSB) region harbors the rRNA clusters, core, and highly expressed genes. Selective pressure for reduction of collisions with DNA replication to minimize detrimental mutations can explain the conservation of essential genes in this region. Repetitive and mobile elements are underrepresented, suggesting reduced recombination frequency by structural isolation from other parts of the chromosome. We propose that the HSB region forms a distinct chromosomal domain. Gemmatimonadota chromosomes evolved mainly by expansion through horizontal gene transfer and duplications outside of the ancient high strand bias region. In support of our hypothesis, we could further identify two Spiroplasma strains on a similar evolutionary path.IMPORTANCEOn bacterial chromosomes, a preferred location of genes on the leading strand has evolved to reduce conflicts between replication and transcription. Despite a vast body of research, the question why bacteria show large differences in their gene strand bias is still not solved. The discovery of "hybrid" chromosomes in different phyla, including Gemmatimonadota, in which a conserved high strand bias is found exclusively in a region at ter, points toward a role of nucleoid structure, additional to replication, in the evolution of strand preferences. A fine-grained structural analysis of the ever-increasing number of available bacterial genomes could help to better understand the forces that shape the sequential and spatial organization of the cell's information content.
Collapse
Affiliation(s)
- Jürgen Tomasch
- Laboratory of Anoxygenic Phototrophs, Institute of Microbiology of the Czech Academy of Sciences, Třeboň, Czechia
| | - Karel Kopejtka
- Laboratory of Anoxygenic Phototrophs, Institute of Microbiology of the Czech Academy of Sciences, Třeboň, Czechia
| | - Sahana Shivaramu
- Laboratory of Anoxygenic Phototrophs, Institute of Microbiology of the Czech Academy of Sciences, Třeboň, Czechia
| | - Izabela Mujakić
- Laboratory of Anoxygenic Phototrophs, Institute of Microbiology of the Czech Academy of Sciences, Třeboň, Czechia
| | - Michal Koblížek
- Laboratory of Anoxygenic Phototrophs, Institute of Microbiology of the Czech Academy of Sciences, Třeboň, Czechia
| |
Collapse
|
2
|
Atre M, Joshi B, Babu J, Sawant S, Sharma S, Sankar TS. Origin, evolution, and maintenance of gene-strand bias in bacteria. Nucleic Acids Res 2024; 52:3493-3509. [PMID: 38442257 DOI: 10.1093/nar/gkae155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 02/06/2024] [Accepted: 02/19/2024] [Indexed: 03/07/2024] Open
Abstract
Gene-strand bias is a characteristic feature of bacterial genome organization wherein genes are preferentially encoded on the leading strand of replication, promoting co-orientation of replication and transcription. This co-orientation bias has evolved to protect gene essentiality, expression, and genomic stability from the harmful effects of head-on replication-transcription collisions. However, the origin, variation, and maintenance of gene-strand bias remain elusive. Here, we reveal that the frequency of inversions that alter gene orientation exhibits large variation across bacterial populations and negatively correlates with gene-strand bias. The density, distance, and distribution of inverted repeats show a similar negative relationship with gene-strand bias explaining the heterogeneity in inversions. Importantly, these observations are broadly evident across the entire bacterial kingdom uncovering inversions and inverted repeats as primary factors underlying the variation in gene-strand bias and its maintenance. The distinct catalytic subunits of replicative DNA polymerase have co-evolved with gene-strand bias, suggesting a close link between replication and the origin of gene-strand bias. Congruently, inversion frequencies and inverted repeats vary among bacteria with different DNA polymerases. In summary, we propose that the nature of replication determines the fitness cost of replication-transcription collisions, establishing a selection gradient on gene-strand bias by fine-tuning DNA sequence repeats and, thereby, gene inversions.
Collapse
Affiliation(s)
- Malhar Atre
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - Bharat Joshi
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - Jebin Babu
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - Shabduli Sawant
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - Shreya Sharma
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - T Sabari Sankar
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| |
Collapse
|
3
|
Ebu SM, Ray L, Panda AN, Gouda SK. De novo assembly and comparative genome analysis for polyhydroxyalkanoates-producing Bacillus sp. BNPI-92 strain. J Genet Eng Biotechnol 2023; 21:132. [PMID: 37991636 PMCID: PMC10665291 DOI: 10.1186/s43141-023-00578-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 10/26/2023] [Indexed: 11/23/2023]
Abstract
BACKGROUND Certain Bacillus species play a vital role in polyhydroxyalkanoate (PHA) production. However, most of these isolates did not properly identify to species level when scientifically had been reported. RESULTS From NGS analysis, 5719 genes were predicted in the de novo genome assembly. Based on genome annotation using RAST server, 5,527,513 bp sequences were predicted with 5679 bp number of protein-coding sequence. Its genome sequence contains 35.1% and 156 GC content and contigs, respectively. In RAST server analysis, subsystem (43%) and non-subsystem coverage (57%) were generated. Ortho Venn comparative genome analysis indicated that Bacillus sp. BNPI-92 shared 2930 gene cluster (core gene) with B. cereus ATCC 14579 T (AE016877), B. paranthracis Mn5T (MACE01000012), B. thuringiensis ATCC 10792 T (ACNF01000156), and B. antrics Amen T (AE016879) strains. For our strain, the maximum gene cluster (190) was shared with B. cereus ATCC 14579 T (AE016877). For Ortho Venn pair wise analysis, the maximum overlapping gene clusters thresholds have been detected between Bacillus s p.BNPI-92 and Ba. cereus ATCC 14579 T (5414). Average nucleotide identity (ANI) such as OriginalANI and OrthoANI, in silicon digital DND-DNA hybridization (isDDH), Type (Strain) Genome Server (TYGS), and Genome-Genome Distance Calculator (GGDC) were more essentially related Bacillus sp. BNPI-92 with B. cereus ATCC 14579 T strain. Therefore, based on the combination of RAST annotation, OrthoVenn server, ANI and isDDH result Bacillus sp.BNPI-92 strain was strongly confirmed to be a B. cereus type strain. It was designated as B. cereus BNPI-92 strain. In B. cereus BNPI-92 strain whole genome sequence, PHA biosynthesis encoding genes such as phaP, phaQ, phaR (PHA synthesis repressor phaR gene sequence), phaB/phbB, and phaC were predicted on the same operon. These gene clusters were designated as phaPQRBC. However, phaA was located on other operons. CONCLUSIONS This newly obtained isolate was found to be new a strain based on comparative genomic analysis and it was also observed as a potential candidate for PHA biosynthesis.
Collapse
Affiliation(s)
- Seid Mohammed Ebu
- Department of Applied Biology, SoANS, Adama Science and Technology University, Oromia, Ethiopia.
| | - Lopamudra Ray
- School of Law, Campus -16 Adjunct Faculty, School of Biotech, Campus-11 KIIT University, Bhubaneswar, Odisha, 751024, India
| | - Ananta N Panda
- School of Biotechnology, Campus-11 KIIT University, Bhubaneswar, Odisha, 751024, India
| | - Sudhansu K Gouda
- School of Biotechnology, Campus-11 KIIT University, Bhubaneswar, Odisha, 751024, India
| |
Collapse
|
4
|
Moeckel C, Zaravinos A, Georgakopoulos-Soares I. Strand Asymmetries Across Genomic Processes. Comput Struct Biotechnol J 2023; 21:2036-2047. [PMID: 36968020 PMCID: PMC10030826 DOI: 10.1016/j.csbj.2023.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 03/08/2023] [Accepted: 03/08/2023] [Indexed: 03/12/2023] Open
Abstract
Across biological systems, a number of genomic processes, including transcription, replication, DNA repair, and transcription factor binding, display intrinsic directionalities. These directionalities are reflected in the asymmetric distribution of nucleotides, motifs, genes, transposon integration sites, and other functional elements across the two complementary strands. Strand asymmetries, including GC skews and mutational biases, have shaped the nucleotide composition of diverse organisms. The investigation of strand asymmetries often serves as a method to understand underlying biological mechanisms, including protein binding preferences, transcription factor interactions, retrotransposition, DNA damage and repair preferences, transcription-replication collisions, and mutagenesis mechanisms. Research into this subject also enables the identification of functional genomic sites, such as replication origins and transcription start sites. Improvements in our ability to detect and quantify DNA strand asymmetries will provide insights into diverse functionalities of the genome, the contribution of different mutational mechanisms in germline and somatic mutagenesis, and our knowledge of genome instability and evolution, which all have significant clinical implications in human disease, including cancer. In this review, we describe key developments that have been made across the field of genomic strand asymmetries, as well as the discovery of associated mechanisms.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Apostolos Zaravinos
- Department of Life Sciences, European University Cyprus, Diogenis Str., 6, Nicosia 2404, Cyprus
- Cancer Genetics, Genomics and Systems Biology laboratory, Basic and Translational Cancer Research Center (BTCRC), Nicosia 1516, Cyprus
- Corresponding author at: Department of Life Sciences, European University Cyprus, Diogenis Str., 6, Nicosia 2404, Cyprus.
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Corresponding author.
| |
Collapse
|
5
|
Abstract
The phylum "Candidatus Omnitrophica" (candidate division OP3) is ubiquitous in anaerobic habitats but is currently characterized only by draft genomes from metagenomes and single cells. We had visualized cells of the phylotype OP3 LiM in methanogenic cultures on limonene as small epibiotic cells. In this study, we enriched OP3 cells by double density gradient centrifugation and obtained the first closed genome of an apparently clonal OP3 cell population by applying metagenomics and PCR for gap closure. Filaments of acetoclastic Methanosaeta, the largest morphotype in the culture community, contained empty cells, cells devoid of rRNA or of both rRNA and DNA, and dead cells according to transmission electron microscopy (TEM), thin-section TEM, scanning electron microscopy (SEM), catalyzed reporter deposition-fluorescence in situ hybridization (CARD-FISH), and LIVE/DEAD imaging. OP3 LiM cells were ultramicrobacteria (200 to 300 nm in diameter) and showed two physiological stages in CARD-FISH fluorescence signals: strong signals of OP3 LiM cells attached to Bacteria and to Archaea indicated many rRNA molecules and an active metabolism, whereas free-living OP3 cells had weak signals. Metaproteomics revealed that OP3 LiM lives with highly expressed secreted proteins involved in depolymerization and uptake of macromolecules and an active glycolysis and energy conservation by the utilization of pyruvate via a pyruvate:ferredoxin oxidoreductase and an Rnf complex (ferredoxin:NAD oxidoreductase). Besides sugar fermentation, a nucleotidyl transferase may contribute to energy conservation by phosphorolysis, the phosphate-dependent depolymerization of nucleic acids. Thin-section TEM showed distinctive structures of predation. Our study demonstrated a predatory metabolism for OP3 LiM cells, and therefore, we propose the name "Candidatus Velamenicoccus archaeovorus" gen. nov., sp. nov., for OP3 LiM. IMPORTANCE Epibiotic bacteria are known to live on and off bacterial cells. Here, we describe the ultramicrobacterial anaerobic epibiont OP3 LiM living on Archaea and Bacteria. We detected sick and dead cells of the filamentous archaeon Methanosaeta in slowly growing methanogenic cultures. OP3 LiM lives as a sugar fermenter, likely on polysaccharides from outer membranes, and has the genomic potential to live as a syntroph. The predatory lifestyle of OP3 LiM was supported by its genome, the first closed genome for the phylum "Candidatus Omnitrophica," and by images of cell-to-cell contact with prey cells. We propose naming OP3 LiM "Candidatus Velamenicoccus archaeovorus." Its metabolic versatility explains the ubiquitous presence of "Candidatus Omnitrophica" 3 in anoxic habitats and gives ultramicrobacterial epibionts an important role in the recycling and remineralization of microbial biomass. The removal of polysaccharides from outer membranes by ultramicrobacteria may also influence biological interactions between pro- and eukaryotes.
Collapse
|
6
|
Miura MC, Nagata S, Tamaki S, Tomita M, Kanai A. Distinct Expansion of Group II Introns During Evolution of Prokaryotes and Possible Factors Involved in Its Regulation. Front Microbiol 2022; 13:849080. [PMID: 35295308 PMCID: PMC8919778 DOI: 10.3389/fmicb.2022.849080] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Accepted: 02/07/2022] [Indexed: 11/23/2022] Open
Abstract
Group II introns (G2Is) are ribozymes that have retroelement characteristics in prokaryotes. Although G2Is are suggested to have been an important evolutionary factor in the prokaryote-to-eukaryote transition, comprehensive analyses of these introns among the tens of thousands of prokaryotic genomes currently available are still limited. Here, we developed a bioinformatic pipeline that systematically collects G2Is and applied it to prokaryotic genomes. We found that in bacteria, 25% (447 of 1,790) of the total representative genomes had an average of 5.3 G2Is, and in archaea, 9% (28 of 296) of the total representative genomes had an average of 3.0 G2Is. The greatest number of G2Is per genome was 101 in Arthrospira platensis (phylum Cyanobacteriota). A comprehensive sequence analysis of the intron-encoded protein (IEP) in each G2I sequence was conducted and resulted in the addition of three new IEP classes (U1-U3) to the previous classification. This analysis suggested that about 30% of all IEPs are non-canonical IEPs. The number of G2Is per genome was defined almost at the phylum level, and at least in the following two phyla, Firmicutes, and Cyanobacteriota, the type of IEP was largely associated as a factor in the G2I increase, i.e., there was an explosive increase in G2Is with bacterial C-type IEPs, mainly in the phylum Firmicutes, and in G2Is with CL-type IEPs, mainly in the phylum Cyanobacteriota. We also systematically analyzed the relationship between genomic signatures and the mechanism of these increases in G2Is. This is the first study to systematically characterize G2Is in the prokaryotic phylogenies.
Collapse
Affiliation(s)
- Masahiro C. Miura
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Japan
- Systems Biology Program, Graduate School of Media and Governance, Keio University, Fujisawa, Japan
| | - Shohei Nagata
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Japan
| | - Satoshi Tamaki
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Japan
| | - Masaru Tomita
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Japan
- Systems Biology Program, Graduate School of Media and Governance, Keio University, Fujisawa, Japan
- Faculty of Environment and Information Studies, Keio University, Fujisawa, Japan
| | - Akio Kanai
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Japan
- Systems Biology Program, Graduate School of Media and Governance, Keio University, Fujisawa, Japan
- Faculty of Environment and Information Studies, Keio University, Fujisawa, Japan
| |
Collapse
|
7
|
Yang L, Xue Y, Wei J, Dai Q, Li P. Integrating metabolomic data with machine learning approach for discovery of Q-markers from Jinqi Jiangtang preparation against type 2 diabetes. Chin Med 2021; 16:30. [PMID: 33741031 PMCID: PMC7980607 DOI: 10.1186/s13020-021-00438-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2020] [Accepted: 03/10/2021] [Indexed: 02/06/2023] Open
Abstract
Background Jinqi Jiangtang (JQJT) has been widely used in clinical practice to prevent and treat type 2 diabetes. However, little research has been done to identify and classify its quality markers (Q-markers) associated with anti-diabetes bioactivity. In this study, a strategy combining mass spectrometry-based untargeted metabolomics with backpropagation artificial neural network (BP-ANN)-based machine learning approach was proposed to screen Q-markers from JQJT preparation. Methods This strategy mainly involved chemical profiling of herbal medicines, statistic processing of metabolomic datasets, detection of different anti-diabetes activities and establishment of BP-ANN model. The chemical features of seventy-eight batches of JQJT extracts were first profiled by using the untargeted UPLC-LTQ-Orbitrap metabolomic approach. The chemical features obtained which were associated with different anti-diabetes activities based on three modes of action were normalized, ranked, and then pre-selected by using ReliefF feature selection. BP-ANN model was then established and optimized to screen Q-markers based on mean impact value (MIV). Results Optimized BP-ANN architecture was established with high accuracy of R > 0.9983 and relative low error of MSE < 0.0014, which showed better performance than that of partial least square (PLS) model (R2 < 0.5). Meanwhile, the BP-ANN model was subsequently applied to further screen potential bioactive components from the pre-selected chemical features by calculating their MIVs. With this machine learning model, 10 potential Q-markers with bioactivity were discovered from JQJT. The tested anti-diabetes bioactivities of 78 batches of JQJT could be accurately predicted. Conclusions This proposed artificial intelligence approach is desirable for quick and easy identification of Q-markers with bioactivity from JQJT preparation. Supplementary Information The online version contains supplementary material available at 10.1186/s13020-021-00438-x.
Collapse
Affiliation(s)
- Lele Yang
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Macau, China
| | - Yan Xue
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Macau, China
| | - Jinchao Wei
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Macau, China
| | - Qi Dai
- Chengdu Institute for Food and Drug Control, Chengdu, China
| | - Peng Li
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Macau, China.
| |
Collapse
|
8
|
Sonbol S, Siam R. The association of group IIB intron with integrons in hypersaline environments. Mob DNA 2021; 12:8. [PMID: 33648565 PMCID: PMC7923331 DOI: 10.1186/s13100-021-00234-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Accepted: 01/27/2021] [Indexed: 11/25/2022] Open
Abstract
Background Group II introns are mobile genetic elements used as efficient gene targeting tools. They function as both ribozymes and retroelements. Group IIC introns are the only class reported so far to be associated with integrons. In order to identify group II introns linked with integrons and CALINS (cluster of attC sites lacking a neighboring integron integrase) within halophiles, we mined for integrons in 28 assembled metagenomes from hypersaline environments and publically available 104 halophilic genomes using Integron Finder followed by blast search for group II intron reverse transcriptases (RT)s. Results We report the presence of different group II introns associated with integrons and integron-related sequences denoted by UHB.F1, UHB.I2, H.ha.F1 and H.ha.F2. The first two were identified within putative integrons in the metagenome of Tanatar-5 hypersaline soda lake, belonging to IIC and IIB intron classes, respectively at which the first was a truncated intron. Other truncated introns H.ha.F1 and H.ha.F2 were also detected in a CALIN within the extreme halophile Halorhodospira halochloris, both belonging to group IIB introns. The intron-encoded proteins (IEP) s identified within group IIB introns belonged to different classes: CL1 class in UHB.I2 and bacterial class E in H.ha.Fa1 and H.ha.F2. A newly identified insertion sequence (ISHahl1) of IS200/605 superfamily was also identified adjacent to H. halochloris CALIN. Finally, an abundance of toxin-antitoxin (TA) systems was observed within the identified integrons. Conclusion So far, this is the first investigation of group II introns within integrons in halophilic genomes and metagenomes from hypersaline environments. We report the presence of group IIB introns associated with integrons or CALINs. This study provides the basis for understanding the role of group IIB introns in the evolution of halophiles and their potential biotechnological role. Supplementary Information The online version contains supplementary material available at 10.1186/s13100-021-00234-2.
Collapse
Affiliation(s)
- Sarah Sonbol
- Biology Department and the Graduate Program of Biotechnology, School of Sciences and Engineering, the American University in Cairo, New Cairo, Cairo, 11835, Egypt
| | - Rania Siam
- Biology Department and the Graduate Program of Biotechnology, School of Sciences and Engineering, the American University in Cairo, New Cairo, Cairo, 11835, Egypt. .,University of Medicine and Health Sciences, Basseterre, Saint Kitts and Nevis.
| |
Collapse
|
9
|
Liu Z, Feng J, Yu B, Ma Q, Liu B. The functional determinants in the organization of bacterial genomes. Brief Bioinform 2020; 22:5892344. [PMID: 32793986 DOI: 10.1093/bib/bbaa172] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Revised: 06/30/2020] [Accepted: 07/07/2020] [Indexed: 12/13/2022] Open
Abstract
Bacterial genomes are now recognized as interacting intimately with cellular processes. Uncovering organizational mechanisms of bacterial genomes has been a primary focus of researchers to reveal the potential cellular activities. The advances in both experimental techniques and computational models provide a tremendous opportunity for understanding these mechanisms, and various studies have been proposed to explore the organization rules of bacterial genomes associated with functions recently. This review focuses mainly on the principles that shape the organization of bacterial genomes, both locally and globally. We first illustrate local structures as operons/transcription units for facilitating co-transcription and horizontal transfer of genes. We then clarify the constraints that globally shape bacterial genomes, such as metabolism, transcription and replication. Finally, we highlight challenges and opportunities to advance bacterial genomic studies and provide application perspectives of genome organization, including pathway hole assignment and genome assembly and understanding disease mechanisms.
Collapse
Affiliation(s)
| | | | - Bin Yu
- College of Mathematics and Physics, Qingdao University of Science and Technology
| | - Qin Ma
- Department of Biomedical Informatics, the Ohio State University
| | | |
Collapse
|
10
|
Lato DF, Golding GB. Spatial Patterns of Gene Expression in Bacterial Genomes. J Mol Evol 2020; 88:510-520. [PMID: 32506154 PMCID: PMC7324424 DOI: 10.1007/s00239-020-09951-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Accepted: 05/08/2020] [Indexed: 01/06/2023]
Abstract
Gene expression in bacteria is a remarkably controlled and intricate process impacted by many factors. One such factor is the genomic position of a gene within a bacterial genome. Genes located near the origin of replication generally have a higher expression level, increased dosage, and are often more conserved than genes located farther from the origin of replication. The majority of the studies involved with these findings have only noted this phenomenon in a single gene or cluster of genes that was re-located to pre-determined positions within a bacterial genome. In this work, we look at the overall expression levels from eleven bacterial data sets from Escherichia coli, Bacillus subtilis, Streptomyces, and Sinorhizobium meliloti. We have confirmed that gene expression tends to decrease when moving away from the origin of replication in majority of the replicons analysed in this study. This study sheds light on the impact of genomic location on molecular trends such as gene expression and highlights the importance of accounting for spatial trends in bacterial molecular analysis.
Collapse
Affiliation(s)
- Daniella F Lato
- Department of Biology, McMaster Univeristy, 1280 Main St. West, Hamilton, ON, L8S 4K1, Canada
| | - G Brian Golding
- Department of Biology, McMaster Univeristy, 1280 Main St. West, Hamilton, ON, L8S 4K1, Canada.
| |
Collapse
|
11
|
Westmann CA, Alves LDF, Silva-Rocha R, Guazzaroni ME. Mining Novel Constitutive Promoter Elements in Soil Metagenomic Libraries in Escherichia coli. Front Microbiol 2018; 9:1344. [PMID: 29973927 PMCID: PMC6019500 DOI: 10.3389/fmicb.2018.01344] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2018] [Accepted: 05/31/2018] [Indexed: 11/13/2022] Open
Abstract
Although functional metagenomics has been widely employed for the discovery of genes relevant to biotechnology and biomedicine, its potential for assessing the diversity of transcriptional regulatory elements of microbial communities has remained poorly explored. Here, we experimentally mined novel constitutive promoter sequences in metagenomic libraries by combining a bi-directional reporter vector, high-throughput fluorescence assays and predictive computational methods. Through the expression profiling of fluorescent clones from two independent soil sample libraries, we have analyzed the regulatory dynamics of 260 clones with candidate promoters as a set of active metagenomic promoters in the host Escherichia coli. Through an in-depth analysis of selected clones, we were able to further explore the architecture of metagenomic fragments and to report the presence of multiple promoters per fragment with a dominant promoter driving the expression profile. These approaches resulted in the identification of 33 novel active promoters from metagenomic DNA originated from very diverse phylogenetic groups. The in silico and in vivo analysis of these individual promoters allowed the generation of a constitutive promoter consensus for exogenous sequences recognizable by E. coli in metagenomic studies. The results presented here demonstrates the potential of functional metagenomics for exploring environmental bacterial communities as a source of novel regulatory genetic parts to expand the toolbox for microbial engineering.
Collapse
Affiliation(s)
- Cauã A Westmann
- Department of Cellular and Molecular Biology, FMRP, University of São Paulo, Ribeirão Preto, Brazil
| | - Luana de Fátima Alves
- Department of Biology, FFCLRP, University of São Paulo, Ribeirão Preto, Brazil.,Department of Biochemistry, FMRP, University of São Paulo, Ribeirão Preto, Brazil
| | - Rafael Silva-Rocha
- Department of Cellular and Molecular Biology, FMRP, University of São Paulo, Ribeirão Preto, Brazil
| | | |
Collapse
|
12
|
Luo H, Quan CL, Peng C, Gao F. Recent development of Ori-Finder system and DoriC database for microbial replication origins. Brief Bioinform 2018; 20:1114-1124. [DOI: 10.1093/bib/bbx174] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Revised: 12/04/2017] [Indexed: 01/28/2023] Open
Abstract
Abstract
DNA replication begins at replication origins in all three domains of life. Identification and characterization of replication origins are important not only in providing insights into the structure and function of the replication origins but also in understanding the regulatory mechanisms of the initiation step in DNA replication. The Z-curve method has been used in the identification of replication origins in archaeal genomes successfully since 2002. Furthermore, the Web servers of Ori-Finder and Ori-Finder 2 have been developed to predict replication origins in both bacterial and archaeal genomes based on the Z-curve method, and the replication origins with manual curation have been collected into an online database, DoriC. Ori-Finder system and DoriC database are currently used in the research field of DNA replication origins in prokaryotes, including: (i) identification of oriC regions in bacterial and archaeal genomes; (ii) discovery and analysis of the conserved sequences within oriC regions; and (iii) strand-biased analysis of bacterial genomes.
Up to now, more and more predicted results by Ori-Finder system were supported by subsequent experiments, and Ori-Finder system has been used to identify the replication origins in > 100 newly sequenced prokaryotes in their genome reports. In addition, the data in DoriC database have been widely used in the large-scale analyses of replication origins and strand bias in prokaryotic genomes. Here, we review the development of Ori-Finder system and DoriC database as well as their applications. Some future directions and aspects for extending the application of Ori-Finder and DoriC are also presented.
Collapse
|
13
|
Selection for energy efficiency drives strand-biased gene distribution in prokaryotes. Sci Rep 2017; 7:10572. [PMID: 28874819 PMCID: PMC5585166 DOI: 10.1038/s41598-017-11159-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2017] [Accepted: 08/18/2017] [Indexed: 01/08/2023] Open
Abstract
Lagging-strand genes accumulate more deleterious mutations. Genes are thus preferably located on the leading strand, an observation known as strand-biased gene distribution (SGD). Despite of this mechanistic understanding, a satisfactory quantitative model is still lacking. Replication-transcription-collisions induce stalling of the replication machinery, expose DNA to various attacks, and are followed by error-prone repairs. We found that mutational biases in non-transcribed regions can explain ~71% of the variations in SGDs in 1,552 genomes, supporting the mutagenesis origin of SGD. Mutational biases introduce energetically cheaper nucleotides on the lagging strand, and result in more expensive protein products; consistently, the cost difference between the two strands explains ~50% of the variance in SGDs. Protein costs decrease with increasing gene expression. At similar expression levels, protein products of leading-strand genes are generally cheaper than lagging-strand genes; however, highly-expressed lagging genes are still cheaper than lowly-expressed leading genes. Selection for energy efficiency thus drives some genes to the leading strand, especially those highly expressed and essential, but certainly not all genes. Stronger mutational biases are often associated with low-GC genomes; as low-GC genes encode expensive proteins, low-GC genomes thus tend to have stronger SGDs to alleviate the stronger pressure on efficient energy usage.
Collapse
|
14
|
diCenzo GC, Finan TM. The Divided Bacterial Genome: Structure, Function, and Evolution. Microbiol Mol Biol Rev 2017; 81:e00019-17. [PMID: 28794225 PMCID: PMC5584315 DOI: 10.1128/mmbr.00019-17] [Citation(s) in RCA: 135] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Approximately 10% of bacterial genomes are split between two or more large DNA fragments, a genome architecture referred to as a multipartite genome. This multipartite organization is found in many important organisms, including plant symbionts, such as the nitrogen-fixing rhizobia, and plant, animal, and human pathogens, including the genera Brucella, Vibrio, and Burkholderia. The availability of many complete bacterial genome sequences means that we can now examine on a broad scale the characteristics of the different types of DNA molecules in a genome. Recent work has begun to shed light on the unique properties of each class of replicon, the unique functional role of chromosomal and nonchromosomal DNA molecules, and how the exploitation of novel niches may have driven the evolution of the multipartite genome. The aims of this review are to (i) outline the literature regarding bacterial genomes that are divided into multiple fragments, (ii) provide a meta-analysis of completed bacterial genomes from 1,708 species as a way of reviewing the abundant information present in these genome sequences, and (iii) provide an encompassing model to explain the evolution and function of the multipartite genome structure. This review covers, among other topics, salient genome terminology; mechanisms of multipartite genome formation; the phylogenetic distribution of multipartite genomes; how each part of a genome differs with respect to genomic signatures, genetic variability, and gene functional annotation; how each DNA molecule may interact; as well as the costs and benefits of this genome structure.
Collapse
Affiliation(s)
- George C diCenzo
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | - Turlough M Finan
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
15
|
Sadhasivam A, Vetrivel U. Genome-wide codon usage profiling of ocular infective Chlamydia trachomatis serovars and drug target identification. J Biomol Struct Dyn 2017. [PMID: 28627970 DOI: 10.1080/07391102.2017.1343685] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Chlamydia trachomatis (C.t) is a Gram-negative obligate intracellular bacteria and is a major causative of infectious blindness and sexually transmitted diseases. Among the varied serovars of this organism, A, B and C are reported as prominent ocular pathogens. Genomic studies of these strains shall aid in deciphering potential drug targets and genomic influence on pathogenesis. Hence, in this study we performed deep statistical profiling of codon usage in these serovars. The overall base composition analysis reveals that these serovars are over biased to AU than GC. Similarly, relative synonymous codon usage also showed preference towards A/U ending codons. Parity Rule 2 analysis inferred unequal distribution of AT and GC, indicative of other unknown factors acting along with mutational pressure to influence codon usage bias (CUB). Moreover, absolute quantification of CUB also revealed lower bias across these serovars. The effect of natural selection on CUB was also confirmed by neutrality plot, reinforcing natural selection under mutational pressure turned to be a pivotal role in shaping the CUB in the strains studied. Correspondence analysis (COA) clarified that, C.t C/TW-3 to show a unique trend in codon usage variation. Host influence analysis on shaping the codon usage pattern also inferred some speculative relativity. In a nutshell, our finding suggests that mutational pressure is the dominating factor in shaping CUB in the strains studied, followed by natural selection. We also propose potential drug targets based on cumulative analysis of strand bias, CUB and human non-homologue screening.
Collapse
Affiliation(s)
- Anupriya Sadhasivam
- a Centre for Bioinformatics , Kamalnayan Bajaj Institute for Research in Vision and Ophthalmology, Vision Research Foundation, Sankara Nethralaya , Chennai 600 006 , Tamil Nadu , India
| | - Umashankar Vetrivel
- a Centre for Bioinformatics , Kamalnayan Bajaj Institute for Research in Vision and Ophthalmology, Vision Research Foundation, Sankara Nethralaya , Chennai 600 006 , Tamil Nadu , India
| |
Collapse
|
16
|
Merrikh H. Spatial and Temporal Control of Evolution through Replication-Transcription Conflicts. Trends Microbiol 2017; 25:515-521. [PMID: 28216294 DOI: 10.1016/j.tim.2017.01.008] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Revised: 01/10/2017] [Accepted: 01/27/2017] [Indexed: 01/16/2023]
Abstract
Evolution could potentially be accelerated if an organism could selectively increase the mutation rate of specific genes that are actively under positive selection. Recently, a mechanism that cells can use to target rapid evolution to specific genes was discovered. This mechanism is driven by gene orientation-dependent encounters between DNA replication and transcription machineries. These encounters increase mutagenesis in lagging-strand genes, where replication-transcription conflicts are severe. Due to the orientation and transcription-dependent nature of this process, conflict-driven mutagenesis can be used by cells to spatially (gene-specifically) and temporally (only upon transcription induction) regulate the rate of gene evolution. Here, I summarize recent findings on this topic, and discuss the implications of increasing mutagenesis rates and accelerating evolution through active mechanisms.
Collapse
Affiliation(s)
- Houra Merrikh
- Department of Microbiology, Health Sciences Building - J-wing, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
17
|
Apostolou-Karampelis K, Nikolaou C, Almirantis Y. A novel skew analysis reveals substitution asymmetries linked to genetic code GC-biases and PolIII a-subunit isoforms. DNA Res 2016; 23:353-63. [PMID: 27345720 PMCID: PMC4991834 DOI: 10.1093/dnares/dsw021] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2016] [Accepted: 05/09/2016] [Indexed: 11/30/2022] Open
Abstract
Strand biases reflect deviations from a null expectation of DNA evolution that assumes strand-symmetric substitution rates. Here, we present strong evidence that nearest-neighbour preferences are a strand-biased feature of bacterial genomes, indicating neighbour-dependent substitution asymmetries. To detect such asymmetries we introduce an alignment free index (relative abundance skews). The profiles of relative abundance skews along coding sequences can trace the phylogenetic relations of bacteria, suggesting that the patterns of neighbour-dependent substitution strand-biases are not common among different lineages, but are rather species-specific. Analysis of neighbour-dependent and codon-site skews sheds light on the origins of substitution asymmetries. Via a simple model we argue that the structure of the genetic code imposes position-dependent substitution strand-biases along coding sequences, as a response to GC mutation pressure. Thus, the organization of the genetic code per se can lead to an uneven distribution of nucleotides among different codon sites, even when requirements for specific codons and amino-acids are not accounted for. Moreover, our results suggest that strand-biases in replication fidelity of PolIII α-subunit induce substitution asymmetries, both neighbour-dependent and independent, on a genome scale. The role of DNA repair systems, such as transcription-coupled repair, is also considered.
Collapse
Affiliation(s)
| | - Christoforos Nikolaou
- Computational Genomics Group, Department of Biology, University of Crete, 71409 Heraklion, Greece
| | - Yannis Almirantis
- Institute of Biosciences and Applications, National Center for Scientific Research "Demokritos", 15310 Athens, Greece
| |
Collapse
|
18
|
Overlapping genes: A significant genomic correlate of prokaryotic growth rates. Gene 2016; 582:143-7. [DOI: 10.1016/j.gene.2016.02.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2015] [Accepted: 02/03/2016] [Indexed: 11/19/2022]
|
19
|
Touchon M, Rocha EPC. Coevolution of the Organization and Structure of Prokaryotic Genomes. Cold Spring Harb Perspect Biol 2016; 8:a018168. [PMID: 26729648 DOI: 10.1101/cshperspect.a018168] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The cytoplasm of prokaryotes contains many molecular machines interacting directly with the chromosome. These vital interactions depend on the chromosome structure, as a molecule, and on the genome organization, as a unit of genetic information. Strong selection for the organization of the genetic elements implicated in these interactions drives replicon ploidy, gene distribution, operon conservation, and the formation of replication-associated traits. The genomes of prokaryotes are also very plastic with high rates of horizontal gene transfer and gene loss. The evolutionary conflicts between plasticity and organization lead to the formation of regions with high genetic diversity whose impact on chromosome structure is poorly understood. Prokaryotic genomes are remarkable documents of natural history because they carry the imprint of all of these selective and mutational forces. Their study allows a better understanding of molecular mechanisms, their impact on microbial evolution, and how they can be tinkered in synthetic biology.
Collapse
Affiliation(s)
- Marie Touchon
- Microbial Evolutionary Genomics, Institut Pasteur, 75015 Paris, France CNRS, UMR3525, 75015 Paris, France
| | - Eduardo P C Rocha
- Microbial Evolutionary Genomics, Institut Pasteur, 75015 Paris, France CNRS, UMR3525, 75015 Paris, France
| |
Collapse
|
20
|
Zheng WX, Luo CS, Deng YY, Guo FB. Essentiality drives the orientation bias of bacterial genes in a continuous manner. Sci Rep 2015; 5:16431. [PMID: 26560889 PMCID: PMC4642330 DOI: 10.1038/srep16431] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2015] [Accepted: 10/13/2015] [Indexed: 12/04/2022] Open
Abstract
Studies had found that bacterial genes are preferentially located on the leading strands. Subsequently, the preferences of essential genes and highly expressed genes were compared by classifying all genes into four groups, which showed that the former has an exclusive influence on orientation. However, only some functional classes of essential genes have this orientation bias. Nevertheless, previous studies only performed comparative analyzes by differentiating the orientation bias extent of two types of genes. Thus, it is unclear whether the influence of essentiality on strand bias works continuously. Herein, we found a significant correlation between essentiality and orientation bias extent in 19 of 21 analyzed bacterial genomes, based on quantitative measurement of gene essentiality (or fitness). The correlation coefficient was much higher than that derived from binary essentiality measures (essential or non-essential). This suggested that genes with relatively lower essentiality, i.e., conditionally essential genes, also have some orientation bias, although it is weaker than that of absolutely essential genes. The results demonstrated the continuous influence of essentiality on orientation bias and provided details on this visible structural feature of bacterial genomes. It also proved that Geptop and IFIM could serve as useful resources of bacterial gene essentiality, particularly for quantitative analysis.
Collapse
Affiliation(s)
- Wen-Xin Zheng
- School of Biomedical Engineering, Capital Medical University, Beijing 100069, China.,Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, Capital Medical University, Beijing 100069, China
| | - Cheng-Si Luo
- Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Center for Information in BioMedicine, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Key Laboratory for Neuro Information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Yan-Yan Deng
- Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Center for Information in BioMedicine, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Key Laboratory for Neuro Information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Feng-Biao Guo
- Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Center for Information in BioMedicine, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Key Laboratory for Neuro Information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, 610054, China
| |
Collapse
|
21
|
Chou WC, Ma Q, Yang S, Cao S, Klingeman DM, Brown SD, Xu Y. Analysis of strand-specific RNA-seq data using machine learning reveals the structures of transcription units in Clostridium thermocellum. Nucleic Acids Res 2015; 43:e67. [PMID: 25765651 PMCID: PMC4446414 DOI: 10.1093/nar/gkv177] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2013] [Accepted: 02/22/2015] [Indexed: 12/31/2022] Open
Abstract
Identification of transcription units (TUs) encoded in a bacterial genome is essential to elucidation of transcriptional regulation of the organism. To gain a detailed understanding of the dynamically composed TU structures, we have used four strand-specific RNA-seq (ssRNA-seq) datasets collected under two experimental conditions to derive the genomic TU organization of Clostridium thermocellum using a machine-learning approach. Our method accurately predicted the genomic boundaries of individual TUs based on two sets of parameters measuring the RNA-seq expression patterns across the genome: expression-level continuity and variance. A total of 2590 distinct TUs are predicted based on the four RNA-seq datasets. Among the predicted TUs, 44% have multiple genes. We assessed our prediction method on an independent set of RNA-seq data with longer reads. The evaluation confirmed the high quality of the predicted TUs. Functional enrichment analyses on a selected subset of the predicted TUs revealed interesting biology. To demonstrate the generality of the prediction method, we have also applied the method to RNA-seq data collected on Escherichia coli and achieved high prediction accuracies. The TU prediction program named SeqTU is publicly available at https://code.google.com/p/seqtu/. We expect that the predicted TUs can serve as the baseline information for studying transcriptional and post-transcriptional regulation in C. thermocellum and other bacteria.
Collapse
Affiliation(s)
- Wen-Chi Chou
- Computational Systems Biology Lab, Department of Biochemistry and Molecular Biology, and Institute of Bioinformatics, University of Georgia, GA 30602, USA BioEnergy Science Center, TN 37831, USA
| | - Qin Ma
- Computational Systems Biology Lab, Department of Biochemistry and Molecular Biology, and Institute of Bioinformatics, University of Georgia, GA 30602, USA BioEnergy Science Center, TN 37831, USA
| | - Shihui Yang
- BioEnergy Science Center, TN 37831, USA Biosciences Division, Oak Ridge National Laboratory, TN 37831, USA National Bioenergy Center, National Renewable Energy Laboratory, Golden, CO 80401, USA
| | - Sha Cao
- Computational Systems Biology Lab, Department of Biochemistry and Molecular Biology, and Institute of Bioinformatics, University of Georgia, GA 30602, USA
| | - Dawn M Klingeman
- BioEnergy Science Center, TN 37831, USA Biosciences Division, Oak Ridge National Laboratory, TN 37831, USA
| | - Steven D Brown
- BioEnergy Science Center, TN 37831, USA Biosciences Division, Oak Ridge National Laboratory, TN 37831, USA
| | - Ying Xu
- Computational Systems Biology Lab, Department of Biochemistry and Molecular Biology, and Institute of Bioinformatics, University of Georgia, GA 30602, USA BioEnergy Science Center, TN 37831, USA College of Computer Science and Technology and School of Public Health, Jilin University, Changchun, Jilin 130012, China
| |
Collapse
|
22
|
Tremblay-Savard O, Benzaid B, Lang BF, El-Mabrouk N. Evolution of tRNA Repertoires in Bacillus Inferred with OrthoAlign. Mol Biol Evol 2015; 32:1643-56. [PMID: 25660374 DOI: 10.1093/molbev/msv029] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
OrthoAlign, an algorithm for the gene order alignment problem (alignment of orthologs), accounting for most genome-wide evolutionary events such as duplications, losses, rearrangements, and substitutions, was presented. OrthoAlign was used in a phylogenetic framework to infer the evolution of transfer RNA repertoires of 50 fully sequenced bacteria in the Bacillus genus. A prevalence of gene duplications and losses over rearrangement events was observed. The average rate of duplications inferred in Bacillus was 24 times lower than the one reported in Escherichia coli, whereas the average rates of losses and inversions were both 12 times lower. These rates were extremely low, suggesting a strong selective pressure acting on tRNA gene repertoires in Bacillus. An exhaustive analysis of the type, location, distribution, and length of evolutionary events was provided, together with ancestral configurations. OrthoAlign can be downloaded at: http://www.iro.umontreal.ca/~mabrouk/.
Collapse
Affiliation(s)
- Olivier Tremblay-Savard
- Département d'informatique et de recherche opérationnelle (DIRO), Université de Montréal, CP 6128 succursale Centre-Ville, Montreal, QC, Canada
| | - Billel Benzaid
- Département d'informatique et de recherche opérationnelle (DIRO), Université de Montréal, CP 6128 succursale Centre-Ville, Montreal, QC, Canada
| | - B Franz Lang
- Département de biochimie, Université de Montréal, CP 6128 succursale Centre-Ville, Montreal, QC, Canada
| | - Nadia El-Mabrouk
- Département d'informatique et de recherche opérationnelle (DIRO), Université de Montréal, CP 6128 succursale Centre-Ville, Montreal, QC, Canada
| |
Collapse
|
23
|
Goswami A, Roy Chowdhury A, Sarkar M, Saha SK, Paul S, Dutta C. Strand-biased gene distribution, purine assymetry and environmental factors influence protein evolution in Bacillus. FEBS Lett 2015; 589:629-38. [PMID: 25639611 DOI: 10.1016/j.febslet.2015.01.028] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2014] [Revised: 01/16/2015] [Accepted: 01/18/2015] [Indexed: 12/23/2022]
Abstract
A strong purine asymmetry, along with strand-biased gene distribution and the presence of PolC, prevails in Bacillus and some other members of Firmicutes, Fusobacteria and Tenericutes. The analysis of protein features in 21 Bacillus species of diverse metabolic, virulence and ecological traits revealed that purine asymmetry in conjunction with lineage/niche specific constraints significantly influences protein evolution in Bacillus. All Bacillus species, except for Se-respiring Bacillus selenitireducens, display distinct strand-specific biases in amino acid usage, which may affect the isoelectric point or surface charge distribution of proteins with prevalence of acidic and basic residues in the leading and lagging strand proteins, respectively.
Collapse
Affiliation(s)
- Aranyak Goswami
- Structural Biology & Bioinformatics Division, CSIR - Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Kolkata 700032, India.
| | - Anindya Roy Chowdhury
- Structural Biology & Bioinformatics Division, CSIR - Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Kolkata 700032, India.
| | - Munmun Sarkar
- Structural Biology & Bioinformatics Division, CSIR - Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Kolkata 700032, India.
| | - Sanjoy Kumar Saha
- Structural Biology & Bioinformatics Division, CSIR - Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Kolkata 700032, India.
| | - Sandip Paul
- Structural Biology & Bioinformatics Division, CSIR - Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Kolkata 700032, India.
| | - Chitra Dutta
- Structural Biology & Bioinformatics Division, CSIR - Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Kolkata 700032, India.
| |
Collapse
|
24
|
Cao H, Butler K, Hossain M, Lewis JD. Variation in the fitness effects of mutations with population density and size in Escherichia coli. PLoS One 2014; 9:e105369. [PMID: 25121498 PMCID: PMC4133409 DOI: 10.1371/journal.pone.0105369] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2014] [Accepted: 07/23/2014] [Indexed: 11/18/2022] Open
Abstract
The fitness effects of mutations are context specific and depend on both external (e.g., environment) and internal (e.g., cellular stress, genetic background) factors. The influence of population size and density on fitness effects are unknown, despite the central role population size plays in the supply and fixation of mutations. We addressed this issue by comparing the fitness of 92 Keio strains (Escherichia coli K12 single gene knockouts) at comparatively high (1.2×10(7) CFUs/mL) and low (2.5×10(2) CFUs/mL) densities, which also differed in population size (high: 1.2×10(8); low: 1.25×10(3)). Twenty-eight gene deletions (30%) exhibited a fitness difference, ranging from 5 to 174% (median: 35%), between the high and low densities. Our analyses suggest this variation among gene deletions in fitness responses reflected in part both gene orientation and function, of the gene properties we examined (genomic position, length, orientation, and function). Although we could not determine the relative effects of population density and size, our results suggest fitness effects of mutations vary with these two factors, and this variation is gene-specific. Besides being a mechanism for density-dependent selection (r-K selection), the dependence of fitness effects on population density and size has implications for any population that varies in size over time, including populations undergoing evolutionary rescue, species invasions into novel habitats, and cancer progression and metastasis. Further, combined with recent advances in understanding the roles of other context-specific factors in the fitness effects of mutations, our results will help address theoretical and applied biological questions more realistically.
Collapse
Affiliation(s)
- Huansheng Cao
- Louis Calder Center–Biological Field Station and Department of Biological Sciences, Fordham University, Armonk, New York, United States of America
| | - Kevin Butler
- Louis Calder Center–Biological Field Station and Department of Biological Sciences, Fordham University, Armonk, New York, United States of America
| | - Mithi Hossain
- Louis Calder Center–Biological Field Station and Department of Biological Sciences, Fordham University, Armonk, New York, United States of America
| | - James D. Lewis
- Louis Calder Center–Biological Field Station and Department of Biological Sciences, Fordham University, Armonk, New York, United States of America
| |
Collapse
|
25
|
Abstract
Genomic DNA is used as the template for both replication and transcription, whose machineries may collide and result in mutagenesis, among other damages. Because head-on collisions are more deleterious than codirectional collisions, genes should be preferentially encoded on the leading strand to avoid head-on collisions, as is observed in most bacterial genomes examined. However, why are there still lagging strand encoded genes? Paul et al. recently proposed that these genes take advantage of the increased mutagenesis resulting from head-on collisions and are thus adaptively encoded on the lagging strand. We show that the evidence they provided is invalid and that the existence of lagging strand encoded genes is explainable by a balance between deleterious mutations that bring genes from the leading to the lagging strand and purifying selection purging such mutants. Therefore, the adaptive hypothesis is neither theoretically needed nor empirically supported.
Collapse
Affiliation(s)
- Xiaoshu Chen
- Department of Ecology and Evolutionary Biology, University of Michigan
| | | |
Collapse
|
26
|
Saha SK, Goswami A, Dutta C. Association of purine asymmetry, strand-biased gene distribution and PolC within Firmicutes and beyond: a new appraisal. BMC Genomics 2014; 15:430. [PMID: 24899249 PMCID: PMC4070872 DOI: 10.1186/1471-2164-15-430] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2013] [Accepted: 05/08/2014] [Indexed: 11/10/2022] Open
Abstract
Background The Firmicutes often possess three conspicuous genome features: marked Purine Asymmetry (PAS) across two strands of replication, Strand-biased Gene Distribution (SGD) and presence of two isoforms of DNA polymerase III alpha subunit, PolC and DnaE. Despite considerable research efforts, it is not clear whether the co-existence of PAS, PolC and/or SGD is an essential and exclusive characteristic of the Firmicutes. The nature of correlations, if any, between these three features within and beyond the lineages of Firmicutes has also remained elusive. The present study has been designed to address these issues. Results A large-scale analysis of diverse bacterial genomes indicates that PAS, PolC and SGD are neither essential nor exclusive features of the Firmicutes. PolC prevails in four bacterial phyla: Firmicutes, Fusobacteria, Tenericutes and Thermotogae, while PAS occurs only in subsets of Firmicutes, Fusobacteria and Tenericutes. There are five major compositional trends in Firmicutes: (I) an explicit PAS or G + A-dominance along the entire leading strand (II) only G-dominance in the leading strand, (III) alternate stretches of purine-rich and pyrimidine-rich sequences, (IV) G + T dominance along the leading strand, and (V) no identifiable patterns in base usage. Presence of strong SGD has been observed not only in genomes having PAS, but also in genomes with G-dominance along their leading strands – an observation that defies the notion of co-occurrence of PAS and SGD in Firmicutes. The PolC-containing non-Firmicutes organisms often have alternate stretches of R-dominant and Y-dominant sequences along their genomes and most of them show relatively weak, but significant SGD. Firmicutes having G + A-dominance or G-dominance along LeS usually show distinct base usage patterns in three codon sites of genes. Probable molecular mechanisms that might have incurred such usage patterns have been proposed. Conclusion Co-occurrence of PAS, strong SGD and PolC should not be regarded as a genome signature of the Firmicutes. Presence of PAS in a species may warrant PolC and strong SGD, but PolC and/or SGD not necessarily implies PAS. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-430) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | - Chitra Dutta
- Structural Biology & Bioinformatics Division, CSIR- Indian Institute of Chemical Biology, 4, Raja S, C, Mullick Road, Kolkata 700032, India.
| |
Collapse
|
27
|
Gao F. Recent Advances in the Identification of Replication Origins Based on the Z-curve Method. Curr Genomics 2014; 15:104-12. [PMID: 24822028 PMCID: PMC4009838 DOI: 10.2174/1389202915999140328162938] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2013] [Revised: 11/04/2013] [Accepted: 11/05/2013] [Indexed: 12/19/2022] Open
Abstract
Precise DNA replication is critical for the maintenance of genetic integrity in all organisms. In all three domains
of life, DNA replication starts at a specialized locus, termed as the replication origin, oriC or ORI, and its identification
is vital to understanding the complex replication process. In bacteria and eukaryotes, replication initiates from single
and multiple origins, respectively, while archaea can adopt either of the two modes. The Z-curve method has been
successfully used to identify replication origins in genomes of various species, including multiple oriCs in some archaea.
Based on the Z-curve method and comparative genomics analysis, we have developed a web-based system, Ori-Finder, for
finding oriCs in bacterial genomes with high accuracy. Predicted oriC regions in bacterial genomes are organized into an
online database, DoriC. Recently, archaeal oriC regions identified by both in vivo and in silico methods have also been included
in the database. Here, we summarize the recent advances of in silico prediction of oriCs in bacterial and archaeal
genomes using the Z-curve based method.
Collapse
Affiliation(s)
- Feng Gao
- Department of Physics, Tianjin University, Tianjin 300072, China
| |
Collapse
|
28
|
Lopez-Vernaza MA, Leach DRF. WITHDRAWN: Symmetries and Asymmetries Associated with Non-Random Segregation of Sister DNA Strands in Escherichia coli. Semin Cell Dev Biol 2013:S1084-9521(13)00077-3. [PMID: 23692810 DOI: 10.1016/j.semcdb.2013.05.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2013] [Accepted: 05/06/2013] [Indexed: 11/19/2022]
Abstract
The Publisher regrets that this article is an accidental duplication of an article that has already been published, http://dx.doi.org/10.1016/j.semcdb.2013.05.010. The duplicate article has therefore been withdrawn.
Collapse
Affiliation(s)
- Manuel A Lopez-Vernaza
- Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JR, United Kingdom
| | | |
Collapse
|
29
|
Lopez-Vernaza MA, Leach DRF. Symmetries and asymmetries associated with non-random segregation of sister DNA strands in Escherichia coli. Semin Cell Dev Biol 2013; 24:610-7. [PMID: 23685127 DOI: 10.1016/j.semcdb.2013.05.010] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The successful inheritance of genetic information across generations is a complex process requiring replication of the genome and its faithful segregation into two daughter cells. At each replication cycle there is a risk that new DNA strands incorporate genetic changes caused by miscopying of parental information. By contrast the parental strands retain the original information. This raises the intriguing possibility that specific cell lineages might inherit "immortal" parental DNA strands via non-random segregation. If so, this requires an understanding of the mechanisms of non-random segregation. Here, we review several aspects of asymmetry in the very symmetrical cell, Escherichia coli, in the interest of exploring the potential basis for non-random segregation of leading- and lagging-strand replicated chromosome arms. These considerations lead us to propose a model for DNA replication that integrates chromosome segregation and genomic localisation with non-random strand segregation.
Collapse
Affiliation(s)
- Manuel A Lopez-Vernaza
- Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, United Kingdom
| | | |
Collapse
|
30
|
Sobetzko P, Glinkowska M, Travers A, Muskhelishvili G. DNA thermodynamic stability and supercoil dynamics determine the gene expression program during the bacterial growth cycle. MOLECULAR BIOSYSTEMS 2013; 9:1643-51. [PMID: 23493878 DOI: 10.1039/c3mb25515h] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
The chromosomal DNA polymer constituting the cellular genetic material is primarily a device for coding information. Whilst the gene sequences comprise the digital (discontinuous) linear code, physiological alterations of the DNA superhelical density generate in addition analog (continuous) three-dimensional information essential for regulation of both chromosome compaction and gene expression. Insight into the relationship between the DNA analog information and the digital linear code is of fundamental importance for understanding genetic regulation. Our previous study in the model organism Escherichia coli suggested that the chromosomal gene order and a spatiotemporal gradient of DNA superhelicity associated with DNA replication determine the growth phase-dependent gene transcription. In this study we reveal a general gradient of DNA thermodynamic stability correlated with the polarity of chromosomal replication and manifest in the spatiotemporal pattern of gene transcription during the bacterial growth cycle. Furthermore, by integrating the physical and dynamic features of the transcribed sequences with their functional content we identify spatiotemporal domains of gene expression encompassing different functions. We thus provide both an insight into the organisational principle of the bacterial growth program and a novel holistic methodology for exploring chromosomal dynamics.
Collapse
Affiliation(s)
- Patrick Sobetzko
- Jacobs University Bremen, School of Engineering and Science, Campus Ring 1, D-28759 Bremen, Germany
| | | | | | | |
Collapse
|
31
|
Gao F, Luo H, Zhang CT. DoriC 5.0: an updated database of oriC regions in both bacterial and archaeal genomes. Nucleic Acids Res 2012; 41:D90-3. [PMID: 23093601 PMCID: PMC3531139 DOI: 10.1093/nar/gks990] [Citation(s) in RCA: 111] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Replication of chromosomes is one of the central events in the cell cycle. Chromosome replication begins at specific sites, called origins of replication (oriCs), for all three domains of life. However, the origins of replication still remain unknown in a considerably large number of bacterial and archaeal genomes completely sequenced so far. The availability of increasing complete bacterial and archaeal genomes has created challenges and opportunities for identification of their oriCs in silico, as well as in vivo. Based on the Z-curve theory, we have developed a web-based system Ori-Finder to predict oriCs in bacterial genomes with high accuracy and reliability by taking advantage of comparative genomics, and the predicted oriC regions have been organized into an online database DoriC, which is publicly available at http://tubic.tju.edu.cn/doric/ since 2007. Five years after we constructed DoriC, the database has significant advances over the number of bacterial genomes, increasing about 4-fold. Additionally, oriC regions in archaeal genomes identified by in vivo experiments, as well as in silico analyses, have also been added to the database. Consequently, the latest release of DoriC contains oriCs for >1500 bacterial genomes and 81 archaeal genomes, respectively.
Collapse
Affiliation(s)
- Feng Gao
- Department of Physics, Tianjin University, Tianjin 300072, China.
| | | | | |
Collapse
|
32
|
Strobel T, Al-Dilaimi A, Blom J, Gessner A, Kalinowski J, Luzhetska M, Pühler A, Szczepanowski R, Bechthold A, Rückert C. Complete genome sequence of Saccharothrix espanaensis DSM 44229(T) and comparison to the other completely sequenced Pseudonocardiaceae. BMC Genomics 2012; 13:465. [PMID: 22958348 PMCID: PMC3469384 DOI: 10.1186/1471-2164-13-465] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2012] [Accepted: 08/30/2012] [Indexed: 12/04/2022] Open
Abstract
Background The genus Saccharothrix is a representative of the family Pseudonocardiaceae, known to include producer strains of a wide variety of potent antibiotics. Saccharothrix espanaensis produces both saccharomicins A and B of the promising new class of heptadecaglycoside antibiotics, active against both bacteria and yeast. Results To better assess its capabilities, the complete genome sequence of S. espanaensis was established. With a size of 9,360,653 bp, coding for 8,501 genes, it stands alongside other Pseudonocardiaceae with large genomes. Besides a predicted core genome of 810 genes shared in the family, S. espanaensis has a large number of accessory genes: 2,967 singletons when compared to the family, of which 1,292 have no clear orthologs in the RefSeq database. The genome analysis revealed the presence of 26 biosynthetic gene clusters potentially encoding secondary metabolites. Among them, the cluster coding for the saccharomicins could be identified. Conclusion S. espanaensis is the first completely sequenced species of the genus Saccharothrix. The genome discloses the cluster responsible for the biosynthesis of the saccharomicins, the largest oligosaccharide antibiotic currently identified. Moreover, the genome revealed 25 additional putative secondary metabolite gene clusters further suggesting the strain’s potential for natural product synthesis.
Collapse
Affiliation(s)
- Tina Strobel
- Department of Pharmaceutical Biology and Biotechnology, Institute of Pharmaceutical Sciences, Albert-Ludwigs-University, Freiburg 79104, Germany
| | | | | | | | | | | | | | | | | | | |
Collapse
|