1
|
Atre M, Joshi B, Babu J, Sawant S, Sharma S, Sankar TS. Origin, evolution, and maintenance of gene-strand bias in bacteria. Nucleic Acids Res 2024; 52:3493-3509. [PMID: 38442257 DOI: 10.1093/nar/gkae155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 02/06/2024] [Accepted: 02/19/2024] [Indexed: 03/07/2024] Open
Abstract
Gene-strand bias is a characteristic feature of bacterial genome organization wherein genes are preferentially encoded on the leading strand of replication, promoting co-orientation of replication and transcription. This co-orientation bias has evolved to protect gene essentiality, expression, and genomic stability from the harmful effects of head-on replication-transcription collisions. However, the origin, variation, and maintenance of gene-strand bias remain elusive. Here, we reveal that the frequency of inversions that alter gene orientation exhibits large variation across bacterial populations and negatively correlates with gene-strand bias. The density, distance, and distribution of inverted repeats show a similar negative relationship with gene-strand bias explaining the heterogeneity in inversions. Importantly, these observations are broadly evident across the entire bacterial kingdom uncovering inversions and inverted repeats as primary factors underlying the variation in gene-strand bias and its maintenance. The distinct catalytic subunits of replicative DNA polymerase have co-evolved with gene-strand bias, suggesting a close link between replication and the origin of gene-strand bias. Congruently, inversion frequencies and inverted repeats vary among bacteria with different DNA polymerases. In summary, we propose that the nature of replication determines the fitness cost of replication-transcription collisions, establishing a selection gradient on gene-strand bias by fine-tuning DNA sequence repeats and, thereby, gene inversions.
Collapse
Affiliation(s)
- Malhar Atre
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - Bharat Joshi
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - Jebin Babu
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - Shabduli Sawant
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - Shreya Sharma
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - T Sabari Sankar
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| |
Collapse
|
2
|
Liang Y, Luo H, Lin Y, Gao F. Recent advances in the characterization of essential genes and development of a database of essential genes. IMETA 2024; 3:e157. [PMID: 38868518 PMCID: PMC10989110 DOI: 10.1002/imt2.157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 10/09/2023] [Indexed: 06/14/2024]
Abstract
Over the past few decades, there has been a significant interest in the study of essential genes, which are crucial for the survival of an organism under specific environmental conditions and thus have practical applications in the fields of synthetic biology and medicine. An increasing amount of experimental data on essential genes has been obtained with the continuous development of technological methods. Meanwhile, various computational prediction methods, related databases and web servers have emerged accordingly. To facilitate the study of essential genes, we have established a database of essential genes (DEG), which has become popular with continuous updates to facilitate essential gene feature analysis and prediction, drug and vaccine development, as well as artificial genome design and construction. In this article, we summarized the studies of essential genes, overviewed the relevant databases, and discussed their practical applications. Furthermore, we provided an overview of the main applications of DEG and conducted comprehensive analyses based on its latest version. However, it should be noted that the essential gene is a dynamic concept instead of a binary one, which presents both opportunities and challenges for their future development.
Collapse
Affiliation(s)
| | - Hao Luo
- Department of PhysicsTianjin UniversityTianjinChina
| | - Yan Lin
- Department of PhysicsTianjin UniversityTianjinChina
| | - Feng Gao
- Department of PhysicsTianjin UniversityTianjinChina
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education)Tianjin UniversityTianjinChina
- SynBio Research PlatformCollaborative Innovation Center of Chemical Science and Engineering (Tianjin)TianjinChina
| |
Collapse
|
3
|
Abstract
The technology of recombineering, in vivo genetic engineering, was initially developed in Escherichia coli and uses bacteriophage-encoded homologous recombination proteins to efficiently recombine DNA at short homologies (35 to 50 nt). Because the technology is homology driven, genomic DNA can be modified precisely and independently of restriction site location. Recombineering uses linear DNA substrates that are introduced into the cell by electroporation; these can be PCR products, synthetic double-strand DNA (dsDNA), or single-strand DNA (ssDNA). Here we describe the applications, challenges, and factors affecting ssDNA and dsDNA recombineering in a variety of non-model bacteria, both Gram-negative and -positive, and recent breakthroughs in the field. We list different microbes in which the widely used phage λ Red and Rac RecET recombination systems have been used for in vivo genetic engineering. New homologous ssDNA and dsDNA recombineering systems isolated from non-model bacteria are also described. The Basic Protocol outlines a method for ssDNA recombineering in the non-model species of Shewanella. The Alternate Protocol describes the use of CRISPR/Cas as a counter-selection system in conjunction with recombineering to enhance recovery of recombinants. We provide additional background information, pertinent considerations for experimental design, and parameters critical for success. The design of ssDNA oligonucleotides (oligos) and various internet-based tools for oligo selection from genome sequences are also described, as is the use of oligo-mediated recombination. This simple form of genome editing uses only ssDNA oligo(s) and does not require an exogenous recombination system. The information presented here should help researchers identify a recombineering system suitable for their microbe(s) of interest. If no system has been characterized for a specific microbe, researchers can find guidance in developing a recombineering system from scratch. We provide a flowchart of decision-making paths for strategically applying annealase-dependent or oligo-mediated recombination in non-model and undomesticated bacteria. © 2022 Wiley Periodicals LLC. This article has been contributed to by U.S. Government employees and their work is in the public domain in the USA. Basic Protocol: ssDNA recombineering in Shewanella species Alternate Protocol: ssDNA recombineering coupled to CRISPR/Cas9 in Shewanella species.
Collapse
Affiliation(s)
- Anna Corts
- Cultivarium, 490 Arsenal Way, Ste 110, Watertown, Massachusetts 02472
| | - Lynn C. Thomason
- Molecular Control and Genetics Section, RNA Biology Laboratory, National Cancer Institute at Frederick, National Institutes of Health, Frederick, Maryland 21702
| | - Nina Costantino
- Molecular Control and Genetics Section, RNA Biology Laboratory, National Cancer Institute at Frederick, National Institutes of Health, Frederick, Maryland 21702
| | - Donald L. Court
- Emeritus, Molecular Control and Genetics Section, RNA Biology Laboratory, National Cancer Institute at Frederick, National Institutes of Health, Frederick, Maryland 21702
| |
Collapse
|
4
|
DELEAT: gene essentiality prediction and deletion design for bacterial genome reduction. BMC Bioinformatics 2021; 22:444. [PMID: 34537011 PMCID: PMC8449488 DOI: 10.1186/s12859-021-04348-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Accepted: 08/26/2021] [Indexed: 11/10/2022] Open
Abstract
Background The study of gene essentiality is fundamental to understand the basic principles of life, as well as for applications in many fields. In recent decades, dozens of sets of essential genes have been determined using different experimental and bioinformatics approaches, and this information has been useful for genome reduction of model organisms. Multiple in silico strategies have been developed to predict gene essentiality, but no optimal algorithm or set of gene features has been found yet, especially for non-model organisms with incomplete functional annotation. Results We have developed DELEAT v0.1 (DELetion design by Essentiality Analysis Tool), an easy-to-use bioinformatic tool which integrates an in silico gene essentiality classifier in a pipeline allowing automatic design of large-scale deletions in any bacterial genome. The essentiality classifier consists of a novel logistic regression model based on only six gene features which are not dependent on experimental data or functional annotation. As a proof of concept, we have applied this pipeline to the determination of dispensable regions in the genome of Bartonella quintana str. Toulouse. In this already reduced genome, 35 possible deletions have been delimited, spanning 29% of the genome. Conclusions Built on in silico gene essentiality predictions, we have developed an analysis pipeline which assists researchers throughout multiple stages of bacterial genome reduction projects, and created a novel classifier which is simple, fast, and universally applicable to any bacterial organism with a GenBank annotation file. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04348-5.
Collapse
|
5
|
Liu T, Luo H, Gao F. Position preference of essential genes in prokaryotic operons. PLoS One 2021; 16:e0250380. [PMID: 33886641 PMCID: PMC8061932 DOI: 10.1371/journal.pone.0250380] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 04/05/2021] [Indexed: 11/19/2022] Open
Abstract
Essential genes, which form the basis of life activities, are crucial for the survival of organisms. Essential genes tend to be located in operons, but how they are distributed in operons is still unclear for most prokaryotes. In order to clarify the general rule of position preference of essential genes in operons, an index of the average position of genes in an operon was proposed, and the distributions of essential and non-essential genes in operons in 51 bacterial genomes and two archaeal genomes were analyzed based on this new index. Consequently, essential genes were found to preferentially occupy the front positions of the operons, which tend to be expressed at higher levels.
Collapse
Affiliation(s)
- Tao Liu
- Department of Physics, School of Science, Tianjin University, Tianjin, China
| | - Hao Luo
- Department of Physics, School of Science, Tianjin University, Tianjin, China
- * E-mail: (FG); (HL)
| | - Feng Gao
- Department of Physics, School of Science, Tianjin University, Tianjin, China
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China
- SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin, China
- * E-mail: (FG); (HL)
| |
Collapse
|
6
|
Dilucca M, Cimini G, Giansanti A. Bacterial Protein Interaction Networks: Connectivity is Ruled by Gene Conservation, Essentiality and Function. Curr Genomics 2021; 22:111-121. [PMID: 34220298 PMCID: PMC8188579 DOI: 10.2174/1389202922666210219110831] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 08/13/2020] [Accepted: 08/27/2020] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND Protein-protein interaction (PPI) networks are the backbone of all processes in living cells. In this work, we relate conservation, essentiality and functional repertoire of a gene to the connectivity k (i.e. the number of interactions, links) of the corresponding protein in the PPI network. METHODS On a set of 42 bacterial genomes of different sizes, and with reasonably separated evolutionary trajectories, we investigate three issues: i) whether the distribution of connectivities changes between PPI subnetworks of essential and nonessential genes; ii) how gene conservation, measured both by the evolutionary retention index (ERI) and by evolutionary pressures, is related to the connectivity of the corresponding protein; iii) how PPI connectivities are modulated by evolutionary and functional relationships, as represented by the Clusters of Orthologous Genes (COGs). RESULTS We show that conservation, essentiality and functional specialisation of genes constrain the connectivity of the corresponding proteins in bacterial PPI networks. In particular, we isolated a core of highly connected proteins (connectivities k≥40), which is ubiquitous among the species considered here, though mostly visible in the degree distributions of bacteria with small genomes (less than 1000 genes). CONCLUSION The genes that support this highly connected core are conserved, essential and, in most cases, belong to the COG cluster J, related to ribosomal functions and the processing of genetic information.
Collapse
Affiliation(s)
- Maddalena Dilucca
- Dipartimento di Fisica, Sapienza University of Rome, 00185, Rome, Italy
| | - Giulio Cimini
- Dipartimento di Fisica, Tor Vergata University of Rome, 00133, Rome, Italy Istituto dei Sistemi Complessi CNR UoS, Rome, Italy
| | - Andrea Giansanti
- Dipartimento di Fisica, Sapienza University of Rome, 00185, Rome, Italy INFN Roma1 Unit, Rome, Italy
| |
Collapse
|
7
|
Luo H, Lin Y, Liu T, Lai FL, Zhang CT, Gao F, Zhang R. DEG 15, an update of the Database of Essential Genes that includes built-in analysis tools. Nucleic Acids Res 2021; 49:D677-D686. [PMID: 33095861 PMCID: PMC7779065 DOI: 10.1093/nar/gkaa917] [Citation(s) in RCA: 90] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 09/30/2020] [Accepted: 10/06/2020] [Indexed: 12/21/2022] Open
Abstract
Essential genes refer to genes that are required by an organism to survive under specific conditions. Studies of the minimal-gene-set for bacteria have elucidated fundamental cellular processes that sustain life. The past five years have seen a significant progress in identifying human essential genes, primarily due to the successful use of CRISPR/Cas9 in various types of human cells. DEG 15, a new release of the Database of Essential Genes (www.essentialgene.org), has provided major advancements, compared to DEG 10. Specifically, the number of eukaryotic essential genes has increased by more than fourfold, and that of prokaryotic ones has more than doubled. Of note, the human essential-gene number has increased by more than tenfold. Moreover, we have developed built-in analysis modules by which users can perform various analyses, such as essential-gene distributions between bacterial leading and lagging strands, sub-cellular localization distribution, enrichment analysis of gene ontology and KEGG pathways, and generation of Venn diagrams to compare and contrast gene sets between experiments. Additionally, the database offers customizable BLAST tools for performing species- and experiment-specific BLAST searches. Therefore, DEG comprehensively harbors updated human-curated essential-gene records among prokaryotes and eukaryotes with built-in tools to enhance essential-gene analysis.
Collapse
Affiliation(s)
- Hao Luo
- Department of Physics, School of Science, Tianjin University, Tianjin 300072, China
| | - Yan Lin
- Department of Physics, School of Science, Tianjin University, Tianjin 300072, China
| | - Tao Liu
- Department of Physics, School of Science, Tianjin University, Tianjin 300072, China
| | - Fei-Liao Lai
- Department of Physics, School of Science, Tianjin University, Tianjin 300072, China
| | - Chun-Ting Zhang
- Department of Physics, School of Science, Tianjin University, Tianjin 300072, China
| | - Feng Gao
- Department of Physics, School of Science, Tianjin University, Tianjin 300072, China.,Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
| | - Ren Zhang
- Center for Molecular Medicine and Genetics, School of Medicine, Wayne State University, Detroit, MI 48201, USA
| |
Collapse
|
8
|
Chand Y, Alam MA, Singh S. Pan-genomic analysis of the species Salmonella enterica: Identification of core essential and putative essential genes. GENE REPORTS 2020. [DOI: 10.1016/j.genrep.2020.100669] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
|
9
|
Lin Y, Zhang FZ, Xue K, Gao YZ, Guo FB. Identifying Bacterial Essential Genes Based on a Feature-Integrated Method. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1274-1279. [PMID: 28212095 DOI: 10.1109/tcbb.2017.2669968] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Essential genes are those genes of an organism that are considered to be crucial for its survival. Identification of essential genes is therefore of great significance to advance our understanding of the principles of cellular life. We have developed a novel computational method, which can effectively predict bacterial essential genes by extracting and integrating homologous features, protein domain feature, gene intrinsic features, and network topological features. By performing the principal component regression (PCR) analysis for Escherichia coli MG1655, we established a classification model with the average area under curve (AUC) value of 0.992 in ten times 5-fold cross-validation tests. Furthermore, when employing this new model to a distantly related organism-Streptococcus pneumoniae TIGR4, we still got a reliable AUC value of 0.788. These results indicate that our feature-integrated approach could have practical applications in accurately investigating essential genes from broad bacterial species, and also provide helpful guidelines for the minimal cell.
Collapse
|
10
|
Uddin R, Masood F, Azam SS, Wadood A. Identification of putative non-host essential genes and novel drug targets against Acinetobacter baumannii by in silico comparative genome analysis. Microb Pathog 2018; 128:28-35. [PMID: 30550846 DOI: 10.1016/j.micpath.2018.12.015] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Revised: 12/10/2018] [Accepted: 12/10/2018] [Indexed: 10/27/2022]
Abstract
Acinetobacter baumannii, the gram-negative bacteria emerged as an extremely critical pathogen causing nosocomial and different kinds of infections. A. baumannii exhibit resistivity towards various classes of antibiotics that shows that there is a dire need to search more drug targets by exploiting the full genome of the bacteria. In doing so, a strategy is made with the combination of computational biology, pathogen informatics and cheminformatics. Comparative genomics analysis, modeling and docking studies have been performed for the prediction of non-host essential genes and novel drug candidates against A. baumannii. Among 37 unique and 82 common metabolic pathways, 92 genes were predicted as non-host genes. Similarly, using homology search between A. baumannii genome and essential genes of different bacteria, 293 genes were predicted as essential genes of A. baumannii. Among these predicted non-host and essential genes, 86 genes were predicted as non-host essential genes which could serve as potential novel drug and vaccine targets. Additional drug-target like physicochemical properties were estimated such as the molecular weight, subcellular localization and druggability potential. On the structural part, the crystal structures of all the non-host essential genes of A. baumannii were found except the three genes. Out of these three, a homology model of Undecaprenyl-diphosphatase was built using a PDB template by MODELLER [version 9.18]. The quality of the model was assessed by the ProSA and RAMPAGE. The built model was subjected as a receptor for the molecular docking with Adenosine diphosphate (ADP) as a ligand. The molecular docking was performed by AutoDock4 and the best conformation with lowest binding energy (-4.39 kcal/mol) was obtained. The LigPlot was used to identify the close interactions between the ligand the receptor's residues. This study will further aid for the selection of putative inhibitors against a novel drug target identified against A. baumannii and hence could lead to the better therapeutics.
Collapse
Affiliation(s)
- Reaz Uddin
- Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Pakistan.
| | - Fareha Masood
- Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Pakistan
| | - Syed Sikander Azam
- National Centre for Bioinformatics, Quaid-i-Azam University, Islamabad, Pakistan
| | - Abdul Wadood
- Department of Biochemistry, Abdul Wali Khan University, Mardan, Pakistan
| |
Collapse
|
11
|
Dilucca M, Cimini G, Giansanti A. Essentiality, conservation, evolutionary pressure and codon bias in bacterial genomes. Gene 2018; 663:178-188. [PMID: 29678658 DOI: 10.1016/j.gene.2018.04.017] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 03/25/2018] [Accepted: 04/09/2018] [Indexed: 11/30/2022]
Abstract
Essential genes constitute the core of genes which cannot be mutated too much nor lost along the evolutionary history of a species. Natural selection is expected to be stricter on essential genes and on conserved (highly shared) genes, than on genes that are either nonessential or peculiar to a single or a few species. In order to further assess this expectation, we study here how essentiality of a gene is connected with its degree of conservation among several unrelated bacterial species, each one characterised by its own codon usage bias. Confirming previous results on E. coli, we show the existence of a universal exponential relation between gene essentiality and conservation in bacteria. Moreover, we show that, within each bacterial genome, there are at least two groups of functionally distinct genes, characterised by different levels of conservation and codon bias: i) a core of essential genes, mainly related to cellular information processing; ii) a set of less conserved nonessential genes with prevalent functions related to metabolism. In particular, the genes in the first group are more retained among species, are subject to a stronger purifying conservative selection and display a more limited repertoire of synonymous codons. The core of essential genes is close to the minimal bacterial genome, which is in the focus of recent studies in synthetic biology, though we confirm that orthologs of genes that are essential in one species are not necessarily essential in other species. We also list a set of highly shared genes which, reasonably, could constitute a reservoir of targets for new anti-microbial drugs.
Collapse
Affiliation(s)
- Maddalena Dilucca
- Dipartimento di Fisica, "Sapienza" University of Rome, Rome 00185, Italy.
| | - Giulio Cimini
- IMT School for Advanced Studies, Lucca 55100, Italy; Istituto dei Sistemi Complessi (ISC)-CNR, Rome 00185, Italy
| | - Andrea Giansanti
- Dipartimento di Fisica, "Sapienza" University of Rome, Rome 00185, Italy; INFN Roma1 Unit, Rome 00185, Italy
| |
Collapse
|
12
|
Luo H, Quan CL, Peng C, Gao F. Recent development of Ori-Finder system and DoriC database for microbial replication origins. Brief Bioinform 2018; 20:1114-1124. [DOI: 10.1093/bib/bbx174] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Revised: 12/04/2017] [Indexed: 01/28/2023] Open
Abstract
Abstract
DNA replication begins at replication origins in all three domains of life. Identification and characterization of replication origins are important not only in providing insights into the structure and function of the replication origins but also in understanding the regulatory mechanisms of the initiation step in DNA replication. The Z-curve method has been used in the identification of replication origins in archaeal genomes successfully since 2002. Furthermore, the Web servers of Ori-Finder and Ori-Finder 2 have been developed to predict replication origins in both bacterial and archaeal genomes based on the Z-curve method, and the replication origins with manual curation have been collected into an online database, DoriC. Ori-Finder system and DoriC database are currently used in the research field of DNA replication origins in prokaryotes, including: (i) identification of oriC regions in bacterial and archaeal genomes; (ii) discovery and analysis of the conserved sequences within oriC regions; and (iii) strand-biased analysis of bacterial genomes.
Up to now, more and more predicted results by Ori-Finder system were supported by subsequent experiments, and Ori-Finder system has been used to identify the replication origins in > 100 newly sequenced prokaryotes in their genome reports. In addition, the data in DoriC database have been widely used in the large-scale analyses of replication origins and strand bias in prokaryotic genomes. Here, we review the development of Ori-Finder system and DoriC database as well as their applications. Some future directions and aspects for extending the application of Ori-Finder and DoriC are also presented.
Collapse
|
13
|
Peng C, Lin Y, Luo H, Gao F. A Comprehensive Overview of Online Resources to Identify and Predict Bacterial Essential Genes. Front Microbiol 2017; 8:2331. [PMID: 29230204 PMCID: PMC5711816 DOI: 10.3389/fmicb.2017.02331] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Accepted: 11/13/2017] [Indexed: 12/15/2022] Open
Abstract
Genes critical for the survival or reproduction of an organism in certain circumstances are classified as essential genes. Essential genes play a significant role in deciphering the survival mechanism of life. They may be greatly applied to pharmaceutics and synthetic biology. The continuous progress of experimental method for essential gene identification has accelerated the accumulation of gene essentiality data which facilitates the study of essential genes in silico. In this article, we present some available online resources related to gene essentiality, including bioinformatic software tools for transposon sequencing (Tn-seq) analysis, essential gene databases and online services to predict bacterial essential genes. We review several computational approaches that have been used to predict essential genes, and summarize the features used for gene essentiality prediction. In addition, we evaluate the available online bacterial essential gene prediction servers based on the experimentally validated essential gene sets of 30 bacteria from DEG. This article is intended to be a quick reference guide for the microbiologists interested in the essential genes.
Collapse
Affiliation(s)
- Chong Peng
- Department of Physics, School of Science, Tianjin University, Tianjin, China
| | - Yan Lin
- Department of Physics, School of Science, Tianjin University, Tianjin, China
| | - Hao Luo
- Department of Physics, School of Science, Tianjin University, Tianjin, China
| | - Feng Gao
- Department of Physics, School of Science, Tianjin University, Tianjin, China
- Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China
- SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin University, Tianjin, China
| |
Collapse
|
14
|
Sadhasivam A, Vetrivel U. Genome-wide codon usage profiling of ocular infective Chlamydia trachomatis serovars and drug target identification. J Biomol Struct Dyn 2017. [PMID: 28627970 DOI: 10.1080/07391102.2017.1343685] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Chlamydia trachomatis (C.t) is a Gram-negative obligate intracellular bacteria and is a major causative of infectious blindness and sexually transmitted diseases. Among the varied serovars of this organism, A, B and C are reported as prominent ocular pathogens. Genomic studies of these strains shall aid in deciphering potential drug targets and genomic influence on pathogenesis. Hence, in this study we performed deep statistical profiling of codon usage in these serovars. The overall base composition analysis reveals that these serovars are over biased to AU than GC. Similarly, relative synonymous codon usage also showed preference towards A/U ending codons. Parity Rule 2 analysis inferred unequal distribution of AT and GC, indicative of other unknown factors acting along with mutational pressure to influence codon usage bias (CUB). Moreover, absolute quantification of CUB also revealed lower bias across these serovars. The effect of natural selection on CUB was also confirmed by neutrality plot, reinforcing natural selection under mutational pressure turned to be a pivotal role in shaping the CUB in the strains studied. Correspondence analysis (COA) clarified that, C.t C/TW-3 to show a unique trend in codon usage variation. Host influence analysis on shaping the codon usage pattern also inferred some speculative relativity. In a nutshell, our finding suggests that mutational pressure is the dominating factor in shaping CUB in the strains studied, followed by natural selection. We also propose potential drug targets based on cumulative analysis of strand bias, CUB and human non-homologue screening.
Collapse
Affiliation(s)
- Anupriya Sadhasivam
- a Centre for Bioinformatics , Kamalnayan Bajaj Institute for Research in Vision and Ophthalmology, Vision Research Foundation, Sankara Nethralaya , Chennai 600 006 , Tamil Nadu , India
| | - Umashankar Vetrivel
- a Centre for Bioinformatics , Kamalnayan Bajaj Institute for Research in Vision and Ophthalmology, Vision Research Foundation, Sankara Nethralaya , Chennai 600 006 , Tamil Nadu , India
| |
Collapse
|
15
|
Zheng WX, Luo CS, Deng YY, Guo FB. Essentiality drives the orientation bias of bacterial genes in a continuous manner. Sci Rep 2015; 5:16431. [PMID: 26560889 PMCID: PMC4642330 DOI: 10.1038/srep16431] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2015] [Accepted: 10/13/2015] [Indexed: 12/04/2022] Open
Abstract
Studies had found that bacterial genes are preferentially located on the leading strands. Subsequently, the preferences of essential genes and highly expressed genes were compared by classifying all genes into four groups, which showed that the former has an exclusive influence on orientation. However, only some functional classes of essential genes have this orientation bias. Nevertheless, previous studies only performed comparative analyzes by differentiating the orientation bias extent of two types of genes. Thus, it is unclear whether the influence of essentiality on strand bias works continuously. Herein, we found a significant correlation between essentiality and orientation bias extent in 19 of 21 analyzed bacterial genomes, based on quantitative measurement of gene essentiality (or fitness). The correlation coefficient was much higher than that derived from binary essentiality measures (essential or non-essential). This suggested that genes with relatively lower essentiality, i.e., conditionally essential genes, also have some orientation bias, although it is weaker than that of absolutely essential genes. The results demonstrated the continuous influence of essentiality on orientation bias and provided details on this visible structural feature of bacterial genomes. It also proved that Geptop and IFIM could serve as useful resources of bacterial gene essentiality, particularly for quantitative analysis.
Collapse
Affiliation(s)
- Wen-Xin Zheng
- School of Biomedical Engineering, Capital Medical University, Beijing 100069, China.,Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, Capital Medical University, Beijing 100069, China
| | - Cheng-Si Luo
- Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Center for Information in BioMedicine, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Key Laboratory for Neuro Information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Yan-Yan Deng
- Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Center for Information in BioMedicine, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Key Laboratory for Neuro Information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Feng-Biao Guo
- Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Center for Information in BioMedicine, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Key Laboratory for Neuro Information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, 610054, China
| |
Collapse
|
16
|
Multiple Factors Drive Replicating Strand Composition Bias in Bacterial Genomes. Int J Mol Sci 2015; 16:23111-26. [PMID: 26404268 PMCID: PMC4613354 DOI: 10.3390/ijms160923111] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2015] [Revised: 08/18/2015] [Accepted: 09/18/2015] [Indexed: 11/18/2022] Open
Abstract
Composition bias from Chargaff’s second parity rule (PR2) has long been found in sequenced genomes, and is believed to relate strongly with the replication process in microbial genomes. However, some disagreement on the underlying reason for strand composition bias remains. We performed an integrative analysis of various genomic features that might influence composition bias using a large-scale dataset of 1111 genomes. Our results indicate (1) the bias was stronger in obligate intracellular bacteria than in other free-living species (p-value = 0.0305); (2) Fusobacteria and Firmicutes had the highest average bias among the 24 microbial phyla analyzed; (3) the strength of selected codon usage bias and generation times were not observably related to strand composition bias (p-value = 0.3247); (4) significant negative relationships were found between GC content, genome size, rearrangement frequency, Clusters of Orthologous Groups (COG) functional subcategories A, C, I, Q, and composition bias (p-values < 1.0 × 10−8); (5) gene density and COG functional subcategories D, F, J, L, and V were positively related with composition bias (p-value < 2.2 × 10−16); and (6) gene density made the most important contribution to composition bias, indicating transcriptional bias was associated strongly with strand composition bias. Therefore, strand composition bias was found to be influenced by multiple factors with varying weights.
Collapse
|
17
|
Luo H, Gao F, Lin Y. Evolutionary conservation analysis between the essential and nonessential genes in bacterial genomes. Sci Rep 2015; 5:13210. [PMID: 26272053 PMCID: PMC4536490 DOI: 10.1038/srep13210] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 07/22/2015] [Indexed: 11/20/2022] Open
Abstract
Essential genes are thought to be critical for the survival of the organisms under certain circumstances, and the natural selection acting on essential genes is expected to be stricter than on nonessential ones. Up to now, essential genes have been identified in approximately thirty bacterial organisms by experimental methods. In this paper, we performed a comprehensive comparison between the essential and nonessential genes in the genomes of 23 bacterial species based on the Ka/Ks ratio, and found that essential genes are more evolutionarily conserved than nonessential genes in most of the bacteria examined. Furthermore, we also analyzed the conservation by functional clusters with the clusters of orthologous groups (COGs), and found that the essential genes in the functional categories of G (Carbohydrate transport and metabolism), H (Coenzyme transport and metabolism), I (Transcription), J (Translation, ribosomal structure and biogenesis), K (Lipid transport and metabolism) and L (Replication, recombination and repair) tend to be more evolutionarily conserved than the corresponding nonessential genes in bacteria. The results suggest that the essential genes in these subcategories are subject to stronger selective pressure than the nonessential genes, and therefore, provide more insights of the evolutionary conservation for the essential and nonessential genes in complex biological processes.
Collapse
Affiliation(s)
- Hao Luo
- Department of Physics, Tianjin University, Tianjin 300072, China
| | - Feng Gao
- 1] Department of Physics, Tianjin University, Tianjin 300072, China [2] Key Laboratory of Systems Bioengineering, (Ministry of Education), Tianjin University, Tianjin 300072, China [3] SynBio Research Platform, CollaborativeInnovation Center of Chemical Science and Engineering, Tianjin 300072, China
| | - Yan Lin
- Department of Physics, Tianjin University, Tianjin 300072, China
| |
Collapse
|
18
|
Abstract
Essential genes are thought to encode proteins that carry out the basic functions to sustain a cellular life, and genomic islands (GIs) usually contain clusters of horizontally transferred genes. It has been assumed that essential genes are not likely to be located in GIs, but systematical analysis of essential genes in GIs has not been explored before. Here, we have analyzed the essential genes in 28 prokaryotes by statistical method and reached a conclusion that essential genes in GIs are significantly fewer than those outside GIs. The function of 362 essential genes found in GIs has been explored further by BLAST against the Virulence Factor Database (VFDB) and the phage/prophage sequence database of PHAge Search Tool (PHAST). Consequently, 64 and 60 eligible essential genes are found to share the sequence similarity with the virulence factors and phage/prophages-related genes, respectively. Meanwhile, we find several toxin-related proteins and repressors encoded by these essential genes in GIs. The comparative analysis of essential genes in genomic islands will not only shed new light on the development of the prediction algorithm of essential genes, but also give a clue to detect the functionality of essential genes in genomic islands.
Collapse
Affiliation(s)
- Xi Zhang
- Department of Physics, Tianjin University, Tianjin 300072, China
| | - Chong Peng
- Department of Physics, Tianjin University, Tianjin 300072, China
| | - Ge Zhang
- Department of Physics, Tianjin University, Tianjin 300072, China
| | - Feng Gao
- 1] Department of Physics, Tianjin University, Tianjin 300072, China [2] Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China [3] SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering, Tianjin 300072, China
| |
Collapse
|
19
|
Jin DJ, Cagliero C, Martin CM, Izard J, Zhou YN. The dynamic nature and territory of transcriptional machinery in the bacterial chromosome. Front Microbiol 2015; 6:497. [PMID: 26052320 PMCID: PMC4440401 DOI: 10.3389/fmicb.2015.00497] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Accepted: 05/06/2015] [Indexed: 11/16/2022] Open
Abstract
Our knowledge of the regulation of genes involved in bacterial growth and stress responses is extensive; however, we have only recently begun to understand how environmental cues influence the dynamic, three-dimensional distribution of RNA polymerase (RNAP) in Escherichia coli on the level of single cell, using wide-field fluorescence microscopy and state-of-the-art imaging techniques. Live-cell imaging using either an agarose-embedding procedure or a microfluidic system further underscores the dynamic nature of the distribution of RNAP in response to changes in the environment and highlights the challenges in the study. A general agreement between live-cell and fixed-cell images has validated the formaldehyde-fixing procedure, which is a technical breakthrough in the study of the cell biology of RNAP. In this review we use a systems biology perspective to summarize the advances in the cell biology of RNAP in E. coli, including the discoveries of the bacterial nucleolus, the spatial compartmentalization of the transcription machinery at the periphery of the nucleoid, and the segregation of the chromosome territories for the two major cellular functions of transcription and replication in fast-growing cells. Our understanding of the coupling of transcription and bacterial chromosome (or nucleoid) structure is also summarized. Using E. coli as a simple model system, co-imaging of RNAP with DNA and other factors during growth and stress responses will continue to be a useful tool for studying bacterial growth and adaptation in changing environment.
Collapse
Affiliation(s)
- Ding J Jin
- Transcription Control Section, Gene Regulation and Chromosome Biology Laboratory, National Cancer Institute, National Institutes of Health Frederick, MD, USA
| | - Cedric Cagliero
- Transcription Control Section, Gene Regulation and Chromosome Biology Laboratory, National Cancer Institute, National Institutes of Health Frederick, MD, USA
| | - Carmen M Martin
- Transcription Control Section, Gene Regulation and Chromosome Biology Laboratory, National Cancer Institute, National Institutes of Health Frederick, MD, USA
| | - Jerome Izard
- Transcription Control Section, Gene Regulation and Chromosome Biology Laboratory, National Cancer Institute, National Institutes of Health Frederick, MD, USA
| | - Yan N Zhou
- Transcription Control Section, Gene Regulation and Chromosome Biology Laboratory, National Cancer Institute, National Institutes of Health Frederick, MD, USA
| |
Collapse
|
20
|
Abstract
The database of essential genes (DEG, available at http://www.essentialgene.org), constructed in 2003, has been timely updated to harbor essential-gene records of bacteria, archaea, and eukaryotes. DEG 10, the current release, includes not only essential protein-coding genes determined by genome-wide gene essentiality screens but also essential noncoding RNAs, promoters, regulatory sequences, and replication origins. Therefore, DEG 10 includes essential genomic elements under different conditions in three domains of life, with customizable BLAST tools. Based on the analysis of DEG 10, we show that the percentage of essential genes in bacterial genomes exhibits an exponential decay with increasing genome sizes. The functions, ATP binding (GO:0005524), GTP binding (GO:0005525), and DNA-directed RNA polymerase activity (GO:0003899), are likely required for organisms across life domains.
Collapse
|
21
|
Gao F. Recent Advances in the Identification of Replication Origins Based on the Z-curve Method. Curr Genomics 2014; 15:104-12. [PMID: 24822028 PMCID: PMC4009838 DOI: 10.2174/1389202915999140328162938] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2013] [Revised: 11/04/2013] [Accepted: 11/05/2013] [Indexed: 12/19/2022] Open
Abstract
Precise DNA replication is critical for the maintenance of genetic integrity in all organisms. In all three domains
of life, DNA replication starts at a specialized locus, termed as the replication origin, oriC or ORI, and its identification
is vital to understanding the complex replication process. In bacteria and eukaryotes, replication initiates from single
and multiple origins, respectively, while archaea can adopt either of the two modes. The Z-curve method has been
successfully used to identify replication origins in genomes of various species, including multiple oriCs in some archaea.
Based on the Z-curve method and comparative genomics analysis, we have developed a web-based system, Ori-Finder, for
finding oriCs in bacterial genomes with high accuracy. Predicted oriC regions in bacterial genomes are organized into an
online database, DoriC. Recently, archaeal oriC regions identified by both in vivo and in silico methods have also been included
in the database. Here, we summarize the recent advances of in silico prediction of oriCs in bacterial and archaeal
genomes using the Z-curve based method.
Collapse
Affiliation(s)
- Feng Gao
- Department of Physics, Tianjin University, Tianjin 300072, China
| |
Collapse
|
22
|
Jin DJ, Cagliero C, Zhou YN. Role of RNA polymerase and transcription in the organization of the bacterial nucleoid. Chem Rev 2013; 113:8662-82. [PMID: 23941620 PMCID: PMC3830623 DOI: 10.1021/cr4001429] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Affiliation(s)
- Ding Jun Jin
- Transcription Control Section, Gene Regulation and Chromosome Biology Laboratory National Cancer Institute, NIH, P.O. Box B, Frederick, MD 21702
| | - Cedric Cagliero
- Transcription Control Section, Gene Regulation and Chromosome Biology Laboratory National Cancer Institute, NIH, P.O. Box B, Frederick, MD 21702
| | - Yan Ning Zhou
- Transcription Control Section, Gene Regulation and Chromosome Biology Laboratory National Cancer Institute, NIH, P.O. Box B, Frederick, MD 21702
| |
Collapse
|
23
|
Exoproteome and secretome derived broad spectrum novel drug and vaccine candidates in Vibrio cholerae targeted by Piper betel derived compounds. PLoS One 2013; 8:e52773. [PMID: 23382822 PMCID: PMC3559646 DOI: 10.1371/journal.pone.0052773] [Citation(s) in RCA: 83] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2012] [Accepted: 11/21/2012] [Indexed: 01/18/2023] Open
Abstract
Vibrio cholerae is the causal organism of the cholera epidemic, which is mostly prevalent in developing and underdeveloped countries. However, incidences of cholera in developed countries are also alarming. Because of the emergence of new drug-resistant strains, even though several generic drugs and vaccines have been developed over time, Vibrio infections remain a global health problem that appeals for the development of novel drugs and vaccines against the pathogen. Here, applying comparative proteomic and reverse vaccinology approaches to the exoproteome and secretome of the pathogen, we have identified three candidate targets (ompU, uppP and yajC) for most of the pathogenic Vibrio strains. Two targets (uppP and yajC) are novel to Vibrio, and two targets (uppP and ompU) can be used to develop both drugs and vaccines (dual targets) against broad spectrum Vibrio serotypes. Using our novel computational approach, we have identified three peptide vaccine candidates that have high potential to induce both B- and T-cell-mediated immune responses from our identified two dual targets. These two targets were modeled and subjected to virtual screening against natural compounds derived from Piper betel. Seven compounds were identified first time from Piper betel to be highly effective to render the function of these targets to identify them as emerging potential drugs against Vibrio. Our preliminary validation suggests that these identified peptide vaccines and betel compounds are highly effective against Vibrio cholerae. Currently we are exhaustively validating these targets, candidate peptide vaccines, and betel derived lead compounds against a number of Vibrio species.
Collapse
|
24
|
Wang J, Peng W, Wu FX. Computational approaches to predicting essential proteins: A survey. Proteomics Clin Appl 2013; 7:181-92. [DOI: 10.1002/prca.201200068] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Revised: 09/12/2012] [Accepted: 11/06/2012] [Indexed: 12/13/2022]
Affiliation(s)
- Jianxin Wang
- School of Information Science and Engineering; Central South University; Changsha; China
| | - Wei Peng
- School of Information Science and Engineering; Central South University; Changsha; China
| | - Fang-Xiang Wu
- Department of Mechanical Engineering and Division of Biomedical Engineering; University of Saskatchewan; Saskatoon; SK; Canada
| |
Collapse
|
25
|
Klein BA, Tenorio EL, Lazinski DW, Camilli A, Duncan MJ, Hu LT. Identification of essential genes of the periodontal pathogen Porphyromonas gingivalis. BMC Genomics 2012; 13:578. [PMID: 23114059 PMCID: PMC3547785 DOI: 10.1186/1471-2164-13-578] [Citation(s) in RCA: 114] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Accepted: 10/24/2012] [Indexed: 01/09/2023] Open
Abstract
Background Porphyromonas gingivalis is a Gram-negative anaerobic bacterium associated with periodontal disease onset and progression. Genetic tools for the manipulation of bacterial genomes allow for in-depth mechanistic studies of metabolism, physiology, interspecies and host-pathogen interactions. Analysis of the essential genes, protein-coding sequences necessary for survival of P. gingivalis by transposon mutagenesis has not previously been attempted due to the limitations of available transposon systems for the organism. We adapted a Mariner transposon system for mutagenesis of P. gingivalis and created an insertion mutant library. By analyzing the location of insertions using massively-parallel sequencing technology we used this mutant library to define genes essential for P. gingivalis survival under in vitro conditions. Results In mutagenesis experiments we identified 463 genes in P. gingivalis strain ATCC 33277 that are putatively essential for viability in vitro. Comparing the 463 P. gingivalis essential genes with previous essential gene studies, 364 of the 463 are homologues to essential genes in other species; 339 are shared with more than one other species. Twenty-five genes are known to be essential in P. gingivalis and B. thetaiotaomicron only. Significant enrichment of essential genes within Cluster of Orthologous Groups ‘D’ (cell division), ‘I’ (lipid transport and metabolism) and ‘J’ (translation/ribosome) were identified. Previously, the P. gingivalis core genome was shown to encode 1,476 proteins out of a possible 1,909; 434 of 463 essential genes are contained within the core genome. Thus, for the species P. gingivalis twenty-two, seventy-seven and twenty-three percent of the genome respectively are devoted to essential, core and accessory functions. Conclusions A Mariner transposon system can be adapted to create mutant libraries in P. gingivalis amenable to analysis by next-generation sequencing technologies. In silico analysis of genes essential for in vitro growth demonstrates that although the majority are homologous across bacterial species as a whole, species and strain-specific subsets are apparent. Understanding the putative essential genes of P. gingivalis will provide insights into metabolic pathways and niche adaptations as well as clinical therapeutic strategies.
Collapse
Affiliation(s)
- Brian A Klein
- Department of Molecular Biology and Microbiology, Tufts University Sackler School of Biomedical Sciences, Boston, MA 02111, USA
| | | | | | | | | | | |
Collapse
|
26
|
Gao F, Luo H, Zhang CT. DoriC 5.0: an updated database of oriC regions in both bacterial and archaeal genomes. Nucleic Acids Res 2012; 41:D90-3. [PMID: 23093601 PMCID: PMC3531139 DOI: 10.1093/nar/gks990] [Citation(s) in RCA: 111] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Replication of chromosomes is one of the central events in the cell cycle. Chromosome replication begins at specific sites, called origins of replication (oriCs), for all three domains of life. However, the origins of replication still remain unknown in a considerably large number of bacterial and archaeal genomes completely sequenced so far. The availability of increasing complete bacterial and archaeal genomes has created challenges and opportunities for identification of their oriCs in silico, as well as in vivo. Based on the Z-curve theory, we have developed a web-based system Ori-Finder to predict oriCs in bacterial genomes with high accuracy and reliability by taking advantage of comparative genomics, and the predicted oriC regions have been organized into an online database DoriC, which is publicly available at http://tubic.tju.edu.cn/doric/ since 2007. Five years after we constructed DoriC, the database has significant advances over the number of bacterial genomes, increasing about 4-fold. Additionally, oriC regions in archaeal genomes identified by in vivo experiments, as well as in silico analyses, have also been added to the database. Consequently, the latest release of DoriC contains oriCs for >1500 bacterial genomes and 81 archaeal genomes, respectively.
Collapse
Affiliation(s)
- Feng Gao
- Department of Physics, Tianjin University, Tianjin 300072, China.
| | | | | |
Collapse
|
27
|
Butt AM, Nasrullah I, Tahir S, Tong Y. Comparative genomics analysis of Mycobacterium ulcerans for the identification of putative essential genes and therapeutic candidates. PLoS One 2012; 7:e43080. [PMID: 22912793 PMCID: PMC3418265 DOI: 10.1371/journal.pone.0043080] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2012] [Accepted: 07/16/2012] [Indexed: 11/18/2022] Open
Abstract
Mycobacterium ulcerans, the causative agent of Buruli ulcer, is the third most common mycobacterial disease after tuberculosis and leprosy. The present treatment options are limited and emergence of treatment resistant isolates represents a serious concern and a need for better therapeutics. Conventional drug discovery methods are time consuming and labor-intensive. Unfortunately, the slow growing nature of M. ulcerans in experimental conditions is also a barrier for drug discovery and development. In contrast, recent advancements in complete genome sequencing, in combination with cheminformatics and computational biology, represent an attractive alternative approach for the identification of therapeutic candidates worthy of experimental research. A computational, comparative genomics workflow was defined for the identification of novel therapeutic candidates against M. ulcerans, with the aim that a selected target should be essential to the pathogen, and have no homology in the human host. Initially, a total of 424 genes were predicted as essential from the M. ulcerans genome, via homology searching of essential genome content from 20 different bacteria. Metabolic pathway analysis showed that the most essential genes are associated with carbohydrate and amino acid metabolism. Among these, 236 proteins were identified as non-host and essential, and could serve as potential drug and vaccine candidates. Several drug target prioritization parameters including druggability were also calculated. Enzymes from several pathways are discussed as potential drug targets, including those from cell wall synthesis, thiamine biosynthesis, protein biosynthesis, and histidine biosynthesis. It is expected that our data will facilitate selection of M. ulcerans proteins for successful entry into drug design pipelines.
Collapse
Affiliation(s)
- Azeem Mehmood Butt
- National Centre of Excellence in Molecular Biology (CEMB), University of the Punjab, Lahore, Pakistan
- * E-mail: (AMB); (YT)
| | - Izza Nasrullah
- Department of Biochemistry, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad, Pakistan
| | - Shifa Tahir
- National Center for Bioinformatics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad, Pakistan
| | - Yigang Tong
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, People's Republic of China
- * E-mail: (AMB); (YT)
| |
Collapse
|
28
|
Wu H, Qu H, Wan N, Zhang Z, Hu S, Yu J. Strand-biased gene distribution in bacteria is related to both horizontal gene transfer and strand-biased nucleotide composition. GENOMICS PROTEOMICS & BIOINFORMATICS 2012; 10:186-96. [PMID: 23084774 PMCID: PMC5054707 DOI: 10.1016/j.gpb.2012.08.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2012] [Accepted: 07/29/2012] [Indexed: 11/18/2022]
Abstract
Although strand-biased gene distribution (SGD) was described some two decades ago, the underlying molecular mechanisms and their relationship remain elusive. Its facets include, but are not limited to, the degree of biases, the strand-preference of genes, and the influence of background nucleotide composition variations. Using a dataset composed of 364 non-redundant bacterial genomes, we sought to illustrate our current understanding of SGD. First, when we divided the collection of bacterial genomes into non-polC and polC groups according to their possession of DnaE isoforms that correlate closely with taxonomy, the SGD of the polC group stood out more significantly than that of the non-polC group. Second, when examining horizontal gene transfer, coupled with gene functional conservation (essentiality) and expressivity (level of expression), we realized that they all contributed to SGD. Third, we further demonstrated a weaker G-dominance on the leading strand of the non-polC group but strong purine dominance (both G and A) on the leading strand of the polC group. We propose that strand-biased nucleotide composition plays a decisive role for SGD since the polC-bearing genomes are not only AT-rich but also have pronounced purine-rich leading strands, and we believe that a special mutation spectrum that leads to a strong purine asymmetry and a strong strand-biased nucleotide composition coupled with functional selections for genes and their functions are both at work.
Collapse
|
29
|
Mao X, Zhang H, Yin Y, Xu Y. The percentage of bacterial genes on leading versus lagging strands is influenced by multiple balancing forces. Nucleic Acids Res 2012; 40:8210-8. [PMID: 22735706 PMCID: PMC3458553 DOI: 10.1093/nar/gks605] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The majority of bacterial genes are located on the leading strand, and the percentage of such genes has a large variation across different bacteria. Although some explanations have been proposed, these are at most partial explanations as they cover only small percentages of the genes and do not even consider the ones biased toward the lagging strand. We have carried out a computational study on 725 bacterial genomes, aiming to elucidate other factors that may have influenced the strand location of genes in a bacterium. Our analyses suggest that (i) genes of some functional categories such as ribosome have higher preferences to be on the leading strands; (ii) genes of some functional categories such as transcription factor have higher preferences on the lagging strands; (iii) there is a balancing force that tends to keep genes from all moving to the leading and more efficient strand and (iv) the percentage of leading-strand genes in an bacterium can be accurately explained based on the numbers of genes in the functional categories outlined in (i) and (ii), genome size and gene density, indicating that these numbers implicitly contain the information about the percentage of genes on the leading versus lagging strand in a genome.
Collapse
Affiliation(s)
- Xizeng Mao
- Computational Systems Biology Lab, Department of Biochemistry and Molecular Biology and Institute of Bioinformatics, University of Georgia, Athens, GA 30605, USA
| | | | | | | |
Collapse
|
30
|
Abstract
DNA replication and transcription use the same template and occur concurrently in bacteria. The lack of temporal and spatial separation of these two processes leads to their conflict, and failure to deal with this conflict can result in genome alterations and reduced fitness. In recent years major advances have been made in understanding how cells avoid conflicts between replication and transcription and how such conflicts are resolved when they do occur. In this Review, we summarize these findings, which shed light on the significance of the problem and on how bacterial cells deal with unwanted encounters between the replication and transcription machineries.
Collapse
|
31
|
[Current status of theoretical studies on essential genes in microbes]. YI CHUAN = HEREDITAS 2012; 34:420-30. [PMID: 22522159 DOI: 10.3724/sp.j.1005.2012.00420] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Essential genes are indispensable for the survival of an organism in optimal conditions. Recently, study on essential gene is becoming a hot topic of microbiology, genomics, and bioinformatics. This paper described the experiments that determined essential genes in some microbes and the theoretical researches on essential genes were reviewed. The major content contained comparison of essential genes and non-essential genes based on information on evolutionary conservation and sequence composition, and in silico prediction of essential genes, and analysis of the chromosomal distributions of essential genes. Finally, related progresses were concluded and the open problems were pointed out.
Collapse
|
32
|
Lin Y, Zhang RR. Putative essential and core-essential genes in Mycoplasma genomes. Sci Rep 2011; 1:53. [PMID: 22355572 PMCID: PMC3216540 DOI: 10.1038/srep00053] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2011] [Accepted: 07/19/2011] [Indexed: 01/01/2023] Open
Abstract
Mycoplasma, which was used to create the first “synthetic life”, has been an important species in the emerging field, synthetic biology. However, essential genes, an important concept of synthetic biology, for both M. mycoides and M. capricolum, as well as 14 other Mycoplasma with available genomes, are still unknown. We have developed a gene essentiality prediction algorithm that incorporates information of biased gene strand distribution, homologous search and codon adaptation index. The algorithm, which achieved an accuracy of 80.8% and 78.9% in self-consistence and cross-validation tests, respectively, predicted 5880 essential genes in the 16 Mycoplasma genomes. The intersection set of essential genes in available Mycoplasma genomes consists of 153 core essential genes. The predicted essential genes (available from pDEG, tubic.tju.edu.cn/pdeg) and the proposed algorithm can be helpful for studying minimal Mycoplasma genomes as well as essential genes in other genomes.
Collapse
Affiliation(s)
- Yan Lin
- Department of Physics, Tianjin University, Tianjin 300072, China
| | | |
Collapse
|