Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Peng C, Lin Y, Luo H, Gao F. A Comprehensive Overview of Online Resources to Identify and Predict Bacterial Essential Genes. Front Microbiol 2017;8:2331. [PMID: 29230204 PMCID: PMC5711816 DOI: 10.3389/fmicb.2017.02331] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Accepted: 11/13/2017] [Indexed: 12/15/2022] Open

For:	Peng C, Lin Y, Luo H, Gao F. A Comprehensive Overview of Online Resources to Identify and Predict Bacterial Essential Genes. Front Microbiol 2017;8:2331. [PMID: 29230204 PMCID: PMC5711816 DOI: 10.3389/fmicb.2017.02331] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Accepted: 11/13/2017] [Indexed: 12/15/2022] Open

Number

Cited by Other Article(s)

Rahiminejad S, De Sanctis B, Pevzner P, Mushegian A. Synthetic lethality and the minimal genome size problem. mSphere 2024;9:e0013924. [PMID: 38904396 PMCID: PMC11288024 DOI: 10.1128/msphere.00139-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 05/13/2024] [Indexed: 06/22/2024] Open

Abstract

Gene knockout studies suggest that ~300 genes in a bacterial genome and ~1,100 genes in a yeast genome cannot be deleted without loss of viability. These single-gene knockout experiments do not account for negative genetic interactions, when two or more genes can each be deleted without effect, but their joint deletion is lethal. Thus, large-scale single-gene deletion studies underestimate the size of a minimal gene set compatible with cell survival. In yeast Saccharomyces cerevisiae, the viability of all possible deletions of gene pairs (2-tuples), and of some deletions of gene triplets (3-tuples), has been experimentally tested. To estimate the size of a yeast minimal genome from that data, we first established that finding the size of a minimal gene set is equivalent to finding the minimum vertex cover in the lethality (hyper)graph, where the vertices are genes and (hyper)edges connect k-tuples of genes whose joint deletion is lethal. Using the Lovász-Johnson-Chvatal greedy approximation algorithm, we computed the minimum vertex cover of the synthetic-lethal 2-tuples graph to be 1,723 genes. We next simulated the genetic interactions in 3-tuples, extrapolating from the existing triplet sample, and again estimated minimum vertex covers. The size of a minimal gene set in yeast rapidly approaches the size of the entire genome even when considering only synthetic lethalities in k-tuples with small k. In contrast, several studies reported successful experimental reductions of yeast and bacterial genomes by simultaneous deletions of hundreds of genes, without eliciting synthetic lethality. We discuss possible reasons for this apparent contradiction.IMPORTANCEHow can we estimate the smallest number of genes sufficient for a unicellular organism to survive on a rich medium? One approach is to remove genes one at a time and count how many of such deletion strains are unable to grow. However, the single-gene knockout data are insufficient, because joint gene deletions may result in negative genetic interactions, also known as synthetic lethality. We used a technique from graph theory to estimate the size of minimal yeast genome from partial data on synthetic lethality. The number of potential synthetic lethal interactions grows very fast when multiple genes are deleted, revealing a paradoxical contrast with the experimental reductions of yeast genome by ~100 genes, and of bacterial genomes by several hundreds of genes.

Collapse

Ma S, Su T, Lu X, Qi Q. Bacterial genome reduction for optimal chassis of synthetic biology: a review. Crit Rev Biotechnol 2024;44:660-673. [PMID: 37380345 DOI: 10.1080/07388551.2023.2208285] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 10/13/2022] [Accepted: 02/20/2023] [Indexed: 06/30/2023]

Liang Y, Luo H, Lin Y, Gao F. Recent advances in the characterization of essential genes and development of a database of essential genes. IMETA 2024;3:e157. [PMID: 38868518 PMCID: PMC10989110 DOI: 10.1002/imt2.157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 10/09/2023] [Indexed: 06/14/2024]

Aromolaran OT, Isewon I, Adedeji E, Oswald M, Adebiyi E, Koenig R, Oyelade J. Heuristic-enabled active machine learning: A case study of predicting essential developmental stage and immune response genes in Drosophila melanogaster. PLoS One 2023;18:e0288023. [PMID: 37556452 PMCID: PMC10411809 DOI: 10.1371/journal.pone.0288023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 06/18/2023] [Indexed: 08/11/2023] Open

Abstract

Computational prediction of absolute essential genes using machine learning has gained wide attention in recent years. However, essential genes are mostly conditional and not absolute. Experimental techniques provide a reliable approach of identifying conditionally essential genes; however, experimental methods are laborious, time and resource consuming, hence computational techniques have been used to complement the experimental methods. Computational techniques such as supervised machine learning, or flux balance analysis are grossly limited due to the unavailability of required data for training the model or simulating the conditions for gene essentiality. This study developed a heuristic-enabled active machine learning method based on a light gradient boosting model to predict essential immune response and embryonic developmental genes in Drosophila melanogaster. We proposed a new sampling selection technique and introduced a heuristic function which replaces the human component in traditional active learning models. The heuristic function dynamically selects the unlabelled samples to improve the performance of the classifier in the next iteration. Testing the proposed model with four benchmark datasets, the proposed model showed superior performance when compared to traditional active learning models (random sampling and uncertainty sampling). Applying the model to identify conditionally essential genes, four novel essential immune response genes and a list of 48 novel genes that are essential in embryonic developmental condition were identified. We performed functional enrichment analysis of the predicted genes to elucidate their biological processes and the result evidence our predictions. Immune response and embryonic development related processes were significantly enriched in the essential immune response and embryonic developmental genes, respectively. Finally, we propose the predicted essential genes for future experimental studies and use of the developed tool accessible at http://heal.covenantuniversity.edu.ng for conditional essentiality predictions.

Collapse

Saxena P, Rauniyar S, Thakur P, Singh RN, Bomgni A, Alaba MO, Tripathi AK, Gnimpieba EZ, Lushbough C, Sani RK. Integration of text mining and biological network analysis: Identification of essential genes in sulfate-reducing bacteria. Front Microbiol 2023;14:1086021. [PMID: 37125195 PMCID: PMC10133479 DOI: 10.3389/fmicb.2023.1086021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 03/23/2023] [Indexed: 05/02/2023] Open

Abstract

The growth and survival of an organism in a particular environment is highly depends on the certain indispensable genes, termed as essential genes. Sulfate-reducing bacteria (SRB) are obligate anaerobes which thrives on sulfate reduction for its energy requirements. The present study used Oleidesulfovibrio alaskensis G20 (OA G20) as a model SRB to categorize the essential genes based on their key metabolic pathways. Herein, we reported a feedback loop framework for gene of interest discovery, from bio-problem to gene set of interest, leveraging expert annotation with computational prediction. Defined bio-problem was applied to retrieve the genes of SRB from literature databases (PubMed, and PubMed Central) and annotated them to the genome of OA G20. Retrieved gene list was further used to enrich protein-protein interaction and was corroborated to the pangenome analysis, to categorize the enriched gene sets and the respective pathways under essential and non-essential. Interestingly, the sat gene (dde_2265) from the sulfur metabolism was the bridging gene between all the enriched pathways. Gene clusters involved in essential pathways were linked with the genes from seleno-compound metabolism, amino acid metabolism, secondary metabolite synthesis, and cofactor biosynthesis. Furthermore, pangenome analysis demonstrated the gene distribution, where 69.83% of the 116 enriched genes were mapped under "persistent," inferring the essentiality of these genes. Likewise, 21.55% of the enriched genes, which involves specially the formate dehydrogenases and metallic hydrogenases, appeared under "shell." Our methodology suggested that semi-automated text mining and network analysis may play a crucial role in deciphering the previously unexplored genes and key mechanisms which can help to generate a baseline prior to perform any experimental studies.

Collapse

Affiliation(s)

Priya Saxena Department of Chemical and Biological Engineering, South Dakota School of Mines and Technology, Rapid City, SD, United States Data Driven Material Discovery Center for Bioengineering Innovation, South Dakota School of Mines and Technology, Rapid City, SD, United States
Shailabh Rauniyar Department of Chemical and Biological Engineering, South Dakota School of Mines and Technology, Rapid City, SD, United States 2-Dimensional Materials for Biofilm Engineering, Science and Technology, South Dakota School of Mines and Technology, Rapid City, SD, United States
Payal Thakur Department of Chemical and Biological Engineering, South Dakota School of Mines and Technology, Rapid City, SD, United States Data Driven Material Discovery Center for Bioengineering Innovation, South Dakota School of Mines and Technology, Rapid City, SD, United States
Ram Nageena Singh Department of Chemical and Biological Engineering, South Dakota School of Mines and Technology, Rapid City, SD, United States 2-Dimensional Materials for Biofilm Engineering, Science and Technology, South Dakota School of Mines and Technology, Rapid City, SD, United States
Alain Bomgni Department of Biomedical Engineering, University of South Dakota, Sioux Falls, SD, United States
Mathew O. Alaba Department of Biomedical Engineering, University of South Dakota, Sioux Falls, SD, United States
Abhilash Kumar Tripathi Department of Chemical and Biological Engineering, South Dakota School of Mines and Technology, Rapid City, SD, United States 2-Dimensional Materials for Biofilm Engineering, Science and Technology, South Dakota School of Mines and Technology, Rapid City, SD, United States
Etienne Z. Gnimpieba Department of Biomedical Engineering, University of South Dakota, Sioux Falls, SD, United States *Correspondence: Etienne Z. Gnimpieba,
Carol Lushbough Department of Biomedical Engineering, University of South Dakota, Sioux Falls, SD, United States
Rajesh Kumar Sani Department of Chemical and Biological Engineering, South Dakota School of Mines and Technology, Rapid City, SD, United States Data Driven Material Discovery Center for Bioengineering Innovation, South Dakota School of Mines and Technology, Rapid City, SD, United States 2-Dimensional Materials for Biofilm Engineering, Science and Technology, South Dakota School of Mines and Technology, Rapid City, SD, United States BuG ReMeDEE Consortium, South Dakota School of Mines and Technology, Rapid City, SD, United States Rajesh Kumar Sani,

Collapse

LeBlanc N, Charles TC. Bacterial genome reductions: Tools, applications, and challenges. Front Genome Ed 2022;4:957289. [PMID: 36120530 PMCID: PMC9473318 DOI: 10.3389/fgeed.2022.957289] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 07/29/2022] [Indexed: 11/16/2022] Open

Chowdhury ZM, Bhattacharjee A, Ahammad I, Hossain MU, Jaber AA, Rahman A, Dev PC, Salimullah M, Keya CA. Exploration of Streptococcus core genome to reveal druggable targets and novel therapeutics against S. pneumoniae. PLoS One 2022;17:e0272945. [PMID: 35980906 PMCID: PMC9387852 DOI: 10.1371/journal.pone.0272945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2022] [Accepted: 07/29/2022] [Indexed: 11/18/2022] Open

Abstract Streptococcus pneumoniae (S. pneumoniae), the major etiological agent of community-acquired pneumonia (CAP) contributes significantly to the global burden of infectious diseases which is getting resistant day by day. Nearly 30% of the S. pneumoniae genomes encode hypothetical proteins (HPs), and better understandings of these HPs in virulence and pathogenicity plausibly decipher new treatments. Some of the HPs are present across many Streptococcus species, systematic assessment of these unexplored HPs will disclose prospective drug targets. In this study, through a stringent bioinformatics analysis of the core genome and proteome of S. pneumoniae PCS8235, we identified and analyzed 28 HPs that are common in many Streptococcus species and might have a potential role in the virulence or pathogenesis of the bacteria. Functional annotations of the proteins were conducted based on the physicochemical properties, subcellular localization, virulence prediction, protein-protein interactions, and identification of essential genes, to find potentially druggable proteins among 28 HPs. The majority of the HPs are involved in bacterial transcription and translation. Besides, some of them were homologs of enzymes, binding proteins, transporters, and regulators. Protein-protein interactions revealed HP PCS8235_RS05845 made the highest interactions with other HPs and also has TRP structural motif along with virulent and pathogenic properties indicating it has critical cellular functions and might go under unconventional protein secretions. The second highest interacting protein HP PCS8235_RS02595 interacts with the Regulator of chromosomal segregation (RocS) which participates in chromosome segregation and nucleoid protection in S. pneumoniae. In this interacting network, 54% of protein members have virulent properties and 40% contain pathogenic properties. Among them, most of these proteins circulate in the cytoplasmic area and have hydrophilic properties. Finally, molecular docking and dynamics simulation demonstrated that the antimalarial drug Artenimol can act as a drug repurposing candidate against HP PCS8235_RS 04650 of S. pneumoniae. Hence, the present study could aid in drugs against S. pneumoniae. Collapse

de Crécy-lagard V, Amorin de Hegedus R, Arighi C, Babor J, Bateman A, Blaby I, Blaby-Haas C, Bridge AJ, Burley SK, Cleveland S, Colwell LJ, Conesa A, Dallago C, Danchin A, de Waard A, Deutschbauer A, Dias R, Ding Y, Fang G, Friedberg I, Gerlt J, Goldford J, Gorelik M, Gyori BM, Henry C, Hutinet G, Jaroch M, Karp PD, Kondratova L, Lu Z, Marchler-Bauer A, Martin MJ, McWhite C, Moghe GD, Monaghan P, Morgat A, Mungall CJ, Natale DA, Nelson WC, O’Donoghue S, Orengo C, O’Toole KH, Radivojac P, Reed C, Roberts RJ, Rodionov D, Rodionova IA, Rudolf JD, Saleh L, Sheynkman G, Thibaud-Nissen F, Thomas PD, Uetz P, Vallenet D, Carter EW, Weigele PR, Wood V, Wood-Charlson EM, Xu J. A roadmap for the functional annotation of protein families: a community perspective. Database (Oxford) 2022;2022:baac062. [PMID: 35961013 PMCID: PMC9374478 DOI: 10.1093/database/baac062] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 06/28/2022] [Accepted: 08/03/2022] [Indexed: 12/23/2022]

Affiliation(s)

Valérie de Crécy-lagard Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
Rocio Amorin de Hegedus Genetics Institute, University of Florida, Gainesville, FL 32611, USA
Cecilia Arighi Department of Computer and Information Sciences, University of Delaware, Newark, DE 19713, USA
Jill Babor Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
Alex Bateman European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
Ian Blaby US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
Crysten Blaby-Haas Biology Department, Brookhaven National Laboratory, Upton, NY 11973, USA
Alan J Bridge Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva 4 CH-1211, Switzerland
Stephen K Burley RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Stacey Cleveland Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
Lucy J Colwell Departmenf of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
Ana Conesa Spanish National Research Council, Institute for Integrative Systems Biology, Paterna, Valencia 46980, Spain
Christian Dallago TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, i12, Boltzmannstr. 3, Garching/Munich 85748, Germany
Antoine Danchin School of Biomedical Sciences, Li KaShing Faculty of Medicine, The University of Hong Kong, 21 Sassoon Road, Pokfulam, SAR Hong Kong 999077, China
Anita de Waard Research Collaboration Unit, Elsevier, Jericho, VT 05465, USA
Adam Deutschbauer Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
Raquel Dias Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
Yousong Ding Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, USA
Gang Fang NYU-Shanghai, Shanghai 200120, China
Iddo Friedberg Department of Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA 50011, USA
John Gerlt Institute for Genomic Biology and Departments of Biochemistry and Chemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
Joshua Goldford Physics of Living Systems, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
Mark Gorelik Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
Benjamin M Gyori Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA 02115, USA
Christopher Henry Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA
Geoffrey Hutinet Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
Marshall Jaroch Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
Peter D Karp Bioinformatics Research Group, SRI International, Menlo Park, CA 94025, USA
Liudmyla Kondratova Genetics Institute, University of Florida, Gainesville, FL 32611, USA
Zhiyong Lu National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), 8600 Rockville Pike, Bethesda, MD 20817, USA
Aron Marchler-Bauer National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), 8600 Rockville Pike, Bethesda, MD 20817, USA
Maria-Jesus Martin European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
Claire McWhite Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08540, USA
Gaurav D Moghe Plant Biology Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
Paul Monaghan Department of Agricultural Education and Communication, University of Florida, Gainesville, FL 32611, USA
Anne Morgat Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva 4 CH-1211, Switzerland
Christopher J Mungall Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
Darren A Natale Georgetown University Medical Center, Washington, DC 20007, USA
William C Nelson Biological Sciences Division, Pacific Northwest National Laboratories, Richland, WA 99354, USA
Seán O’Donoghue School of Biotechnology and Biomolecular Sciences, University of NSW, Sydney, NSW 2052, Australia
Christine Orengo Department of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
Katherine H O’Toole New England Biolabs, Ipswich, MA 01938, USA
Predrag Radivojac Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA
Colbie Reed Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
Richard J Roberts New England Biolabs, Ipswich, MA 01938, USA
Dmitri Rodionov Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA 92037, USA
Irina A Rodionova Department of Bioengineering, Division of Engineering, University of California at San Diego, La Jolla, CA 92093-0412, USA
Jeffrey D Rudolf Department of Chemistry, University of Florida, Gainesville, FL 32611, USA
Lana Saleh New England Biolabs, Ipswich, MA 01938, USA
Gloria Sheynkman Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
Francoise Thibaud-Nissen National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), 8600 Rockville Pike, Bethesda, MD 20817, USA
Paul D Thomas Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA 90033, USA
Peter Uetz Center for Biological Data Science, Virginia Commonwealth University, Richmond, VA 23284, USA
David Vallenet LABGeM, Génomique Métabolique, CEA, Genoscope, Institut François Jacob, Université d’Évry, Université Paris-Saclay, CNRS, Evry 91057, France
Erica Watson Carter Department of Plant Pathology, University of Florida Citrus Research and Education Center, 700 Experiment Station Rd., Lake Alfred, FL 33850, USA
Peter R Weigele New England Biolabs, Ipswich, MA 01938, USA
Valerie Wood Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
Elisha M Wood-Charlson Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
Jin Xu Department of Plant Pathology, University of Florida Citrus Research and Education Center, 700 Experiment Station Rd., Lake Alfred, FL 33850, USA

Collapse

In silico Methods for Identification of Potential Therapeutic Targets. Interdiscip Sci 2022;14:285-310. [PMID: 34826045 PMCID: PMC8616973 DOI: 10.1007/s12539-021-00491-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 10/19/2021] [Accepted: 11/01/2021] [Indexed: 11/01/2022]

Damas MSF, Mazur FG, Freire CCDM, da Cunha AF, Pranchevicius MCDS. A Systematic Immuno-Informatic Approach to Design a Multiepitope-Based Vaccine Against Emerging Multiple Drug Resistant Serratia marcescens. Front Immunol 2022;13:768569. [PMID: 35371033 PMCID: PMC8967166 DOI: 10.3389/fimmu.2022.768569] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 02/14/2022] [Indexed: 11/24/2022] Open

Abstract

Serratia marcescens is now an important opportunistic pathogen that can cause serious infections in hospitalized or immunocompromised patients. Here, we used extensive bioinformatic analyses based on reverse vaccinology and subtractive proteomics-based approach to predict potential vaccine candidates against S. marcescens. We analyzed the complete proteome sequence of 49 isolate of Serratia marcescens and identified 5 that were conserved proteins, non-homologous from human and gut flora, extracellular or exported to the outer membrane, and antigenic. The identified proteins were used to select 5 CTL, 12 HTL, and 12 BCL epitopes antigenic, non-allergenic, conserved, hydrophilic, and non-toxic. In addition, HTL epitopes were able to induce interferon-gamma immune response. The selected peptides were used to design 4 multi-epitope vaccines constructs (SMV1, SMV2, SMV3 and SMV4) with immune-modulating adjuvants, PADRE sequence, and linkers. Peptide cleavage analysis showed that antigen vaccines are processed and presented via of MHC class molecule. Several physiochemical and immunological analyses revealed that all multiepitope vaccines were non-allergenic, stable, hydrophilic, and soluble and induced the immunity with high antigenicity. The secondary structure analysis revealed the designed vaccines contain mainly coil structure and alpha helix structures. 3D analyses showed high-quality structure. Molecular docking analyses revealed SMV4 as the best vaccine construct among the four constructed vaccines, demonstrating high affinity with the immune receptor. Molecular dynamics simulation confirmed the low deformability and stability of the vaccine candidate. Discontinuous epitope residues analyses of SMV4 revealed that they are flexible and can interact with antibodies. In silico immune simulation indicated that the designed SMV4 vaccine triggers an effective immune response. In silico codon optimization and cloning in expression vector indicate that SMV4 vaccine can be efficiently expressed in E. coli system. Overall, we showed that SMV4 multi-epitope vaccine successfully elicited antigen-specific humoral and cellular immune responses and may be a potential vaccine candidate against S. marcescens. Further experimental validations could confirm its exact efficacy, the safety and immunogenicity profile. Our findings bring a valuable addition to the development of new strategies to prevent and control the spread of multidrug-resistant Gram-negative bacteria with high clinical relevance.

Collapse

Marques de Castro G, Hastenreiter Z, Silva Monteiro TA, Martins da Silva TT, Pereira Lobo F. Cross-species prediction of essential genes in insects. Bioinformatics 2022;38:1504-1513. [PMID: 34999756 DOI: 10.1093/bioinformatics/btac009] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 11/12/2021] [Accepted: 01/04/2022] [Indexed: 02/03/2023] Open

Abstract

MOTIVATION

Insects possess a vast phenotypic diversity and key ecological roles. Several insect species also have medical, agricultural and veterinary importance as parasites and disease vectors. Therefore, strategies to identify potential essential genes in insects may reduce the resources needed to find molecular players in central processes of insect biology. However, most predictors of essential genes in multicellular eukaryotes using machine learning rely on expensive and laborious experimental data to be used as gene features, such as gene expression profiles or protein-protein interactions, even though some of this information may not be available for the majority of insect species with genomic sequences available.

RESULTS

Here, we present and validate a machine learning strategy to predict essential genes in insects using sequence-based intrinsic attributes (statistical and physicochemical data) together with the predictions of subcellular location and transcriptomic data, if available. We gathered information available in public databases describing essential and non-essential genes for Drosophila melanogaster (fruit fly, Diptera) and Tribolium castaneum (red flour beetle, Coleoptera). We proceeded by computing intrinsic and extrinsic attributes that were used to train statistical models in one species and tested by their capability of predicting essential genes in the other. Even models trained using only intrinsic attributes are capable of predicting genes in the other insect species, including the prediction of lineage-specific essential genes. Furthermore, the inclusion of RNA-Seq data is a major factor to increase classifier performance.

AVAILABILITY AND IMPLEMENTATION

The code, data and final models produced in this study are freely available at https://github.com/g1o/GeneEssentiality/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Kania A. Harnessing the information theory and chaos game representation for pattern searching among essential and non-essential genes in Bacteria. J Theor Biol 2021;531:110917. [PMID: 34563550 DOI: 10.1016/j.jtbi.2021.110917] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 08/19/2021] [Accepted: 09/21/2021] [Indexed: 11/29/2022]

Geptop 2.0: Accurately Select Essential Genes from the List of Protein-Coding Genes in Prokaryotic Genomes. Methods Mol Biol 2021. [PMID: 34709630 DOI: 10.1007/978-1-0716-1720-5_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]

Sharma M, Singh DN, Budhraja R, Sood U, Rawat CD, Adrian L, Richnow HH, Singh Y, Negi RK, Lal R. Comparative proteomics unravelled the hexachlorocyclohexane (HCH) isomers specific responses in an archetypical HCH degrading bacterium Sphingobium indicum B90A. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2021;28:41380-41395. [PMID: 33783707 DOI: 10.1007/s11356-021-13073-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2020] [Accepted: 02/17/2021] [Indexed: 06/12/2023]

Nlebedim VU, Chaudhuri RR, Walters K. Probabilistic Identification of Bacterial Essential Genes via insertion density using TraDIS Data with Tn5 libraries. Bioinformatics 2021;37:4343-4349. [PMID: 34255819 PMCID: PMC8652038 DOI: 10.1093/bioinformatics/btab508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 06/24/2021] [Accepted: 07/23/2021] [Indexed: 11/29/2022] Open

Abstract

Motivation

Probabilistic Identification of bacterial essential genes using transposon-directed insertion-site sequencing (TraDIS) data based on Tn5 libraries has received relatively little attention in the literature; most methods are designed for mariner transposon insertions. Analysis of Tn5 transposon-based genomic data is challenging due to the high insertion density and genomic resolution. We present a novel probabilistic Bayesian approach for classifying bacterial essential genes using transposon insertion density derived from transposon insertion sequencing data. We implement a Markov chain Monte Carlo sampling procedure to estimate the posterior probability that any given gene is essential. We implement a Bayesian decision theory approach to selecting essential genes. We assess the effectiveness of our approach via analysis of both simulated data and three previously published Escherichia coli, Salmonella Typhimurium and Staphylococcus aureus datasets. These three bacteria have relatively well characterized essential genes which allows us to test our classification procedure using receiver operating characteristic curves and area under the curves. We compare the classification performance with that of Bio-Tradis, a standard tool for bacterial gene classification.

Results

Our method is able to classify genes in the three datasets with areas under the curves between 0.967 and 0.983. Our simulated synthetic datasets show that both the number of insertions and the extent to which insertions are tolerated in the distal regions of essential genes are both important in determining classification accuracy. Importantly our method gives the user the option of classifying essential genes based on the user-supplied costs of false discovery and false non-discovery.

Availability and implementation

An R package that implements the method presented in this paper is available for download from https://github.com/Kevin-walters/insdens.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Kuang S, Wei Y, Wang L. Expression-based prediction of human essential genes and candidate lncRNAs in cancer cells. Bioinformatics 2021;37:396-403. [PMID: 32790840 DOI: 10.1093/bioinformatics/btaa717] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 07/21/2020] [Accepted: 08/06/2020] [Indexed: 01/12/2023] Open

Aromolaran O, Aromolaran D, Isewon I, Oyelade J. Machine learning approach to gene essentiality prediction: a review. Brief Bioinform 2021;22:6219158. [PMID: 33842944 DOI: 10.1093/bib/bbab128] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 03/04/2021] [Accepted: 03/17/2021] [Indexed: 12/17/2022] Open

Abstract

Essential genes are critical for the growth and survival of any organism. The machine learning approach complements the experimental methods to minimize the resources required for essentiality assays. Previous studies revealed the need to discover relevant features that significantly classify essential genes, improve on the generalizability of prediction models across organisms, and construct a robust gold standard as the class label for the train data to enhance prediction. Findings also show that a significant limitation of the machine learning approach is predicting conditionally essential genes. The essentiality status of a gene can change due to a specific condition of the organism. This review examines various methods applied to essential gene prediction task, their strengths, limitations and the factors responsible for effective computational prediction of essential genes. We discussed categories of features and how they contribute to the classification performance of essentiality prediction models. Five categories of features, namely, gene sequence, protein sequence, network topology, homology and gene ontology-based features, were generated for Caenorhabditis elegans to perform a comparative analysis of their essentiality prediction capacity. Gene ontology-based feature category outperformed other categories of features majorly due to its high correlation with the genes' biological functions. However, the topology feature category provided the highest discriminatory power making it more suitable for essentiality prediction. The major limiting factor of machine learning to predict essential genes conditionality is the unavailability of labeled data for interest conditions that can train a classifier. Therefore, cooperative machine learning could further exploit models that can perform well in conditional essentiality predictions.

SHORT ABSTRACT

Identification of essential genes is imperative because it provides an understanding of the core structure and function, accelerating drug targets' discovery, among other functions. Recent studies have applied machine learning to complement the experimental identification of essential genes. However, several factors are limiting the performance of machine learning approaches. This review aims to present the standard procedure and resources available for predicting essential genes in organisms, and also highlight the factors responsible for the current limitation in using machine learning for conditional gene essentiality prediction. The choice of features and ML technique was identified as an important factor to predict essential genes effectively.

Collapse

Pan-genomics, drug candidate mining and ADMET profiling of natural product inhibitors screened against Yersinia pseudotuberculosis. Genomics 2020;113:238-244. [PMID: 33321204 DOI: 10.1016/j.ygeno.2020.12.015] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Revised: 11/13/2020] [Accepted: 12/10/2020] [Indexed: 12/12/2022]

Liu S, Wang SX, Liu W, Wang C, Zhang FZ, Ye YN, Wu CS, Zheng WX, Rao N, Guo FB. CEG 2.0: an updated database of clusters of essential genes including eukaryotic organisms. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020;2020:6031000. [PMID: 33306800 PMCID: PMC7731928 DOI: 10.1093/database/baaa112] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 11/12/2020] [Accepted: 12/02/2020] [Indexed: 02/06/2023]

Nandi S, Ganguli P, Sarkar RR. Essential gene prediction using limited gene essentiality information-An integrative semi-supervised machine learning strategy. PLoS One 2020;15:e0242943. [PMID: 33253254 PMCID: PMC7703937 DOI: 10.1371/journal.pone.0242943] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 11/12/2020] [Indexed: 11/24/2022] Open

Abstract

Essential gene prediction helps to find minimal genes indispensable for the survival of any organism. Machine learning (ML) algorithms have been useful for the prediction of gene essentiality. However, currently available ML pipelines perform poorly for organisms with limited experimental data. The objective is the development of a new ML pipeline to help in the annotation of essential genes of less explored disease-causing organisms for which minimal experimental data is available. The proposed strategy combines unsupervised feature selection technique, dimension reduction using the Kamada-Kawai algorithm, and semi-supervised ML algorithm employing Laplacian Support Vector Machine (LapSVM) for prediction of essential and non-essential genes from genome-scale metabolic networks using very limited labeled dataset. A novel scoring technique, Semi-Supervised Model Selection Score, equivalent to area under the ROC curve (auROC), has been proposed for the selection of the best model when supervised performance metrics calculation is difficult due to lack of data. The unsupervised feature selection followed by dimension reduction helped to observe a distinct circular pattern in the clustering of essential and non-essential genes. LapSVM then created a curve that dissected this circle for the classification and prediction of essential genes with high accuracy (auROC > 0.85) even with 1% labeled data for model training. After successful validation of this ML pipeline on both Eukaryotes and Prokaryotes that show high accuracy even when the labeled dataset is very limited, this strategy is used for the prediction of essential genes of organisms with inadequate experimentally known data, such as Leishmania sp. Using a graph-based semi-supervised machine learning scheme, a novel integrative approach has been proposed for essential gene prediction that shows universality in application to both Prokaryotes and Eukaryotes with limited labeled data. The essential genes predicted using the pipeline provide an important lead for the prediction of gene essentiality and identification of novel therapeutic targets for antibiotic and vaccine development against disease-causing parasites.

Collapse

Rajamanickam K, Yang J, Chidambaram SB, Sakharkar MK. Enhancing Drug Efficacy against Mastitis Pathogens-An In Vitro Pilot Study in Staphylococcus aureus and Staphylococcus epidermidis. Animals (Basel) 2020;10:E2117. [PMID: 33203170 PMCID: PMC7696410 DOI: 10.3390/ani10112117] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 11/04/2020] [Accepted: 11/09/2020] [Indexed: 11/24/2022] Open

Yu X, Weng T, Gu C, Yang H. Comparison of gene regulatory networks to identify pathogenic genes for lymphoma. J Bioinform Comput Biol 2020;18:2050029. [PMID: 33131362 DOI: 10.1142/s0219720020500298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Yan F, Gao F. A systematic strategy for the investigation of vaccines and drugs targeting bacteria. Comput Struct Biotechnol J 2020;18:1525-1538. [PMID: 32637049 PMCID: PMC7327267 DOI: 10.1016/j.csbj.2020.06.008] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 06/02/2020] [Accepted: 06/03/2020] [Indexed: 02/07/2023] Open

Ferreira LM, Sáfadi T, Ferreira JL. Evaluation of genome similarities using a wavelet-domain approach. Rev Soc Bras Med Trop 2020;53:e20190470. [PMID: 32428175 PMCID: PMC7269520 DOI: 10.1590/0037-8682-0470-2019] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Accepted: 03/10/2020] [Indexed: 11/21/2022] Open

Delineating Novel Therapeutic Drug and Vaccine Targets for Staphylococcus cornubiensis NW1T Through Computational Analysis. Int J Pept Res Ther 2020. [DOI: 10.1007/s10989-020-10076-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Wen QF, Liu S, Dong C, Guo HX, Gao YZ, Guo FB. Geptop 2.0: An Updated, More Precise, and Faster Geptop Server for Identification of Prokaryotic Essential Genes. Front Microbiol 2019;10:1236. [PMID: 31214154 PMCID: PMC6558110 DOI: 10.3389/fmicb.2019.01236] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2019] [Accepted: 05/17/2019] [Indexed: 12/16/2022] Open

Shields RC, Jensen PA. The bare necessities: Uncovering essential and condition-critical genes with transposon sequencing. Mol Oral Microbiol 2019;34:39-50. [PMID: 30739386 DOI: 10.1111/omi.12256] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2018] [Revised: 01/18/2019] [Accepted: 02/06/2019] [Indexed: 12/11/2022]

Li X, Li W, Zeng M, Zheng R, Li M. Network-based methods for predicting essential genes or proteins: a survey. Brief Bioinform 2019;21:566-583. [DOI: 10.1093/bib/bbz017] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Revised: 01/21/2019] [Accepted: 01/22/2019] [Indexed: 12/14/2022] Open

Waman VP, Vedithi SC, Thomas SE, Bannerman BP, Munir A, Skwark MJ, Malhotra S, Blundell TL. Mycobacterial genomics and structural bioinformatics: opportunities and challenges in drug discovery. Emerg Microbes Infect 2019;8:109-118. [PMID: 30866765 PMCID: PMC6334779 DOI: 10.1080/22221751.2018.1561158] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 12/03/2018] [Accepted: 12/09/2018] [Indexed: 01/08/2023]

Martínez-Carranza E, Barajas H, Alcaraz LD, Servín-González L, Ponce-Soto GY, Soberón-Chávez G. Variability of Bacterial Essential Genes Among Closely Related Bacteria: The Case of Escherichia coli. Front Microbiol 2018;9:1059. [PMID: 29910775 PMCID: PMC5992433 DOI: 10.3389/fmicb.2018.01059] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2018] [Accepted: 05/04/2018] [Indexed: 11/23/2022] Open

Abstract

The definition of bacterial essential genes has been widely pursued using different approaches. Their study has impacted several fields of research such as synthetic biology, the construction of bacteria with minimal chromosomes, the search for new antibiotic targets, or the design of strains with biotechnological applications. Bacterial genomes are mosaics that only share a small subset of gene-sequences (core genome) even among members of the same species. It has been reported that the presence of essential genes is highly variable between closely related bacteria and even among members of the same species, due to the phenomenon known as “non-orthologous gene displacement” that refers to the coding for an essential function by genes with no sequence homology due to horizontal gene transfer (HGT). The existence of dormant forms among bacteria and the high incidence of HGT have been proposed to be driving forces of bacterial evolution, and they might have a role in the low level of conservation of essential genes among related bacteria by non-orthologous gene displacement, but this correlation has not been recognized. The aim of this mini-review is to give a brief overview of the approaches that have been taken to define and study essential genes, and the implications of non-orthologous gene displacement in bacterial evolution, focusing mainly in the case of Escherichia coli. To this end, we reviewed the available literature, and we searched for the presence of the essential genes defined by mutagenesis in the genomes of the 63 best-sequenced E. coli genomes that are available in NCBI database. We could not document specific cases of non-orthologous gene displacement among the E. coli strains analyzed, but we found that the quality of the genome-sequences in the database is not enough to make accurate predictions about the conservation of essential-genes among members of this bacterial species.

Collapse