1
|
Harten T, Nimzyk R, Gawlick VEA, Reinhold-Hurek B. Elucidation of Essential Genes and Mutant Fitness during Adaptation toward Nitrogen Fixation Conditions in the Endophyte Azoarcus olearius BH72 Revealed by Tn-Seq. Microbiol Spectr 2022; 10:e0216222. [PMID: 36416558 PMCID: PMC9769520 DOI: 10.1128/spectrum.02162-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 11/05/2022] [Indexed: 11/24/2022] Open
Abstract
Azoarcus olearius BH72 is a diazotrophic model endophyte that contributes fixed nitrogen to its host plant, Kallar grass, and expresses nitrogenase genes endophytically. Despite extensive studies on biological nitrogen fixation (BNF) of diazotrophic endophytes, little is known about global genetic players involved in survival under respective physiological conditions. Here, we report a global genomic screen for putatively essential genes of A. olearius employing Tn5 transposon mutagenesis with a modified transposon combined with high-throughput sequencing (Tn-Seq). A large Tn5 master library of ~6 × 105 insertion mutants of strain BH72 was obtained. Next-generation sequencing identified 183,437 unique insertion sites into the 4,376,040-bp genome, displaying one insertion every 24 bp on average. Applying stringent criteria, we describe 616 genes as putatively essential for growth on rich medium. COG (Clusters of Orthologous Groups) assignment of the 564 identified protein-coding genes revealed enrichment of genes related to core cellular functions and cell viability. To mimic gradual adaptations toward BNF conditions, the Tn5 mutant library was grown aerobically in synthetic medium or microaerobically on either combined or atmospheric nitrogen. Enrichment and depletion analysis of Tn5 mutants not only demonstrated the role of BNF- and metabolism-related proteins but also revealed that, strikingly, many genes relevant for plant-microbe interactions decrease bacterial competitiveness in pure culture, such type IV pilus- and bacterial envelope-associated genes. IMPORTANCE A constantly growing world population and the daunting challenge of climate change demand new strategies in agricultural crop production. Intensive usage of chemical fertilizers, overloading the world's fields with organic input, threaten terrestrial and marine ecosystems as well as human health. Long overlooked, the beneficial interaction of endophytic bacteria and grasses has attracted ever-growing interest in research in the last decade. Capable of biological nitrogen fixation, diazotrophic endophytes not only provide a valuable source of combined nitrogen but also are known for diverse plant growth-promoting effects, thereby contributing to plant productivity. Elucidation of an essential gene set for a prominent model endophyte such as A. olearius BH72 provides us with powerful insights into its basic lifestyle. Knowledge about genes detrimental or advantageous under defined physiological conditions may point out a way of manipulating key steps in the bacterium's lifestyle and plant interaction toward a more sustainable agriculture.
Collapse
Affiliation(s)
- Theresa Harten
- University of Bremen, Faculty of Biology and Chemistry, CBIB Center for Biomolecular Interactions, Department of Microbe-Plant Interactions, Bremen, Germany
| | - Rolf Nimzyk
- University of Bremen, Faculty of Biology and Chemistry, CBIB Center for Biomolecular Interactions, Department of Microbe-Plant Interactions, Bremen, Germany
- University of Bremen, Faculty of Biology and Chemistry, CBIB Center for Biomolecular Interactions, Nucleic Acid Analysis Facility (NAA), Bremen, Germany
| | - Vivian E. A. Gawlick
- University of Bremen, Faculty of Biology and Chemistry, CBIB Center for Biomolecular Interactions, Department of Microbe-Plant Interactions, Bremen, Germany
| | - Barbara Reinhold-Hurek
- University of Bremen, Faculty of Biology and Chemistry, CBIB Center for Biomolecular Interactions, Department of Microbe-Plant Interactions, Bremen, Germany
| |
Collapse
|
2
|
Lo Sciuto A, Spinnato MC, Pasqua M, Imperi F. Generation of Stable and Unmarked Conditional Mutants in Pseudomonas aeruginosa. Methods Mol Biol 2022; 2548:21-35. [PMID: 36151489 DOI: 10.1007/978-1-0716-2581-1_2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The functional and physiological characterization of bacterial genes required for growth and/or cell survival is limited by the inability to generate deletion mutants lacking the specific gene of interest. This limitation can be circumvented by generating conditional mutants in which the loss of the endogenous copy of the gene is compensated by the introduction of the wild-type allele under the control of an inducible promoter, which allows for tightly regulated expression of the gene of interest. Besides the confirmation and/or functional investigation of essential genes, conditional mutants can also be useful to investigate the effect of finely controlled expression of nonessential genes. In this chapter, we describe a method that can be used to generate stable and unmarked conditional mutants in Pseudomonas aeruginosa.
Collapse
Affiliation(s)
| | | | - Martina Pasqua
- Department of Biology and Biotechnology "Charles Darwin", Sapienza University of Rome, Rome, Italy
| | - Francesco Imperi
- Department of Science, University Roma Tre, Rome, Italy.
- IRCCS Fondazione Santa Lucia, Rome, Italy.
| |
Collapse
|
3
|
Selection or drift: The population biology underlying transposon insertion sequencing experiments. Comput Struct Biotechnol J 2020; 18:791-804. [PMID: 32280434 PMCID: PMC7138912 DOI: 10.1016/j.csbj.2020.03.021] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 03/06/2020] [Accepted: 03/22/2020] [Indexed: 01/23/2023] Open
Abstract
Transposon insertion sequencing methods such as Tn-seq revolutionized microbiology by allowing the identification of genomic loci that are critical for viability in a specific environment on a genome-wide scale. While powerful, transposon insertion sequencing suffers from limited reproducibility when different analysis methods are compared. From the perspective of population biology, this may be explained by changes in mutant frequency due to chance (drift) rather than differential fitness (selection). Here, we develop a mathematical model of the population biology of transposon insertion sequencing experiments, i.e. the changes in size and composition of the transposon-mutagenized population during the experiment. We use this model to investigate mutagenesis, the growth of the mutant library, and its passage through bottlenecks. Specifically, we study how these processes can lead to extinction of individual mutants depending on their fitness and the distribution of fitness effects (DFE) of the entire mutant population. We find that in typical in vitro experiments few mutants with high fitness go extinct. However, bottlenecks of a size that is common in animal infection models lead to so much random extinction that a large number of viable mutants would be misclassified. While mutants with low fitness are more likely to be lost during the experiment, mutants with intermediate fitness are expected to be much more abundant and can constitute a large proportion of detected hits, i.e. false positives. Thus, incorporating the DFEs of randomly generated mutations in the analysis may improve the reproducibility of transposon insertion experiments, especially when strong bottlenecks are encountered.
Collapse
|
4
|
Statistical analysis of variability in TnSeq data across conditions using zero-inflated negative binomial regression. BMC Bioinformatics 2019; 20:603. [PMID: 31752678 PMCID: PMC6873424 DOI: 10.1186/s12859-019-3156-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Accepted: 10/14/2019] [Indexed: 12/12/2022] Open
Abstract
Background Deep sequencing of transposon mutant libraries (or TnSeq) is a powerful method for probing essentiality of genomic loci under different environmental conditions. Various analytical methods have been described for identifying conditionally essential genes whose tolerance for insertions varies between two conditions. However, for large-scale experiments involving many conditions, a method is needed for identifying genes that exhibit significant variability in insertions across multiple conditions. Results In this paper, we introduce a novel statistical method for identifying genes with significant variability of insertion counts across multiple conditions based on Zero-Inflated Negative Binomial (ZINB) regression. Using likelihood ratio tests, we show that the ZINB distribution fits TnSeq data better than either ANOVA or a Negative Binomial (in a generalized linear model). We use ZINB regression to identify genes required for infection of M. tuberculosis H37Rv in C57BL/6 mice. We also use ZINB to perform a analysis of genes conditionally essential in H37Rv cultures exposed to multiple antibiotics. Conclusions Our results show that, not only does ZINB generally identify most of the genes found by pairwise resampling (and vastly out-performs ANOVA), but it also identifies additional genes where variability is detectable only when the magnitudes of insertion counts are treated separately from local differences in saturation, as in the ZINB model.
Collapse
|
5
|
Martin RE. The transportome of the malaria parasite. Biol Rev Camb Philos Soc 2019; 95:305-332. [PMID: 31701663 DOI: 10.1111/brv.12565] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Revised: 10/02/2019] [Accepted: 10/04/2019] [Indexed: 12/15/2022]
Abstract
Membrane transport proteins, also known as transporters, control the movement of ions, nutrients, metabolites, and waste products across the membranes of a cell and are central to its biology. Proteins of this type also serve as drug targets and are key players in the phenomenon of drug resistance. The malaria parasite has a relatively reduced transportome, with only approximately 2.5% of its genes encoding transporters. Even so, assigning functions and physiological roles to these proteins, and ascertaining their contributions to drug action and drug resistance, has been very challenging. This review presents a detailed critique and synthesis of the disruption phenotypes, protein subcellular localisations, protein functions (observed or predicted), and links to antimalarial drug resistance for each of the parasite's transporter genes. The breadth and depth of the gene disruption data are particularly impressive, with at least one phenotype determined in the parasite's asexual blood stage for each transporter gene, and multiple phenotypes available for 76% of the genes. Analysis of the curated data set revealed there to be relatively little redundancy in the Plasmodium transportome; almost two-thirds of the parasite's transporter genes are essential or required for normal growth in the asexual blood stage of the parasite, and this proportion increased to 78% when the disruption phenotypes available for the other parasite life stages were included in the analysis. These observations, together with the finding that 22% of the transportome is implicated in the parasite's resistance to existing antimalarials and/or drugs within the development pipeline, indicate that transporters are likely to serve, or are already serving, as drug targets. Integration of the different biological and bioinformatic data sets also enabled the selection of candidates for transport processes known to be essential for parasite survival, but for which the underlying proteins have thus far remained undiscovered. These include potential transporters of pantothenate, isoleucine, or isopentenyl diphosphate, as well as putative anion-selective channels that may serve as the pore component of the parasite's 'new permeation pathways'. Other novel insights into the parasite's biology included the identification of transporters for the potential development of antimalarial treatments, transmission-blocking drugs, prophylactics, and genetically attenuated vaccines. The syntheses presented herein set a foundation for elucidating the functions and physiological roles of key members of the Plasmodium transportome and, ultimately, to explore and realise their potential as therapeutic targets.
Collapse
Affiliation(s)
- Rowena E Martin
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| |
Collapse
|
6
|
Ruiz L, Bottacini F, Boinett CJ, Cain AK, O'Connell-Motherway M, Lawley TD, van Sinderen D. The essential genomic landscape of the commensal Bifidobacterium breve UCC2003. Sci Rep 2017; 7:5648. [PMID: 28717159 PMCID: PMC5514069 DOI: 10.1038/s41598-017-05795-y] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Accepted: 06/02/2017] [Indexed: 01/15/2023] Open
Abstract
Bifidobacteria are common gut commensals with purported health-promoting effects. This has encouraged scientific research into bifidobacteria, though recalcitrance to genetic manipulation and scarcity of molecular tools has hampered our knowledge on the precise molecular determinants of their health-promoting attributes and gut adaptation. To overcome this problem and facilitate functional genomic analyses in bifidobacteria, we created a large Tn5 transposon mutant library of the commensal Bifidobacterium breve UCC2003 that was further characterized by means of a Transposon Directed Insertion Sequencing (TraDIS) approach. Statistical analysis of transposon insertion distribution revealed a set of 453 genes that are essential for or markedly contribute to growth of this strain under laboratory conditions. These essential genes encode functions involved in the so-called bifid-shunt, most enzymes related to nucleotide biosynthesis and a range of housekeeping functions. Comparison to the Bifidobacterium and B. breve core genomes highlights a high degree of conservation of essential genes at the species and genus level, while comparison to essential gene datasets from other gut bacteria identified essential genes that appear specific to bifidobacteria. This work establishes a useful molecular tool for scientific discovery of bifidobacteria and identifies targets for further studies aimed at characterizing essential functions not previously examined in bifidobacteria.
Collapse
Affiliation(s)
- Lorena Ruiz
- School of Microbiology and APC Microbiome Institute, National University of Ireland, Cork, Western Road, Ireland.,Department of Nutrition, Bromatology and Food Technology, Complutense University, Avda Puerta de Hierro s/n, 28040, Madrid, Spain
| | - Francesca Bottacini
- School of Microbiology and APC Microbiome Institute, National University of Ireland, Cork, Western Road, Ireland
| | | | - Amy K Cain
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - Mary O'Connell-Motherway
- School of Microbiology and APC Microbiome Institute, National University of Ireland, Cork, Western Road, Ireland
| | | | - Douwe van Sinderen
- School of Microbiology and APC Microbiome Institute, National University of Ireland, Cork, Western Road, Ireland.
| |
Collapse
|
7
|
A Noise Trimming and Positional Significance of Transposon Insertion System to Identify Essential Genes in Yersinia pestis. Sci Rep 2017; 7:41923. [PMID: 28165493 PMCID: PMC5292949 DOI: 10.1038/srep41923] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2016] [Accepted: 12/30/2016] [Indexed: 01/14/2023] Open
Abstract
Massively parallel sequencing technology coupled with saturation mutagenesis has provided new and global insights into gene functions and roles. At a simplistic level, the frequency of mutations within genes can indicate the degree of essentiality. However, this approach neglects to take account of the positional significance of mutations - the function of a gene is less likely to be disrupted by a mutation close to the distal ends. Therefore, a systematic bioinformatics approach to improve the reliability of essential gene identification is desirable. We report here a parametric model which introduces a novel mutation feature together with a noise trimming approach to predict the biological significance of Tn5 mutations. We show improved performance of essential gene prediction in the bacterium Yersinia pestis, the causative agent of plague. This method would have broad applicability to other organisms and to the identification of genes which are essential for competitiveness or survival under a broad range of stresses.
Collapse
|
8
|
Chao MC, Abel S, Davis BM, Waldor MK. The design and analysis of transposon insertion sequencing experiments. Nat Rev Microbiol 2016; 14:119-28. [PMID: 26775926 DOI: 10.1038/nrmicro.2015.7] [Citation(s) in RCA: 144] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Transposon insertion sequencing (TIS) is a powerful approach that can be extensively applied to the genome-wide definition of loci that are required for bacterial growth under diverse conditions. However, experimental design choices and stochastic biological processes can heavily influence the results of TIS experiments and affect downstream statistical analysis. In this Opinion article, we discuss TIS experimental parameters and how these factors relate to the benefits and limitations of the various statistical frameworks that can be applied to the computational analysis of TIS data.
Collapse
Affiliation(s)
- Michael C Chao
- Department of Microbiology and Immunobiology, Harvard Medical School, Boston, Massachusetts 02115, USA; the Division of Infectious Disease, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA; and the Howard Hughes Medical Institute, Boston, Massachusetts 02115, USA
| | - Sören Abel
- Department of Pharmacy, University of Tromsø, The Arctic University of Norway, 9019 Tromsø, Norway
| | - Brigid M Davis
- Department of Microbiology and Immunobiology, Harvard Medical School, Boston, Massachusetts 02115, USA; the Division of Infectious Disease, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA; and the Howard Hughes Medical Institute, Boston, Massachusetts 02115, USA
| | - Matthew K Waldor
- Department of Microbiology and Immunobiology, Harvard Medical School, Boston, Massachusetts 02115, USA; the Division of Infectious Disease, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA; and the Howard Hughes Medical Institute, Boston, Massachusetts 02115, USA
| |
Collapse
|
9
|
Liu F, Wang C, Wu Z, Zhang Q, Liu P. A zero-inflated Poisson model for insertion tolerance analysis of genes based on Tn-seq data. Bioinformatics 2016; 32:1701-8. [PMID: 26833344 DOI: 10.1093/bioinformatics/btw061] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2015] [Accepted: 01/25/2016] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Transposon insertion sequencing (Tn-seq) is an emerging technology that combines transposon mutagenesis with next-generation sequencing technologies for the identification of genes related to bacterial survival. The resulting data from Tn-seq experiments consist of sequence reads mapped to millions of potential transposon insertion sites and a large portion of insertion sites have zero mapped reads. Novel statistical method for Tn-seq data analysis is needed to infer functions of genes on bacterial growth. RESULTS In this article, we propose a zero-inflated Poisson model for analyzing the Tn-seq data that are high-dimensional and with an excess of zeros. Maximum likelihood estimates of model parameters are obtained using an expectation-maximization (EM) algorithm, and pseudogenes are utilized to construct appropriate statistical tests for the transposon insertion tolerance of normal genes of interest. We propose a multiple testing procedure that categorizes genes into each of the three states, hypo-tolerant, tolerant and hyper-tolerant, while controlling the false discovery rate. We evaluate the proposed method with simulation studies and apply the proposed method to a real Tn-seq data from an experiment that studied the bacterial pathogen, Campylobacter jejuniAvailability and implementation: We provide R code for implementing our proposed method at http://github.com/ffliu/TnSeq A user's guide with example data analysis is also available there. CONTACT pliu@iastate.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Chong Wang
- Department of Statistics, Iowa State University, Department of Veterinary Diagnostic and Production Animal Medicine and
| | - Zuowei Wu
- Department of Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA 50010, USA
| | - Qijing Zhang
- Department of Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA 50010, USA
| | - Peng Liu
- Department of Statistics, Iowa State University
| |
Collapse
|
10
|
Fernández-Piñar R, Lo Sciuto A, Rossi A, Ranucci S, Bragonzi A, Imperi F. In vitro and in vivo screening for novel essential cell-envelope proteins in Pseudomonas aeruginosa. Sci Rep 2015; 5:17593. [PMID: 26621210 PMCID: PMC4665194 DOI: 10.1038/srep17593] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Accepted: 11/03/2015] [Indexed: 12/30/2022] Open
Abstract
The Gram-negative bacterium Pseudomonas aeruginosa represents a prototype of multi-drug resistant opportunistic pathogens for which novel therapeutic options are urgently required. In order to identify new candidates as potential drug targets, we combined large-scale transposon mutagenesis data analysis and bioinformatics predictions to retrieve a set of putative essential genes which are conserved in P. aeruginosa and predicted to encode cell envelope or secreted proteins. By generating unmarked deletion or conditional mutants, we confirmed the in vitro essentiality of two periplasmic proteins, LptH and LolA, responsible for lipopolysaccharide and lipoproteins transport to the outer membrane respectively, and confirmed that they are important for cell envelope stability. LptH was also found to be essential for P. aeruginosa ability to cause infection in different animal models. Conversely, LolA-depleted cells appeared only partially impaired in pathogenicity, indicating that this protein likely plays a less relevant role during bacterial infection. Finally, we ruled out any involvement of the other six proteins under investigation in P. aeruginosa growth, cell envelope stability and virulence. Besides proposing LptH as a very promising drug target in P. aeruginosa, this study confirms the importance of in vitro and in vivo validation of potential essential genes identified through random transposon mutagenesis.
Collapse
Affiliation(s)
- Regina Fernández-Piñar
- Laboratory of Molecular Microbiology, Department of Biology and Biotechnology "Charles Darwin", Sapienza University of Rome, Rome, Italy
| | - Alessandra Lo Sciuto
- Laboratory of Molecular Microbiology, Department of Biology and Biotechnology "Charles Darwin", Sapienza University of Rome, Rome, Italy
| | - Alice Rossi
- Division of Immunology, Transplantation and Infectious Diseases, San Raffaele Scientific Institute, Milan, Italy
| | - Serena Ranucci
- Division of Immunology, Transplantation and Infectious Diseases, San Raffaele Scientific Institute, Milan, Italy
| | - Alessandra Bragonzi
- Division of Immunology, Transplantation and Infectious Diseases, San Raffaele Scientific Institute, Milan, Italy
| | - Francesco Imperi
- Laboratory of Molecular Microbiology, Department of Biology and Biotechnology "Charles Darwin", Sapienza University of Rome, Rome, Italy.,Pasteur Institute-Cenci Bolognetti Foundation, Sapienza University of Rome, Rome, Italy
| |
Collapse
|
11
|
DeJesus MA, Ambadipudi C, Baker R, Sassetti C, Ioerger TR. TRANSIT--A Software Tool for Himar1 TnSeq Analysis. PLoS Comput Biol 2015; 11:e1004401. [PMID: 26447887 PMCID: PMC4598096 DOI: 10.1371/journal.pcbi.1004401] [Citation(s) in RCA: 113] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Accepted: 06/10/2015] [Indexed: 02/07/2023] Open
Abstract
TnSeq has become a popular technique for determining the essentiality of genomic regions in bacterial organisms. Several methods have been developed to analyze the wealth of data that has been obtained through TnSeq experiments. We developed a tool for analyzing Himar1 TnSeq data called TRANSIT. TRANSIT provides a graphical interface to three different statistical methods for analyzing TnSeq data. These methods cover a variety of approaches capable of identifying essential genes in individual datasets as well as comparative analysis between conditions. We demonstrate the utility of this software by analyzing TnSeq datasets of M. tuberculosis grown on glycerol and cholesterol. We show that TRANSIT can be used to discover genes which have been previously implicated for growth on these carbon sources. TRANSIT is written in Python, and thus can be run on Windows, OSX and Linux platforms. The source code is distributed under the GNU GPL v3 license and can be obtained from the following GitHub repository: https://github.com/mad-lab/transit.
Collapse
Affiliation(s)
- Michael A. DeJesus
- Department of Computer Science, Texas A&M University, College Station, Texas, United States of America
- * E-mail:
| | - Chaitra Ambadipudi
- Department of Computer Science, Texas A&M University, College Station, Texas, United States of America
| | - Richard Baker
- Department of Microbiology and Physiological Systems, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America
| | - Christopher Sassetti
- Department of Microbiology and Physiological Systems, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America
| | - Thomas R. Ioerger
- Department of Computer Science, Texas A&M University, College Station, Texas, United States of America
| |
Collapse
|
12
|
Lu Y, Lu Y, Deng J, Peng H, Lu H, Lu LJ. A novel essential domain perspective for exploring gene essentiality. Bioinformatics 2015; 31:2921-9. [PMID: 26002906 DOI: 10.1093/bioinformatics/btv312] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2015] [Accepted: 05/13/2015] [Indexed: 02/05/2023] Open
Abstract
MOTIVATION Genes with indispensable functions are identified as essential; however, the traditional gene-level studies of essentiality have several limitations. In this study, we characterized gene essentiality from a new perspective of protein domains, the independent structural or functional units of a polypeptide chain. RESULTS To identify such essential domains, we have developed an Expectation-Maximization (EM) algorithm-based Essential Domain Prediction (EDP) Model. With simulated datasets, the model provided convergent results given different initial values and offered accurate predictions even with noise. We then applied the EDP model to six microbial species and predicted 1879 domains to be essential in at least one species, ranging 10-23% in each species. The predicted essential domains were more conserved than either non-essential domains or essential genes. Comparing essential domains in prokaryotes and eukaryotes revealed an evolutionary distance consistent with that inferred from ribosomal RNA. When utilizing these essential domains to reproduce the annotation of essential genes, we received accurate results that suggest protein domains are more basic units for the essentiality of genes. Furthermore, we presented several examples to illustrate how the combination of essential and non-essential domains can lead to genes with divergent essentiality. In summary, we have described the first systematic analysis on gene essentiality on the level of domains. CONTACT huilu.bioinfo@gmail.com or Long.Lu@cchmc.org SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yao Lu
- Shanghai Institute of Medical Genetics, Shanghai Children's Hospital, Shanghai Jiao Tong University, 24/1400 Beijing (W) Road, Shanghai 200040, People's Republic of China
| | - Yulan Lu
- State Key Laboratory of Genetic Engineering Institute of Biostatistics, School of Life Science, Fudan University, Shanghai 200433, People's Republic of China
| | - Jingyuan Deng
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Hai Peng
- Institute for Systems Biology, Jianghan University, Wuhan, Hubei, People's Republic of China
| | - Hui Lu
- Shanghai Institute of Medical Genetics, Shanghai Children's Hospital, Shanghai Jiao Tong University, 24/1400 Beijing (W) Road, Shanghai 200040, People's Republic of China, Department of Bioengineering (MC 063), University of Illinois at Chicago, Chicago, IL 60607-7052, USA and Collaborative Innovation Center for Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Long Jason Lu
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA, Institute for Systems Biology, Jianghan University, Wuhan, Hubei, People's Republic of China
| |
Collapse
|
13
|
Deng J. A statistical framework for improving genomic annotations of transposon mutagenesis (TM) assigned essential genes. Methods Mol Biol 2015; 1279:153-65. [PMID: 25636618 DOI: 10.1007/978-1-4939-2398-4_10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Whole-genome transposon mutagenesis (TM) experiment followed by sequence-based identification of insertion sites is the most popular genome-wise experiment to identify essential genes in Prokaryota. However, due to the limitation of high-throughput technique, this approach yields substantial systematic biases resulting in the incorrect assignments of many essential genes. To obtain unbiased and accurate annotations of essential genes from TM experiments, we developed a novel Poisson model based statistical framework to refine these TM assignments. In the model, first we identified and incorporated several potential factors such as gene length and TM insertion information which may cause the TM assignment biases into the basic Poisson model. Then we calculated the conditional probability of an essential gene given the observed TM insertion number. By factorizing this probability through introducing a latent variable the real insertion number, we formalized the statistical framework. Through iteratively updating and optimizing model parameters to maximize the goodness-of-fit of the model to the observed TM insertion data, we finalized the model. Using this model, we are able to assign the probability score of essentiality to each individual gene given its TM assignment, which subsequently correct the experimental biases. To enable our model widely useable, we established a user-friendly Web-server that is accessible to the public: http://research.cchmc.org/essentialgene/.
Collapse
Affiliation(s)
- Jingyuan Deng
- Division of Epidemiology and Biostatistics, Department of Environmental Health, University of Cincinnati Medical Center, 3223 Eden Av. ML 56, Cincinnati, OH, 45267-0056, USA,
| |
Collapse
|
14
|
Abstract
Essential genes are those genes indispensable for the survival of any living cell. Bacterial essential genes constitute the cornerstones of synthetic biology and are often attractive targets in the development of antibiotics and vaccines. Because identification of essential genes with wet-lab ways often means expensive economic costs and tremendous labor, scientists changed to seek for alternative way of computational prediction. Aiming to help to solve this issue, our research group (CEFG: group of Computational, Comparative, Evolutionary and Functional Genomics, http://cefg.uestc.edu.cn) has constructed three online services to predict essential genes in bacterial genomes. These freely available tools are applicable for single gene sequences without annotated functions, single genes with definite names, and complete genomes of bacterial strains. To ensure reliable predictions, the investigated species should belong to the same family (for EGP) or phylum (for CEG_Match and Geptop) with one of the reference species, respectively. As the pilot software for the issue, predicting accuracies of them have been assessed and compared with existing algorithms, and note that all of other published algorithms have not any formed online services. We hope these services at CEFG will help scientists and researchers in the field of essential genes.
Collapse
|
15
|
Suk S, Kim Y, Bak G, Lee Y. Identification of a Temperature-Sensitive Mutation in the ribE Gene of an Escherichia coli Keio Collection Strain. B KOREAN CHEM SOC 2014. [DOI: 10.5012/bkcs.2014.35.7.2175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
16
|
Rusmini R, Vecchietti D, Macchi R, Vidal-Aroca F, Bertoni G. A shotgun antisense approach to the identification of novel essential genes in Pseudomonas aeruginosa. BMC Microbiol 2014; 14:24. [PMID: 24499134 PMCID: PMC3922391 DOI: 10.1186/1471-2180-14-24] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2013] [Accepted: 01/23/2014] [Indexed: 12/29/2022] Open
Abstract
Background Antibiotics in current use target a surprisingly small number of cellular functions: cell wall, DNA, RNA, and protein biosynthesis. Targeting of novel essential pathways is expected to play an important role in the discovery of new antibacterial agents against bacterial pathogens, such as Pseudomonas aeruginosa, that are difficult to control because of their ability to develop resistance, often multiple, to all current classes of clinical antibiotics. Results We aimed to identify novel essential genes in P. aeruginosa by shotgun antisense screening. This technique was developed in Staphylococcus aureus and, following a period of limited success in Gram-negative bacteria, has recently been used effectively in Escherichia coli. To also target low expressed essential genes, we included some variant steps that were expected to overcome the non-stringent regulation of the promoter carried by the expression vector used for the shotgun antisense libraries. Our antisense screenings identified 33 growth-impairing single-locus genomic inserts that allowed us to generate a list of 28 “essential-for-growth” genes: five were “classical” essential genes involved in DNA replication, transcription, translation, and cell division; seven were already reported as essential in other bacteria; and 16 were “novel” essential genes with no homologs reported to have an essential role in other bacterial species. Interestingly, the essential genes in our panel were suggested to take part in a broader range of cellular functions than those currently targeted by extant antibiotics, namely protein secretion, biosynthesis of cofactors, prosthetic groups and carriers, energy metabolism, central intermediary metabolism, transport of small molecules, translation, post-translational modification, non-ribosomal peptide synthesis, lipopolysaccharide synthesis/modification, and transcription regulation. This study also identified 43 growth-impairing inserts carrying multiple loci targeting 105 genes, of which 25 have homologs reported as essential in other bacteria. Finally, four multigenic growth-impairing inserts belonged to operons that have never been reported to play an essential role. Conclusions For the first time in P. aeruginosa, we applied regulated antisense RNA expression and showed the feasibility of this technology for the identification of novel essential genes.
Collapse
Affiliation(s)
| | | | | | | | - Giovanni Bertoni
- Department of Life Sciences, Università degli Studi di Milano, via Celoria 26, 20133 Milan, Italy.
| |
Collapse
|
17
|
Lu Y, Deng J, Rhodes JC, Lu H, Lu LJ. Predicting essential genes for identifying potential drug targets in Aspergillus fumigatus. Comput Biol Chem 2014; 50:29-40. [PMID: 24569026 DOI: 10.1016/j.compbiolchem.2014.01.011] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/23/2013] [Indexed: 12/31/2022]
Abstract
BACKGROUND Aspergillus fumigatus (Af) is a ubiquitous and opportunistic pathogen capable of causing acute, invasive pulmonary disease in susceptible hosts. Despite current therapeutic options, mortality associated with invasive Af infections remains unacceptably high, increasing 357% since 1980. Therefore, there is an urgent need for the development of novel therapeutic strategies, including more efficacious drugs acting on new targets. Thus, as noted in a recent review, "the identification of essential genes in fungi represents a crucial step in the development of new antifungal drugs". Expanding the target space by rapidly identifying new essential genes has thus been described as "the most important task of genomics-based target validation". RESULTS In previous research, we were the first to show that essential gene annotation can be reliably transferred between distantly related four Prokaryotic species. In this study, we extend our machine learning approach to the much more complex Eukaryotic fungal species. A compendium of essential genes is predicted in Af by transferring known essential gene annotations from another filamentous fungus Neurospora crassa. This approach predicts essential genes by integrating diverse types of intrinsic and context-dependent genomic features encoded in microbial genomes. The predicted essential datasets contained 1674 genes. We validated our results by comparing our predictions with known essential genes in Af, comparing our predictions with those predicted by homology mapping, and conducting conditional expressed alleles. We applied several layers of filters and selected a set of potential drug targets from the predicted essential genes. Finally, we have conducted wet lab knockout experiments to verify our predictions, which further validates the accuracy and wide applicability of the machine learning approach. CONCLUSIONS The approach presented here significantly extended our ability to predict essential genes beyond orthologs and made it possible to predict an inventory of essential genes in Eukaryotic fungal species, amongst which a preferred subset of suitable drug targets may be selected. By selecting the best new targets, we believe that resultant drugs would exhibit an unparalleled clinical impact against a naive pathogen population. Additional benefits that a compendium of essential genes can provide are important information on cell function and evolutionary biology. Furthermore, mapping essential genes to pathways may also reveal critical check points in the pathogen's metabolism. Finally, this approach is highly reproducible and portable, and can be easily applied to predict essential genes in many more pathogenic microbes, especially those unculturable.
Collapse
Affiliation(s)
- Yao Lu
- Shanghai Institute of Medical Genetics, Shanghai Children's Hospital, Shanghai Jiao Tong University, 24/1400 Beijing (W) Road, Shanghai 200040, PR China
| | - Jingyuan Deng
- Division of Biomedical Informatics, Cincinnati Children's Hospital Research Foundation, 3333 Burnet Avenue, MLC7024, Cincinnati, OH 45229, USA
| | - Judith C Rhodes
- Department of Pathology and Laboratory Medicine, University of Cincinnati, 2600 Clifton Avenue, Cincinnati, OH 45221, USA
| | - Hui Lu
- Shanghai Institute of Medical Genetics, Shanghai Children's Hospital, Shanghai Jiao Tong University, 24/1400 Beijing (W) Road, Shanghai 200040, PR China; Department of Bioengineering (MC 063), University of Illinois at Chicago, 851 S Morgan St, 218 SEO, Chicago, IL 60607, USA.
| | - Long Jason Lu
- Division of Biomedical Informatics, Cincinnati Children's Hospital Research Foundation, 3333 Burnet Avenue, MLC7024, Cincinnati, OH 45229, USA; Division of Epidemiology and Biostatistics, Cincinnati Children's Hospital Research Foundation, 3333 Burnet Avenue, MLC7024, Cincinnati, OH 45229, USA; Department of Computer Science, University of Cincinnati, 2600 Clifton Avenue, Cincinnati, OH 45221, USA; Department of Environmental Health, University of Cincinnati, 2600 Clifton Avenue, Cincinnati, OH 45221, USA; Department of Biomedical Engineering, University of Cincinnati, 2600 Clifton Avenue, Cincinnati, OH 45221, USA.
| |
Collapse
|