1
|
Zhao Y, Feng L, Zhou B, Zhang X, Yao Z, Wang L, Wang Z, Zhou T, Chen L. A newly isolated bacteriophage vB8388 and its synergistic effect with aminoglycosides against multi-drug resistant Klebsiella oxytoca strain FK-8388. Microb Pathog 2023; 174:105906. [PMID: 36494020 DOI: 10.1016/j.micpath.2022.105906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Revised: 11/24/2022] [Accepted: 11/24/2022] [Indexed: 12/12/2022]
Abstract
The bacteriophage vB8388 can lyse multi-drug resistant Klebsiella oxytoca strain FK-8388 and maintain stability in a wide range of temperatures (from 4 °C to 80 °C) and pHs (3-11). Bioinformatics analysis showed that vB8388 is a linear double-stranded DNA virus that is 39,750 long with 50.65% G + C content and 44 putative open reading frames (ORFs). Phage vB8388 belongs to the family Autographviridae and possesses a non-contractile tail. The latency period of vB8388 was approximately 20 min. The combination of phage vB8388 and gentamicin, amikacin, or tobramycin could effectively inhibit the growth of K. oxytoca strain FK-8388, with a decrease of more than 4 log units within 12 h in vitro. Phage vB8388 showed a strong synergistic effect with gentamicin that could enhance the anti-biofilm effect of vB8388. The phage + gentamicin combination also showed synergy in vivo in the larval infection model of Galleria mellonella. In conclusion, the findings of this study suggest the potential of phage + antibiotic combination therapy to be used as an alternative therapeutic approach for treating infectious diseases caused by multidrug-resistant bacteria.
Collapse
Affiliation(s)
- Yining Zhao
- Department of Clinical Laboratory, The First Affiliated Hospital of Wenzhou Medical University, Key Laboratory of Clinical Laboratory Diagnosis and Translational Research of Zhejiang Province, Wenzhou, Zhejiang Province, China.
| | - Luozhu Feng
- Department of Medical Lab Science, School of Laboratory Medicine and Life Science, Wenzhou Medical University, Wenzhou, Zhejiang Province, China.
| | - Beibei Zhou
- Department of Clinical Laboratory, The First Affiliated Hospital of Wenzhou Medical University, Key Laboratory of Clinical Laboratory Diagnosis and Translational Research of Zhejiang Province, Wenzhou, Zhejiang Province, China.
| | - Xiaodong Zhang
- Department of Clinical Laboratory, The First Affiliated Hospital of Wenzhou Medical University, Key Laboratory of Clinical Laboratory Diagnosis and Translational Research of Zhejiang Province, Wenzhou, Zhejiang Province, China.
| | - Zhuocheng Yao
- Department of Medical Lab Science, School of Laboratory Medicine and Life Science, Wenzhou Medical University, Wenzhou, Zhejiang Province, China.
| | - Lingbo Wang
- Department of Clinical Laboratory, The First Affiliated Hospital of Wenzhou Medical University, Key Laboratory of Clinical Laboratory Diagnosis and Translational Research of Zhejiang Province, Wenzhou, Zhejiang Province, China.
| | - Zhongyong Wang
- Department of Clinical Laboratory, The First Affiliated Hospital of Wenzhou Medical University, Key Laboratory of Clinical Laboratory Diagnosis and Translational Research of Zhejiang Province, Wenzhou, Zhejiang Province, China.
| | - Tieli Zhou
- Department of Clinical Laboratory, The First Affiliated Hospital of Wenzhou Medical University, Key Laboratory of Clinical Laboratory Diagnosis and Translational Research of Zhejiang Province, Wenzhou, Zhejiang Province, China.
| | - Lijiang Chen
- Department of Clinical Laboratory, The First Affiliated Hospital of Wenzhou Medical University, Key Laboratory of Clinical Laboratory Diagnosis and Translational Research of Zhejiang Province, Wenzhou, Zhejiang Province, China.
| |
Collapse
|
2
|
Sridhar S, Ajo-Franklin CM, Masiello CA. A Framework for the Systematic Selection of Biosensor Chassis for Environmental Synthetic Biology. ACS Synth Biol 2022; 11:2909-2916. [PMID: 35961652 PMCID: PMC9486965 DOI: 10.1021/acssynbio.2c00079] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Microbial biosensors sense and report exposures to stimuli, thereby facilitating our understanding of environmental processes. Successful design and deployment of biosensors hinge on the persistence of the microbial host of the genetic circuit, termed the chassis. However, model chassis organisms may persist poorly in environmental conditions. In contrast, non-model organisms persist better in environmental conditions but are limited by other challenges, such as genetic intractability and part unavailability. Here we identify ecological, metabolic, and genetic constraints for chassis development and propose a conceptual framework for the systematic selection of environmental biosensor chassis. We identify key challenges with using current model chassis and delineate major points of conflict in choosing the most suitable organisms as chassis for environmental biosensing. This framework provides a way forward in the selection of biosensor chassis for environmental synthetic biology.
Collapse
Affiliation(s)
- Swetha Sridhar
- Systems,
Synthetic, and Physical Biology Graduate Program, Rice University, 6100 Main Street, MS-180, Houston, Texas 77005, United
States,Tel: 713-348-2565.
| | - Caroline M. Ajo-Franklin
- Department
of BioSciences, Rice University, 6100 Main Street, MS-140, Houston, Texas 77005, United States
| | - Caroline A. Masiello
- Department
of BioSciences, Rice University, 6100 Main Street, MS-140, Houston, Texas 77005, United States,Department
of Earth, Environmental, and Planetary Sciences, Rice University, 6100 Main St, MS-126, Houston, Texas 77005, United
States
| |
Collapse
|
3
|
Bardou P, Laguerre S, Maman Haddad S, Legoueix Rodriguez S, Laville E, Dumon C, Potocki-Veronese G, Klopp C. MINTIA: a metagenomic INserT integrated assembly and annotation tool. PeerJ 2021; 9:e11885. [PMID: 34692239 PMCID: PMC8483015 DOI: 10.7717/peerj.11885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 07/09/2021] [Indexed: 11/29/2022] Open
Abstract
The earth harbors trillions of bacterial species adapted to very diverse ecosystems thanks to specific metabolic function acquisition. Most of the genes responsible for these functions belong to uncultured bacteria and are still to be discovered. Functional metagenomics based on activity screening is a classical way to retrieve these genes from microbiomes. This approach is based on the insertion of large metagenomic DNA fragments into a vector and transformation of a host to express heterologous genes. Metagenomic libraries are then screened for activities of interest, and the metagenomic DNA inserts of active clones are extracted to be sequenced and analysed to identify genes that are responsible for the detected activity. Hundreds of metagenomics sequences found using this strategy have already been published in public databases. Here we present the MINTIA software package enabling biologists to easily generate and analyze large metagenomic sequence sets, retrieved after activity-based screening. It filters reads, performs assembly, removes cloning vector, annotates open reading frames and generates user friendly reports as well as files ready for submission to international sequence repositories. The software package can be downloaded from https://github.com/Bios4Biol/MINTIA.
Collapse
Affiliation(s)
- Philippe Bardou
- Sigenae, GenPhySE, Université de Toulouse, INRAE, ENVT, Castanet Tolosan, France
| | | | - Sarah Maman Haddad
- Sigenae, GenPhySE, Université de Toulouse, INRAE, ENVT, Castanet Tolosan, France
| | | | | | - Claire Dumon
- TBI, Université de Toulouse, CNRS, INRAE, INSA, Toulouse, France
| | | | - Christophe Klopp
- Sigenae, Genotoul Bioinfo, MIAT UR875, INRAE, Castanet Tolosan, France
| |
Collapse
|
4
|
Arginine-Rich Small Proteins with a Domain of Unknown Function, DUF1127, Play a Role in Phosphate and Carbon Metabolism of Agrobacterium tumefaciens. J Bacteriol 2020; 202:JB.00309-20. [PMID: 33093235 DOI: 10.1128/jb.00309-20] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 07/21/2020] [Indexed: 02/06/2023] Open
Abstract
In any given organism, approximately one-third of all proteins have a yet-unknown function. A widely distributed domain of unknown function is DUF1127. Approximately 17,000 proteins with such an arginine-rich domain are found in 4,000 bacteria. Most of them are single-domain proteins, and a large fraction qualifies as small proteins with fewer than 50 amino acids. We systematically identified and characterized the seven DUF1127 members of the plant pathogen Agrobacterium tumefaciens They all give rise to authentic proteins and are differentially expressed as shown at the RNA and protein levels. The seven proteins fall into two subclasses on the basis of their length, sequence, and reciprocal regulation by the LysR-type transcription factor LsrB. The absence of all three short DUF1127 proteins caused a striking phenotype in later growth phases and increased cell aggregation and biofilm formation. Protein profiling and transcriptome sequencing (RNA-seq) analysis of the wild type and triple mutant revealed a large number of differentially regulated genes in late exponential and stationary growth. The most affected genes are involved in phosphate uptake, glycine/serine homeostasis, and nitrate respiration. The results suggest a redundant function of the small DUF1127 paralogs in nutrient acquisition and central carbon metabolism of A. tumefaciens They may be required for diauxic switching between carbon sources when sugar from the medium is depleted. We end by discussing how DUF1127 might confer such a global impact on cell physiology and gene expression.IMPORTANCE Despite being prevalent in numerous ecologically and clinically relevant bacterial species, the biological role of proteins with a domain of unknown function, DUF1127, is unclear. Experimental models are needed to approach their elusive function. We used the phytopathogen Agrobacterium tumefaciens, a natural genetic engineer that causes crown gall disease, and focused on its three small DUF1127 proteins. They have redundant and pervasive roles in nutrient acquisition, cellular metabolism, and biofilm formation. The study shows that small proteins have important previously missed biological functions. How small basic proteins can have such a broad impact is a fascinating prospect of future research.
Collapse
|
5
|
Abstract
Microbial communities are widespread in the environment, and to isolate and identify species or to determine relations among microorganisms, some 'omics methods like metagenomics, proteomics, and metabolomics have been used. When combined with various 'omics data, models known as artificial microbial ecosystems (AME) are powerful methods that can make functional predictions about microbial communities. Reconstruction of an AME model is the first step for model analysis. Many techniques have been applied to the construction of AME models, e.g., the compartmentalization approach, community objectives method, and dynamic analysis approach. Of these approaches, species compartmentalization is the most relevant to genetics. Besides, some algorithms have been developed for the analysis of AME models. In this chapter, we present a general protocol for the use of the species compartmentalization method to reconstruct a model of microbial communities. Then, the analysis of an AME is discussed.
Collapse
|
6
|
Kalkatawi M, Alam I, Bajic VB. BEACON: automated tool for Bacterial GEnome Annotation ComparisON. BMC Genomics 2015; 16:616. [PMID: 26283419 PMCID: PMC4539851 DOI: 10.1186/s12864-015-1826-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2015] [Accepted: 08/07/2015] [Indexed: 11/25/2022] Open
Abstract
Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1826-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Manal Kalkatawi
- Computational Bioscience Research Centre (CBRC), King Abdullah University of Science and Technology (KAUST), 23955-6900, Thuwal, Kingdom of Saudi Arabia.
| | - Intikhab Alam
- Computational Bioscience Research Centre (CBRC), King Abdullah University of Science and Technology (KAUST), 23955-6900, Thuwal, Kingdom of Saudi Arabia.
| | - Vladimir B Bajic
- Computational Bioscience Research Centre (CBRC), King Abdullah University of Science and Technology (KAUST), 23955-6900, Thuwal, Kingdom of Saudi Arabia.
| |
Collapse
|
7
|
Joice R, Yasuda K, Shafquat A, Morgan XC, Huttenhower C. Determining microbial products and identifying molecular targets in the human microbiome. Cell Metab 2014; 20:731-741. [PMID: 25440055 PMCID: PMC4254638 DOI: 10.1016/j.cmet.2014.10.003] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Human-associated microbes are the source of many bioactive microbial products (proteins and metabolites) that play key functions both in human host pathways and in microbe-microbe interactions. Culture-independent studies now provide an accelerated means of exploring novel bioactives in the human microbiome; however, intriguingly, a substantial fraction of the microbial metagenome cannot be mapped to annotated genes or isolate genomes and is thus of unknown function. Meta'omic approaches, including metagenomic sequencing, metatranscriptomics, metabolomics, and integration of multiple assay types, represent an opportunity to efficiently explore this large pool of potential therapeutics. In combination with appropriate follow-up validation, high-throughput culture-independent assays can be combined with computational approaches to identify and characterize novel and biologically interesting microbial products. Here we briefly review the state of microbial product identification and characterization and discuss possible next steps to catalog and leverage the large uncharted fraction of the microbial metagenome.
Collapse
Affiliation(s)
- Regina Joice
- Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Koji Yasuda
- Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Afrah Shafquat
- Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Xochitl C Morgan
- Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Curtis Huttenhower
- Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| |
Collapse
|
8
|
Toby IT, Widmer J, Dyer DW. Divergence of protein-coding capacity and regulation in the Bacillus cereus sensu lato group. BMC Bioinformatics 2014; 15 Suppl 11:S8. [PMID: 25350501 PMCID: PMC4251056 DOI: 10.1186/1471-2105-15-s11-s8] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND The Bacillus cereus sensu lato group contains ubiquitous facultative anaerobic soil-borne Gram-positive spore-forming bacilli. Molecular phylogeny and comparative genome sequencing have suggested that these organisms should be classified as a single species. While clonal in nature, there do not appear to be species-specific clonal lineages, excepting B. anthracis, in spite of the wide array of phenotypes displayed by these organisms. RESULTS We compared the protein-coding content of 201 B. cereus sensu lato genomes to characterize differences and understand the consequences of these differences on biological function. From this larger group we selected a subset consisting of 25 whole genomes for deeper analysis. Cluster analysis of orthologous proteins grouped these genomes into five distinct clades. Each clade could be characterized by unique genes shared among the group, with consequences for the phenotype of each clade. Surprisingly, this population structure recapitulates our recent observations on the divergence of the generalized stress response (SigB) regulons in these organisms. Divergence of the SigB regulon among these organisms is primarily due to the placement of SigB-dependent promoters that bring genes from a common gene pool into/out of the SigB regulon. CONCLUSIONS Collectively, our observations suggest the hypothesis that the evolution of these closely related bacteria is a consequence of two distinct processes. Horizontal gene transfer, gene duplication/divergence and deletion dictate the underlying coding capacity in these genomes. Regulatory divergence overlays this protein coding reservoir and shapes the expression of both the unique and shared coding capacity of these organisms, resulting in phenotypic divergence. Data from other organisms suggests that this is likely a common pattern in prokaryotic evolution.
Collapse
Affiliation(s)
- Inimary T Toby
- University of Oklahoma Health Sciences Center, 975 NE 10th Street, BRC-1106, Oklahoma City, OK 73104, USA
| | - Jonah Widmer
- University of Oklahoma Health Sciences Center, 975 NE 10th Street, BRC-1106, Oklahoma City, OK 73104, USA
| | - David W Dyer
- University of Oklahoma Health Sciences Center, 975 NE 10th Street, BRC-1106, Oklahoma City, OK 73104, USA
| |
Collapse
|
9
|
SearchDOGS bacteria, software that provides automated identification of potentially missed genes in annotated bacterial genomes. J Bacteriol 2014; 196:2030-42. [PMID: 24659774 DOI: 10.1128/jb.01368-13] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
We report the development of SearchDOGS Bacteria, software to automatically detect missing genes in annotated bacterial genomes by combining BLAST searches with comparative genomics. Having successfully applied the approach to yeast genomes, we redeveloped SearchDOGS to function as a standalone, downloadable package, requiring only a set of GenBank annotation files as input. The software automatically generates a homology structure using reciprocal BLAST and a synteny-based method; this is followed by a scan of the entire genome of each species for unannotated genes. Results are provided in a HTML interface, providing coordinates, BLAST results, syntenic location, omega values (Ka/Ks, where Ks is the number of synonymous substitutions per synonymous site and Ka is the number of nonsynonymous substitutions per nonsynonymous site) for protein conservation estimates, and other information for each candidate gene. Using SearchDOGS Bacteria, we identified 155 gene candidates in the Shigella boydii sb227 genome, including 56 candidates of length < 60 codons. SearchDOGS Bacteria has two major advantages over currently available annotation software. First, it outperforms current methods in terms of sensitivity and is highly effective at identifying small or highly diverged genes. Second, as a freely downloadable package, it can be used with unpublished or confidential data.
Collapse
|
10
|
Ely B, Scott LE. Correction of the Caulobacter crescentus NA1000 genome annotation. PLoS One 2014; 9:e91668. [PMID: 24621776 PMCID: PMC3951458 DOI: 10.1371/journal.pone.0091668] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Accepted: 02/14/2014] [Indexed: 11/18/2022] Open
Abstract
Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. These results demonstrate that a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%.
Collapse
Affiliation(s)
- Bert Ely
- Department of Biological Sciences, University of South Carolina, Columbia, South Carolina, United States of America
- * E-mail:
| | - LaTia Etheredge Scott
- Department of Biological Sciences, University of South Carolina, Columbia, South Carolina, United States of America
| |
Collapse
|
11
|
Privé F, Kaderbhai NN, Girdwood S, Worgan HJ, Pinloche E, Scollan ND, Huws SA, Newbold CJ. Identification and characterization of three novel lipases belonging to families II and V from Anaerovibrio lipolyticus 5ST. PLoS One 2013; 8:e69076. [PMID: 23950883 PMCID: PMC3741291 DOI: 10.1371/journal.pone.0069076] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2013] [Accepted: 06/04/2013] [Indexed: 11/19/2022] Open
Abstract
Following the isolation, cultivation and characterization of the rumen bacterium Anaerovibrio lipolyticus in the 1960s, it has been recognized as one of the major species involved in lipid hydrolysis in ruminant animals. However, there has been limited characterization of the lipases from the bacterium, despite the importance of understanding lipolysis and its impact on subsequent biohydrogenation of polyunsaturated fatty acids by rumen microbes. This study describes the draft genome of Anaerovibrio lipolytica 5ST, and the characterization of three lipolytic genes and their translated protein. The uncompleted draft genome was 2.83 Mbp and comprised of 2,673 coding sequences with a G+C content of 43.3%. Three putative lipase genes, alipA, alipB and alipC, encoding 492-, 438- and 248- amino acid peptides respectively, were identified using RAST. Phylogenetic analysis indicated that alipA and alipB clustered with the GDSL/SGNH family II, and alipC clustered with lipolytic enzymes from family V. Subsequent expression and purification of the enzymes showed that they were thermally unstable and had higher activities at neutral to alkaline pH. Substrate specificity assays indicated that the enzymes had higher hydrolytic activity against caprylate (C8), laurate (C12) and myristate (C14).
Collapse
Affiliation(s)
- Florence Privé
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, Ceredigion, United Kingdom
| | - Naheed N. Kaderbhai
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, Ceredigion, United Kingdom
| | - Susan Girdwood
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, Ceredigion, United Kingdom
| | - Hilary J. Worgan
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, Ceredigion, United Kingdom
| | - Eric Pinloche
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, Ceredigion, United Kingdom
| | - Nigel D. Scollan
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, Ceredigion, United Kingdom
| | - Sharon A. Huws
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, Ceredigion, United Kingdom
| | - C. Jamie Newbold
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, Ceredigion, United Kingdom
- * E-mail:
| |
Collapse
|
12
|
Quantification of endospore-forming firmicutes by quantitative PCR with the functional gene spo0A. Appl Environ Microbiol 2013; 79:5302-12. [PMID: 23811505 DOI: 10.1128/aem.01376-13] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Bacterial endospores are highly specialized cellular forms that allow endospore-forming Firmicutes (EFF) to tolerate harsh environmental conditions. EFF are considered ubiquitous in natural environments, in particular, those subjected to stress conditions. In addition to natural habitats, EFF are often the cause of contamination problems in anthropogenic environments, such as industrial production plants or hospitals. It is therefore desirable to assess their prevalence in environmental and industrial fields. To this end, a high-sensitivity detection method is still needed. The aim of this study was to develop and evaluate an approach based on quantitative PCR (qPCR). For this, the suitability of functional genes specific for and common to all EFF were evaluated. Seven genes were considered, but only spo0A was retained to identify conserved regions for qPCR primer design. An approach based on multivariate analysis was developed for primer design. Two primer sets were obtained and evaluated with 16 pure cultures, including representatives of the genera Bacillus, Paenibacillus, Brevibacillus, Geobacillus, Alicyclobacillus, Sulfobacillus, Clostridium, and Desulfotomaculum, as well as with environmental samples. The primer sets developed gave a reliable quantification when tested on laboratory strains, with the exception of Sulfobacillus and Desulfotomaculum. A test using sediment samples with a diverse EFF community also gave a reliable quantification compared to 16S rRNA gene pyrosequencing. A detection limit of about 10(4) cells (or spores) per gram of initial material was calculated, indicating this method has a promising potential for the detection of EFF over a wide range of applications.
Collapse
|
13
|
A semi-automated genome annotation comparison and integration scheme. BMC Bioinformatics 2013; 14:172. [PMID: 23725374 PMCID: PMC3680241 DOI: 10.1186/1471-2105-14-172] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2012] [Accepted: 05/23/2013] [Indexed: 02/02/2023] Open
Abstract
Background Different genome annotation services have been developed in recent years and widely used. However, the functional annotation results from different services are often not the same and a scheme to obtain consensus functional annotations by integrating different results is in demand. Results This article presents a semi-automated scheme that is capable of comparing functional annotations from different sources and consequently obtaining a consensus genome functional annotation result. In this study, we used four automated annotation services to annotate a newly sequenced genome--Arcobacter butzleri ED-1. Our scheme is divided into annotation comparison and annotation determination sections. In the functional annotation comparison section, we employed gene synonym lists to tackle term difference problems. Multiple techniques from information retrieval were used to preprocess the functional annotations. Based on the functional annotation comparison results, we designed a decision tree to obtain a consensus functional annotation result. Experimental results show that our approach can greatly reduce the workload of manual comparison by automatically comparing 87% of the functional annotations. In addition, it automatically determined 87% of the functional annotations, leaving only 13% of the genes for manual curation. We applied this approach across six phylogenetically different genomes in order to assess the performance consistency. The results showed that our scheme is able to automatically perform, on average, 73% and 86% of the annotation comparison and determination tasks, respectively. Conclusions We propose a semi-automatic and effective scheme to compare and determine genome functional annotations. It greatly reduces the manual work required in genome functional annotation. As this scheme does not require any specific biological knowledge, it is readily applicable for genome annotation comparison and genome re-annotation projects.
Collapse
|
14
|
Jimenez-Lopez JC, Gachomo EW, Sharma S, Kotchoni SO. Genome sequencing and next-generation sequence data analysis: A comprehensive compilation of bioinformatics tools and databases. ACTA ACUST UNITED AC 2013. [DOI: 10.4236/ajmb.2013.32016] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
15
|
Lei Y, Kang SK, Gao J, Jia XS, Chen LL. Improved annotation of a plant pathogen genome Xanthomonas oryzae pv. oryzae PXO99A. J Biomol Struct Dyn 2012; 31:342-50. [PMID: 22849520 DOI: 10.1080/07391102.2012.698218] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Many bacterial genomes have been sequenced and stored in public databases now, of which Reference Sequence (RefSeq) is the most widely used one. However, the annotation in RefSeq is still unsatisfactory. The present analysis is focused on the re-annotation of an important plant pathogen genome Xanthomonas oryzae pv. oryzae PXO99A (Xoo PXO99A), which is the causal agent of bacterial blight on rice. Based on the parameters of 28 nucleotide frequencies and support vector machine algorithm, 41 originally annotated hypothetical genes were recognized as noncoding sequences, which were further supported by principal component analysis and other evidence. Ten of them were tested with reverse transcription-polymerase chain reaction experiments (RT-PCR), and all of them were confirmed to be noncoding sequences. Furthermore, 197 potential new genes not annotated in RefSeq were both recognized by two ab initio gene finding programs. Most of them only have sequence similarities with part of the known genes in other species, so they are unlikely to be protein-coding genes. Twelve potential new genes have high full-length sequence similarities with function-known genes, which are very likely to be true protein-coding genes. All the 12 potential genes were tested with RT-PCR, and 11 of them (92%) were successfully amplified in cDNA template. The RT-PCR experiments confirm that our theoretical prediction has high accuracy. The improvement of Xoo PXO99A annotation is helpful for the research of lifestyle, metabolism, and pathogenicity of this important plant pathogen. The improved annotation can be obtained from http://211.69.128.148/Xoo .
Collapse
Affiliation(s)
- Yang Lei
- National Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Center for Bioinformatics, Huazhong Agricultural University, No. 1 Shizishan Street, Hongshan District, Wuhan, 430070, P.R. China
| | | | | | | | | |
Collapse
|
16
|
Abstract
With the development of ultra-high-throughput technologies, the cost of sequencing bacterial genomes has been vastly reduced. As more genomes are sequenced, less time can be spent manually annotating those genomes, resulting in an increased reliance on automatic annotation pipelines. However, automatic pipelines can produce inaccurate genome annotation and their results often require manual curation. Here, we discuss the automatic and manual annotation of bacterial genomes, identify common problems introduced by the current genome annotation process and suggests potential solutions.
Collapse
Affiliation(s)
- Emily J Richardson
- The Roslin Institute, University of Edinburgh, Easter Bush, EH25 9RG, UK
| | | |
Collapse
|
17
|
D'Angelo S, Velappan N, Mignone F, Santoro C, Sblattero D, Kiss C, Bradbury ARM. Filtering "genic" open reading frames from genomic DNA samples for advanced annotation. BMC Genomics 2011; 12 Suppl 1:S5. [PMID: 21810207 PMCID: PMC3223728 DOI: 10.1186/1471-2164-12-s1-s5] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background In order to carry out experimental gene annotation, DNA encoding open reading frames (ORFs) derived from real genes (termed "genic") in the correct frame is required. When genes are correctly assigned, isolation of genic DNA for functional annotation can be carried out by PCR. However, not all genes are correctly assigned, and even when correctly assigned, gene products are often incorrectly folded when expressed in heterologous hosts. This is a problem that can sometimes be overcome by the expression of protein fragments encoding domains, rather than full-length proteins. One possible method to isolate DNA encoding such domains would to "filter" complex DNA (cDNA libraries, genomic and metagenomic DNA) for gene fragments that confer a selectable phenotype relying on correct folding, with all such domains present in a complex DNA sample, termed the “domainome”. Results In this paper we discuss the preparation of diverse genic ORF libraries from randomly fragmented genomic DNA using ß-lactamase to filter out the open reading frames. By cloning DNA fragments between leader sequences and the mature ß-lactamase gene, colonies can be selected for resistance to ampicillin, conferred by correct folding of the lactamase gene. Our experiments demonstrate that the majority of surviving colonies contain genic open reading frames, suggesting that ß-lactamase is acting as a selectable folding reporter. Furthermore, different leaders (Sec, TAT and SRP), normally translocating different protein classes, filter different genic fragment subsets, indicating that their use increases the fraction of the “domainone” that is accessible. Conclusions The availability of ORF libraries, obtained with the filtering method described here, combined with screening methods such as phage display and protein-protein interaction studies, or with protein structure determination projects, can lead to the identification and structural determination of functional genic ORFs. ORF libraries represent, moreover, a useful tool to proceed towards high-throughput functional annotation of newly sequenced genomes.
Collapse
Affiliation(s)
- Sara D'Angelo
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | | | | | | | | | | | | |
Collapse
|
18
|
Abstract
Vaccine informatics is an emerging research area that focuses on development and applications of bioinformatics methods that can be used to facilitate every aspect of the preclinical, clinical, and postlicensure vaccine enterprises. Many immunoinformatics algorithms and resources have been developed to predict T- and B-cell immune epitopes for epitope vaccine development and protective immunity analysis. Vaccine protein candidates are predictable in silico from genome sequences using reverse vaccinology. Systematic transcriptomics and proteomics gene expression analyses facilitate rational vaccine design and identification of gene responses that are correlates of protection in vivo. Mathematical simulations have been used to model host-pathogen interactions and improve vaccine production and vaccination protocols. Computational methods have also been used for development of immunization registries or immunization information systems, assessment of vaccine safety and efficacy, and immunization modeling. Computational literature mining and databases effectively process, mine, and store large amounts of vaccine literature and data. Vaccine Ontology (VO) has been initiated to integrate various vaccine data and support automated reasoning.
Collapse
|
19
|
Segerman B, De Medici D, Ehling Schulz M, Fach P, Fenicia L, Fricker M, Wielinga P, Van Rotterdam B, Knutsson R. Bioinformatic tools for using whole genome sequencing as a rapid high resolution diagnostic typing tool when tracing bioterror organisms in the food and feed chain. Int J Food Microbiol 2011; 145 Suppl 1:S167-76. [DOI: 10.1016/j.ijfoodmicro.2010.06.027] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2010] [Revised: 06/23/2010] [Accepted: 06/27/2010] [Indexed: 10/19/2022]
|
20
|
Larsen PE, Trivedi G, Sreedasyam A, Lu V, Podila GK, Collart FR. Using deep RNA sequencing for the structural annotation of the Laccaria bicolor mycorrhizal transcriptome. PLoS One 2010; 5:e9780. [PMID: 20625404 PMCID: PMC2897884 DOI: 10.1371/journal.pone.0009780] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2010] [Accepted: 02/26/2010] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. METHODOLOGY We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derived from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. CONCLUSIONS 69% of expressed mycorrhizal JGI "best" gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural annotation in other species, provided that there is a sequenced genome and a set of gene models.
Collapse
Affiliation(s)
- Peter E. Larsen
- Biosciences Division, Argonne National Laboratory, Lemont, Illinois, United States of America
| | - Geetika Trivedi
- Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, Alabama, United States of America
| | - Avinash Sreedasyam
- Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, Alabama, United States of America
| | - Vincent Lu
- Biosciences Division, Argonne National Laboratory, Lemont, Illinois, United States of America
| | - Gopi K. Podila
- Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, Alabama, United States of America
| | - Frank R. Collart
- Biosciences Division, Argonne National Laboratory, Lemont, Illinois, United States of America
- * E-mail:
| |
Collapse
|
21
|
Poptsova MS, Gogarten JP. Using comparative genome analysis to identify problems in annotated microbial genomes. Microbiology (Reading) 2010; 156:1909-1917. [DOI: 10.1099/mic.0.033811-0] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Genome annotation is a tedious task that is mostly done by automated methods; however, the accuracy of these approaches has been questioned since the beginning of the sequencing era. Genome annotation is a multilevel process, and errors can emerge at different stages: during sequencing, as a result of gene-calling procedures, and in the process of assigning gene functions. Missed or wrongly annotated genes differentially impact different types of analyses. Here we discuss and demonstrate how the methods of comparative genome analysis can refine annotations by locating missing orthologues. We also discuss possible reasons for errors and show that the second-generation annotation systems, which combine multiple gene-calling programs with similarity-based methods, perform much better than the first annotation tools. Since old errors may propagate to the newly sequenced genomes, we emphasize that the problem of continuously updating popular public databases is an urgent and unresolved one. Due to the progress in genome-sequencing technologies, automated annotation techniques will remain the main approach in the future. Researchers need to be aware of the existing errors in the annotation of even well-studied genomes, such as Escherichia coli, and consider additional quality control for their results.
Collapse
Affiliation(s)
- Maria S. Poptsova
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269-3125, USA
| | - J. Peter Gogarten
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269-3125, USA
| |
Collapse
|
22
|
Genome-wide analysis of intergenic regions of Mycobacterium tuberculosis H37Rv using Affymetrix GeneChips. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2010:23054. [PMID: 18253472 DOI: 10.1155/2007/23054] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2007] [Accepted: 08/14/2007] [Indexed: 11/17/2022]
Abstract
Sequencing the complete genome of Mycobacterium tuberculosis H37Rv is a major milestone in the genome project and it sheds new light in our fight with tuberculosis. The genome contains around 4000 genes (protein-coding sequences) in the original genome annotation. A subsequent reannotation of the genome has added 80 more genes. However, we have found that the intergenic regions can exhibit expression signals, as evidenced by microarray hybridization. It is then reasonable to suspect that there are unidentified genes in these regions. We conducted a genome-wide analysis using the Affymetrix GeneChip to explore genes contained in the intergenic sequences of the M. tuberculosis H37Rv genome. A working criterion for potential protein-coding genes was based on bioinformatics, consisting of the gene structure, protein coding potential, and presence of ortholog evidence. The bioinformatics criteria in conjunction with transcriptional evidence revealed potential genes with a specific function, such as a DNA-binding protein in the CopG family and a nickel binding GTPase, as well as hypothetical proteins that had not been reported in the H37Rv genome. This study further demonstrated that microarray-based transcriptional evidence would facilitate genome-wide gene finding, and is also the first report concerning intergenic expression in M. tuberculosis genome.
Collapse
|
23
|
Senger RS. Biofuel production improvement with genome-scale models: The role of cell composition. Biotechnol J 2010; 5:671-85. [DOI: 10.1002/biot.201000007] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
24
|
Fournier PE, Raoult D. Bacterial genomes. Infect Dis (Lond) 2010. [DOI: 10.1016/b978-0-323-04579-7.00007-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
|
25
|
Sintchenko V. Informatics for Infectious Disease Research and Control. INFECTIOUS DISEASE INFORMATICS 2010. [PMCID: PMC7120928 DOI: 10.1007/978-1-4419-1327-2_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The goal of infectious disease informatics is to optimize the clinical and public health management of infectious diseases through improvements in the development and use of antimicrobials, the design of more effective vaccines, the identification of biomarkers for life-threatening infections, a better understanding of host-pathogen interactions, and biosurveillance and clinical decision support. Infectious disease informatics can lead to more targeted and effective approaches for the prevention, diagnosis and treatment of infections through a comprehensive review of the genetic repertoire and metabolic profiles of a pathogen. The developments in informatics have been critical in boosting the translational science and in supporting both reductionist and integrative research paradigms.
Collapse
|
26
|
Zaremba S, Ramos-Santacruz M, Hampton T, Shetty P, Fedorko J, Whitmore J, Greene JM, Perna NT, Glasner JD, Plunkett G, Shaker M, Pot D. Text-mining of PubMed abstracts by natural language processing to create a public knowledge base on molecular mechanisms of bacterial enteropathogens. BMC Bioinformatics 2009; 10:177. [PMID: 19515247 PMCID: PMC2704210 DOI: 10.1186/1471-2105-10-177] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2008] [Accepted: 06/10/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The Enteropathogen Resource Integration Center (ERIC; http://www.ericbrc.org) has a goal of providing bioinformatics support for the scientific community researching enteropathogenic bacteria such as Escherichia coli and Salmonella spp. Rapid and accurate identification of experimental conclusions from the scientific literature is critical to support research in this field. Natural Language Processing (NLP), and in particular Information Extraction (IE) technology, can be a significant aid to this process. DESCRIPTION We have trained a powerful, state-of-the-art IE technology on a corpus of abstracts from the microbial literature in PubMed to automatically identify and categorize biologically relevant entities and predicative relations. These relations include: Genes/Gene Products and their Roles; Gene Mutations and the resulting Phenotypes; and Organisms and their associated Pathogenicity. Evaluations on blind datasets show an F-measure average of greater than 90% for entities (genes, operons, etc.) and over 70% for relations (gene/gene product to role, etc). This IE capability, combined with text indexing and relational database technologies, constitute the core of our recently deployed text mining application. CONCLUSION Our Text Mining application is available online on the ERIC website (http://www.ericbrc.org/portal/eric/articles). The information retrieval interface displays a list of recently published enteropathogen literature abstracts, and also provides a search interface to execute custom queries by keyword, date range, etc. Upon selection, processed abstracts and the entities and relations extracted from them are retrieved from a relational database and marked up to highlight the entities and relations. The abstract also provides links from extracted genes and gene products to the ERIC Annotations database, thus providing access to comprehensive genomic annotations and adding value to both the text-mining and annotations systems.
Collapse
Affiliation(s)
- Sam Zaremba
- ERIC-BRC, SRA International Inc, Global Health Sector, Rockville MD, 20852, USA
| | | | | | - Panna Shetty
- ERIC-BRC, SRA International Inc, Global Health Sector, Rockville MD, 20852, USA
| | - Joel Fedorko
- ERIC-BRC, SRA International Inc, Global Health Sector, Rockville MD, 20852, USA
| | - Jon Whitmore
- ERIC-BRC, SRA International Inc, Global Health Sector, Rockville MD, 20852, USA
| | - John M Greene
- ERIC-BRC, SRA International Inc, Global Health Sector, Rockville MD, 20852, USA
| | - Nicole T Perna
- Genome Center, University of Wisconsin, Madison WI, 53706, USA
- Laboratory of Genetics, University of Wisconsin, Madison WI, 53706, USA
| | | | - Guy Plunkett
- Laboratory of Genetics, University of Wisconsin, Madison WI, 53706, USA
| | - Matthew Shaker
- ERIC-BRC, SRA International Inc, Global Health Sector, Rockville MD, 20852, USA
| | - David Pot
- ERIC-BRC, SRA International Inc, Global Health Sector, Rockville MD, 20852, USA
| |
Collapse
|
27
|
Giuliani SE, Frank AM, Collart FR. Functional assignment of solute-binding proteins of ABC transporters using a fluorescence-based thermal shift assay. Biochemistry 2009; 47:13974-84. [PMID: 19063603 DOI: 10.1021/bi801648r] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We have used a fluorescence-based thermal shift (FTS) assay to identify amino acids that bind to solute-binding proteins in the bacterial ABC transporter family. The assay was validated with a set of six proteins with known binding specificity and was consistently able to map proteins with their known binding ligands. The assay also identified additional candidate binding ligands for several of the amino acid-binding proteins in the validation set. We extended this approach to additional targets and demonstrated the ability of the FTS assay to unambiguously identify preferential binding for several homologues of amino acid-binding proteins with known specificity and to functionally annotate proteins of unknown binding specificity. The assay is implemented in a microwell plate format and provides a rapid approach to validate an anticipated function or to screen proteins of unknown function. The ABC-type transporter family is ubiquitous and transports a variety of biological compounds, but the current annotation of the ligand-binding proteins is limited to mostly generic descriptions of function. The results illustrate the feasibility of the FTS assay to improve the functional annotation of binding proteins associated with ABC-type transporters and suggest this approach that can also be extended to other protein families.
Collapse
Affiliation(s)
- Sarah E Giuliani
- Biosciences Division, Argonne National Laboratory, Lemont, Illinois 60439, USA
| | | | | |
Collapse
|
28
|
Lima T, Auchincloss AH, Coudert E, Keller G, Michoud K, Rivoire C, Bulliard V, de Castro E, Lachaize C, Baratin D, Phan I, Bougueleret L, Bairoch A. HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot. Nucleic Acids Res 2008; 37:D471-8. [PMID: 18849571 PMCID: PMC2686602 DOI: 10.1093/nar/gkn661] [Citation(s) in RCA: 116] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The growth in the number of completely sequenced microbial genomes (bacterial and archaeal) has generated a need for a procedure that provides UniProtKB/Swiss-Prot-quality annotation to as many protein sequences as possible. We have devised a semi-automated system, HAMAP (High-quality Automated and Manual Annotation of microbial Proteomes), that uses manually built annotation templates for protein families to propagate annotation to all members of manually defined protein families, using very strict criteria. The HAMAP system is composed of two databases, the proteome database and the family database, and of an automatic annotation pipeline. The proteome database comprises biological and sequence information for each completely sequenced microbial proteome, and it offers several tools for CDS searches, BLAST options and retrieval of specific sets of proteins. The family database currently comprises more than 1500 manually curated protein families and their annotation templates that are used to annotate proteins that belong to one of the HAMAP families. On the HAMAP website, individual sequences as well as whole genomes can be scanned against all HAMAP families. The system provides warnings for the absence of conserved amino acid residues, unusual sequence length, etc. Thanks to the implementation of HAMAP, more than 200,000 microbial proteins have been fully annotated in UniProtKB/Swiss-Prot (HAMAP website: http://www.expasy.org/sprot/hamap).
Collapse
Affiliation(s)
- Tania Lima
- Swiss-Prot Group, Swiss Institute of Bioinformatics, 1 rue Michel-Servet, 1211 Geneva 4, Switzerland.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
High-throughput phenotypic characterization of Pseudomonas aeruginosa membrane transport genes. PLoS Genet 2008; 4:e1000211. [PMID: 18833300 PMCID: PMC2542419 DOI: 10.1371/journal.pgen.1000211] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2008] [Accepted: 08/29/2008] [Indexed: 11/26/2022] Open
Abstract
The deluge of data generated by genome sequencing has led to an increasing reliance on bioinformatic predictions, since the traditional experimental approach of characterizing gene function one at a time cannot possibly keep pace with the sequence-based discovery of novel genes. We have utilized Biolog phenotype MicroArrays to identify phenotypes of gene knockout mutants in the opportunistic pathogen and versatile soil bacterium Pseudomonas aeruginosa in a relatively high-throughput fashion. Seventy-eight P. aeruginosa mutants defective in predicted sugar and amino acid membrane transporter genes were screened and clear phenotypes were identified for 27 of these. In all cases, these phenotypes were confirmed by independent growth assays on minimal media. Using qRT-PCR, we demonstrate that the expression levels of 11 of these transporter genes were induced from 4- to 90-fold by their substrates identified via phenotype analysis. Overall, the experimental data showed the bioinformatic predictions to be largely correct in 22 out of 27 cases, and led to the identification of novel transporter genes and a potentially new histamine catabolic pathway. Thus, rapid phenotype identification assays are an invaluable tool for confirming and extending bioinformatic predictions. Genome sequencing has led to the identification of literally millions of new genes, for which there is no experimental evidence concerning their function. This limits our knowledge of these genes to computational predictions; however, the accuracy of such bioinformatic predictions is essentially unknown. We have focused on investigating the accuracy of bioinformatic predictions for a specific class of genes—those encoding membrane transporters. Our approach used Biolog phenotype MicroArrays to screen transporter gene knockout mutants in the bacterium P. aeruginosa for the ability to metabolize hundreds of different compounds. We were able to identify functions for 27 out of 78 genes, all of which were confirmed through independent growth assays. For 80% of these genes, the computationally predicted and experimentally determined functions were either identical or generically similar. Additionally, this led to the discovery of entirely new types of transporters and a novel potential histamine metabolic pathway.
Collapse
|
30
|
Montgomerie S, Cruz JA, Shrivastava S, Arndt D, Berjanskii M, Wishart DS. PROTEUS2: a web server for comprehensive protein structure prediction and structure-based annotation. Nucleic Acids Res 2008; 36:W202-9. [PMID: 18483082 PMCID: PMC2447806 DOI: 10.1093/nar/gkn255] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
PROTEUS2 is a web server designed to support comprehensive protein structure prediction and structure-based annotation. PROTEUS2 accepts either single sequences (for directed studies) or multiple sequences (for whole proteome annotation) and predicts the secondary and, if possible, tertiary structure of the query protein(s). Unlike most other tools or servers, PROTEUS2 bundles signal peptide identification, transmembrane helix prediction, transmembrane β-strand prediction, secondary structure prediction (for soluble proteins) and homology modeling (i.e. 3D structure generation) into a single prediction pipeline. Using a combination of progressive multi-sequence alignment, structure-based mapping, hidden Markov models, multi-component neural nets and up-to-date databases of known secondary structure assignments, PROTEUS is able to achieve among the highest reported levels of predictive accuracy for signal peptides (Q2 = 94%), membrane spanning helices (Q2 = 87%) and secondary structure (Q3 score of 81.3%). PROTEUS2's homology modeling services also provide high quality 3D models that compare favorably with those generated by SWISS-MODEL and 3D JigSaw (within 0.2 Å RMSD). The average PROTEUS2 prediction takes ∼3 min per query sequence. The PROTEUS2 server along with source code for many of its modules is accessible a http://wishart.biology.ualberta.ca/proteus2.
Collapse
Affiliation(s)
- Scott Montgomerie
- Department of Computing Science and Department of Biological Sciences, University of Alberta and National Research Council, National Institute for Nanotechnology (NINT), Edmonton, AB, Canada T6G 2E8
| | | | | | | | | | | |
Collapse
|
31
|
Grant JR, Stothard P. The CGView Server: a comparative genomics tool for circular genomes. Nucleic Acids Res 2008; 36:W181-4. [PMID: 18411202 PMCID: PMC2447734 DOI: 10.1093/nar/gkn179] [Citation(s) in RCA: 965] [Impact Index Per Article: 60.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
The CGView Server generates graphical maps of circular genomes that show sequence features, base composition plots, analysis results and sequence similarity plots. Sequences can be supplied in raw, FASTA, GenBank or EMBL format. Additional feature or analysis information can be submitted in the form of GFF (General Feature Format) files. The server uses BLAST to compare the primary sequence to up to three comparison genomes or sequence sets. The BLAST results and feature information are converted to a graphical map showing the entire sequence, or an expanded and more detailed view of a region of interest. Several options are included to control which types of features are displayed and how the features are drawn. The CGView Server can be used to visualize features associated with any bacterial, plasmid, chloroplast or mitochondrial genome, and can aid in the identification of conserved genome segments, instances of horizontal gene transfer, and differences in gene copy number. Because a collection of sequences can be used in place of a comparison genome, maps can also be used to visualize regions of a known genome covered by newly obtained sequence reads. The CGView Server can be accessed at http://stothard.afns.ualberta.ca/cgview_server/
Collapse
Affiliation(s)
- Jason R Grant
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Canada
| | | |
Collapse
|
32
|
Annotation, comparison and databases for hundreds of bacterial genomes. Res Microbiol 2007; 158:724-36. [DOI: 10.1016/j.resmic.2007.09.009] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2007] [Revised: 09/21/2007] [Accepted: 09/26/2007] [Indexed: 11/20/2022]
|
33
|
Raes J, Foerstner KU, Bork P. Get the most out of your metagenome: computational analysis of environmental sequence data. Curr Opin Microbiol 2007; 10:490-8. [DOI: 10.1016/j.mib.2007.09.001] [Citation(s) in RCA: 130] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2007] [Revised: 08/27/2007] [Accepted: 09/03/2007] [Indexed: 11/28/2022]
|