Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Haft DH, Selengut JD, Brinkac LM, Zafar N, White O. Genome Properties: a system for the investigation of prokaryotic genetic content for microbiology, genome annotation and comparative genomics. Bioinformatics 2004;21:293-306. [PMID: 15347579 DOI: 10.1093/bioinformatics/bti015] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

For:	Haft DH, Selengut JD, Brinkac LM, Zafar N, White O. Genome Properties: a system for the investigation of prokaryotic genetic content for microbiology, genome annotation and comparative genomics. Bioinformatics 2004;21:293-306. [PMID: 15347579 DOI: 10.1093/bioinformatics/bti015] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Number

Cited by Other Article(s)

Haroon M, Wang X, Afzal R, Zafar MM, Idrees F, Batool M, Khan AS, Imran M. Novel Plant Breeding Techniques Shake Hands with Cereals to Increase Production. PLANTS (BASEL, SWITZERLAND) 2022;11:1052. [PMID: 35448780 PMCID: PMC9025237 DOI: 10.3390/plants11081052] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 04/07/2022] [Accepted: 04/10/2022] [Indexed: 06/01/2023]

Lin M, Xiong Q, Chung M, Daugherty SC, Nagaraj S, Sengamalay N, Ott S, Godinez A, Tallon LJ, Sadzewicz L, Fraser C, Dunning Hotopp JC, Rikihisa Y. Comparative Analysis of Genome of Ehrlichia sp. HF, a Model Bacterium to Study Fatal Human Ehrlichiosis. BMC Genomics 2021;22:11. [PMID: 33407096 PMCID: PMC7789307 DOI: 10.1186/s12864-020-07309-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 12/07/2020] [Indexed: 12/16/2022] Open

Abstract

BACKGROUND

The genus Ehrlichia consists of tick-borne obligatory intracellular bacteria that can cause deadly diseases of medical and agricultural importance. Ehrlichia sp. HF, isolated from Ixodes ovatus ticks in Japan [also referred to as I. ovatus Ehrlichia (IOE) agent], causes acute fatal infection in laboratory mice that resembles acute fatal human monocytic ehrlichiosis caused by Ehrlichia chaffeensis. As there is no small laboratory animal model to study fatal human ehrlichiosis, Ehrlichia sp. HF provides a needed disease model. However, the inability to culture Ehrlichia sp. HF and the lack of genomic information have been a barrier to advance this animal model. In addition, Ehrlichia sp. HF has several designations in the literature as it lacks a taxonomically recognized name.

RESULTS

We stably cultured Ehrlichia sp. HF in canine histiocytic leukemia DH82 cells from the HF strain-infected mice, and determined its complete genome sequence. Ehrlichia sp. HF has a single double-stranded circular chromosome of 1,148,904 bp, which encodes 866 proteins with a similar metabolic potential as E. chaffeensis. Ehrlichia sp. HF encodes homologs of all virulence factors identified in E. chaffeensis, including 23 paralogs of P28/OMP-1 family outer membrane proteins, type IV secretion system apparatus and effector proteins, two-component systems, ankyrin-repeat proteins, and tandem repeat proteins. Ehrlichia sp. HF is a novel species in the genus Ehrlichia, as demonstrated through whole genome comparisons with six representative Ehrlichia species, subspecies, and strains, using average nucleotide identity, digital DNA-DNA hybridization, and core genome alignment sequence identity.

CONCLUSIONS

The genome of Ehrlichia sp. HF encodes all known virulence factors found in E. chaffeensis, substantiating it as a model Ehrlichia species to study fatal human ehrlichiosis. Comparisons between Ehrlichia sp. HF and E. chaffeensis will enable identification of in vivo virulence factors that are related to host specificity, disease severity, and host inflammatory responses. We propose to name Ehrlichia sp. HF as Ehrlichia japonica sp. nov. (type strain HF), to denote the geographic region where this bacterium was initially isolated.

Collapse

Affiliation(s)

Mingqun Lin Department of Veterinary Biosciences, The Ohio State University, 1925 Coffey Road, Columbus, OH, 43210, USA.
Qingming Xiong Department of Veterinary Biosciences, The Ohio State University, 1925 Coffey Road, Columbus, OH, 43210, USA
Matthew Chung Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Sean C Daugherty Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Sushma Nagaraj Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Naomi Sengamalay Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Sandra Ott Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Al Godinez Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Luke J Tallon Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Lisa Sadzewicz Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Claire Fraser Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA Department of Medicine, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Julie C Dunning Hotopp Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA Department of Microbiology and Immunology, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA Greenebaum Cancer Center, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Yasuko Rikihisa Department of Veterinary Biosciences, The Ohio State University, 1925 Coffey Road, Columbus, OH, 43210, USA.

Collapse

Richardson LJ, Rawlings ND, Salazar GA, Almeida A, Haft DR, Ducq G, Sutton GG, Finn RD. Genome properties in 2019: a new companion database to InterPro for the inference of complete functional attributes. Nucleic Acids Res 2019;47:D564-D572. [PMID: 30364992 PMCID: PMC6323913 DOI: 10.1093/nar/gky1013] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Revised: 10/09/2018] [Accepted: 10/10/2018] [Indexed: 11/14/2022] Open

Mercier J, Josso A, Médigue C, Vallenet D. GROOLS: reactive graph reasoning for genome annotation through biological processes. BMC Bioinformatics 2018;19:132. [PMID: 29642842 PMCID: PMC5896057 DOI: 10.1186/s12859-018-2126-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Accepted: 03/22/2018] [Indexed: 11/22/2022] Open

Bhardwaj T, Somvanshi P. Pan-genome analysis of Clostridium botulinum reveals unique targets for drug development. Gene 2017;623:48-62. [DOI: 10.1016/j.gene.2017.04.019] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2016] [Revised: 03/29/2017] [Accepted: 04/12/2017] [Indexed: 10/19/2022]

Lin M, Bachman K, Cheng Z, Daugherty SC, Nagaraj S, Sengamalay N, Ott S, Godinez A, Tallon LJ, Sadzewicz L, Fraser C, Dunning Hotopp JC, Rikihisa Y. Analysis of complete genome sequence and major surface antigens of Neorickettsia helminthoeca, causative agent of salmon poisoning disease. Microb Biotechnol 2017;10:933-957. [PMID: 28585301 PMCID: PMC5481527 DOI: 10.1111/1751-7915.12731] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2016] [Revised: 03/09/2017] [Accepted: 04/25/2017] [Indexed: 12/31/2022] Open

Affiliation(s)

Mingqun Lin Department of Veterinary Biosciences, The Ohio State University, 1925 Coffey Road, Columbus, OH, 43210, USA
Katherine Bachman Department of Veterinary Biosciences, The Ohio State University, 1925 Coffey Road, Columbus, OH, 43210, USA
Zhihui Cheng Department of Veterinary Biosciences, The Ohio State University, 1925 Coffey Road, Columbus, OH, 43210, USA
Sean C Daugherty Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Sushma Nagaraj Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Naomi Sengamalay Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Sandra Ott Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Al Godinez Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Luke J Tallon Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Lisa Sadzewicz Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Claire Fraser Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA.,Department of Medicine, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Julie C Dunning Hotopp Institute for Genome Sciences, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA.,Department of Microbiology and Immunology, University of Maryland School of Medicine, 801 W. Baltimore St, Baltimore, MD, 21201, USA
Yasuko Rikihisa Department of Veterinary Biosciences, The Ohio State University, 1925 Coffey Road, Columbus, OH, 43210, USA

Collapse

Quantifying the Importance of the Rare Biosphere for Microbial Community Response to Organic Pollutants in a Freshwater Ecosystem. Appl Environ Microbiol 2017;83:AEM.03321-16. [PMID: 28258138 DOI: 10.1128/aem.03321-16] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2016] [Accepted: 02/01/2017] [Indexed: 01/01/2023] Open

Abstract

A single liter of water contains hundreds, if not thousands, of bacterial and archaeal species, each of which typically makes up a very small fraction of the total microbial community (<0.1%), the so-called "rare biosphere." How often, and via what mechanisms, e.g., clonal amplification versus horizontal gene transfer, the rare taxa and genes contribute to microbial community response to environmental perturbations represent important unanswered questions toward better understanding the value and modeling of microbial diversity. We tested whether rare species frequently responded to changing environmental conditions by establishing 20-liter planktonic mesocosms with water from Lake Lanier (Georgia, USA) and perturbing them with organic compounds that are rarely detected in the lake, including 2,4-dichlorophenoxyacetic acid (2,4-D), 4-nitrophenol (4-NP), and caffeine. The populations of the degraders of these compounds were initially below the detection limit of quantitative PCR (qPCR) or metagenomic sequencing methods, but they increased substantially in abundance after perturbation. Sequencing of several degraders (isolates) and time-series metagenomic data sets revealed distinct cooccurring alleles of degradation genes, frequently carried on transmissible plasmids, especially for the 2,4-D mesocosms, and distinct species dominating the post-enrichment microbial communities from each replicated mesocosm. This diversity of species and genes also underlies distinct degradation profiles among replicated mesocosms. Collectively, these results supported the hypothesis that the rare biosphere can serve as a genetic reservoir, which can be frequently missed by metagenomics but enables community response to changing environmental conditions caused by organic pollutants, and they provided insights into the size of the pool of rare genes and species.IMPORTANCE A single liter of water or gram of soil contains hundreds of low-abundance bacterial and archaeal species, the so called rare biosphere. The value of this astonishing biodiversity for ecosystem functioning remains poorly understood, primarily due to the fact that microbial community analysis frequently focuses on abundant organisms. Using a combination of culture-dependent and culture-independent (metagenomics) techniques, we showed that rare taxa and genes commonly contribute to the microbial community response to organic pollutants. Our findings should have implications for future studies that aim to study the role of rare species in environmental processes, including environmental bioremediation efforts of oil spills or other contaminants.

Collapse

Haft DR, Haft DH. A comprehensive software suite for protein family construction and functional site prediction. PLoS One 2017;12:e0171758. [PMID: 28182651 PMCID: PMC5300114 DOI: 10.1371/journal.pone.0171758] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Accepted: 01/25/2017] [Indexed: 11/18/2022] Open

Abstract

In functionally diverse protein families, conservation in short signature regions may outperform full-length sequence comparisons for identifying proteins that belong to a subgroup within which one specific aspect of their function is conserved. The SIMBAL workflow (Sites Inferred by Metabolic Background Assertion Labeling) is a data-mining procedure for finding such signature regions. It begins by using clues from genomic context, such as co-occurrence or conserved gene neighborhoods, to build a useful training set from a large number of uncharacterized but mutually homologous proteins. When training set construction is successful, the YES partition is enriched in proteins that share function with the user’s query sequence, while the NO partition is depleted. A selected query sequence is then mined for short signature regions whose closest matches overwhelmingly favor proteins from the YES partition. High-scoring signature regions typically contain key residues critical to functional specificity, so proteins with the highest sequence similarity across these regions tend to share the same function. The SIMBAL algorithm was described previously, but significant manual effort, expertise, and a supporting software infrastructure were required to prepare the requisite training sets. Here, we describe a new, distributable software suite that speeds up and simplifies the process for using SIMBAL, most notably by providing tools that automate training set construction. These tools have broad utility for comparative genomics, allowing for flexible collection of proteins or protein domains based on genomic context as well as homology, a capability that can greatly assist in protein family construction. Armed with this new software suite, SIMBAL can serve as a fast and powerful in silico alternative to direct experimentation for characterizing proteins and their functional interactions.

Collapse

Brbić M, Piškorec M, Vidulin V, Kriško A, Šmuc T, Supek F. The landscape of microbial phenotypic traits and associated genes. Nucleic Acids Res 2016;44:10074-10090. [PMID: 27915291 PMCID: PMC5137458 DOI: 10.1093/nar/gkw964] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2016] [Revised: 09/21/2016] [Accepted: 10/11/2016] [Indexed: 12/31/2022] Open

Biofilms on Hospital Shower Hoses: Characterization and Implications for Nosocomial Infections. Appl Environ Microbiol 2016;82:2872-2883. [PMID: 26969701 DOI: 10.1128/aem.03529-15] [Citation(s) in RCA: 65] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Accepted: 02/23/2016] [Indexed: 11/20/2022] Open

Abstract

Although the source of drinking water (DW) used in hospitals is commonly disinfected, biofilms forming on water pipelines are a refuge for bacteria, including possible pathogens that survive different disinfection strategies. These biofilm communities are only beginning to be explored by culture-independent techniques that circumvent the limitations of conventional monitoring efforts. Hence, theories regarding the frequency of opportunistic pathogens in DW biofilms and how biofilm members withstand high doses of disinfectants and/or chlorine residuals in the water supply remain speculative. The aim of this study was to characterize the composition of microbial communities growing on five hospital shower hoses using both 16S rRNA gene sequencing of bacterial isolates and whole-genome shotgun metagenome sequencing. The resulting data revealed a Mycobacterium-like population, closely related to Mycobacterium rhodesiae and Mycobacterium tusciae, to be the predominant taxon in all five samples, and its nearly complete draft genome sequence was recovered. In contrast, the fraction recovered by culture was mostly affiliated with Proteobacteria, including members of the genera Sphingomonas, Blastomonas, and Porphyrobacter.The biofilm community harbored genes related to disinfectant tolerance (2.34% of the total annotated proteins) and a lower abundance of virulence determinants related to colonization and evasion of the host immune system. Additionally, genes potentially conferring resistance to β-lactam, aminoglycoside, amphenicol, and quinolone antibiotics were detected. Collectively, our results underscore the need to understand the microbiome of DW biofilms using metagenomic approaches. This information might lead to more robust management practices that minimize the risks associated with exposure to opportunistic pathogens in hospitals.

Collapse

Haft DH. Using comparative genomics to drive new discoveries in microbiology. Curr Opin Microbiol 2015;23:189-96. [PMID: 25617609 DOI: 10.1016/j.mib.2014.11.017] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2014] [Revised: 11/19/2014] [Accepted: 11/20/2014] [Indexed: 01/17/2023]

Signal correlations in ecological niches can shape the organization and evolution of bacterial gene regulatory networks. Adv Microb Physiol 2013;61:1-36. [PMID: 23046950 DOI: 10.1016/b978-0-12-394423-8.00001-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Haft DH, Selengut JD, Richter RA, Harkins D, Basu MK, Beck E. TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res 2012. [PMID: 23197656 PMCID: PMC3531188 DOI: 10.1093/nar/gks1234] [Citation(s) in RCA: 400] [Impact Index Per Article: 33.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open

Ricaldi JN, Fouts DE, Selengut JD, Harkins DM, Patra KP, Moreno A, Lehmann JS, Purushe J, Sanka R, Torres M, Webster NJ, Vinetz JM, Matthias MA. Whole genome analysis of Leptospira licerasiae provides insight into leptospiral evolution and pathogenicity. PLoS Negl Trop Dis 2012;6:e1853. [PMID: 23145189 PMCID: PMC3493377 DOI: 10.1371/journal.pntd.0001853] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2012] [Accepted: 08/25/2012] [Indexed: 12/25/2022] Open

Abstract

The whole genome analysis of two strains of the first intermediately pathogenic leptospiral species to be sequenced (Leptospira licerasiae strains VAR010 and MMD0835) provides insight into their pathogenic potential and deepens our understanding of leptospiral evolution. Comparative analysis of eight leptospiral genomes shows the existence of a core leptospiral genome comprising 1547 genes and 452 conserved genes restricted to infectious species (including L. licerasiae) that are likely to be pathogenicity-related. Comparisons of the functional content of the genomes suggests that L. licerasiae retains several proteins related to nitrogen, amino acid and carbohydrate metabolism which might help to explain why these Leptospira grow well in artificial media compared with pathogenic species. L. licerasiae strains VAR010^T and MMD0835 possess two prophage elements. While one element is circular and shares homology with LE1 of L. biflexa, the second is cryptic and homologous to a previously identified but unnamed region in L. interrogans serovars Copenhageni and Lai. We also report a unique O-antigen locus in L. licerasiae comprised of a 6-gene cluster that is unexpectedly short compared with L. interrogans in which analogous regions may include >90 such genes. Sequence homology searches suggest that these genes were acquired by lateral gene transfer (LGT). Furthermore, seven putative genomic islands ranging in size from 5 to 36 kb are present also suggestive of antecedent LGT. How Leptospira become naturally competent remains to be determined, but considering the phylogenetic origins of the genes comprising the O-antigen cluster and other putative laterally transferred genes, L. licerasiae must be able to exchange genetic material with non-invasive environmental bacteria. The data presented here demonstrate that L. licerasiae is genetically more closely related to pathogenic than to saprophytic Leptospira and provide insight into the genomic bases for its infectiousness and its unique antigenic characteristics.

Leptospirosis is one of the most common diseases transmitted by animals worldwide and is important because it is a major cause of febrile illness in tropical areas and also occurs in epidemic form associated with natural disasters and flooding. The mechanisms through which Leptospira cause disease are not well understood. In this study we have sequenced the genomes of two strains of Leptospira licerasiae isolated from a person and a marsupial in the Peruvian Amazon. These strains were thought to be able to cause only mild disease in humans. We have compared these genomes with other leptospires that can cause severe illness and death and another leptospire that does not infect humans or animals. These comparisons have allowed us to demonstrate similarities among the disease-causing Leptospira. Studying genes that are common among infectious strains will allow us to identify genetic factors necessary for infecting, causing disease and determining the severity of disease. We have also found that L. licerasiae seems to be able to uptake and incorporate genetic information from other bacteria found in the environment. This information will allow us to begin to understand how Leptospira species have evolved.

Collapse

Affiliation(s)

Jessica N. Ricaldi Instituto de Medicina Tropical Alexander von Humboldt, Universidad Peruana Cayetano Heredia, Lima, Peru Division of Infectious Diseases, Department of Medicine, University of California San Diego School of Medicine, La Jolla, California, United States of America
Derrick E. Fouts J. Craig Venter Institute, Rockville, Maryland, United States of America
Jeremy D. Selengut J. Craig Venter Institute, Rockville, Maryland, United States of America
Derek M. Harkins J. Craig Venter Institute, Rockville, Maryland, United States of America
Kailash P. Patra Division of Infectious Diseases, Department of Medicine, University of California San Diego School of Medicine, La Jolla, California, United States of America
Angelo Moreno Division of Infectious Diseases, Department of Medicine, University of California San Diego School of Medicine, La Jolla, California, United States of America
Jason S. Lehmann Division of Infectious Diseases, Department of Medicine, University of California San Diego School of Medicine, La Jolla, California, United States of America
Janaki Purushe J. Craig Venter Institute, Rockville, Maryland, United States of America
Ravi Sanka J. Craig Venter Institute, Rockville, Maryland, United States of America
Michael Torres Departamento de Ciencias Celulares y Moleculares, Laboratorio de Investigación y Desarrollo, Facultad de Ciencias, Universidad Peruana Cayetano Heredia, Lima, Peru
Nicholas J. Webster Department of Medicine, University of California San Diego School of Medicine, La Jolla, California, United States of America
Joseph M. Vinetz Instituto de Medicina Tropical Alexander von Humboldt, Universidad Peruana Cayetano Heredia, Lima, Peru Division of Infectious Diseases, Department of Medicine, University of California San Diego School of Medicine, La Jolla, California, United States of America Departamento de Ciencias Celulares y Moleculares, Laboratorio de Investigación y Desarrollo, Facultad de Ciencias, Universidad Peruana Cayetano Heredia, Lima, Peru * E-mail: (JMV); (MAM)
Michael A. Matthias Division of Infectious Diseases, Department of Medicine, University of California San Diego School of Medicine, La Jolla, California, United States of America * E-mail: (JMV); (MAM)

Collapse

Paralanov V, Lu J, Duffy LB, Crabb DM, Shrivastava S, Methé BA, Inman J, Yooseph S, Xiao L, Cassell GH, Waites KB, Glass JI. Comparative genome analysis of 19 Ureaplasma urealyticum and Ureaplasma parvum strains. BMC Microbiol 2012;12:88. [PMID: 22646228 PMCID: PMC3511179 DOI: 10.1186/1471-2180-12-88] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2011] [Accepted: 05/02/2012] [Indexed: 11/10/2022] Open

Abstract

Background

Ureaplasma urealyticum (UUR) and Ureaplasma parvum (UPA) are sexually transmitted bacteria among humans implicated in a variety of disease states including but not limited to: nongonococcal urethritis, infertility, adverse pregnancy outcomes, chorioamnionitis, and bronchopulmonary dysplasia in neonates. There are 10 distinct serotypes of UUR and 4 of UPA. Efforts to determine whether difference in pathogenic potential exists at the ureaplasma serovar level have been hampered by limitations of antibody-based typing methods, multiple cross-reactions and poor discriminating capacity in clinical samples containing two or more serovars.

Results

We determined the genome sequences of the American Type Culture Collection (ATCC) type strains of all UUR and UPA serovars as well as four clinical isolates of UUR for which we were not able to determine serovar designation. UPA serovars had 0.75−0.78 Mbp genomes and UUR serovars were 0.84−0.95 Mbp. The original classification of ureaplasma isolates into distinct serovars was largely based on differences in the major ureaplasma surface antigen called the multiple banded antigen (MBA) and reactions of human and animal sera to the organisms. Whole genome analysis of the 14 serovars and the 4 clinical isolates showed the mba gene was part of a large superfamily, which is a phase variable gene system, and that some serovars have identical sets of mba genes. Most of the differences among serovars are hypothetical genes, and in general the two species and 14 serovars are extremely similar at the genome level.

Conclusions

Comparative genome analysis suggests UUR is more capable of acquiring genes horizontally, which may contribute to its greater virulence for some conditions. The overwhelming evidence of extensive horizontal gene transfer among these organisms from our previous studies combined with our comparative analysis indicates that ureaplasmas exist as quasi-species rather than as stable serovars in their native environment. Therefore, differential pathogenicity and clinical outcome of a ureaplasmal infection is most likely not on the serovar level, but rather may be due to the presence or absence of potential pathogenicity factors in an individual ureaplasma clinical isolate and/or patient to patient differences in terms of autoimmunity and microbiome.

Collapse

Connecting genotype to phenotype in the era of high-throughput sequencing. Biochim Biophys Acta Gen Subj 2011;1810:967-77. [PMID: 21421023 DOI: 10.1016/j.bbagen.2011.03.010] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2010] [Revised: 02/17/2011] [Accepted: 03/13/2011] [Indexed: 12/25/2022]

Lintner NG, Frankel KA, Tsutakawa SE, Alsbury DL, Copié V, Young MJ, Tainer JA, Lawrence CM. The structure of the CRISPR-associated protein Csa3 provides insight into the regulation of the CRISPR/Cas system. J Mol Biol 2011;405:939-55. [PMID: 21093452 PMCID: PMC4507800 DOI: 10.1016/j.jmb.2010.11.019] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2010] [Revised: 11/01/2010] [Accepted: 11/09/2010] [Indexed: 01/07/2023]

Haft DH. Bioinformatic evidence for a widely distributed, ribosomally produced electron carrier precursor, its maturation proteins, and its nicotinoprotein redox partners. BMC Genomics 2011;12:21. [PMID: 21223593 PMCID: PMC3023750 DOI: 10.1186/1471-2164-12-21] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2010] [Accepted: 01/11/2011] [Indexed: 11/10/2022] Open

Abstract

Background

Enzymes in the radical SAM (rSAM) domain family serve in a wide variety of biological processes, including RNA modification, enzyme activation, bacteriocin core peptide maturation, and cofactor biosynthesis. Evolutionary pressures and relationships to other cellular constituents impose recognizable grammars on each class of rSAM-containing system, shaping patterns in results obtained through various comparative genomics analyses.

Results

An uncharacterized gene cluster found in many Actinobacteria and sporadically in Firmicutes, Chloroflexi, Deltaproteobacteria, and one Archaeal plasmid contains a PqqE-like rSAM protein family that includes Rv0693 from Mycobacterium tuberculosis. Members occur clustered with a strikingly well-conserved small polypeptide we designate "mycofactocin," similar in size to bacteriocins and PqqA, precursor of pyrroloquinoline quinone (PQQ). Partial Phylogenetic Profiling (PPP) based on the distribution of these markers identifies the mycofactocin cluster, but also a second tier of high-scoring proteins. This tier, strikingly, is filled with up to thirty-one members per genome from three variant subfamilies that occur, one each, in three unrelated classes of nicotinoproteins. The pattern suggests these variant enzymes require not only NAD(P), but also the novel gene cluster. Further study was conducted using SIMBAL, a PPP-like tool, to search these nicotinoproteins for subsequences best correlated across multiple genomes to the presence of mycofactocin. For both the short chain dehydrogenase/reductase (SDR) and iron-containing dehydrogenase families, aligning SIMBAL's top-scoring sequences to homologous solved crystal structures shows signals centered over NAD(P)-binding sites rather than over substrate-binding or active site residues. Previous studies on some of these proteins have revealed a non-exchangeable NAD cofactor, such that enzymatic activity in vitro requires an artificial electron acceptor such as N,N-dimethyl-4-nitrosoaniline (NDMA) for the enzyme to cycle.

Conclusions

Taken together, these findings suggest that the mycofactocin precursor is modified by the Rv0693 family rSAM protein and other enzymes in its cluster. It becomes an electron carrier molecule that serves in vivo as NDMA and other artificial electron acceptors do in vitro. Subclasses from three different nicotinoprotein families show "only-if" relationships to mycofactocin because they require its presence. This framework suggests a segregated redox pool in which mycofactocin mediates communication among enzymes with non-exchangeable cofactors.

Collapse

Determination of the G+C Content of Prokaryotes. J Microbiol Methods 2011. [DOI: 10.1016/b978-0-12-387730-7.00014-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

InterPro protein classification. Methods Mol Biol 2011;694:37-47. [PMID: 21082426 DOI: 10.1007/978-1-60761-977-2_3] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Abstract

Improvements in nucleotide sequencing technology have resulted in an ever increasing number of nucleotide and protein sequences being deposited in databases. Unfortunately, the ability to manually classify and annotate these sequences cannot keep pace with their rapid generation, resulting in an increased bias toward unannotated sequence. Automatic annotation tools can help redress the balance. There are a number of different groups working to produce protein signatures that describe protein families, functional domains or conserved sites within related groups of proteins. Protein signature databases include CATH-Gene3D, HAMAP, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY, and TIGRFAMs. Their approaches range from characterising small conserved motifs that can identify members of a family or subfamily, to the use of hidden Markov models that describe the conservation of residues over entire domains or whole proteins. To increase their value as protein classification tools, protein signatures from these 11 databases have been combined into one, powerful annotation tool: the InterPro database (http://www.ebi.ac.uk/interpro/) (Hunter et al., Nucleic Acids Res 37:D211-D215, 2009). InterPro is an open-source protein resource used for the automatic annotation of proteins, and is scalable to the analysis of entire new genomes through the use of a downloadable version of InterProScan, which can be incorporated into an existing local pipeline. InterPro provides structural information from PDB (Kouranov et al., Nucleic Acids Res 34:D302-D305, 2006), its classification in CATH (Cuff et al., Nucleic Acids Res 37:D310-D314, 2009) and SCOP (Andreeva et al., Nucleic Acids Res 36:D419-D425, 2008), as well as homology models from ModBase (Pieper et al., Nucleic Acids Res 37:D347-D354, 2009) and SwissModel (Kiefer et al., Nucleic Acids Res 37:D387-D392, 2009), allowing a direct comparison of the protein signatures with the available structural information. This chapter reviews the signature methods found in the InterPro database, and provides an overview of the InterPro resource itself.

Collapse

Madupu R, Brinkac LM, Harrow J, Wilming LG, Böhme U, Lamesch P, Hannick LI. Meeting report: a workshop on Best Practices in Genome Annotation. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2010;2010:baq001. [PMID: 20428316 PMCID: PMC2860899 DOI: 10.1093/database/baq001] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/12/2009] [Revised: 01/08/2010] [Accepted: 01/11/2010] [Indexed: 01/28/2023]

Selengut JD, Rusch DB, Haft DH. Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL): adapting the Partial Phylogenetic Profiling algorithm to scan sequences for signatures that predict protein function. BMC Bioinformatics 2010;11:52. [PMID: 20102603 PMCID: PMC3098086 DOI: 10.1186/1471-2105-11-52] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2009] [Accepted: 01/26/2010] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Comparative genomics methods such as phylogenetic profiling can mine powerful inferences from inherently noisy biological data sets. We introduce Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL), a method that applies the Partial Phylogenetic Profiling (PPP) approach locally within a protein sequence to discover short sequence signatures associated with functional sites. The approach is based on the basic scoring mechanism employed by PPP, namely the use of binomial distribution statistics to optimize sequence similarity cutoffs during searches of partitioned training sets.

RESULTS

Here we illustrate and validate the ability of the SIMBAL method to find functionally relevant short sequence signatures by application to two well-characterized protein families. In the first example, we partitioned a family of ABC permeases using a metabolic background property (urea utilization). Thus, the TRUE set for this family comprised members whose genome of origin encoded a urea utilization system. By moving a sliding window across the sequence of a permease, and searching each subsequence in turn against the full set of partitioned proteins, the method found which local sequence signatures best correlated with the urea utilization trait. Mapping of SIMBAL "hot spots" onto crystal structures of homologous permeases reveals that the significant sites are gating determinants on the cytosolic face rather than, say, docking sites for the substrate-binding protein on the extracellular face. In the second example, we partitioned a protein methyltransferase family using gene proximity as a criterion. In this case, the TRUE set comprised those methyltransferases encoded near the gene for the substrate RF-1. SIMBAL identifies sequence regions that map onto the substrate-binding interface while ignoring regions involved in the methyltransferase reaction mechanism in general. Neither method for training set construction requires any prior experimental characterization.

CONCLUSIONS

SIMBAL shows that, in functionally divergent protein families, selected short sequences often significantly outperform their full-length parent sequence for making functional predictions by sequence similarity, suggesting avenues for improved functional classifiers. When combined with structural data, SIMBAL affords the ability to localize and model functional sites.

Collapse

Ho Sui SJ, Fedynak A, Hsiao WWL, Langille MGI, Brinkman FSL. The association of virulence factors with genomic islands. PLoS One 2009;4:e8094. [PMID: 19956607 PMCID: PMC2779486 DOI: 10.1371/journal.pone.0008094] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2009] [Accepted: 11/07/2009] [Indexed: 01/20/2023] Open

Abstract

BACKGROUND

It has been noted that many bacterial virulence factor genes are located within genomic islands (GIs; clusters of genes in a prokaryotic genome of probable horizontal origin). However, such studies have been limited to single genera or isolated observations. We have performed the first large-scale analysis of multiple diverse pathogens to examine this association. We additionally identified genes found predominantly in pathogens, but not non-pathogens, across multiple genera using 631 complete bacterial genomes, and we identified common trends in virulence for genes in GIs. Furthermore, we examined the relationship between GIs and clustered regularly interspaced palindromic repeats (CRISPRs) proposed to confer resistance to phage.

METHODOLOGY/PRINCIPAL FINDINGS

We show quantitatively that GIs disproportionately contain more virulence factors than the rest of a given genome (p<1E-40 using three GI datasets) and that CRISPRs are also over-represented in GIs. Virulence factors in GIs and pathogen-associated virulence factors are enriched for proteins having more "offensive" functions, e.g. active invasion of the host, and are disproportionately components of type III/IV secretion systems or toxins. Numerous hypothetical pathogen-associated genes were identified, meriting further study.

CONCLUSIONS/SIGNIFICANCE

This is the first systematic analysis across diverse genera indicating that virulence factors are disproportionately associated with GIs. "Offensive" virulence factors, as opposed to host-interaction factors, may more often be a recently acquired trait (on an evolutionary time scale detected by GI analysis). Newly identified pathogen-associated genes warrant further study. We discuss the implications of these results, which cement the significant role of GIs in the evolution of many pathogens.

Collapse

Davidsen T, Beck E, Ganapathy A, Montgomery R, Zafar N, Yang Q, Madupu R, Goetz P, Galinsky K, White O, Sutton G. The comprehensive microbial resource. Nucleic Acids Res 2009;38:D340-5. [PMID: 19892825 PMCID: PMC2808947 DOI: 10.1093/nar/gkp912] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Brinkac LM, Davidsen T, Beck E, Ganapathy A, Caler E, Dodson RJ, Durkin AS, Harkins DM, Lorenzi H, Madupu R, Sebastian Y, Shrivastava S, Thiagarajan M, Orvis J, Sundaram JP, Crabtree J, Galens K, Zhao Y, Inman JM, Montgomery R, Schobel S, Galinsky K, Tanenbaum DM, Resnick A, Zafar N, White O, Sutton G. Pathema: a clade-specific bioinformatics resource center for pathogen research. Nucleic Acids Res 2009;38:D408-14. [PMID: 19843611 PMCID: PMC2808925 DOI: 10.1093/nar/gkp850] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Merhej V, Royer-Carenzi M, Pontarotti P, Raoult D. Massive comparative genomic analysis reveals convergent evolution of specialized bacteria. Biol Direct 2009;4:13. [PMID: 19361336 PMCID: PMC2688493 DOI: 10.1186/1745-6150-4-13] [Citation(s) in RCA: 169] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2009] [Accepted: 04/10/2009] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Genome size and gene content in bacteria are associated with their lifestyles. Obligate intracellular bacteria (i.e., mutualists and parasites) have small genomes that derived from larger free-living bacterial ancestors; however, the different steps of bacterial specialization from free-living to intracellular lifestyle have not been studied comprehensively. The growing number of available sequenced genomes makes it possible to perform a statistical comparative analysis of 317 genomes from bacteria with different lifestyles.

RESULTS

Compared to free-living bacteria, host-dependent bacteria exhibit fewer rRNA genes, more split rRNA operons and fewer transcriptional regulators, linked to slower growth rates. We found a function-dependent and non-random loss of the same 100 orthologous genes in all obligate intracellular bacteria. Thus, we showed that obligate intracellular bacteria from different phyla are converging according to their lifestyle. Their specialization is an irreversible phenomenon characterized by translation modification and massive gene loss, including the loss of transcriptional regulators. Although both mutualists and parasites converge by genome reduction, these obligate intracellular bacteria have lost distinct sets of genes in the context of their specific host associations: mutualists have significantly more genes that enable nutrient provisioning whereas parasites have genes that encode Types II, IV, and VI secretion pathways.

CONCLUSION

Our findings suggest that gene loss, rather than acquisition of virulence factors, has been a driving force in the adaptation of parasites to eukaryotic cells. This comparative genomic analysis helps to explore the strategies by which obligate intracellular genomes specialize to particular host-associations and contributes to advance our knowledge about the mechanisms of bacterial evolution.

Collapse

Kastenmüller G, Schenk ME, Gasteiger J, Mewes HW. Uncovering metabolic pathways relevant to phenotypic traits of microbial genomes. Genome Biol 2009;10:R28. [PMID: 19284550 PMCID: PMC2690999 DOI: 10.1186/gb-2009-10-3-r28] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2008] [Revised: 02/12/2009] [Accepted: 03/10/2009] [Indexed: 01/20/2023] Open

Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, Tatusova T, Thomson N, Allen MJ, Angiuoli SV, Ashburner M, Axelrod N, Baldauf S, Ballard S, Boore J, Cochrane G, Cole J, Dawyndt P, De Vos P, DePamphilis C, Edwards R, Faruque N, Feldman R, Gilbert J, Gilna P, Glöckner FO, Goldstein P, Guralnick R, Haft D, Hancock D, Hermjakob H, Hertz-Fowler C, Hugenholtz P, Joint I, Kagan L, Kane M, Kennedy J, Kowalchuk G, Kottmann R, Kolker E, Kravitz S, Kyrpides N, Leebens-Mack J, Lewis SE, Li K, Lister AL, Lord P, Maltsev N, Markowitz V, Martiny J, Methe B, Mizrachi I, Moxon R, Nelson K, Parkhill J, Proctor L, White O, Sansone SA, Spiers A, Stevens R, Swift P, Taylor C, Tateno Y, Tett A, Turner S, Ussery D, Vaughan B, Ward N, Whetzel T, San Gil I, Wilson G, Wipat A. The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol 2008;26:541-7. [PMID: 18464787 PMCID: PMC2409278 DOI: 10.1038/nbt1360] [Citation(s) in RCA: 969] [Impact Index Per Article: 60.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Choi K, Kim S. ComPath: comparative enzyme analysis and annotation in pathway/subsystem contexts. BMC Bioinformatics 2008;9:145. [PMID: 18325116 PMCID: PMC2277404 DOI: 10.1186/1471-2105-9-145] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2007] [Accepted: 03/06/2008] [Indexed: 11/16/2022] Open

Abstract

Background

Once a new genome is sequenced, one of the important questions is to determine the presence and absence of biological pathways. Analysis of biological pathways in a genome is a complicated task since a number of biological entities are involved in pathways and biological pathways in different organisms are not identical. Computational pathway identification and analysis thus involves a number of computational tools and databases and typically done in comparison with pathways in other organisms. This computational requirement is much beyond the capability of biologists, so information systems for reconstructing, annotating, and analyzing biological pathways are much needed. We introduce a new comparative pathway analysis workbench, ComPath, which integrates various resources and computational tools using an interactive spreadsheet-style web interface for reliable pathway analyses.

Results

ComPath allows users to compare biological pathways in multiple genomes using a spreadsheet style web interface where various sequence-based analysis can be performed either to compare enzymes (e.g. sequence clustering) and pathways (e.g. pathway hole identification), to search a genome for de novo prediction of enzymes, or to annotate a genome in comparison with reference genomes of choice. To fill in pathway holes or make de novo enzyme predictions, multiple computational methods such as FASTA, Whole-HMM, CSR-HMM (a method of our own introduced in this paper), and PDB-domain search are integrated in ComPath. Our experiments show that FASTA and CSR-HMM search methods generally outperform Whole-HMM and PDB-domain search methods in terms of sensitivity, but FASTA search performs poorly in terms of specificity, detecting more false positive as E-value cutoff increases. Overall, CSR-HMM search method performs best in terms of both sensitivity and specificity. Gene neighborhood and pathway neighborhood (global network) visualization tools can be used to get context information that is complementary to conventional KEGG map representation.

Conclusion

ComPath is an interactive workbench for pathway reconstruction, annotation, and analysis where experts can perform various sequence, domain, context analysis, using an intuitive and interactive spreadsheet-style interface.

Collapse

Replacement of the Arginine Biosynthesis Operon in Xanthomonadales by Lateral Gene Transfer. J Mol Evol 2008;66:266-75. [DOI: 10.1007/s00239-008-9082-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2007] [Revised: 07/23/2007] [Accepted: 01/25/2008] [Indexed: 11/30/2022]

Haft DH, Self WT. Orphan SelD proteins and selenium-dependent molybdenum hydroxylases. Biol Direct 2008;3:4. [PMID: 18289380 PMCID: PMC2276186 DOI: 10.1186/1745-6150-3-4] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2008] [Accepted: 02/20/2008] [Indexed: 11/24/2022] Open

Annotation, comparison and databases for hundreds of bacterial genomes. Res Microbiol 2007;158:724-36. [DOI: 10.1016/j.resmic.2007.09.009] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2007] [Revised: 09/21/2007] [Accepted: 09/26/2007] [Indexed: 11/20/2022]

Markowitz VM. Microbial genome data resources. Curr Opin Biotechnol 2007;18:267-72. [PMID: 17467973 DOI: 10.1016/j.copbio.2007.04.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2007] [Revised: 03/18/2007] [Accepted: 04/18/2007] [Indexed: 11/17/2022]

Greene JM, Collins F, Lefkowitz EJ, Roos D, Scheuermann RH, Sobral B, Stevens R, White O, Di Francesco V. National Institute of Allergy and Infectious Diseases bioinformatics resource centers: new assets for pathogen informatics. Infect Immun 2007;75:3212-9. [PMID: 17420237 PMCID: PMC1932942 DOI: 10.1128/iai.00105-07] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Lussier YA, Liu Y. Computational approaches to phenotyping: high-throughput phenomics. Ann Am Thorac Soc 2007;4:18-25. [PMID: 17202287 PMCID: PMC2647609 DOI: 10.1513/pats.200607-142jg] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Osterman AL, Begley TP. A subsystems-based approach to the identification of drug targets in bacterial pathogens. PROGRESS IN DRUG RESEARCH. FORTSCHRITTE DER ARZNEIMITTELFORSCHUNG. PROGRES DES RECHERCHES PHARMACEUTIQUES 2007;64:131, 133-70. [PMID: 17195474 DOI: 10.1007/978-3-7643-7567-6_6] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]

de Crécy-Lagard V. Identification of genes encoding tRNA modification enzymes by comparative genomics. Methods Enzymol 2007;425:153-83. [PMID: 17673083 PMCID: PMC3034448 DOI: 10.1016/s0076-6879(07)25007-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Selengut JD, Haft DH, Davidsen T, Ganapathy A, Gwinn-Giglio M, Nelson WC, Richter AR, White O. TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes. Nucleic Acids Res 2006;35:D260-4. [PMID: 17151080 PMCID: PMC1781115 DOI: 10.1093/nar/gkl1043] [Citation(s) in RCA: 229] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Mattes WB. Cross-species comparative toxicogenomics as an aid to safety assessment. Expert Opin Drug Metab Toxicol 2006;2:859-74. [PMID: 17125406 DOI: 10.1517/17425255.2.6.859] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Badger JH, Hoover TR, Brun YV, Weiner RM, Laub MT, Alexandre G, Mrázek J, Ren Q, Paulsen IT, Nelson KE, Khouri HM, Radune D, Sosa J, Dodson RJ, Sullivan SA, Rosovitz MJ, Madupu R, Brinkac LM, Durkin AS, Daugherty SC, Kothari SP, Giglio MG, Zhou L, Haft DH, Selengut JD, Davidsen TM, Yang Q, Zafar N, Ward NL. Comparative genomic evidence for a close relationship between the dimorphic prosthecate bacteria Hyphomonas neptunium and Caulobacter crescentus. J Bacteriol 2006;188:6841-50. [PMID: 16980487 PMCID: PMC1595504 DOI: 10.1128/jb.00111-06] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Liu Y, Li J, Sam L, Goh CS, Gerstein M, Lussier YA. An integrative genomic approach to uncover molecular mechanisms of prokaryotic traits. PLoS Comput Biol 2006;2:e159. [PMID: 17112314 PMCID: PMC1636675 DOI: 10.1371/journal.pcbi.0020159] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2006] [Accepted: 10/10/2006] [Indexed: 11/18/2022] Open

Abstract

With mounting availability of genomic and phenotypic databases, data integration and mining become increasingly challenging. While efforts have been put forward to analyze prokaryotic phenotypes, current computational technologies either lack high throughput capacity for genomic scale analysis, or are limited in their capability to integrate and mine data across different scales of biology. Consequently, simultaneous analysis of associations among genomes, phenotypes, and gene functions is prohibited. Here, we developed a high throughput computational approach, and demonstrated for the first time the feasibility of integrating large quantities of prokaryotic phenotypes along with genomic datasets for mining across multiple scales of biology (protein domains, pathways, molecular functions, and cellular processes). Applying this method over 59 fully sequenced prokaryotic species, we identified genetic basis and molecular mechanisms underlying the phenotypes in bacteria. We identified 3,711 significant correlations between 1,499 distinct Pfam and 63 phenotypes, with 2,650 correlations and 1,061 anti-correlations. Manual evaluation of a random sample of these significant correlations showed a minimal precision of 30% (95% confidence interval: 20%-42%; n = 50). We stratified the most significant 478 predictions and subjected 100 to manual evaluation, of which 60 were corroborated in the literature. We furthermore unveiled 10 significant correlations between phenotypes and KEGG pathways, eight of which were corroborated in the evaluation, and 309 significant correlations between phenotypes and 166 GO concepts evaluated using a random sample (minimal precision = 72%; 95% confidence interval: 60%-80%; n = 50). Additionally, we conducted a novel large-scale phenomic visualization analysis to provide insight into the modular nature of common molecular mechanisms spanning multiple biological scales and reused by related phenotypes (metaphenotypes). We propose that this method elucidates which classes of molecular mechanisms are associated with phenotypes or metaphenotypes and holds promise in facilitating a computable systems biology approach to genomic and biomedical research.

Collapse

Seshadri R, Joseph SW, Chopra AK, Sha J, Shaw J, Graf J, Haft D, Wu M, Ren Q, Rosovitz MJ, Madupu R, Tallon L, Kim M, Jin S, Vuong H, Stine OC, Ali A, Horneman AJ, Heidelberg JF. Genome sequence of Aeromonas hydrophila ATCC 7966T: jack of all trades. J Bacteriol 2006;188:8272-82. [PMID: 16980456 PMCID: PMC1698176 DOI: 10.1128/jb.00621-06] [Citation(s) in RCA: 259] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Field D, Wilson G, van der Gast C. How do we compare hundreds of bacterial genomes? Curr Opin Microbiol 2006;9:499-504. [PMID: 16942900 DOI: 10.1016/j.mib.2006.08.008] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2006] [Accepted: 08/16/2006] [Indexed: 11/26/2022]

Haft DH, Paulsen IT, Ward N, Selengut JD. Exopolysaccharide-associated protein sorting in environmental organisms: the PEP-CTERM/EpsH system. Application of a novel phylogenetic profiling heuristic. BMC Biol 2006;4:29. [PMID: 16930487 PMCID: PMC1569441 DOI: 10.1186/1741-7007-4-29] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2006] [Accepted: 08/24/2006] [Indexed: 11/13/2022] Open

Abstract

Background

Protein translocation to the proper cellular destination may be guided by various classes of sorting signals recognizable in the primary sequence. Detection in some genomes, but not others, may reveal sorting system components by comparison of the phylogenetic profile of the class of sorting signal to that of various protein families.

Results

We describe a short C-terminal homology domain, sporadically distributed in bacteria, with several key characteristics of protein sorting signals. The domain includes a near-invariant motif Pro-Glu-Pro (PEP). This possible recognition or processing site is followed by a predicted transmembrane helix and a cluster rich in basic amino acids. We designate this domain PEP-CTERM. It tends to occur multiple times in a genome if it occurs at all, with a median count of eight instances; Verrucomicrobium spinosum has sixty-five. PEP-CTERM-containing proteins generally contain an N-terminal signal peptide and exhibit high diversity and little homology to known proteins. All bacteria with PEP-CTERM have both an outer membrane and exopolysaccharide (EPS) production genes. By a simple heuristic for screening phylogenetic profiles in the absence of pre-formed protein families, we discovered that a homolog of the membrane protein EpsH (exopolysaccharide locus protein H) occurs in a species when PEP-CTERM domains are found. The EpsH family contains invariant residues consistent with a transpeptidase function. Most PEP-CTERM proteins are encoded by single-gene operons preceded by large intergenic regions. In the Proteobacteria, most of these upstream regions share a DNA sequence, a probable cis-regulatory site that contains a sigma-54 binding motif. The phylogenetic profile for this DNA sequence exactly matches that of three proteins: a sigma-54-interacting response regulator (PrsR), a transmembrane histidine kinase (PrsK), and a TPR protein (PrsT).

Conclusion

These findings are consistent with the hypothesis that PEP-CTERM and EpsH form a protein export sorting system, analogous to the LPXTG/sortase system of Gram-positive bacteria, and correlated to EPS expression. It occurs preferentially in bacteria from sediments, soils, and biofilms. The novel method that led to these findings, partial phylogenetic profiling, requires neither global sequence clustering nor arbitrary similarity cutoffs and appears to be a rapid, effective alternative to other profiling methods.

Collapse

Mormann S, Lömker A, Rückert C, Gaigalat L, Tauch A, Pühler A, Kalinowski J. Random mutagenesis in Corynebacterium glutamicum ATCC 13032 using an IS6100-based transposon vector identified the last unknown gene in the histidine biosynthesis pathway. BMC Genomics 2006;7:205. [PMID: 16901339 PMCID: PMC1590026 DOI: 10.1186/1471-2164-7-205] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2006] [Accepted: 08/10/2006] [Indexed: 12/02/2022] Open

Abstract

BACKGROUND

Corynebacterium glutamicum, a Gram-positive bacterium of the class Actinobacteria, is an industrially relevant producer of amino acids. Several methods for the targeted genetic manipulation of this organism and rational strain improvement have been developed. An efficient transposon mutagenesis system for the completely sequenced type strain ATCC 13032 would significantly advance functional genome analysis in this bacterium.

RESULTS

A comprehensive transposon mutant library comprising 10,080 independent clones was constructed by electrotransformation of the restriction-deficient derivative of strain ATCC 13032, C. glutamicum RES167, with an IS6100-containing non-replicative plasmid. Transposon mutants had stable cointegrates between the transposon vector and the chromosome. Altogether 172 transposon integration sites have been determined by sequencing of the chromosomal inserts, revealing that each integration occurred at a different locus. Statistical target site analyses revealed an apparent absence of a target site preference. From the library, auxotrophic mutants were obtained with a frequency of 2.9%. By auxanography analyses nearly two thirds of the auxotrophs were further characterized, including mutants with single, double and alternative nutritional requirements. In most cases the nutritional requirement observed could be correlated to the annotation of the mutated gene involved in the biosynthesis of an amino acid, a nucleotide or a vitamin. One notable exception was a clone mutagenized by transposition into the gene cg0910, which exhibited an auxotrophy for histidine. The protein sequence deduced from cg0910 showed high sequence similarities to inositol-1(or 4)-monophosphatases (EC 3.1.3.25). Subsequent genetic deletion of cg0910 delivered the same histidine-auxotrophic phenotype. Genetic complementation of the mutants as well as supplementation by histidinol suggests that cg0910 encodes the hitherto unknown essential L-histidinol-phosphate phosphatase (EC 3.1.3.15) in C. glutamicum. The cg0910 gene, renamed hisN, and its encoded enzyme have putative orthologs in almost all Actinobacteria, including mycobacteria and streptomycetes.

CONCLUSION

The absence of regional and sequence preferences of IS6100-transposition demonstrate that the established system is suitable for efficient genome-scale random mutagenesis in the sequenced type strain C.glutamicum ATCC 13032. The identification of the hisN gene encoding histidinol-phosphate phosphatase in C. glutamicum closed the last gap in histidine synthesis in the Actinobacteria. The system might be a valuable genetic tool also in other bacteria due to the broad host-spectrum of IS6100.

Collapse

Schmidt T, Frishman D. PROMPT: a protein mapping and comparison tool. BMC Bioinformatics 2006;7:331. [PMID: 16817977 PMCID: PMC1569443 DOI: 10.1186/1471-2105-7-331] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2006] [Accepted: 07/04/2006] [Indexed: 11/12/2022] Open

Abstract

Background

Comparison of large protein datasets has become a standard task in bioinformatics. Typically researchers wish to know whether one group of proteins is significantly enriched in certain annotation attributes or sequence properties compared to another group, and whether this enrichment is statistically significant. In order to conduct such comparisons it is often required to integrate molecular sequence data and experimental information from disparate incompatible sources. While many specialized programs exist for comparisons of this kind in individual problem domains, such as expression data analysis, no generic software solution capable of addressing a wide spectrum of routine tasks in comparative proteomics is currently available.

Results

PROMPT is a comprehensive bioinformatics software environment which enables the user to compare arbitrary protein sequence sets, revealing statistically significant differences in their annotation features. It allows automatic retrieval and integration of data from a multitude of molecular biological databases as well as from a custom XML format. Similarity-based mapping of sequence IDs makes it possible to link experimental information obtained from different sources despite discrepancies in gene identifiers and minor sequence variation. PROMPT provides a full set of statistical procedures to address the following four use cases: i) comparison of the frequencies of categorical annotations between two sets, ii) enrichment of nominal features in one set with respect to another one, iii) comparison of numeric distributions, and iv) correlation of numeric variables. Analysis results can be visualized in the form of plots and spreadsheets and exported in various formats, including Microsoft Excel.

Conclusion

PROMPT is a versatile, platform-independent, easily expandable, stand-alone application designed to be a practical workhorse in analysing and mining protein sequences and associated annotation. The availability of the Java Application Programming Interface and scripting capabilities on one hand, and the intuitive Graphical User Interface with context-sensitive help system on the other, make it equally accessible to professional bioinformaticians and biologically-oriented users. PROMPT is freely available for academic users from .

Collapse

Gerdes SY, Kurnasov OV, Shatalin K, Polanuyer B, Sloutsky R, Vonstein V, Overbeek R, Osterman AL. Comparative genomics of NAD biosynthesis in cyanobacteria. J Bacteriol 2006;188:3012-23. [PMID: 16585762 PMCID: PMC1446974 DOI: 10.1128/jb.188.8.3012-3023.2006] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2005] [Accepted: 01/23/2006] [Indexed: 11/20/2022] Open

Abstract

Biosynthesis of NAD(P) cofactors is of special importance for cyanobacteria due to their role in photosynthesis and respiration. Despite significant progress in understanding NAD(P) biosynthetic machinery in some model organisms, relatively little is known about its implementation in cyanobacteria. We addressed this problem by a combination of comparative genome analysis with verification experiments in the model system of Synechocystis sp. strain PCC 6803. A detailed reconstruction of the NAD(P) metabolic subsystem using the SEED genomic platform (http://theseed.uchicago.edu/FIG/index.cgi) helped us accurately annotate respective genes in the entire set of 13 cyanobacterial species with completely sequenced genomes available at the time. Comparative analysis of operational variants implemented in this divergent group allowed us to elucidate both conserved (de novo and universal pathways) and variable (recycling and salvage pathways) aspects of this subsystem. Focused genetic and biochemical experiments confirmed several conjectures about the key aspects of this subsystem. (i) The product of the slr1691 gene, a homolog of Escherichia coli gene nadE containing an additional nitrilase-like N-terminal domain, is a NAD synthetase capable of utilizing glutamine as an amide donor in vitro. (ii) The product of the sll1916 gene, a homolog of E. coli gene nadD, is a nicotinic acid mononucleotide-preferring adenylyltransferase. This gene is essential for survival and cannot be compensated for by an alternative nicotinamide mononucleotide (NMN)-preferring adenylyltransferase (slr0787 gene). (iii) The product of the slr0788 gene is a nicotinamide-preferring phosphoribosyltransferase involved in the first step of the two-step non-deamidating utilization of nicotinamide (NMN shunt). (iv) The physiological role of this pathway encoded by a conserved gene cluster, slr0787-slr0788, is likely in the recycling of endogenously generated nicotinamide, as supported by the inability of this organism to utilize exogenously provided niacin. Positional clustering and the co-occurrence profile of the respective genes across a diverse collection of cellular organisms provide evidence of horizontal transfer events in the evolutionary history of this pathway.

Collapse

Affiliation(s)

Svetlana Y. Gerdes Fellowship for Interpretation of Genomes, Burr Ridge, Illinois 60527, Burnham Institute for Medical Research, La Jolla, California 92037, Department of Biochemistry, New York University School of Medicine, New York, New York 10016, Rohm and Haas Company, Advanced Biosciences Division, Spring House, Pennsylvania 19477, Department of Molecular Virology, Immunology, and Medical Genetics, Ohio State University, Columbus, Ohio 43210
Oleg V. Kurnasov Fellowship for Interpretation of Genomes, Burr Ridge, Illinois 60527, Burnham Institute for Medical Research, La Jolla, California 92037, Department of Biochemistry, New York University School of Medicine, New York, New York 10016, Rohm and Haas Company, Advanced Biosciences Division, Spring House, Pennsylvania 19477, Department of Molecular Virology, Immunology, and Medical Genetics, Ohio State University, Columbus, Ohio 43210
Konstantin Shatalin Fellowship for Interpretation of Genomes, Burr Ridge, Illinois 60527, Burnham Institute for Medical Research, La Jolla, California 92037, Department of Biochemistry, New York University School of Medicine, New York, New York 10016, Rohm and Haas Company, Advanced Biosciences Division, Spring House, Pennsylvania 19477, Department of Molecular Virology, Immunology, and Medical Genetics, Ohio State University, Columbus, Ohio 43210
Boris Polanuyer Fellowship for Interpretation of Genomes, Burr Ridge, Illinois 60527, Burnham Institute for Medical Research, La Jolla, California 92037, Department of Biochemistry, New York University School of Medicine, New York, New York 10016, Rohm and Haas Company, Advanced Biosciences Division, Spring House, Pennsylvania 19477, Department of Molecular Virology, Immunology, and Medical Genetics, Ohio State University, Columbus, Ohio 43210
Roman Sloutsky Fellowship for Interpretation of Genomes, Burr Ridge, Illinois 60527, Burnham Institute for Medical Research, La Jolla, California 92037, Department of Biochemistry, New York University School of Medicine, New York, New York 10016, Rohm and Haas Company, Advanced Biosciences Division, Spring House, Pennsylvania 19477, Department of Molecular Virology, Immunology, and Medical Genetics, Ohio State University, Columbus, Ohio 43210
Veronika Vonstein Fellowship for Interpretation of Genomes, Burr Ridge, Illinois 60527, Burnham Institute for Medical Research, La Jolla, California 92037, Department of Biochemistry, New York University School of Medicine, New York, New York 10016, Rohm and Haas Company, Advanced Biosciences Division, Spring House, Pennsylvania 19477, Department of Molecular Virology, Immunology, and Medical Genetics, Ohio State University, Columbus, Ohio 43210
Ross Overbeek Fellowship for Interpretation of Genomes, Burr Ridge, Illinois 60527, Burnham Institute for Medical Research, La Jolla, California 92037, Department of Biochemistry, New York University School of Medicine, New York, New York 10016, Rohm and Haas Company, Advanced Biosciences Division, Spring House, Pennsylvania 19477, Department of Molecular Virology, Immunology, and Medical Genetics, Ohio State University, Columbus, Ohio 43210
Andrei L. Osterman Fellowship for Interpretation of Genomes, Burr Ridge, Illinois 60527, Burnham Institute for Medical Research, La Jolla, California 92037, Department of Biochemistry, New York University School of Medicine, New York, New York 10016, Rohm and Haas Company, Advanced Biosciences Division, Spring House, Pennsylvania 19477, Department of Molecular Virology, Immunology, and Medical Genetics, Ohio State University, Columbus, Ohio 43210

Collapse

Liolios K, Tavernarakis N, Hugenholtz P, Kyrpides NC. The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide. Nucleic Acids Res 2006;34:D332-4. [PMID: 16381880 PMCID: PMC1347507 DOI: 10.1093/nar/gkj145] [Citation(s) in RCA: 196] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open

Wang HC, Susko E, Roger AJ. On the correlation between genomic G+C content and optimal growth temperature in prokaryotes: data quality and confounding factors. Biochem Biophys Res Commun 2006;342:681-4. [PMID: 16499870 DOI: 10.1016/j.bbrc.2006.02.037] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2006] [Accepted: 02/08/2006] [Indexed: 11/30/2022]

Dunning Hotopp JC, Lin M, Madupu R, Crabtree J, Angiuoli SV, Eisen JA, Eisen J, Seshadri R, Ren Q, Wu M, Utterback TR, Smith S, Lewis M, Khouri H, Zhang C, Niu H, Lin Q, Ohashi N, Zhi N, Nelson W, Brinkac LM, Dodson RJ, Rosovitz MJ, Sundaram J, Daugherty SC, Davidsen T, Durkin AS, Gwinn M, Haft DH, Selengut JD, Sullivan SA, Zafar N, Zhou L, Benahmed F, Forberger H, Halpin R, Mulligan S, Robinson J, White O, Rikihisa Y, Tettelin H. Comparative genomics of emerging human ehrlichiosis agents. PLoS Genet 2006;2:e21. [PMID: 16482227 PMCID: PMC1366493 DOI: 10.1371/journal.pgen.0020021] [Citation(s) in RCA: 341] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2005] [Accepted: 01/09/2006] [Indexed: 11/25/2022] Open

Abstract

Anaplasma (formerly Ehrlichia) phagocytophilum, Ehrlichia chaffeensis, and Neorickettsia (formerly Ehrlichia) sennetsu are intracellular vector-borne pathogens that cause human ehrlichiosis, an emerging infectious disease. We present the complete genome sequences of these organisms along with comparisons to other organisms in the Rickettsiales order. Ehrlichia spp. and Anaplasma spp. display a unique large expansion of immunodominant outer membrane proteins facilitating antigenic variation. All Rickettsiales have a diminished ability to synthesize amino acids compared to their closest free-living relatives. Unlike members of the Rickettsiaceae family, these pathogenic Anaplasmataceae are capable of making all major vitamins, cofactors, and nucleotides, which could confer a beneficial role in the invertebrate vector or the vertebrate host. Further analysis identified proteins potentially involved in vacuole confinement of the Anaplasmataceae, a life cycle involving a hematophagous vector, vertebrate pathogenesis, human pathogenesis, and lack of transovarial transmission. These discoveries provide significant insights into the biology of these obligate intracellular pathogens.

Ehrlichiosis is an acute disease that triggers flu-like symptoms in both humans and animals. It is caused by a range of bacteria transmitted by ticks or flukes. Because these bacteria are difficult to culture, however, the organisms are poorly understood. The genomes of three emerging human pathogens causing ehrlichiosis were sequenced. A database was designed to allow the comparison of these three genomes to sixteen other bacteria with similar lifestyles. Analysis from this database reveals new species-specific and disease-specific genes indicating niche adaptations, pathogenic traits, and other features. In particular, one of the organisms contains more than 100 copies of a single gene involved in interactions with the host(s). These comparisons also enabled a reconstruction of the metabolic potential of five representative genomes from these bacteria and their close relatives. With this work, scientists can study these emerging pathogens in earnest.

Collapse