1
|
Gabor CE, Hazen TH, Delaine-Elias BC, Rasko DA, Barry EM. Genomic, transcriptomic, and phenotypic differences among archetype Shigella flexneri strains of serotypes 2a, 3a, and 6. mSphere 2023; 8:e0040823. [PMID: 37830809 PMCID: PMC10732043 DOI: 10.1128/msphere.00408-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 08/30/2023] [Indexed: 10/14/2023] Open
Abstract
IMPORTANCE Given the genomic diversity between S. flexneri serotypes and the paucity of data to support serotype-specific phenotypic differences, we applied in silico and in vitro functional analyses of archetype strains of 2457T (Sf2a), J17B (Sf3a), and CH060 (Sf6). These archetype strains represent the three leading S. flexneri serotypes recommended for inclusion in multivalent vaccines. Characterizing the genomic and phenotypic variation among these clinically prevalent serotypes is an important step toward understanding serotype-specific host-pathogen interactions to optimize the efficacy of multivalent vaccines and therapeutics. This study underpins the importance for further large-scale serotype-targeted analyses.
Collapse
Affiliation(s)
- Caitlin E. Gabor
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, USA
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, USA
- Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Tracy H. Hazen
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, USA
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - BreOnna C. Delaine-Elias
- Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - David A. Rasko
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, USA
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, USA
- Center for Pathogen Research, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Eileen M. Barry
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, USA
- Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, Maryland, USA
| |
Collapse
|
2
|
Genomic diversity of non-diarrheagenic fecal Escherichia coli from children in sub-Saharan Africa and south Asia and their relatedness to diarrheagenic E. coli. Nat Commun 2023; 14:1400. [PMID: 36918537 PMCID: PMC10011798 DOI: 10.1038/s41467-023-36337-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 01/23/2023] [Indexed: 03/16/2023] Open
Abstract
Escherichia coli is a frequent member of the healthy human gastrointestinal microbiota, as well as an important human pathogen. Previous studies have focused on the genomic diversity of the pathogenic E. coli and much remains unknown about the non-diarrheagenic E. coli residing in the human gut, particularly among young children in low and middle income countries. Also, gaining additional insight into non-diarrheagenic E. coli is important for understanding gut health as non-diarrheagenic E. coli can prevent infection by diarrheagenic bacteria. In this study we examine the genomic diversity of non-diarrheagenic fecal E. coli from male and female children with or without diarrhea from countries in sub-Saharan Africa and south Asia as part of the Global Enteric Multicenter Study (GEMS). We find that these E. coli exhibit considerable genetic diversity as they were identified in all E. coli phylogroups and an Escherichia cryptic clade. Although these fecal E. coli lack the characteristic virulence factors of diarrheagenic E. coli pathotypes, many exhibit remarkable genomic similarity to previously described diarrheagenic isolates with differences attributed to mobile elements. This raises an important question of whether these non-diarrheagenic fecal E. coli may have at one time possessed the mobile element-encoded virulence factors of diarrheagenic pathotypes or may have the potential to acquire these virulence factors.
Collapse
|
3
|
Iquebal MA, Jagannadham J, Jaiswal S, Prabha R, Rai A, Kumar D. Potential Use of Microbial Community Genomes in Various Dimensions of Agriculture Productivity and Its Management: A Review. Front Microbiol 2022; 13:708335. [PMID: 35655999 PMCID: PMC9152772 DOI: 10.3389/fmicb.2022.708335] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 03/17/2022] [Indexed: 12/12/2022] Open
Abstract
Agricultural productivity is highly influenced by its associated microbial community. With advancements in omics technology, metagenomics is known to play a vital role in microbial world studies by unlocking the uncultured microbial populations present in the environment. Metagenomics is a diagnostic tool to target unique signature loci of plant and animal pathogens as well as beneficial microorganisms from samples. Here, we reviewed various aspects of metagenomics from experimental methods to techniques used for sequencing, as well as diversified computational resources, including databases and software tools. Exhaustive focus and study are conducted on the application of metagenomics in agriculture, deciphering various areas, including pathogen and plant disease identification, disease resistance breeding, plant pest control, weed management, abiotic stress management, post-harvest management, discoveries in agriculture, source of novel molecules/compounds, biosurfactants and natural product, identification of biosynthetic molecules, use in genetically modified crops, and antibiotic-resistant genes. Metagenomics-wide association studies study in agriculture on crop productivity rates, intercropping analysis, and agronomic field is analyzed. This article is the first of its comprehensive study and prospects from an agriculture perspective, focusing on a wider range of applications of metagenomics and its association studies.
Collapse
Affiliation(s)
- Mir Asif Iquebal
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Jaisri Jagannadham
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Sarika Jaiswal
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Ratna Prabha
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Anil Rai
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Dinesh Kumar
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
- School of Interdisciplinary and Applied Sciences, Central University of Haryana, Mahendergarh, Haryana, India
| |
Collapse
|
4
|
Contribution of Noncanonical Antigens to Virulence and Adaptive Immunity in Human Infection with Enterotoxigenic E. coli. Infect Immun 2021; 89:IAI.00041-21. [PMID: 33558320 DOI: 10.1128/iai.00041-21] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 01/27/2021] [Indexed: 12/12/2022] Open
Abstract
Enterotoxigenic Escherichia coli (ETEC) contributes significantly to the substantial burden of infectious diarrhea among children living in low- and middle-income countries. In the absence of a vaccine for ETEC, children succumb to acute dehydration as well as nondiarrheal sequelae related to these infections, including malnutrition. The considerable diversity of ETEC genomes has complicated canonical vaccine development approaches defined by a subset of ETEC pathovar-specific antigens known as colonization factors (CFs). To identify additional conserved immunogens unique to this pathovar, we employed an "open-aperture" approach to capture all potential conserved ETEC surface antigens, in which we mined the genomic sequences of 89 ETEC isolates, bioinformatically selected potential surface-exposed pathovar-specific antigens conserved in more than 40% of the genomes (n = 118), and assembled the representative proteins onto microarrays, complemented with known or putative colonization factor subunit molecules (n = 52) and toxin subunits. These arrays were then used to interrogate samples from individuals with acute symptomatic ETEC infections. Surprisingly, in this approach, we found that immune responses were largely constrained to a small number of antigens, including individual colonization factor antigens and EtpA, an extracellular adhesin. In a Bangladeshi cohort of naturally infected children <2 years of age, both EtpA and a second antigen, EatA, elicited significant serologic responses that were associated with protection from symptomatic illness. In addition, children infected with ETEC isolates bearing either etpA or eatA genes were significantly more likely to develop symptomatic disease. These studies support a role for antigens not presently targeted by vaccines (noncanonical) in virulence and the development of adaptive immune responses during ETEC infections. These findings may inform vaccine design efforts to complement existing approaches.
Collapse
|
5
|
Pal S, Mondal S, Das G, Khatua S, Ghosh Z. Big data in biology: The hope and present-day challenges in it. GENE REPORTS 2020. [DOI: 10.1016/j.genrep.2020.100869] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
6
|
Complete Genome Sequence of Francisella halioticida Type Strain DSM 23729 (FSC1005). Microbiol Resour Announc 2020; 9:9/37/e00541-20. [PMID: 32912905 PMCID: PMC7484064 DOI: 10.1128/mra.00541-20] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Here, we announce the complete genome sequence of the Francisella halioticida type strain DSM 23729 (FSC1005), isolated from a diseased cultured giant abalone in Japan in 2005. The genome is composed of a 2,197,430-bp-long circular chromosome, with a G+C content of 31.2%. Here, we announce the complete genome sequence of the Francisella halioticida type strain DSM 23729 (FSC1005), isolated from a diseased cultured giant abalone in Japan in 2005. The genome is composed of a 2,197,430-bp-long circular chromosome, with a G+C content of 31.2%.
Collapse
|
7
|
Hernandes RT, Hazen TH, dos Santos LF, Richter TKS, Michalski JM, Rasko DA. Comparative genomic analysis provides insight into the phylogeny and virulence of atypical enteropathogenic Escherichia coli strains from Brazil. PLoS Negl Trop Dis 2020; 14:e0008373. [PMID: 32479541 PMCID: PMC7289442 DOI: 10.1371/journal.pntd.0008373] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 06/11/2020] [Accepted: 05/07/2020] [Indexed: 12/21/2022] Open
Abstract
Background Atypical enteropathogenic Escherichia coli (aEPEC) are one of the most frequent intestinal E. coli pathotypes isolated from diarrheal patients in Brazil. Isolates of aEPEC contain the locus of enterocyte effacement, but lack the genes of the bundle-forming pilus of typical EPEC, and the Shiga toxin of enterohemorrhagic E. coli (EHEC). The objective of this study was to evaluate the phylogeny and the gene content of Brazilian aEPEC genomes compared to a global aEPEC collection. Methodology Single nucleotide polymorphism (SNP)-based phylogenomic analysis was used to compare 106 sequenced Brazilian aEPEC with 221 aEPEC obtained from other geographic origins. Additionally, Large-Scale BLAST Score Ratio was used to determine the shared versus unique gene content of the aEPEC studied. Principal Findings Phylogenomic analysis demonstrated the 106 Brazilian aEPEC were present in phylogroups B1 (47.2%, 50/106), B2 (23.6%, 25/106), A (22.6%, 24/106), and E (6.6%, 7/106). Identification of EPEC and EHEC phylogenomic lineages demonstrated that 42.5% (45/106) of the Brazilian aEPEC were in four of the previously defined lineages: EPEC10 (17.9%, 19/106), EPEC9 (10.4%, 11/106), EHEC2 (7.5%, 8/106) and EPEC7 (6.6%, 7/106). Interestingly, an additional 28.3% (30/106) of the Brazilian aEPEC were identified in five novel lineages: EPEC11 (14.2%, 15/106), EPEC12 (4.7%, 5/106), EPEC13 (1.9%, 2/106), EPEC14 (5.7%, 6/106) and EPEC15 (1.9%, 2/106). We identified 246 genes that were more frequent among the aEPEC isolates from Brazil compared to the global aEPEC collection, including espG2, espT and espC (P<0.001). Moreover, the nleF gene was more frequently identified among Brazilian aEPEC isolates obtained from diarrheagenic patients when compared to healthy subjects (69.7% vs 41.2%, P<0.05). Conclusion The current study demonstrates significant genomic diversity among aEPEC from Brazil, with the identification of Brazilian aEPEC isolates to five novel EPEC lineages. The greater prevalence of some virulence genes among Brazilian aEPEC genomes could be important to the specific virulence strategies used by aEPEC in Brazil to cause diarrheal disease. Atypical EPEC (aEPEC) is one of the most frequent diarrheagenic Escherichia coli pathotypes isolated from patients in Brazil and is associated with diarrheal outbreaks. This study is the first to sequence the genomes of a collection of aEPEC isolates from a South American country, Brazil, and compare their phylogenetic relationships and gene content with a global collection of aEPEC. This approach identified Brazilian aEPEC genomes in previously characterized EPEC/EHEC phylogenomic lineages and resulted in the identification of five novel EPEC phylogenomic lineages, designated EPEC11 to EPEC15. We also observed that virulence genes, such as espG2, espT and espC were more frequently identified among the Brazilian aEPEC genomes, demonstrating potential differences in the virulence repertoire of this pathogen in Brazil.
Collapse
Affiliation(s)
- Rodrigo T. Hernandes
- Departamento de Microbiologia e Imunologia, Instituto de Biociências, Universidade Estadual Paulista “Júlio de Mesquita Filho” (UNESP), Botucatu, SP, Brasil
- * E-mail:
| | - Tracy H. Hazen
- Institute for Genome Sciences, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
| | | | - Taylor K. S. Richter
- Institute for Genome Sciences, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
| | - Jane M. Michalski
- Institute for Genome Sciences, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
| | - David A. Rasko
- Institute for Genome Sciences, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
| |
Collapse
|
8
|
D'Mello A, Ahearn CP, Murphy TF, Tettelin H. ReVac: a reverse vaccinology computational pipeline for prioritization of prokaryotic protein vaccine candidates. BMC Genomics 2019; 20:981. [PMID: 31842745 PMCID: PMC6916091 DOI: 10.1186/s12864-019-6195-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Accepted: 10/16/2019] [Indexed: 12/24/2022] Open
Abstract
Background Reverse vaccinology accelerates the discovery of potential vaccine candidates (PVCs) prior to experimental validation. Current programs typically use one bacterial proteome to identify PVCs through a filtering architecture using feature prediction programs or a machine learning approach. Filtering approaches may eliminate potential antigens based on limitations in the accuracy of prediction tools used. Machine learning approaches are heavily dependent on the selection of training datasets with experimentally validated antigens (positive control) and non-protective-antigens (negative control). The use of one or few bacterial proteomes does not assess PVC conservation among strains, an important feature of vaccine antigens. Results We present ReVac, which implements both a panoply of feature prediction programs without filtering out proteins, and scoring of candidates based on predictions made on curated positive and negative control PVCs datasets. ReVac surveys several genomes assessing protein conservation, as well as DNA and protein repeats, which may result in variable expression of PVCs. ReVac’s orthologous clustering of conserved genes, identifies core and dispensable genome components. This is useful for determining the degree of conservation of PVCs among the population of isolates for a given pathogen. Potential vaccine candidates are then prioritized based on conservation and overall feature-based scoring. We present the application of ReVac, applied to 69 Moraxella catarrhalis and 270 non-typeable Haemophilus influenzae genomes, prioritizing 64 and 29 proteins as PVCs, respectively. Conclusion ReVac’s use of a scoring scheme ranks PVCs for subsequent experimental testing. It employs a redundancy-based approach in its predictions of features using several prediction tools. The protein’s features are collated, and each protein is ranked based on the scoring scheme. Multi-genome analyses performed in ReVac allow for a comprehensive overview of PVCs from a pan-genome perspective, as an essential pre-requisite for any bacterial subunit vaccine design. ReVac prioritized PVCs of two human respiratory pathogens, identifying both novel and previously validated PVCs.
Collapse
Affiliation(s)
- Adonis D'Mello
- Department of Microbiology and Immunology, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Christian P Ahearn
- Department of Microbiology and Immunology, University at Buffalo, the State University of New York, Buffalo, NY, USA.,Clinical and Translational Research Center, University at Buffalo, the State University of New York, Buffalo, NY, USA
| | - Timothy F Murphy
- Department of Microbiology and Immunology, University at Buffalo, the State University of New York, Buffalo, NY, USA.,Clinical and Translational Research Center, University at Buffalo, the State University of New York, Buffalo, NY, USA.,Division of Infectious Disease, Department of Medicine, University at Buffalo, the State University of New York, Buffalo, NY, 14203, USA
| | - Hervé Tettelin
- Department of Microbiology and Immunology, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA.
| |
Collapse
|
9
|
Rasko DA, Del Canto F, Luo Q, Fleckenstein JM, Vidal R, Hazen TH. Comparative genomic analysis and molecular examination of the diversity of enterotoxigenic Escherichia coli isolates from Chile. PLoS Negl Trop Dis 2019; 13:e0007828. [PMID: 31747410 PMCID: PMC6901236 DOI: 10.1371/journal.pntd.0007828] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2019] [Revised: 12/09/2019] [Accepted: 10/04/2019] [Indexed: 02/02/2023] Open
Abstract
Enterotoxigenic Escherichia coli (ETEC) is one of the most common diarrheal pathogens in the low- and middle-income regions of the world, however a systematic examination of the genomic content of isolates from Chile has not yet been undertaken. Whole genome sequencing and comparative analysis of a collection of 125 ETEC isolates from three geographic locations in Chile, allowed the interrogation of phylogenomic groups, sequence types and genes specific to isolates from the different geographic locations. A total of 80.8% (101/125) of the ETEC isolates were identified in E. coli phylogroup A, 15.2% (19/125) in phylogroup B, and 4.0% (5/125) in phylogroup E. The over-representation of genomes in phylogroup A was significantly different from other global ETEC genomic studies. The Chilean ETEC isolates could be further subdivided into sub-clades similar to previously defined global ETEC reference lineages that had conserved multi-locus sequence types and toxin profiles. Comparison of the gene content of the Chilean ETEC identified genes that were unique based on geographic location within Chile, phylogenomic classifications or sequence type. Completion of a limited number of genomes provided insight into the ETEC plasmid content, which is conserved in some phylogenomic groups and not conserved in others. These findings suggest that the Chilean ETEC isolates contain unique virulence factor combinations and genomic content compared to global reference ETEC isolates.
Collapse
Affiliation(s)
- David A. Rasko
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
- * E-mail:
| | - Felipe Del Canto
- Programa de Microbiología y Micología, Instituto de Ciencias Biomédicas, Facultad de Medicina, Universidad de Chile, Santiago, Chile
| | - Qingwei Luo
- Department of Medicine, Division of Infectious Diseases, Washington University School of Medicine, Saint Louis, Missouri, United States of America
| | - James M. Fleckenstein
- Department of Medicine, Division of Infectious Diseases, Washington University School of Medicine, Saint Louis, Missouri, United States of America
- Veterans Affairs Medical Center, Saint Louis, Missouri, United States of America
| | - Roberto Vidal
- Programa de Microbiología y Micología, Instituto de Ciencias Biomédicas, Facultad de Medicina, Universidad de Chile, Santiago, Chile
- Instituto Milenio de Inmunología e Inmunoterapia, Facultad de Medicina, Universidad de Chile, Santiago, Chile
| | - Tracy H. Hazen
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
| |
Collapse
|
10
|
Xie G, Cheng Q, Daligault H, Davenport K, Gleasner C, Jacobs L, Kubicek-Sutherland J, LeCuyer T, Otieno V, Raballah E, Doggett N, Mukundan H, Perkins DJ, McMahon B. Draft Genome Sequences of Two Staphylococcus warneri Clinical Isolates, Strains SMA0023-04 (UGA3) and SMA0670-05 (UGA28), from Siaya County Referral Hospital, Siaya, Kenya. Microbiol Resour Announc 2019; 8:e01595-18. [PMID: 30975813 PMCID: PMC6460036 DOI: 10.1128/mra.01595-18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Accepted: 03/15/2019] [Indexed: 11/20/2022] Open
Abstract
We report the complete draft genome sequences of two Staphylococcus warneri clinical isolates, strains SMA0023-04 (UGA3) and SMA0670-05 (UGA28), each of which contains one chromosome and at least one plasmid. Isolate SMA0023-04 (UGA3) contains tetracycline efflux major facilitator superfamily (MFS) transporter (tetK), macrolide resistance (msrC and mphC), and beta-lactamase (blaZ) genes on its plasmids.
Collapse
Affiliation(s)
- Gary Xie
- Biosecurity and Public Health, Bioscience Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
| | - Qiuying Cheng
- Center for Global Health, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, USA
| | - Hajnalka Daligault
- Biosecurity and Public Health, Bioscience Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
| | - Karen Davenport
- Biosecurity and Public Health, Bioscience Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
| | - Cheryl Gleasner
- Biosecurity and Public Health, Bioscience Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
| | - Lindsey Jacobs
- Theoretical Biology and Biophysics, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
| | - Jessica Kubicek-Sutherland
- Physical Chemistry and Applied Spectroscopy, Chemistry Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
| | - Tessa LeCuyer
- Center for Global Health, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, USA
| | - Vincent Otieno
- University of New Mexico Laboratories of Parasitic and Viral Diseases, Kisumu, Kenya
| | - Evans Raballah
- Department of Medical Laboratory Sciences, School of Public Health, Biomedical Sciences and Technology, Masinde Muliro University of Science and Technology, Kakamega, Kenya
| | - Norman Doggett
- Biosecurity and Public Health, Bioscience Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
| | - Harshini Mukundan
- Physical Chemistry and Applied Spectroscopy, Chemistry Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
| | - Douglas J Perkins
- Center for Global Health, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, USA
| | - Benjamin McMahon
- Theoretical Biology and Biophysics, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
| |
Collapse
|
11
|
Genome and Functional Characterization of Colonization Factor Antigen I- and CS6-Encoding Heat-Stable Enterotoxin-Only Enterotoxigenic Escherichia coli Reveals Lineage and Geographic Variation. mSystems 2019; 4:mSystems00329-18. [PMID: 30944874 PMCID: PMC6446980 DOI: 10.1128/msystems.00329-18] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Accepted: 12/17/2018] [Indexed: 12/23/2022] Open
Abstract
Comparative genomics and functional characterization were used to analyze a global collection of CFA/I and CS6 ST-only ETEC isolates associated with human diarrhea, demonstrating differences in the genomic content of CFA/I and CS6 isolates related to CF type, lineage, and geographic location of isolation and also lineage-related differences in ST production. Complete genome sequencing of selected CFA/I and CS6 isolates enabled descriptions of a highly conserved ST-positive (ST+) CFA/I plasmid and of at least five diverse ST and/or CS6 plasmids among the CS6 ETEC isolates. There is currently no approved vaccine for ST-only ETEC, or for any ETEC for that matter, and as such, the current report provides functional verification of ST and CF production and antimicrobial susceptibility testing and an in-depth genomic characterization of a collection of isolates that could serve as representatives of CFA/I- or CS6-encoding ST-only ETEC strains for future studies of ETEC pathogenesis, vaccine studies, and/or clinical trials. Enterotoxigenic Escherichia coli (ETEC) is a significant cause of childhood diarrhea and is a leading cause of traveler’s diarrhea. ETEC strains encoding the heat-stable enterotoxin (ST) are more often associated with childhood diarrhea than ETEC strains that encode only the heat-labile enterotoxin (LT). Colonization factors (CFs) also have a demonstrated role in ETEC virulence, and two of the most prevalent CFs among ETEC that have caused diarrhea are colonization factor antigen I (CFA/I) and CS6. In the current report, we describe the genomes of 269 CS6- or CFA/I-encoding ST-only ETEC isolates that were associated with human diarrhea. While the CS6 and CFA/I ETEC were identified in at least 13 different ETEC genomic lineages, a majority (85%; 229/269) were identified in only six lineages. Complete genome sequencing of selected isolates demonstrated that a conserved plasmid contributed to the dissemination of CFA/I whereas at least five distinct plasmids were involved in the dissemination of ST and/or CS6. Additionally, there were differences in gene content between CFA/I and CS6 ETEC at the phylogroup and lineage levels and in association with their geographic location of isolation as well as lineage-related differences in ST production. Thus, we demonstrate that genomically diverse E. coli strains have acquired ST, as well as CFA/I or CS6, via one or more plasmids and that, in some cases, isolates of a particular lineage or geographic location have undergone additional modifications to their genome content. These findings will aid investigations of virulence and the development of improved diagnostics and vaccines against this important human diarrheal pathogen. IMPORTANCE Comparative genomics and functional characterization were used to analyze a global collection of CFA/I and CS6 ST-only ETEC isolates associated with human diarrhea, demonstrating differences in the genomic content of CFA/I and CS6 isolates related to CF type, lineage, and geographic location of isolation and also lineage-related differences in ST production. Complete genome sequencing of selected CFA/I and CS6 isolates enabled descriptions of a highly conserved ST-positive (ST+) CFA/I plasmid and of at least five diverse ST and/or CS6 plasmids among the CS6 ETEC isolates. There is currently no approved vaccine for ST-only ETEC, or for any ETEC for that matter, and as such, the current report provides functional verification of ST and CF production and antimicrobial susceptibility testing and an in-depth genomic characterization of a collection of isolates that could serve as representatives of CFA/I- or CS6-encoding ST-only ETEC strains for future studies of ETEC pathogenesis, vaccine studies, and/or clinical trials.
Collapse
|
12
|
Temporal Variability of Escherichia coli Diversity in the Gastrointestinal Tracts of Tanzanian Children with and without Exposure to Antibiotics. mSphere 2018; 3:3/6/e00558-18. [PMID: 30404930 PMCID: PMC6222053 DOI: 10.1128/msphere.00558-18] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
This study increases the number of resident Escherichia coli genome sequences, and explores E. coli diversity through longitudinal sampling. We investigate the genomes of E. coli isolated from human gastrointestinal tracts as part of an antibiotic treatment program among rural Tanzanian children. Phylogenomics demonstrates that resident E. coli are diverse, even within a single host. Though the E. coli isolates of the gastrointestinal community tend to be phylogenomically similar at a given time, they differed across the interrogated time points, demonstrating the variability of the members of the E. coli community in these subjects. Exposure to antibiotic treatment did not have an apparent impact on the E. coli community or the presence of resistance and virulence genes within E. coli genomes. The findings of this study highlight the variable nature of specific bacterial members of the human gastrointestinal tract. The stability of the Escherichia coli populations in the human gastrointestinal tract is not fully appreciated, and represents a significant knowledge gap regarding gastrointestinal community structure, as well as resistance to incoming pathogenic bacterial species and antibiotic treatment. The current study examines the genomic content of 240 Escherichia coli isolates from 30 children, aged 2 to 35 months old, in Tanzania. The E. coli strains were isolated from three time points spanning a six-month time period, with and without antibiotic treatment. The resulting isolates were sequenced, and the genomes compared. The findings in this study highlight the transient nature of E. coli strains in the gastrointestinal tract of these children, as during a six-month interval, no one individual contained phylogenomically related isolates at all three time points. While the majority of the isolates at any one time point were phylogenomically similar, most individuals did not contain phylogenomically similar isolates at more than two time points. Examination of global genome content, canonical E. coli virulence factors, multilocus sequence type, serotype, and antimicrobial resistance genes identified diversity even among phylogenomically similar strains. There was no apparent increase in the antimicrobial resistance gene content after antibiotic treatment. The examination of the E. coli from longitudinal samples from multiple children in Tanzania provides insight into the genomic diversity and population variability of resident E. coli within the rapidly changing environment of the gastrointestinal tract of these children. IMPORTANCE This study increases the number of resident Escherichia coli genome sequences, and explores E. coli diversity through longitudinal sampling. We investigate the genomes of E. coli isolated from human gastrointestinal tracts as part of an antibiotic treatment program among rural Tanzanian children. Phylogenomics demonstrates that resident E. coli are diverse, even within a single host. Though the E. coli isolates of the gastrointestinal community tend to be phylogenomically similar at a given time, they differed across the interrogated time points, demonstrating the variability of the members of the E. coli community in these subjects. Exposure to antibiotic treatment did not have an apparent impact on the E. coli community or the presence of resistance and virulence genes within E. coli genomes. The findings of this study highlight the variable nature of specific bacterial members of the human gastrointestinal tract.
Collapse
|
13
|
A comparative analysis of library prep approaches for sequencing low input translatome samples. BMC Genomics 2018; 19:696. [PMID: 30241496 PMCID: PMC6151020 DOI: 10.1186/s12864-018-5066-2] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Accepted: 09/11/2018] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Cell type-specific ribosome-pulldown has become an increasingly popular method for analysis of gene expression. It allows for expression analysis from intact tissues and monitoring of protein synthesis in vivo. However, while its utility has been assessed, technical aspects related to sequencing of these samples, often starting with a smaller amount of RNA, have not been reported. In this study, we evaluated the performance of five library prep protocols for ribosome-associated mRNAs when only 250 pg-4 ng of total RNA are used. RESULTS We obtained total and RiboTag-IP RNA, in three biological replicates. We compared 5 methods of library preparation for Illumina Next Generation sequencing: NuGEN Ovation RNA-Seq system V2 Kit, TaKaRa SMARTer Stranded Total RNA-Seq Kit, TaKaRa SMART-Seq v4 Ultra Low Input RNA Kit, Illumina TruSeq RNA Library Prep Kit v2 and NEBNext® Ultra™ Directional RNA Library Prep Kit using slightly modified protocols each with 4 ng of total RNA. An additional set of samples was processed using the TruSeq kit with 70 ng, as a 'gold standard' control and the SMART-Seq v4 with 250 pg of total RNA. TruSeq-processed samples had the best metrics overall, with similar results for the 4 ng and 70 ng samples. The results of the SMART-Seq v4 processed samples were similar to TruSeq (Spearman correlation > 0.8) despite using lower amount of input RNA. All RiboTag-IP samples had an increase in the intronic reads compared with the corresponding whole tissue, suggesting that the IP captures some immature mRNAs. The SMARTer-processed samples had a higher representation of ribosomal and non-coding RNAs leading to lower representation of protein coding mRNA. The enrichment or depletion of IP samples compared to corresponding input RNA was similar across all kits except for SMARTer kit. CONCLUSION RiboTag-seq can be performed successfully with as little as 250 pg of total RNA when using the SMART-Seq v4 kit and 4 ng when using the modified protocols of other library preparation kits. The SMART-Seq v4 and TruSeq kits resulted in the highest quality libraries. RiboTag IP RNA contains some immature transcripts.
Collapse
|
14
|
Ma ZS, Li L. Measuring metagenome diversity and similarity with Hill numbers. Mol Ecol Resour 2018; 18:1339-1355. [PMID: 29985552 DOI: 10.1111/1755-0998.12923] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2017] [Revised: 01/31/2018] [Accepted: 02/17/2018] [Indexed: 11/27/2022]
Abstract
The first step of any metagenome sequencing project is to get the inventory of OTU abundances (operational taxonomic units) and/or metagenomic gene abundances. The former is generated with 16S-rRNA-tagged amplicon sequencing technology, and the latter can be generated from either gene-targeted or whole-sample shotgun metagenomics technologies. With 16S-rRNA data sets, measuring community diversity with diversity indexes such as species richness and Shannon's index has been a de facto standard analysis; nevertheless, similarly comprehensive approaches to metagenomic gene abundances are still largely missing, despite that both OTU and gene abundances are DNA reads. Here, we adapt the Hill numbers, which were reintroduced to macrocommunity ecology recently and are now widely regarded as a most appropriate measure system for ecological diversity, for measuring metagenome alpha-, beta- and gamma-diversities, and similarity. Our proposal includes the following: (a) Metagenomic gene (MG) diversity measures the single-gene-level metagenome diversity; (b) Type-I metagenome functional gene cluster (MFGC) diversity measures the diversity of functional gene clusters but ignoring within-cluster gene abundance information; (c) Type-II MFGC diversity considers within-cluster gene abundances information and integrates gene-cluster-level metagenome diversity and functional gene redundancy information; and (d) Four classes of Hill-numbers-based similarity metrics, including local gene overlap, regional gene overlap, gene homogeneity measure and gene turnover complement, were introduced in terms of MG and MFGC, respectively. We demonstrate the proposal with the gut metagenomes from healthy and IBD (inflammatory bowel disease) cohorts. The Hill numbers offer a unified approach to cohesively and comprehensively measuring the ecological and metagenome diversities of microbiomes.
Collapse
Affiliation(s)
- Zhanshan Sam Ma
- Computational Biology and Medical Ecology Lab, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| | - Lianwei Li
- Computational Biology and Medical Ecology Lab, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| |
Collapse
|
15
|
Responses of the Human Gut Escherichia coli Population to Pathogen and Antibiotic Disturbances. mSystems 2018; 3:mSystems00047-18. [PMID: 30057943 PMCID: PMC6060285 DOI: 10.1128/msystems.00047-18] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Accepted: 07/06/2018] [Indexed: 11/23/2022] Open
Abstract
Research on human-associated E. coli tends to focus on pathogens, such as enterotoxigenic E. coli (ETEC) strains, which are a leading cause of diarrhea in developing countries. However, the severity of disease caused by these pathogens is thought to be influenced by the microbiome. The nonpathogenic E. coli community that resides in the human gastrointestinal tract may play a role in pathogen colonization and disease severity and may become a reservoir for virulence and antibiotic resistance genes. Our study used whole-genome sequencing of E. coli before, during, and after challenge with an archetype ETEC isolate, H10407, and antibiotic treatment to explore the diversity and resiliency of the resident E. coli population in response to the ecological disturbances caused by pathogen invasion and antibiotic treatment. Studies of Escherichia coli in the human gastrointestinal tract have focused on pathogens, such as diarrhea-causing enterotoxigenic E. coli (ETEC), while overlooking the resident, nonpathogenic E. coli community. Relatively few genomes of nonpathogenic E. coli strains are available for comparative genomic analysis, and the ecology of these strains is poorly understood. This study examined the diversity and dynamics of resident human gastrointestinal E. coli communities in the face of the ecological challenges presented by pathogen (ETEC) challenge, as well as of antibiotic treatment. Whole-genome sequences obtained from E. coli isolates from before, during, and after ETEC challenge were used in phylogenomic and comparative genomic analyses to examine the diversity of the resident E. coli communities, as well as the dynamics of the challenge strain, H10407, a well-studied ETEC strain (serotype O78:H11) that produces both heat-labile and heat-stable enterotoxins. ETEC failed to become the dominant E. coli clone in two of the six challenge subjects, each of whom exhibited limited or no clinical presentation of diarrhea. The E. coli communities of the remaining four subjects became ETEC dominant during the challenge but reverted to their original, subject-specific populations following antibiotic treatment, suggesting resiliency of the resident E. coli population following major ecological disruptions. This resiliency is likely due in part to the abundance of antibiotic-resistant ST131 E. coli strains in the resident populations. This report provides valuable insights into the potential interactions of members of the gastrointestinal microbiome and its responses to challenge by an external pathogen and by antibiotic exposure. IMPORTANCE Research on human-associated E. coli tends to focus on pathogens, such as enterotoxigenic E. coli (ETEC) strains, which are a leading cause of diarrhea in developing countries. However, the severity of disease caused by these pathogens is thought to be influenced by the microbiome. The nonpathogenic E. coli community that resides in the human gastrointestinal tract may play a role in pathogen colonization and disease severity and may become a reservoir for virulence and antibiotic resistance genes. Our study used whole-genome sequencing of E. coli before, during, and after challenge with an archetype ETEC isolate, H10407, and antibiotic treatment to explore the diversity and resiliency of the resident E. coli population in response to the ecological disturbances caused by pathogen invasion and antibiotic treatment.
Collapse
|
16
|
Nickerson KP, Senger S, Zhang Y, Lima R, Patel S, Ingano L, Flavahan WA, Kumar DKV, Fraser CM, Faherty CS, Sztein MB, Fiorentino M, Fasano A. Salmonella Typhi Colonization Provokes Extensive Transcriptional Changes Aimed at Evading Host Mucosal Immune Defense During Early Infection of Human Intestinal Tissue. EBioMedicine 2018; 31:92-109. [PMID: 29735417 PMCID: PMC6013756 DOI: 10.1016/j.ebiom.2018.04.005] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 04/02/2018] [Accepted: 04/05/2018] [Indexed: 12/29/2022] Open
Abstract
Commensal microorganisms influence a variety of host functions in the gut, including immune response, glucose homeostasis, metabolic pathways and oxidative stress, among others. This study describes how Salmonella Typhi, the pathogen responsible for typhoid fever, uses similar strategies to escape immune defense responses and survive within its human host. To elucidate the early mechanisms of typhoid fever, we performed studies using healthy human intestinal tissue samples and "mini-guts," organoids grown from intestinal tissue taken from biopsy specimens. We analyzed gene expression changes in human intestinal specimens and bacterial cells both separately and after colonization. Our results showed mechanistic strategies that S. Typhi uses to rearrange the cellular machinery of the host cytoskeleton to successfully invade the intestinal epithelium, promote polarized cytokine release and evade immune system activation by downregulating genes involved in antigen sampling and presentation during infection. This work adds novel information regarding S. Typhi infection pathogenesis in humans, by replicating work shown in traditional cell models, and providing new data that can be applied to future vaccine development strategies.
Collapse
Affiliation(s)
- K P Nickerson
- Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States; Department of Pediatrics, Harvard Medical School, Harvard University, Boston, MA, United States.
| | - S Senger
- Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States; Department of Pediatrics, Harvard Medical School, Harvard University, Boston, MA, United States
| | - Y Zhang
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, United States
| | - R Lima
- Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States
| | - S Patel
- Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States
| | - L Ingano
- Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States
| | - W A Flavahan
- Department of Pathology, Massachusetts General Hospital, Boston, MA, United States
| | - D K V Kumar
- Department for the Neuroscience of Genetics and Aging, Massachusetts General Hospital, Boston, MA, United States
| | - C M Fraser
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, United States
| | - C S Faherty
- Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States; Department of Pediatrics, Harvard Medical School, Harvard University, Boston, MA, United States
| | - M B Sztein
- Center for Vaccine Development, Department of Pediatrics, University of Maryland, Baltimore, MD, United States
| | - M Fiorentino
- Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States; Department of Pediatrics, Harvard Medical School, Harvard University, Boston, MA, United States
| | - A Fasano
- Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States; Department of Pediatrics, Harvard Medical School, Harvard University, Boston, MA, United States.
| |
Collapse
|
17
|
Shuman J, Giles TX, Carroll L, Tabata K, Powers A, Suh SJ, Silo-Suh L. Transcriptome analysis of a Pseudomonas aeruginosasn-glycerol-3-phosphate dehydrogenase mutant reveals a disruption in bioenergetics. MICROBIOLOGY-SGM 2018. [PMID: 29533746 DOI: 10.1099/mic.0.000646] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Pseudomonas aeruginosa causes acute and chronic human infections and is the major cause of morbidity and mortality in cystic fibrosis (CF) patients. We previously determined that the sn-glycerol-3-phosphate dehydrogenase encoded by glpD plays a larger role in P. aeruginosa physiology beyond its role in glycerol metabolism. To better understand the effect of a glpD mutation on P. aeruginosa physiology we compared the transcriptomes of P. aeruginosa strain PAO1 and the PAO1ΔglpD mutant using RNA-seq analysis. We determined that a null mutation of glpD significantly altered amino acid metabolism in P. aeruginosa and affected the production of intermediates that are channelled into the tricarboxylic acid cycle. Moreover, the loss of glpD induced a general stress response mediated by RpoS in P. aeruginosa. Several other phenotypes observed for the P. aeruginosa glpD mutant include increased persister cell formation, reduced extracellular ATP accumulation and increased heat output. Taken together, these findings implicate sn-glycerol-3-phosphate dehydrogenase as a key player in energy metabolism in P. aeruginosa.
Collapse
Affiliation(s)
- Jon Shuman
- Department of Basic Medical Sciences, Mercer University, School of Medicine, Macon, GA 31207, USA
| | - Tyler Xavier Giles
- Department of Basic Medical Sciences, Mercer University, School of Medicine, Macon, GA 31207, USA
| | - Leslie Carroll
- Department of Basic Medical Sciences, Mercer University, School of Medicine, Macon, GA 31207, USA
| | - Kenji Tabata
- Daiichi University of Pharmacy, 22-1, Tamagawa-cho, Minami-ku, Fukuoka 815-8511, Japan
| | - Austin Powers
- Department of Basic Medical Sciences, Mercer University, School of Medicine, Macon, GA 31207, USA
| | - Sang-Jin Suh
- Department of Biological Sciences, Auburn University, AL 36849, USA
| | - Laura Silo-Suh
- Department of Basic Medical Sciences, Mercer University, School of Medicine, Macon, GA 31207, USA
| |
Collapse
|
18
|
Sintsova A, Smith S, Subashchandrabose S, Mobley HL. Role of Ethanolamine Utilization Genes in Host Colonization during Urinary Tract Infection. Infect Immun 2018; 86:e00542-17. [PMID: 29229730 PMCID: PMC5820945 DOI: 10.1128/iai.00542-17] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Accepted: 12/05/2017] [Indexed: 11/20/2022] Open
Abstract
Urinary tract infection (UTI) is the second most common infection in humans, making it a global health priority. Nearly half of all women will experience a symptomatic UTI, with uropathogenic Escherichia coli (UPEC) being the major causative agent of the infection. Although there has been extensive research on UPEC virulence determinants, the importance of host-specific metabolism remains understudied. We report here that UPEC upregulates the expression of ethanolamine utilization genes during uncomplicated UTIs in humans. We further show that UPEC ethanolamine metabolism is required for effective bladder colonization in the mouse model of ascending UTI and is dispensable for bladder colonization in an immunocompromised mouse model of UTI. We demonstrate that although ethanolamine metabolism mutants do not show increased susceptibility to antimicrobial responses of neutrophils, this metabolic pathway is important for surviving the innate immune system during UTI. This study reveals a novel aspect of UPEC metabolism in the host and provides evidence for an underappreciated link between bacterial metabolism and the host immune response.
Collapse
Affiliation(s)
- Anna Sintsova
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA
| | - Sara Smith
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA
| | | | - Harry L Mobley
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
19
|
Abstract
Enteric pathogens with low infectious doses rely on the ability to orchestrate the expression of virulence and metabolism-associated genes in response to environmental cues for successful infection. Accordingly, the human pathogen enterohemorrhagic Escherichia coli (EHEC) employs a complex multifaceted regulatory network to link the expression of type III secretion system (T3SS) components to nutrient availability. While phosphorylation of histidine and aspartate residues on two-component system response regulators is recognized as an integral part of bacterial signaling, the involvement of phosphotyrosine-mediated control is minimally explored in Gram-negative pathogens. Our recent phosphotyrosine profiling study of E. coli identified 342 phosphorylated proteins, indicating that phosphotyrosine modifications in bacteria are more prevalent than previously anticipated. The present study demonstrates that tyrosine phosphorylation of a metabolite-responsive LacI/GalR family regulator, Cra, negatively affects T3SS expression under glycolytic conditions that are typical for the colonic lumen environment where production of the T3SS is unnecessary. Our data suggest that Cra phosphorylation affects T3SS expression by modulating the expression of ler, which encodes the major activator of EHEC virulence gene expression. Phosphorylation of the Cra Y47 residue diminishes DNA binding to fine-tune the expression of virulence-associated genes, including those of the locus of enterocyte effacement pathogenicity island that encode the T3SS, and thereby negatively affects the formation of attaching and effacing lesions. Our data indicate that tyrosine phosphorylation provides an additional mechanism to control the DNA binding of Cra and other LacI/GalR family regulators, including LacI and PurR. This study describes an initial effort to unravel the role of global phosphotyrosine signaling in the control of EHEC virulence potential. Enterohemorrhagic Escherichia coli (EHEC) causes outbreaks of hemorrhagic colitis and the potentially fatal hemolytic-uremic syndrome. Successful host colonization by EHEC relies on the ability to coordinate the expression of virulence factors in response to environmental cues. A complex network that integrates environmental signals at multiple regulatory levels tightly controls virulence gene expression. We demonstrate that EHEC utilizes a previously uncharacterized phosphotyrosine signaling pathway through Cra to fine-tune the expression of virulence-associated genes to effectively control T3SS production. This study demonstrates that tyrosine phosphorylation negatively affects the DNA-binding capacity of Cra, which affects the expression of genes related to virulence and metabolism. We demonstrate for the first time that phosphotyrosine-mediated control affects global transcription in EHEC. Our data provide insight into a hitherto unexplored regulatory level of the global network controlling EHEC virulence gene expression.
Collapse
|
20
|
Ahmed Z, Ucar D. I-ATAC: interactive pipeline for the management and pre-processing of ATAC-seq samples. PeerJ 2017; 5:e4040. [PMID: 29181276 PMCID: PMC5702251 DOI: 10.7717/peerj.4040] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2017] [Accepted: 10/25/2017] [Indexed: 11/20/2022] Open
Abstract
Assay for Transposase Accessible Chromatin (ATAC-seq) is an open chromatin profiling assay that is adapted to interrogate chromatin accessibility from small cell numbers. ATAC-seq surmounted a major technical barrier and enabled epigenome profiling of clinical samples. With this advancement in technology, we are now accumulating ATAC-seq samples from clinical samples at an unprecedented rate. These epigenomic profiles hold the key to uncovering how transcriptional programs are established in diverse human cells and are disrupted by genetic or environmental factors. Thus, the barrier to deriving important clinical insights from clinical epigenomic samples is no longer one of data generation but of data analysis. Specifically, we are still missing easy-to-use software tools that will enable non-computational scientists to analyze their own ATAC-seq samples. To facilitate systematic pre-processing and management of ATAC-seq samples, we developed an interactive, cross-platform, user-friendly and customized desktop application: interactive-ATAC (I-ATAC). I-ATAC integrates command-line data processing tools (FASTQC, Trimmomatic, BWA, Picard, ATAC_BAM_shiftrt_gappedAlign.pl, Bedtools and Macs2) into an easy-to-use platform with user interface to automatically pre-process ATAC-seq samples with parallelized and customizable pipelines. Its performance has been tested using public ATAC-seq datasets in GM12878 and CD4+T cells and a feature-based comparison is performed with some available interactive LIMS (Galaxy, SMITH, SeqBench, Wasp, NG6, openBIS). I-ATAC is designed to empower non-computational scientists to process their own datasets and to break to exclusivity of data analyses to computational scientists. Additionally, I-ATAC is capable of processing WGS and ChIP-seq samples, and can be customized by the user for one-independent or multiple-sequential operations.
Collapse
Affiliation(s)
- Zeeshan Ahmed
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, United States of America
| | - Duygu Ucar
- The Jackson Laboratory For Genomic Medicine, Farmington, CT, United States of America
| |
Collapse
|
21
|
Lloyd-Price J, Mahurkar A, Rahnavard G, Crabtree J, Orvis J, Hall AB, Brady A, Creasy HH, McCracken C, Giglio MG, McDonald D, Franzosa EA, Knight R, White O, Huttenhower C. Strains, functions and dynamics in the expanded Human Microbiome Project. Nature 2017; 550:61-66. [PMID: 28953883 PMCID: PMC5831082 DOI: 10.1038/nature23889] [Citation(s) in RCA: 727] [Impact Index Per Article: 103.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2016] [Accepted: 08/08/2017] [Indexed: 12/29/2022]
Abstract
The characterization of baseline microbial and functional diversity in the human microbiome has enabled studies of microbiome-related disease, diversity, biogeography, and molecular function. The National Institutes of Health Human Microbiome Project has provided one of the broadest such characterizations so far. Here we introduce a second wave of data from the study, comprising 1,631 new metagenomes (2,355 total) targeting diverse body sites with multiple time points in 265 individuals. We applied updated profiling and assembly methods to provide new characterizations of microbiome personalization. Strain identification revealed subspecies clades specific to body sites; it also quantified species with phylogenetic diversity under-represented in isolate genomes. Body-wide functional profiling classified pathways into universal, human-enriched, and body site-enriched subsets. Finally, temporal analysis decomposed microbial variation into rapidly variable, moderately variable, and stable subsets. This study furthers our knowledge of baseline human microbial diversity and enables an understanding of personalized microbiome function and dynamics. Updates from the Human Microbiome Project analyse the largest known body-wide metagenomic profile of human microbiome personalization. The National Institutes of Health Human Microbiome Project, published in 2012, provided a broad overview of the baseline microbiome in healthy individuals using samples from 18 different body sites. In this second installment, the authors expand this dataset with new whole-metagenome sequences and additional time points to assess the diversity and spatiotemporal distributions of the microbiota at six of these body sites. Using a combination of strain profiling, species-level metagenomic functional profiling and longitudinal analyses, this study delivers deeper insights into human microbial communities and provides an important resource for understanding what constitutes a 'healthy' microbiota.
Collapse
Affiliation(s)
- Jason Lloyd-Price
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts 02115, USA.,The Broad Institute, Cambridge, Massachusetts 02142, USA
| | - Anup Mahurkar
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Gholamali Rahnavard
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts 02115, USA.,The Broad Institute, Cambridge, Massachusetts 02142, USA
| | - Jonathan Crabtree
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Joshua Orvis
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | | | - Arthur Brady
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Heather H Creasy
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Carrie McCracken
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Michelle G Giglio
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Daniel McDonald
- Department of Pediatrics, University of California San Diego, La Jolla, California 92093, USA
| | - Eric A Franzosa
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts 02115, USA.,The Broad Institute, Cambridge, Massachusetts 02142, USA
| | - Rob Knight
- Department of Pediatrics, University of California San Diego, La Jolla, California 92093, USA.,Department of Computer Science & Engineering, University of California San Diego, La Jolla, California 92093, USA
| | - Owen White
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Curtis Huttenhower
- Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts 02115, USA.,The Broad Institute, Cambridge, Massachusetts 02142, USA
| |
Collapse
|
22
|
Transcriptional Variation of Diverse Enteropathogenic Escherichia coli Isolates under Virulence-Inducing Conditions. mSystems 2017; 2:mSystems00024-17. [PMID: 28766584 PMCID: PMC5527300 DOI: 10.1128/msystems.00024-17] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2017] [Accepted: 05/06/2017] [Indexed: 12/23/2022] Open
Abstract
Enteropathogenic Escherichia coli (EPEC) bacteria are a diverse group of pathogens that cause moderate to severe diarrhea in young children in developing countries. EPEC isolates can be further subclassified as typical EPEC (tEPEC) isolates that contain the bundle-forming pilus (BFP) or as atypical EPEC (aEPEC) isolates that do not contain BFP. Comparative genomics studies have recently highlighted the considerable genomic diversity among EPEC isolates. In the current study, we used RNA sequencing (RNA-Seq) to characterize the global transcriptomes of eight tEPEC isolates representing the identified genomic diversity, as well as one aEPEC isolate. The global transcriptomes were determined for the EPEC isolates under conditions of laboratory growth that are known to induce expression of virulence-associated genes. The findings demonstrate that unique genes of EPEC isolates from diverse phylogenomic lineages contribute to variation in their global transcriptomes. There were also phylogroup-specific differences in the global transcriptomes, including genes involved in iron acquisition, which had significant differential expression in the EPEC isolates belonging to phylogroup B2. Also, three EPEC isolates from the same phylogenomic lineage (EPEC8) had greater levels of similarity in their genomic content and exhibited greater similarities in their global transcriptomes than EPEC from other lineages; however, even among closely related isolates there were isolate-specific differences among their transcriptomes. These findings highlight the transcriptional variability that correlates with the previously unappreciated genomic diversity of EPEC. IMPORTANCE Recent studies have demonstrated that there is considerable genomic diversity among EPEC isolates; however, it is unknown if this genomic diversity leads to differences in their global transcription. This study used RNA-Seq to compare the global transcriptomes of EPEC isolates from diverse phylogenomic lineages. We demonstrate that there are lineage- and isolate-specific differences in the transcriptomes of genomically diverse EPEC isolates during growth under in vitro virulence-inducing conditions. This study addressed biological variation among isolates of a single pathovar in an effort to demonstrate that while each of these isolates is considered an EPEC isolate, there is significant transcriptional diversity among members of this pathovar. Future studies should consider whether this previously undescribed transcriptional variation may play a significant role in isolate-specific variability of EPEC clinical presentations.
Collapse
|
23
|
Comparative genomics and transcriptomics of Escherichia coli isolates carrying virulence factors of both enteropathogenic and enterotoxigenic E. coli. Sci Rep 2017; 7:3513. [PMID: 28615618 PMCID: PMC5471185 DOI: 10.1038/s41598-017-03489-z] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Accepted: 04/28/2017] [Indexed: 12/21/2022] Open
Abstract
Escherichia coli that are capable of causing human disease are often classified into pathogenic variants (pathovars) based on their virulence gene content. However, disease-associated hybrid E. coli, containing unique combinations of multiple canonical virulence factors have also been described. Such was the case of the E. coli O104:H4 outbreak in 2011, which caused significant morbidity and mortality. Among the pathovars of diarrheagenic E. coli that cause significant human disease are the enteropathogenic E. coli (EPEC) and enterotoxigenic E. coli (ETEC). In the current study we use comparative genomics, transcriptomics, and functional studies to characterize isolates that contain virulence factors of both EPEC and ETEC. Based on phylogenomic analysis, these hybrid isolates are more genomically-related to EPEC, but appear to have acquired ETEC virulence genes. Global transcriptional analysis using RNA sequencing, demonstrated that the EPEC and ETEC virulence genes of these hybrid isolates were differentially-expressed under virulence-inducing laboratory conditions, similar to reference isolates. Immunoblot assays further verified that the virulence gene products were produced and that the T3SS effector EspB of EPEC, and heat-labile toxin of ETEC were secreted. These findings document the existence and virulence potential of an E. coli pathovar hybrid that blurs the distinction between E. coli pathovars.
Collapse
|
24
|
Agrawal S, Arze C, Adkins RS, Crabtree J, Riley D, Vangala M, Galens K, Fraser CM, Tettelin H, White O, Angiuoli SV, Mahurkar A, Fricke WF. CloVR-Comparative: automated, cloud-enabled comparative microbial genome sequence analysis pipeline. BMC Genomics 2017; 18:332. [PMID: 28449639 PMCID: PMC5408420 DOI: 10.1186/s12864-017-3717-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 04/21/2017] [Indexed: 11/11/2022] Open
Abstract
Background The benefit of increasing genomic sequence data to the scientific community depends on easy-to-use, scalable bioinformatics support. CloVR-Comparative combines commonly used bioinformatics tools into an intuitive, automated, and cloud-enabled analysis pipeline for comparative microbial genomics. Results CloVR-Comparative runs on annotated complete or draft genome sequences that are uploaded by the user or selected via a taxonomic tree-based user interface and downloaded from NCBI. CloVR-Comparative runs reference-free multiple whole-genome alignments to determine unique, shared and core coding sequences (CDSs) and single nucleotide polymorphisms (SNPs). Output includes short summary reports and detailed text-based results files, graphical visualizations (phylogenetic trees, circular figures), and a database file linked to the Sybil comparative genome browser. Data up- and download, pipeline configuration and monitoring, and access to Sybil are managed through CloVR-Comparative web interface. CloVR-Comparative and Sybil are distributed as part of the CloVR virtual appliance, which runs on local computers or the Amazon EC2 cloud. Representative datasets (e.g. 40 draft and complete Escherichia coli genomes) are processed in <36 h on a local desktop or at a cost of <$20 on EC2. Conclusions CloVR-Comparative allows anybody with Internet access to run comparative genomics projects, while eliminating the need for on-site computational resources and expertise. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3717-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Cesar Arze
- Institute for Genome Sciences, Baltimore, MD, USA
| | | | | | - David Riley
- Institute for Genome Sciences, Baltimore, MD, USA
| | | | - Kevin Galens
- Institute for Genome Sciences, Baltimore, MD, USA
| | - Claire M Fraser
- Institute for Genome Sciences, Baltimore, MD, USA.,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Hervé Tettelin
- Institute for Genome Sciences, Baltimore, MD, USA.,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Owen White
- Institute for Genome Sciences, Baltimore, MD, USA.,Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD, USA
| | | | | | - W Florian Fricke
- Institute for Genome Sciences, Baltimore, MD, USA. .,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, USA. .,Department of Nutrigenomics, University of Hohenheim, Stuttgart, Germany.
| |
Collapse
|
25
|
Crawl D, Singh A, Altintas I. Kepler WebView: A Lightweight, Portable Framework for Constructing Real-time Web Interfaces of Scientific Workflows. ACTA ACUST UNITED AC 2017; 80:673-679. [PMID: 28232853 DOI: 10.1016/j.procs.2016.05.361] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Modern web technologies facilitate the creation of high-quality data visualizations, and rich, interactive components across a wide variety of devices. Scientific workflow systems can greatly benefit from these technologies by giving scientists a better understanding of their data or model leading to new insights. While several projects have enabled web access to scientific workflow systems, they are primarily organized as a large portal server encapsulating the workflow engine. In this vision paper, we propose the design for Kepler WebView, a lightweight framework that integrates web technologies with the Kepler Scientific Workflow System. By embedding a web server in the Kepler process, Kepler WebView enables a wide variety of usage scenarios that would be difficult or impossible using the portal model.
Collapse
Affiliation(s)
- Daniel Crawl
- San Diego Supercomputer Center, University of California, San Diego
| | - Alok Singh
- San Diego Supercomputer Center, University of California, San Diego
| | - Ilkay Altintas
- San Diego Supercomputer Center, University of California, San Diego
| |
Collapse
|
26
|
Gonzalez S, Clavijo B, Rivarola M, Moreno P, Fernandez P, Dopazo J, Paniego N. ATGC transcriptomics: a web-based application to integrate, explore and analyze de novo transcriptomic data. BMC Bioinformatics 2017; 18:121. [PMID: 28222698 PMCID: PMC5320735 DOI: 10.1186/s12859-017-1494-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2016] [Accepted: 01/21/2017] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND In the last years, applications based on massively parallelized RNA sequencing (RNA-seq) have become valuable approaches for studying non-model species, e.g., without a fully sequenced genome. RNA-seq is a useful tool for detecting novel transcripts and genetic variations and for evaluating differential gene expression by digital measurements. The large and complex datasets resulting from functional genomic experiments represent a challenge in data processing, management, and analysis. This problem is especially significant for small research groups working with non-model species. RESULTS We developed a web-based application, called ATGC transcriptomics, with a flexible and adaptable interface that allows users to work with new generation sequencing (NGS) transcriptomic analysis results using an ontology-driven database. This new application simplifies data exploration, visualization, and integration for a better comprehension of the results. CONCLUSIONS ATGC transcriptomics provides access to non-expert computer users and small research groups to a scalable storage option and simple data integration, including database administration and management. The software is freely available under the terms of GNU public license at http://atgcinta.sourceforge.net .
Collapse
Affiliation(s)
- Sergio Gonzalez
- Instituto de Biotecnología, Centro Investigación en Ciencias Veterinarias y Agronómicas (CICVyA) INTA, Hurlingham, Buenos Aires Argentina
| | | | - Máximo Rivarola
- Instituto de Biotecnología, Centro Investigación en Ciencias Veterinarias y Agronómicas (CICVyA) INTA, Hurlingham, Buenos Aires Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Godoy Cruz 2290, Buenos Aires, C1425FQB Argentina
| | - Patricio Moreno
- Instituto de Ingeniería Biomédica, Facultad de Ingeniería, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Paula Fernandez
- Instituto de Biotecnología, Centro Investigación en Ciencias Veterinarias y Agronómicas (CICVyA) INTA, Hurlingham, Buenos Aires Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Godoy Cruz 2290, Buenos Aires, C1425FQB Argentina
- Escuela de Ciencia y Tecnología, Universidad Nacional de San Martín, San Martín, Buenos Aires Argentina
| | - Joaquín Dopazo
- Computational Genomics Department, Centro de Investigación Príncipe Felipe, Valencia, Spain
| | - Norma Paniego
- Instituto de Biotecnología, Centro Investigación en Ciencias Veterinarias y Agronómicas (CICVyA) INTA, Hurlingham, Buenos Aires Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Godoy Cruz 2290, Buenos Aires, C1425FQB Argentina
| |
Collapse
|
27
|
Investigating the Relatedness of Enteroinvasive Escherichia coli to Other E. coli and Shigella Isolates by Using Comparative Genomics. Infect Immun 2016; 84:2362-2371. [PMID: 27271741 DOI: 10.1128/iai.00350-16] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2016] [Accepted: 05/31/2016] [Indexed: 12/17/2022] Open
Abstract
Enteroinvasive Escherichia coli (EIEC) is a unique pathovar that has a pathogenic mechanism nearly indistinguishable from that of Shigella species. In contrast to isolates of the four Shigella species, which are widespread and can be frequent causes of human illness, EIEC causes far fewer reported illnesses each year. In this study, we analyzed the genome sequences of 20 EIEC isolates, including 14 first described in this study. Phylogenomic analysis of the EIEC genomes demonstrated that 17 of the isolates are present in three distinct lineages that contained only EIEC genomes, compared to reference genomes from each of the E. coli pathovars and Shigella species. Comparative genomic analysis identified genes that were unique to each of the three identified EIEC lineages. While many of the EIEC lineage-specific genes have unknown functions, those with predicted functions included a colicin and putative proteins involved in transcriptional regulation or carbohydrate metabolism. In silico detection of the Shigella virulence plasmid (pINV), which is essential for the invasion of host cells, demonstrated that a form of pINV was present in nearly all EIEC genomes, but the Mxi-Spa-Ipa region of the plasmid that encodes the invasion-associated proteins was absent from several of the EIEC isolates. The comparative genomic findings in this study support the hypothesis that multiple EIEC lineages have evolved independently from multiple distinct lineages of E. coli via the acquisition of the Shigella virulence plasmid and, in some cases, the Shigella pathogenicity islands.
Collapse
|
28
|
Kania DA, Hazen TH, Hossain A, Nataro JP, Rasko DA. Genome diversity of Shigella boydii. Pathog Dis 2016; 74:ftw027. [PMID: 27056949 DOI: 10.1093/femspd/ftw027] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/04/2016] [Indexed: 11/13/2022] Open
Abstract
ITALIC! Shigella boydiiis one of the four ITALIC! Shigellaspecies that causes disease worldwide; however, there are few published studies that examine the genomic variation of this species. This study compares genomes of 72 total isolates; 28 ITALIC! S. boydiifrom Bangladesh and The Gambia that were recently isolated as part of the Global Enteric Multicenter Study (GEMS), 14 historical ITALIC! S. boydiigenomes in the public domain and 30 ITALIC! Escherichia coliand ITALIC! Shigellareference genomes that represent the genomic diversity of these pathogens. This comparative analysis of these 72 genomes identified that the ITALIC! S. boydiiisolates separate into three phylogenomic clades, each with specific gene content. Each of the clades contains ITALIC! S. boydiiisolates from geographic and temporally distant sources, indicating that the ITALIC! S. boydiiisolates from the GEMS are representative of ITALIC! S. boydii.This study describes the genome sequences of a collection of novel ITALIC! S. boydiiisolates and provides insight into the diversity of this species in comparison to the ITALIC! E. coliand other ITALIC! Shigellaspecies.
Collapse
Affiliation(s)
- Dane A Kania
- Institute for Genome Sciences, Department of Microbiology and Immunology, University of Maryland School of Medicine, 801 W. Baltimore Street, Suite 600, Baltimore, MD 21201, USA
| | - Tracy H Hazen
- Institute for Genome Sciences, Department of Microbiology and Immunology, University of Maryland School of Medicine, 801 W. Baltimore Street, Suite 600, Baltimore, MD 21201, USA
| | | | - James P Nataro
- Department of Pediatrics, University of Virginia School of Medicine, Charlottesville, VA 22908, USA
| | - David A Rasko
- Institute for Genome Sciences, Department of Microbiology and Immunology, University of Maryland School of Medicine, 801 W. Baltimore Street, Suite 600, Baltimore, MD 21201, USA
| |
Collapse
|
29
|
Davidson RL, Weber RJM, Liu H, Sharma-Oates A, Viant MR. Galaxy-M: a Galaxy workflow for processing and analyzing direct infusion and liquid chromatography mass spectrometry-based metabolomics data. Gigascience 2016; 5:10. [PMID: 26913198 PMCID: PMC4765054 DOI: 10.1186/s13742-016-0115-8] [Citation(s) in RCA: 71] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2015] [Accepted: 02/06/2016] [Indexed: 12/05/2022] Open
Abstract
BACKGROUND Metabolomics is increasingly recognized as an invaluable tool in the biological, medical and environmental sciences yet lags behind the methodological maturity of other omics fields. To achieve its full potential, including the integration of multiple omics modalities, the accessibility, standardization and reproducibility of computational metabolomics tools must be improved significantly. RESULTS Here we present our end-to-end mass spectrometry metabolomics workflow in the widely used platform, Galaxy. Named Galaxy-M, our workflow has been developed for both direct infusion mass spectrometry (DIMS) and liquid chromatography mass spectrometry (LC-MS) metabolomics. The range of tools presented spans from processing of raw data, e.g. peak picking and alignment, through data cleansing, e.g. missing value imputation, to preparation for statistical analysis, e.g. normalization and scaling, and principal components analysis (PCA) with associated statistical evaluation. We demonstrate the ease of using these Galaxy workflows via the analysis of DIMS and LC-MS datasets, and provide PCA scores and associated statistics to help other users to ensure that they can accurately repeat the processing and analysis of these two datasets. Galaxy and data are all provided pre-installed in a virtual machine (VM) that can be downloaded from the GigaDB repository. Additionally, source code, executables and installation instructions are available from GitHub. CONCLUSIONS The Galaxy platform has enabled us to produce an easily accessible and reproducible computational metabolomics workflow. More tools could be added by the community to expand its functionality. We recommend that Galaxy-M workflow files are included within the supplementary information of publications, enabling metabolomics studies to achieve greater reproducibility.
Collapse
Affiliation(s)
- Robert L. Davidson
- />GigaScience, BGI-Hong Kong Co. Ltd, Tai Po Industrial Estate, 16 Dai Fu Street, Tai Po, NT Hong Kong
- />School of Biosciences, University of Birmingham, Birmingham, B15 2TT UK
| | - Ralf J. M. Weber
- />School of Biosciences, University of Birmingham, Birmingham, B15 2TT UK
| | - Haoyu Liu
- />School of Biosciences, University of Birmingham, Birmingham, B15 2TT UK
| | | | - Mark R. Viant
- />School of Biosciences, University of Birmingham, Birmingham, B15 2TT UK
| |
Collapse
|
30
|
Genomic diversity of EPEC associated with clinical presentations of differing severity. Nat Microbiol 2016; 1:15014. [PMID: 27571975 DOI: 10.1038/nmicrobiol.2015.14] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Accepted: 11/06/2015] [Indexed: 01/01/2023]
Abstract
Enteropathogenic Escherichia coli (EPEC) are diarrhoeagenic E. coli, and are a significant cause of gastrointestinal illness among young children in developing countries. Typical EPEC are identified by the presence of the bundle-forming pilus encoded by a virulence plasmid, which has been linked to an increased severity of illness, while atypical EPEC lack this feature. Comparative genomics of 70 total EPEC from lethal (LI), non-lethal symptomatic (NSI) or asymptomatic (AI) cases of diarrhoeal illness in children enrolled in the Global Enteric Multicenter Study was used to investigate the genomic differences in EPEC isolates obtained from individuals with various clinical outcomes. A comparison of the genomes of isolates from different clinical outcomes identified genes that were significantly more prevalent in EPEC isolates of symptomatic and lethal outcomes than in EPEC isolates of asymptomatic outcomes. These EPEC isolates exhibited previously unappreciated phylogenomic diversity and combinations of virulence factors. These comparative results highlight the diversity of the pathogen, as well as the complexity of the EPEC virulence factor repertoire.
Collapse
|
31
|
Mariette J, Escudié F, Bardou P, Nabihoudine I, Noirot C, Trotard MS, Gaspin C, Klopp C. Jflow: a workflow management system for web applications. Bioinformatics 2015; 32:456-8. [PMID: 26454273 PMCID: PMC5859998 DOI: 10.1093/bioinformatics/btv589] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Accepted: 10/07/2015] [Indexed: 11/14/2022] Open
Abstract
SUMMARY Biologists produce large data sets and are in demand of rich and simple web portals in which they can upload and analyze their files. Providing such tools requires to mask the complexity induced by the needed High Performance Computing (HPC) environment. The connection between interface and computing infrastructure is usually specific to each portal. With Jflow, we introduce a Workflow Management System (WMS), composed of jQuery plug-ins which can easily be embedded in any web application and a Python library providing all requested features to setup, run and monitor workflows. AVAILABILITY AND IMPLEMENTATION Jflow is available under the GNU General Public License (GPL) at http://bioinfo.genotoul.fr/jflow. The package is coming with full documentation, quick start and a running test portal. CONTACT Jerome.Mariette@toulouse.inra.fr.
Collapse
Affiliation(s)
- Jérôme Mariette
- Plate-forme Bio-informatique Genotoul, INRA, UR875 Mathmatiques et Informatique Appliques Toulouse, Castanet-Tolosan, France and
| | - Frédéric Escudié
- Plate-forme Bio-informatique Genotoul, INRA, UR875 Mathmatiques et Informatique Appliques Toulouse, Castanet-Tolosan, France and
| | - Philippe Bardou
- Plate-forme SIGENAE, INRA, GenPhyse, Castanet-Tolosan Cedex, France
| | - Ibouniyamine Nabihoudine
- Plate-forme Bio-informatique Genotoul, INRA, UR875 Mathmatiques et Informatique Appliques Toulouse, Castanet-Tolosan, France and
| | - Céline Noirot
- Plate-forme Bio-informatique Genotoul, INRA, UR875 Mathmatiques et Informatique Appliques Toulouse, Castanet-Tolosan, France and
| | - Marie-Stéphane Trotard
- Plate-forme Bio-informatique Genotoul, INRA, UR875 Mathmatiques et Informatique Appliques Toulouse, Castanet-Tolosan, France and
| | - Christine Gaspin
- Plate-forme Bio-informatique Genotoul, INRA, UR875 Mathmatiques et Informatique Appliques Toulouse, Castanet-Tolosan, France and
| | - Christophe Klopp
- Plate-forme Bio-informatique Genotoul, INRA, UR875 Mathmatiques et Informatique Appliques Toulouse, Castanet-Tolosan, France and Plate-forme SIGENAE, INRA, GenPhyse, Castanet-Tolosan Cedex, France
| |
Collapse
|
32
|
Kanterakis A, Deelen P, van Dijk F, Byelas H, Dijkstra M, Swertz MA. Molgenis-impute: imputation pipeline in a box. BMC Res Notes 2015; 8:359. [PMID: 26286716 PMCID: PMC4541731 DOI: 10.1186/s13104-015-1309-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2014] [Accepted: 07/30/2015] [Indexed: 12/12/2022] Open
Abstract
Background Genotype imputation is an important procedure in current genomic analysis such as genome-wide association studies, meta-analyses and fine mapping. Although high quality tools are available that perform the steps of this process, considerable effort and expertise is required to set up and run a best practice imputation pipeline, particularly for larger genotype datasets, where imputation has to scale out in parallel on computer clusters. Results Here we present MOLGENIS-impute, an ‘imputation in a box’ solution that seamlessly and transparently automates the set up and running of all the steps of the imputation process. These steps include genome build liftover (liftovering), genotype phasing with SHAPEIT2, quality control, sample and chromosomal chunking/merging, and imputation with IMPUTE2. MOLGENIS-impute builds on MOLGENIS-compute, a simple pipeline management platform for submission and monitoring of bioinformatics tasks in High Performance Computing (HPC) environments like local/cloud servers, clusters and grids. All the required tools, data and scripts are downloaded and installed in a single step. Researchers with diverse backgrounds and expertise have tested MOLGENIS-impute on different locations and imputed over 30,000 samples so far using the 1,000 Genomes Project and new Genome of the Netherlands data as the imputation reference. The tests have been performed on PBS/SGE clusters, cloud VMs and in a grid HPC environment. Conclusions MOLGENIS-impute gives priority to the ease of setting up, configuring and running an imputation. It has minimal dependencies and wraps the pipeline in a simple command line interface, without sacrificing flexibility to adapt or limiting the options of underlying imputation tools. It does not require knowledge of a workflow system or programming, and is targeted at researchers who just want to apply best practices in imputation via simple commands. It is built on the MOLGENIS compute workflow framework to enable customization with additional computational steps or it can be included in other bioinformatics pipelines. It is available as open source from: https://github.com/molgenis/molgenis-imputation. Electronic supplementary material The online version of this article (doi:10.1186/s13104-015-1309-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alexandros Kanterakis
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Patrick Deelen
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Freerk van Dijk
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Heorhiy Byelas
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Martijn Dijkstra
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Morris A Swertz
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| |
Collapse
|
33
|
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing. PLoS One 2015; 10:e0134273. [PMID: 26280450 PMCID: PMC4539224 DOI: 10.1371/journal.pone.0134273] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2015] [Accepted: 07/07/2015] [Indexed: 12/04/2022] Open
Abstract
Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.
Collapse
|
34
|
MaPSeq, A Service-Oriented Architecture for Genomics Research within an Academic Biomedical Research Institution. INFORMATICS 2015. [DOI: 10.3390/informatics2030020] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|
35
|
Hazen TH, Daugherty SC, Shetty A, Mahurkar AA, White O, Kaper JB, Rasko DA. RNA-Seq analysis of isolate- and growth phase-specific differences in the global transcriptomes of enteropathogenic Escherichia coli prototype isolates. Front Microbiol 2015; 6:569. [PMID: 26124752 PMCID: PMC4464170 DOI: 10.3389/fmicb.2015.00569] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 05/24/2015] [Indexed: 11/13/2022] Open
Abstract
Enteropathogenic Escherichia coli (EPEC) are a leading cause of diarrheal illness among infants in developing countries. E. coli isolates classified as typical EPEC are identified by the presence of the locus of enterocyte effacement (LEE) and the bundle-forming pilus (BFP), and absence of the Shiga-toxin genes, while the atypical EPEC also encode LEE but do not encode BFP or Shiga-toxin. Comparative genomic analyses have demonstrated that EPEC isolates belong to diverse evolutionary lineages and possess lineage- and isolate-specific genomic content. To investigate whether this genomic diversity results in significant differences in global gene expression, we used an RNA sequencing (RNA-Seq) approach to characterize the global transcriptomes of the prototype typical EPEC isolates E2348/69, B171, C581-05, and the prototype atypical EPEC isolate E110019. The global transcriptomes were characterized during laboratory growth in two different media and three different growth phases, as well as during adherence of the EPEC isolates to human cells using in vitro tissue culture assays. Comparison of the global transcriptomes during these conditions was used to identify isolate- and growth phase-specific differences in EPEC gene expression. These analyses resulted in the identification of genes that encode proteins involved in survival and metabolism that were coordinately expressed with virulence factors. These findings demonstrate there are isolate- and growth phase-specific differences in the global transcriptomes of EPEC prototype isolates, and highlight the utility of comparative transcriptomics for identifying additional factors that are directly or indirectly involved in EPEC pathogenesis.
Collapse
Affiliation(s)
- Tracy H Hazen
- Institute for Genome Sciences, University of Maryland School of Medicine Baltimore, MD, USA ; Department of Microbiology and Immunology, University of Maryland School of Medicine Baltimore, MD, USA
| | - Sean C Daugherty
- Institute for Genome Sciences, University of Maryland School of Medicine Baltimore, MD, USA
| | - Amol Shetty
- Institute for Genome Sciences, University of Maryland School of Medicine Baltimore, MD, USA
| | - Anup A Mahurkar
- Institute for Genome Sciences, University of Maryland School of Medicine Baltimore, MD, USA
| | - Owen White
- Institute for Genome Sciences, University of Maryland School of Medicine Baltimore, MD, USA
| | - James B Kaper
- Department of Microbiology and Immunology, University of Maryland School of Medicine Baltimore, MD, USA
| | - David A Rasko
- Institute for Genome Sciences, University of Maryland School of Medicine Baltimore, MD, USA ; Department of Microbiology and Immunology, University of Maryland School of Medicine Baltimore, MD, USA
| |
Collapse
|
36
|
Draft Genome Sequence of Thauera sp. Strain SWB20, Isolated from a Singapore Wastewater Treatment Facility Using Gel Microdroplets. GENOME ANNOUNCEMENTS 2015; 3:3/2/e00132-15. [PMID: 25792053 PMCID: PMC4395064 DOI: 10.1128/genomea.00132-15] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
We report here the genome sequence of Thauera sp. strain SWB20, isolated from a Singaporean wastewater treatment facility using gel microdroplets (GMDs) and single-cell genomics (SCG). This approach provided a single clonal microcolony that was sufficient to obtain a 4.9-Mbp genome assembly of an ecologically relevant Thauera species.
Collapse
|
37
|
Chancey ST, Agrawal S, Schroeder MR, Farley MM, Tettelin H, Stephens DS. Composite mobile genetic elements disseminating macrolide resistance in Streptococcus pneumoniae. Front Microbiol 2015; 6:26. [PMID: 25709602 PMCID: PMC4321634 DOI: 10.3389/fmicb.2015.00026] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2014] [Accepted: 01/08/2015] [Indexed: 01/17/2023] Open
Abstract
Macrolide resistance in Streptococcus pneumoniae emerged in the U.S. and globally during the early 1990's. The RNA methylase encoded by erm(B) and the macrolide efflux genes mef(E) and mel were identified as the resistance determining factors. These genes are disseminated in the pneumococcus on mobile, often chimeric elements consisting of multiple smaller elements. To better understand the variety of elements encoding macrolide resistance and how they have evolved in the pre- and post-conjugate vaccine eras, the genomes of 121 invasive and ten carriage isolates from Atlanta from 1994 to 2011 were analyzed for mobile elements involved in the dissemination of macrolide resistance. The isolates were selected to provide broad coverage of the genetic variability of antibiotic resistant pneumococci and included 100 invasive isolates resistant to macrolides. Tn916-like elements carrying mef(E) and mel on the Macrolide Genetic Assembly (Mega) and erm(B) on the erm(B) element and Tn917 were integrated into the pneumococcal chromosome backbone and into larger Tn5253-like composite elements. The results reported here include identification of novel insertion sites for Mega and characterization of the insertion sites of Tn916-like elements in the pneumococcal chromosome and in larger composite elements. The data indicate that integration of elements by conjugation was infrequent compared to recombination. Thus, it appears that conjugative mobile elements allow the pneumococcus to acquire DNA from distantly related bacteria, but once integrated into a pneumococcal genome, transformation and recombination is the primary mechanism for transmission of novel DNA throughout the pneumococcal population.
Collapse
Affiliation(s)
- Scott T Chancey
- Division of Infectious Diseases, Department of Medicine, Emory University School of Medicine Atlanta, GA, USA ; Laboratories of Microbial Pathogenesis, Department of Veterans Affairs Medical Center Atlanta, GA, USA
| | - Sonia Agrawal
- Institute for Genome Sciences, University of Maryland School of Medicine Baltimore, MD, USA
| | - Max R Schroeder
- Division of Infectious Diseases, Department of Medicine, Emory University School of Medicine Atlanta, GA, USA ; Laboratories of Microbial Pathogenesis, Department of Veterans Affairs Medical Center Atlanta, GA, USA
| | - Monica M Farley
- Division of Infectious Diseases, Department of Medicine, Emory University School of Medicine Atlanta, GA, USA ; Laboratories of Microbial Pathogenesis, Department of Veterans Affairs Medical Center Atlanta, GA, USA
| | - Hervé Tettelin
- Institute for Genome Sciences, University of Maryland School of Medicine Baltimore, MD, USA ; Department of Microbiology and Immunology, University of Maryland School of Medicine Baltimore, MD, USA
| | - David S Stephens
- Division of Infectious Diseases, Department of Medicine, Emory University School of Medicine Atlanta, GA, USA ; Laboratories of Microbial Pathogenesis, Department of Veterans Affairs Medical Center Atlanta, GA, USA
| |
Collapse
|
38
|
Blocking yersiniabactin import attenuates extraintestinal pathogenic Escherichia coli in cystitis and pyelonephritis and represents a novel target to prevent urinary tract infection. Infect Immun 2015; 83:1443-50. [PMID: 25624354 DOI: 10.1128/iai.02904-14] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
The emergence and spread of extended-spectrum beta-lactamases and carbapenemases among common bacterial pathogens are threatening our ability to treat routine hospital- and community-acquired infections. With the pipeline for new antibiotics virtually empty, there is an urgent need to develop novel therapeutics. Bacteria require iron to establish infection, and specialized pathogen-associated iron acquisition systems like yersiniabactin, common among pathogenic species in the family Enterobacteriaceae, including multidrug-resistant Klebsiella pneumoniae and pathogenic Escherichia coli, represent potentially novel therapeutic targets. Although the yersiniabactin system was recently identified as a vaccine target for uropathogenic E. coli (UPEC)-mediated urinary tract infection (UTI), its contribution to UPEC pathogenesis is unknown. Using an E. coli mutant (strain 536ΔfyuA) unable to acquire yersiniabactin during infection, we established the yersiniabactin receptor as a UPEC virulence factor during cystitis and pyelonephritis, a fitness factor during bacteremia, and a surface-accessible target of the experimental FyuA vaccine. In addition, we determined through transcriptome sequencing (RNA-seq) analyses of RNA from E. coli causing cystitis in women that iron acquisition systems, including the yersiniabactin system, are highly expressed by bacteria during natural uncomplicated UTI. Given that yersiniabactin contributes to the virulence of several pathogenic species in the family Enterobacteriaceae, including UPEC, and is frequently associated with multidrug-resistant strains, it represents a promising novel target to combat antibiotic-resistant infections.
Collapse
|
39
|
Abstract
We report draft genomes of Enterobacter cloacae strain S611, an endophytic bacterium isolated from surface-sterilized germinating wheat seeds. We present the assembly and annotation of its genome, which may provide insights into the metabolic pathways involved in adaptation.
Collapse
|
40
|
Venco F, Vaskin Y, Ceol A, Muller H. SMITH: a LIMS for handling next-generation sequencing workflows. BMC Bioinformatics 2014; 15 Suppl 14:S3. [PMID: 25471934 PMCID: PMC4255740 DOI: 10.1186/1471-2105-15-s14-s3] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Background Life-science laboratories make increasing use of Next Generation Sequencing (NGS) for studying bio-macromolecules and their interactions. Array-based methods for measuring gene expression or protein-DNA interactions are being replaced by RNA-Seq and ChIP-Seq. Sequencing is generally performed by specialized facilities that have to keep track of sequencing requests, trace samples, ensure quality and make data available according to predefined privileges. An integrated tool helps to troubleshoot problems, to maintain a high quality standard, to reduce time and costs. Commercial and non-commercial tools called LIMS (Laboratory Information Management Systems) are available for this purpose. However, they often come at prohibitive cost and/or lack the flexibility and scalability needed to adjust seamlessly to the frequently changing protocols employed. In order to manage the flow of sequencing data produced at the Genomic Unit of the Italian Institute of Technology (IIT), we developed SMITH (Sequencing Machine Information Tracking and Handling). Methods SMITH is a web application with a MySQL server at the backend. Wet-lab scientists of the Centre for Genomic Science and database experts from the Politecnico of Milan in the context of a Genomic Data Model Project developed SMITH. The data base schema stores all the information of an NGS experiment, including the descriptions of all protocols and algorithms used in the process. Notably, an attribute-value table allows associating an unconstrained textual description to each sample and all the data produced afterwards. This method permits the creation of metadata that can be used to search the database for specific files as well as for statistical analyses. Results SMITH runs automatically and limits direct human interaction mainly to administrative tasks. SMITH data-delivery procedures were standardized making it easier for biologists and analysts to navigate the data. Automation also helps saving time. The workflows are available through an API provided by the workflow management system. The parameters and input data are passed to the workflow engine that performs de-multiplexing, quality control, alignments, etc. Conclusions SMITH standardizes, automates, and speeds up sequencing workflows. Annotation of data with key-value pairs facilitates meta-analysis.
Collapse
|
41
|
Simonyan V, Mazumder R. High-Performance Integrated Virtual Environment (HIVE) Tools and Applications for Big Data Analysis. Genes (Basel) 2014; 5:957-81. [PMID: 25271953 PMCID: PMC4276921 DOI: 10.3390/genes5040957] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Revised: 09/22/2014] [Accepted: 09/22/2014] [Indexed: 12/30/2022] Open
Abstract
The High-performance Integrated Virtual Environment (HIVE) is a high-throughput cloud-based infrastructure developed for the storage and analysis of genomic and associated biological data. HIVE consists of a web-accessible interface for authorized users to deposit, retrieve, share, annotate, compute and visualize Next-generation Sequencing (NGS) data in a scalable and highly efficient fashion. The platform contains a distributed storage library and a distributed computational powerhouse linked seamlessly. Resources available through the interface include algorithms, tools and applications developed exclusively for the HIVE platform, as well as commonly used external tools adapted to operate within the parallel architecture of the system. HIVE is composed of a flexible infrastructure, which allows for simple implementation of new algorithms and tools. Currently, available HIVE tools include sequence alignment and nucleotide variation profiling tools, metagenomic analyzers, phylogenetic tree-building tools using NGS data, clone discovery algorithms, and recombination analysis algorithms. In addition to tools, HIVE also provides knowledgebases that can be used in conjunction with the tools for NGS sequence and metadata analysis.
Collapse
Affiliation(s)
- Vahan Simonyan
- Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD 20993, USA.
| | - Raja Mazumder
- Department of Biochemistry and Molecular Medicine, George Washington University, Washington, DC 20037, USA.
| |
Collapse
|
42
|
Phan IQH, Stacy R, Myler PJ. Selecting targets from eukaryotic parasites for structural genomics and drug discovery. Methods Mol Biol 2014; 1140:53-9. [PMID: 24590708 DOI: 10.1007/978-1-4939-0354-2_4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
The selection of targets is the first step for any structural genomics project. The application of structural genomics approaches to drug discovery also starts with the selection of targets. Here, three protocols are described that were developed to select targets from eukaryotic pathogens. These protocols could also be applied to other drug discovery projects.
Collapse
Affiliation(s)
- Isabelle Q H Phan
- Seattle Structural Genomics Center for Infectious Disease, Seattle, WA, USA
| | | | | |
Collapse
|
43
|
Draft Genome Sequence of Pseudomonas putida Strain S610, a Seed-Borne Bacterium of Wheat. GENOME ANNOUNCEMENTS 2013; 1:1/6/e01048-13. [PMID: 24371199 PMCID: PMC3873609 DOI: 10.1128/genomea.01048-13] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
We report the genome sequence of a seed-borne bacterium, Pseudomonas putida strain S610. The size of the draft genome sequence is approximately 4.6 Mb, which is the smallest among all P. putida strains sequenced to date.
Collapse
|
44
|
Bacterial endosymbiosis in a chordate host: long-term co-evolution and conservation of secondary metabolism. PLoS One 2013; 8:e80822. [PMID: 24324632 PMCID: PMC3851785 DOI: 10.1371/journal.pone.0080822] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Accepted: 10/16/2013] [Indexed: 11/19/2022] Open
Abstract
Intracellular symbiosis is known to be widespread in insects, but there are few described examples in other types of host. These symbionts carry out useful activities such as synthesizing nutrients and conferring resistance against adverse events such as parasitism. Such symbionts persist through host speciation events, being passed down through vertical transmission. Due to various evolutionary forces, symbionts go through a process of genome reduction, eventually resulting in tiny genomes where only those genes essential to immediate survival and those beneficial to the host remain. In the marine environment, invertebrates such as tunicates are known to harbor complex microbiomes implicated in the production of natural products that are toxic and probably serve a defensive function. Here, we show that the intracellular symbiont Candidatus Endolissoclinum faulkneri is a long-standing symbiont of the tunicate Lissoclinum patella, that has persisted through cryptic speciation of the host. In contrast to the known examples of insect symbionts, which tend to be either relatively recent or ancient relationships, the genome of Ca. E. faulkneri has a very low coding density but very few recognizable pseudogenes. The almost complete degradation of intergenic regions and stable gene inventory of extant strains of Ca. E. faulkneri show that further degradation and deletion is happening very slowly. This is a novel stage of genome reduction and provides insight into how tiny genomes are formed. The ptz pathway, which produces the defensive patellazoles, is shown to date to before the divergence of Ca. E. faulkneri strains, reinforcing its importance in this symbiotic relationship. Lastly, as in insects we show that stable symbionts can be lost, as we describe an L. patella animal where Ca. E. faulkneri is displaced by a likely intracellular pathogen. Our results suggest that intracellular symbionts may be an important source of ecologically significant natural products in animals.
Collapse
|
45
|
Sanderson LA, Ficklin SP, Cheng CH, Jung S, Feltus FA, Bett KE, Main D. Tripal v1.1: a standards-based toolkit for construction of online genetic and genomic databases. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2013; 2013:bat075. [PMID: 24163125 PMCID: PMC3808541 DOI: 10.1093/database/bat075] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Tripal is an open-source freely available toolkit for construction of online genomic and genetic databases. It aims to facilitate development of community-driven biological websites by integrating the GMOD Chado database schema with Drupal, a popular website creation and content management software. Tripal provides a suite of tools for interaction with a Chado database and display of content therein. The tools are designed to be generic to support the various ways in which data may be stored in Chado. Previous releases of Tripal have supported organisms, genomic libraries, biological stocks, stock collections and genomic features, their alignments and annotations. Also, Tripal and its extension modules provided loaders for commonly used file formats such as FASTA, GFF, OBO, GAF, BLAST XML, KEGG heir files and InterProScan XML. Default generic templates were provided for common views of biological data, which could be customized using an open Application Programming Interface to change the way data are displayed. Here, we report additional tools and functionality that are part of release v1.1 of Tripal. These include (i) a new bulk loader that allows a site curator to import data stored in a custom tab delimited format; (ii) full support of every Chado table for Drupal Views (a powerful tool allowing site developers to construct novel displays and search pages); (iii) new modules including ‘Feature Map’, ‘Genetic’, ‘Publication’, ‘Project’, ‘Contact’ and the ‘Natural Diversity’ modules. Tutorials, mailing lists, download and set-up instructions, extension modules and other documentation can be found at the Tripal website located at http://tripal.info. Database URL: http://tripal.info/
Collapse
Affiliation(s)
- Lacey-Anne Sanderson
- Department of Plant Sciences, University of Saskatchewan. Saskatoon, SK Canada, Department of Horticulture, Washington State University. Pullman, WA, USA and Department of Genetics and Biochemistry, Clemson University. Clemson, SC, USA
| | | | | | | | | | | | | |
Collapse
|
46
|
Mongodin EF, Casjens SR, Bruno JF, Xu Y, Drabek EF, Riley DR, Cantarel BL, Pagan PE, Hernandez YA, Vargas LC, Dunn JJ, Schutzer SE, Fraser CM, Qiu WG, Luft BJ. Inter- and intra-specific pan-genomes of Borrelia burgdorferi sensu lato: genome stability and adaptive radiation. BMC Genomics 2013; 14:693. [PMID: 24112474 PMCID: PMC3833655 DOI: 10.1186/1471-2164-14-693] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2013] [Accepted: 09/26/2013] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Lyme disease is caused by spirochete bacteria from the Borrelia burgdorferi sensu lato (B. burgdorferi s.l.) species complex. To reconstruct the evolution of B. burgdorferi s.l. and identify the genomic basis of its human virulence, we compared the genomes of 23 B. burgdorferi s.l. isolates from Europe and the United States, including B. burgdorferi sensu stricto (B. burgdorferi s.s., 14 isolates), B. afzelii (2), B. garinii (2), B. "bavariensis" (1), B. spielmanii (1), B. valaisiana (1), B. bissettii (1), and B. "finlandensis" (1). RESULTS Robust B. burgdorferi s.s. and B. burgdorferi s.l. phylogenies were obtained using genome-wide single-nucleotide polymorphisms, despite recombination. Phylogeny-based pan-genome analysis showed that the rate of gene acquisition was higher between species than within species, suggesting adaptive speciation. Strong positive natural selection drives the sequence evolution of lipoproteins, including chromosomally-encoded genes 0102 and 0404, cp26-encoded ospC and b08, and lp54-encoded dbpA, a07, a22, a33, a53, a65. Computer simulations predicted rapid adaptive radiation of genomic groups as population size increases. CONCLUSIONS Intra- and inter-specific pan-genome sizes of B. burgdorferi s.l. expand linearly with phylogenetic diversity. Yet gene-acquisition rates in B. burgdorferi s.l. are among the lowest in bacterial pathogens, resulting in high genome stability and few lineage-specific genes. Genome adaptation of B. burgdorferi s.l. is driven predominantly by copy-number and sequence variations of lipoprotein genes. New genomic groups are likely to emerge if the current trend of B. burgdorferi s.l. population expansion continues.
Collapse
Affiliation(s)
- Emmanuel F Mongodin
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Genome Sequences of Two Klebsiella pneumoniae Isolates from Different Geographical Regions, Argentina (Strain JHCK1) and the United States (Strain VA360). GENOME ANNOUNCEMENTS 2013; 1:1/2/e00168-13. [PMID: 23640195 PMCID: PMC3642250 DOI: 10.1128/genomea.00168-13] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
We report the sequences of two Klebsiella pneumoniae clinical isolates, strains JHCK1 and VA360, from a newborn with meningitis in Buenos Aires, Argentina, and from a tertiary care medical center in Cleveland, OH, respectively. Both isolates contain one chromosome and at least five plasmids; isolate VA360 contains the Klebsiella pneumoniae carbapenemase (KPC) gene.
Collapse
|
48
|
White JR, Maddox C, White O, Angiuoli SV, Fricke WF. CloVR-ITS: Automated internal transcribed spacer amplicon sequence analysis pipeline for the characterization of fungal microbiota. MICROBIOME 2013; 1:6. [PMID: 24451270 PMCID: PMC3869194 DOI: 10.1186/2049-2618-1-6] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/19/2012] [Accepted: 11/21/2012] [Indexed: 05/16/2023]
Abstract
BACKGROUND Besides the development of comprehensive tools for high-throughput 16S ribosomal RNA amplicon sequence analysis, there exists a growing need for protocols emphasizing alternative phylogenetic markers such as those representing eukaryotic organisms. RESULTS Here we introduce CloVR-ITS, an automated pipeline for comparative analysis of internal transcribed spacer (ITS) pyrosequences amplified from metagenomic DNA isolates and representing fungal species. This pipeline performs a variety of steps similar to those commonly used for 16S rRNA amplicon sequence analysis, including preprocessing for quality, chimera detection, clustering of sequences into operational taxonomic units (OTUs), taxonomic assignment (at class, order, family, genus, and species levels) and statistical analysis of sample groups of interest based on user-provided information. Using ITS amplicon pyrosequencing data from a previous human gastric fluid study, we demonstrate the utility of CloVR-ITS for fungal microbiota analysis and provide runtime and cost examples, including analysis of extremely large datasets on the cloud. We show that the largest fractions of reads from the stomach fluid samples were assigned to Dothideomycetes, Saccharomycetes, Agaricomycetes and Sordariomycetes but that all samples were dominated by sequences that could not be taxonomically classified. Representatives of the Candida genus were identified in all samples, most notably C. quercitrusa, while sequence reads assigned to the Aspergillus genus were only identified in a subset of samples. CloVR-ITS is made available as a pre-installed, automated, and portable software pipeline for cloud-friendly execution as part of the CloVR virtual machine package (http://clovr.org). CONCLUSION The CloVR-ITS pipeline provides fungal microbiota analysis that can be complementary to bacterial 16S rRNA and total metagenome sequence analysis allowing for more comprehensive studies of environmental and host-associated microbial communities.
Collapse
Affiliation(s)
- James Robert White
- Institute for Genome Sciences, University of Maryland School of Medicine, BioPark II - 801 West Baltimore Street, Baltimore, MD, 21201, USA
| | - Cynthia Maddox
- Institute for Genome Sciences, University of Maryland School of Medicine, BioPark II - 801 West Baltimore Street, Baltimore, MD, 21201, USA
| | - Owen White
- Institute for Genome Sciences, University of Maryland School of Medicine, BioPark II - 801 West Baltimore Street, Baltimore, MD, 21201, USA
| | - Samuel V Angiuoli
- Institute for Genome Sciences, University of Maryland School of Medicine, BioPark II - 801 West Baltimore Street, Baltimore, MD, 21201, USA
| | - W Florian Fricke
- Institute for Genome Sciences, University of Maryland School of Medicine, BioPark II - 801 West Baltimore Street, Baltimore, MD, 21201, USA
| |
Collapse
|
49
|
Mariette J, Escudié F, Allias N, Salin G, Noirot C, Thomas S, Klopp C. NG6: Integrated next generation sequencing storage and processing environment. BMC Genomics 2012; 13:462. [PMID: 22958229 PMCID: PMC3444930 DOI: 10.1186/1471-2164-13-462] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2012] [Accepted: 08/30/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Next generation sequencing platforms are now well implanted in sequencing centres and some laboratories. Upcoming smaller scale machines such as the 454 junior from Roche or the MiSeq from Illumina will increase the number of laboratories hosting a sequencer. In such a context, it is important to provide these teams with an easily manageable environment to store and process the produced reads. RESULTS We describe a user-friendly information system able to manage large sets of sequencing data. It includes, on one hand, a workflow environment already containing pipelines adapted to different input formats (sff, fasta, fastq and qseq), different sequencers (Roche 454, Illumina HiSeq) and various analyses (quality control, assembly, alignment, diversity studies,…) and, on the other hand, a secured web site giving access to the results. The connected user will be able to download raw and processed data and browse through the analysis result statistics. The provided workflows can easily be modified or extended and new ones can be added. Ergatis is used as a workflow building, running and monitoring system. The analyses can be run locally or in a cluster environment using Sun Grid Engine. CONCLUSIONS NG6 is a complete information system designed to answer the needs of a sequencing platform. It provides a user-friendly interface to process, store and download high-throughput sequencing data.
Collapse
Affiliation(s)
- Jérôme Mariette
- Plate-forme bio-informatique Genotoul, INRA, Biométrie et Intelligence Artificielle, BP 52627, 31326, Castanet-Tolosan Cedex, France.
| | | | | | | | | | | | | |
Collapse
|
50
|
McLellan AS, Dubin RA, Jing Q, Broin PÓ, Moskowitz D, Suzuki M, Calder RB, Hargitai J, Golden A, Greally JM. The Wasp System: an open source environment for managing and analyzing genomic data. Genomics 2012; 100:345-51. [PMID: 22944616 DOI: 10.1016/j.ygeno.2012.08.005] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2012] [Revised: 08/16/2012] [Accepted: 08/20/2012] [Indexed: 01/17/2023]
Abstract
The challenges associated with the management, analysis and interpretation of assays based on massively-parallel sequencing (MPS) are both individually complex and numerous. We describe what we believe to be the appropriate solution, one that represents a departure from traditional computational biology approaches. The Wasp System is an open source, distributed package written in Spring/J2EE that creates a foundation for development of an end-to-end solution for MPS-based experiments or clinical tests. Recognizing that one group will be unable to solve these challenges in isolation, we describe a nurtured open source development model that will allow the software to be collectively used, shared and developed. The ultimate goal is to emulate resources such as the Virtual Observatory of the astrophysics community, enabling computationally-inexpert scientists and clinicians to explore and interpret their MPS data. Here we describe the current implementation and development of the Wasp System and the roadmap for its community development.
Collapse
Affiliation(s)
- Andrew S McLellan
- Center for Epigenomics and Division of Computational Genetics, Department of Genetics, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY 10461, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|