Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Orvis J, Crabtree J, Galens K, Gussman A, Inman JM, Lee E, Nampally S, Riley D, Sundaram JP, Felix V, Whitty B, Mahurkar A, Wortman J, White O, Angiuoli SV. Ergatis: a web interface and scalable software system for bioinformatics workflows. Bioinformatics 2010;26:1488-92. [PMID: 20413634 PMCID: PMC2881353 DOI: 10.1093/bioinformatics/btq167] [Citation(s) in RCA: 79] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

For:	Orvis J, Crabtree J, Galens K, Gussman A, Inman JM, Lee E, Nampally S, Riley D, Sundaram JP, Felix V, Whitty B, Mahurkar A, Wortman J, White O, Angiuoli SV. Ergatis: a web interface and scalable software system for bioinformatics workflows. Bioinformatics 2010;26:1488-92. [PMID: 20413634 PMCID: PMC2881353 DOI: 10.1093/bioinformatics/btq167] [Citation(s) in RCA: 79] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Number

Cited by Other Article(s)

Gabor CE, Hazen TH, Delaine-Elias BC, Rasko DA, Barry EM. Genomic, transcriptomic, and phenotypic differences among archetype Shigella flexneri strains of serotypes 2a, 3a, and 6. mSphere 2023;8:e0040823. [PMID: 37830809 PMCID: PMC10732043 DOI: 10.1128/msphere.00408-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 08/30/2023] [Indexed: 10/14/2023] Open

Genomic diversity of non-diarrheagenic fecal Escherichia coli from children in sub-Saharan Africa and south Asia and their relatedness to diarrheagenic E. coli. Nat Commun 2023;14:1400. [PMID: 36918537 PMCID: PMC10011798 DOI: 10.1038/s41467-023-36337-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 01/23/2023] [Indexed: 03/16/2023] Open

Iquebal MA, Jagannadham J, Jaiswal S, Prabha R, Rai A, Kumar D. Potential Use of Microbial Community Genomes in Various Dimensions of Agriculture Productivity and Its Management: A Review. Front Microbiol 2022;13:708335. [PMID: 35655999 PMCID: PMC9152772 DOI: 10.3389/fmicb.2022.708335] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 03/17/2022] [Indexed: 12/12/2022] Open

Contribution of Noncanonical Antigens to Virulence and Adaptive Immunity in Human Infection with Enterotoxigenic E. coli. Infect Immun 2021;89:IAI.00041-21. [PMID: 33558320 DOI: 10.1128/iai.00041-21] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 01/27/2021] [Indexed: 12/12/2022] Open

Abstract

Enterotoxigenic Escherichia coli (ETEC) contributes significantly to the substantial burden of infectious diarrhea among children living in low- and middle-income countries. In the absence of a vaccine for ETEC, children succumb to acute dehydration as well as nondiarrheal sequelae related to these infections, including malnutrition. The considerable diversity of ETEC genomes has complicated canonical vaccine development approaches defined by a subset of ETEC pathovar-specific antigens known as colonization factors (CFs). To identify additional conserved immunogens unique to this pathovar, we employed an "open-aperture" approach to capture all potential conserved ETEC surface antigens, in which we mined the genomic sequences of 89 ETEC isolates, bioinformatically selected potential surface-exposed pathovar-specific antigens conserved in more than 40% of the genomes (n = 118), and assembled the representative proteins onto microarrays, complemented with known or putative colonization factor subunit molecules (n = 52) and toxin subunits. These arrays were then used to interrogate samples from individuals with acute symptomatic ETEC infections. Surprisingly, in this approach, we found that immune responses were largely constrained to a small number of antigens, including individual colonization factor antigens and EtpA, an extracellular adhesin. In a Bangladeshi cohort of naturally infected children <2 years of age, both EtpA and a second antigen, EatA, elicited significant serologic responses that were associated with protection from symptomatic illness. In addition, children infected with ETEC isolates bearing either etpA or eatA genes were significantly more likely to develop symptomatic disease. These studies support a role for antigens not presently targeted by vaccines (noncanonical) in virulence and the development of adaptive immune responses during ETEC infections. These findings may inform vaccine design efforts to complement existing approaches.

Collapse

Pal S, Mondal S, Das G, Khatua S, Ghosh Z. Big data in biology: The hope and present-day challenges in it. GENE REPORTS 2020. [DOI: 10.1016/j.genrep.2020.100869] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Complete Genome Sequence of Francisella halioticida Type Strain DSM 23729 (FSC1005). Microbiol Resour Announc 2020;9:9/37/e00541-20. [PMID: 32912905 PMCID: PMC7484064 DOI: 10.1128/mra.00541-20] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Hernandes RT, Hazen TH, dos Santos LF, Richter TKS, Michalski JM, Rasko DA. Comparative genomic analysis provides insight into the phylogeny and virulence of atypical enteropathogenic Escherichia coli strains from Brazil. PLoS Negl Trop Dis 2020;14:e0008373. [PMID: 32479541 PMCID: PMC7289442 DOI: 10.1371/journal.pntd.0008373] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 06/11/2020] [Accepted: 05/07/2020] [Indexed: 12/21/2022] Open

Abstract

Background

Atypical enteropathogenic Escherichia coli (aEPEC) are one of the most frequent intestinal E. coli pathotypes isolated from diarrheal patients in Brazil. Isolates of aEPEC contain the locus of enterocyte effacement, but lack the genes of the bundle-forming pilus of typical EPEC, and the Shiga toxin of enterohemorrhagic E. coli (EHEC). The objective of this study was to evaluate the phylogeny and the gene content of Brazilian aEPEC genomes compared to a global aEPEC collection.

Methodology

Single nucleotide polymorphism (SNP)-based phylogenomic analysis was used to compare 106 sequenced Brazilian aEPEC with 221 aEPEC obtained from other geographic origins. Additionally, Large-Scale BLAST Score Ratio was used to determine the shared versus unique gene content of the aEPEC studied.

Principal Findings

Phylogenomic analysis demonstrated the 106 Brazilian aEPEC were present in phylogroups B1 (47.2%, 50/106), B2 (23.6%, 25/106), A (22.6%, 24/106), and E (6.6%, 7/106). Identification of EPEC and EHEC phylogenomic lineages demonstrated that 42.5% (45/106) of the Brazilian aEPEC were in four of the previously defined lineages: EPEC10 (17.9%, 19/106), EPEC9 (10.4%, 11/106), EHEC2 (7.5%, 8/106) and EPEC7 (6.6%, 7/106). Interestingly, an additional 28.3% (30/106) of the Brazilian aEPEC were identified in five novel lineages: EPEC11 (14.2%, 15/106), EPEC12 (4.7%, 5/106), EPEC13 (1.9%, 2/106), EPEC14 (5.7%, 6/106) and EPEC15 (1.9%, 2/106). We identified 246 genes that were more frequent among the aEPEC isolates from Brazil compared to the global aEPEC collection, including espG2, espT and espC (P<0.001). Moreover, the nleF gene was more frequently identified among Brazilian aEPEC isolates obtained from diarrheagenic patients when compared to healthy subjects (69.7% vs 41.2%, P<0.05).

Conclusion

The current study demonstrates significant genomic diversity among aEPEC from Brazil, with the identification of Brazilian aEPEC isolates to five novel EPEC lineages. The greater prevalence of some virulence genes among Brazilian aEPEC genomes could be important to the specific virulence strategies used by aEPEC in Brazil to cause diarrheal disease.

Atypical EPEC (aEPEC) is one of the most frequent diarrheagenic Escherichia coli pathotypes isolated from patients in Brazil and is associated with diarrheal outbreaks. This study is the first to sequence the genomes of a collection of aEPEC isolates from a South American country, Brazil, and compare their phylogenetic relationships and gene content with a global collection of aEPEC. This approach identified Brazilian aEPEC genomes in previously characterized EPEC/EHEC phylogenomic lineages and resulted in the identification of five novel EPEC phylogenomic lineages, designated EPEC11 to EPEC15. We also observed that virulence genes, such as espG2, espT and espC were more frequently identified among the Brazilian aEPEC genomes, demonstrating potential differences in the virulence repertoire of this pathogen in Brazil.

Collapse

D'Mello A, Ahearn CP, Murphy TF, Tettelin H. ReVac: a reverse vaccinology computational pipeline for prioritization of prokaryotic protein vaccine candidates. BMC Genomics 2019;20:981. [PMID: 31842745 PMCID: PMC6916091 DOI: 10.1186/s12864-019-6195-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Accepted: 10/16/2019] [Indexed: 12/24/2022] Open

Abstract

Background

Reverse vaccinology accelerates the discovery of potential vaccine candidates (PVCs) prior to experimental validation. Current programs typically use one bacterial proteome to identify PVCs through a filtering architecture using feature prediction programs or a machine learning approach. Filtering approaches may eliminate potential antigens based on limitations in the accuracy of prediction tools used. Machine learning approaches are heavily dependent on the selection of training datasets with experimentally validated antigens (positive control) and non-protective-antigens (negative control). The use of one or few bacterial proteomes does not assess PVC conservation among strains, an important feature of vaccine antigens.

Results

We present ReVac, which implements both a panoply of feature prediction programs without filtering out proteins, and scoring of candidates based on predictions made on curated positive and negative control PVCs datasets. ReVac surveys several genomes assessing protein conservation, as well as DNA and protein repeats, which may result in variable expression of PVCs. ReVac’s orthologous clustering of conserved genes, identifies core and dispensable genome components. This is useful for determining the degree of conservation of PVCs among the population of isolates for a given pathogen. Potential vaccine candidates are then prioritized based on conservation and overall feature-based scoring. We present the application of ReVac, applied to 69 Moraxella catarrhalis and 270 non-typeable Haemophilus influenzae genomes, prioritizing 64 and 29 proteins as PVCs, respectively.

Conclusion

ReVac’s use of a scoring scheme ranks PVCs for subsequent experimental testing. It employs a redundancy-based approach in its predictions of features using several prediction tools. The protein’s features are collated, and each protein is ranked based on the scoring scheme. Multi-genome analyses performed in ReVac allow for a comprehensive overview of PVCs from a pan-genome perspective, as an essential pre-requisite for any bacterial subunit vaccine design. ReVac prioritized PVCs of two human respiratory pathogens, identifying both novel and previously validated PVCs.

Collapse

Rasko DA, Del Canto F, Luo Q, Fleckenstein JM, Vidal R, Hazen TH. Comparative genomic analysis and molecular examination of the diversity of enterotoxigenic Escherichia coli isolates from Chile. PLoS Negl Trop Dis 2019;13:e0007828. [PMID: 31747410 PMCID: PMC6901236 DOI: 10.1371/journal.pntd.0007828] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2019] [Revised: 12/09/2019] [Accepted: 10/04/2019] [Indexed: 02/02/2023] Open

Xie G, Cheng Q, Daligault H, Davenport K, Gleasner C, Jacobs L, Kubicek-Sutherland J, LeCuyer T, Otieno V, Raballah E, Doggett N, Mukundan H, Perkins DJ, McMahon B. Draft Genome Sequences of Two Staphylococcus warneri Clinical Isolates, Strains SMA0023-04 (UGA3) and SMA0670-05 (UGA28), from Siaya County Referral Hospital, Siaya, Kenya. Microbiol Resour Announc 2019;8:e01595-18. [PMID: 30975813 PMCID: PMC6460036 DOI: 10.1128/mra.01595-18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Accepted: 03/15/2019] [Indexed: 11/20/2022] Open

Affiliation(s)

Gary Xie Biosecurity and Public Health, Bioscience Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
Qiuying Cheng Center for Global Health, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, USA
Hajnalka Daligault Biosecurity and Public Health, Bioscience Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
Karen Davenport Biosecurity and Public Health, Bioscience Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
Cheryl Gleasner Biosecurity and Public Health, Bioscience Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
Lindsey Jacobs Theoretical Biology and Biophysics, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
Jessica Kubicek-Sutherland Physical Chemistry and Applied Spectroscopy, Chemistry Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
Tessa LeCuyer Center for Global Health, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, USA
Vincent Otieno University of New Mexico Laboratories of Parasitic and Viral Diseases, Kisumu, Kenya
Evans Raballah Department of Medical Laboratory Sciences, School of Public Health, Biomedical Sciences and Technology, Masinde Muliro University of Science and Technology, Kakamega, Kenya
Norman Doggett Biosecurity and Public Health, Bioscience Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
Harshini Mukundan Physical Chemistry and Applied Spectroscopy, Chemistry Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
Douglas J Perkins Center for Global Health, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, USA
Benjamin McMahon Theoretical Biology and Biophysics, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA

Collapse

Genome and Functional Characterization of Colonization Factor Antigen I- and CS6-Encoding Heat-Stable Enterotoxin-Only Enterotoxigenic Escherichia coli Reveals Lineage and Geographic Variation. mSystems 2019;4:mSystems00329-18. [PMID: 30944874 PMCID: PMC6446980 DOI: 10.1128/msystems.00329-18] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Accepted: 12/17/2018] [Indexed: 12/23/2022] Open

Abstract

Comparative genomics and functional characterization were used to analyze a global collection of CFA/I and CS6 ST-only ETEC isolates associated with human diarrhea, demonstrating differences in the genomic content of CFA/I and CS6 isolates related to CF type, lineage, and geographic location of isolation and also lineage-related differences in ST production. Complete genome sequencing of selected CFA/I and CS6 isolates enabled descriptions of a highly conserved ST-positive (ST⁺) CFA/I plasmid and of at least five diverse ST and/or CS6 plasmids among the CS6 ETEC isolates. There is currently no approved vaccine for ST-only ETEC, or for any ETEC for that matter, and as such, the current report provides functional verification of ST and CF production and antimicrobial susceptibility testing and an in-depth genomic characterization of a collection of isolates that could serve as representatives of CFA/I- or CS6-encoding ST-only ETEC strains for future studies of ETEC pathogenesis, vaccine studies, and/or clinical trials.

Enterotoxigenic Escherichia coli (ETEC) is a significant cause of childhood diarrhea and is a leading cause of traveler’s diarrhea. ETEC strains encoding the heat-stable enterotoxin (ST) are more often associated with childhood diarrhea than ETEC strains that encode only the heat-labile enterotoxin (LT). Colonization factors (CFs) also have a demonstrated role in ETEC virulence, and two of the most prevalent CFs among ETEC that have caused diarrhea are colonization factor antigen I (CFA/I) and CS6. In the current report, we describe the genomes of 269 CS6- or CFA/I-encoding ST-only ETEC isolates that were associated with human diarrhea. While the CS6 and CFA/I ETEC were identified in at least 13 different ETEC genomic lineages, a majority (85%; 229/269) were identified in only six lineages. Complete genome sequencing of selected isolates demonstrated that a conserved plasmid contributed to the dissemination of CFA/I whereas at least five distinct plasmids were involved in the dissemination of ST and/or CS6. Additionally, there were differences in gene content between CFA/I and CS6 ETEC at the phylogroup and lineage levels and in association with their geographic location of isolation as well as lineage-related differences in ST production. Thus, we demonstrate that genomically diverse E. coli strains have acquired ST, as well as CFA/I or CS6, via one or more plasmids and that, in some cases, isolates of a particular lineage or geographic location have undergone additional modifications to their genome content. These findings will aid investigations of virulence and the development of improved diagnostics and vaccines against this important human diarrheal pathogen.

IMPORTANCE Comparative genomics and functional characterization were used to analyze a global collection of CFA/I and CS6 ST-only ETEC isolates associated with human diarrhea, demonstrating differences in the genomic content of CFA/I and CS6 isolates related to CF type, lineage, and geographic location of isolation and also lineage-related differences in ST production. Complete genome sequencing of selected CFA/I and CS6 isolates enabled descriptions of a highly conserved ST-positive (ST⁺) CFA/I plasmid and of at least five diverse ST and/or CS6 plasmids among the CS6 ETEC isolates. There is currently no approved vaccine for ST-only ETEC, or for any ETEC for that matter, and as such, the current report provides functional verification of ST and CF production and antimicrobial susceptibility testing and an in-depth genomic characterization of a collection of isolates that could serve as representatives of CFA/I- or CS6-encoding ST-only ETEC strains for future studies of ETEC pathogenesis, vaccine studies, and/or clinical trials.

Collapse

Temporal Variability of Escherichia coli Diversity in the Gastrointestinal Tracts of Tanzanian Children with and without Exposure to Antibiotics. mSphere 2018;3:3/6/e00558-18. [PMID: 30404930 PMCID: PMC6222053 DOI: 10.1128/msphere.00558-18] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Abstract

This study increases the number of resident Escherichia coli genome sequences, and explores E. coli diversity through longitudinal sampling. We investigate the genomes of E. coli isolated from human gastrointestinal tracts as part of an antibiotic treatment program among rural Tanzanian children. Phylogenomics demonstrates that resident E. coli are diverse, even within a single host. Though the E. coli isolates of the gastrointestinal community tend to be phylogenomically similar at a given time, they differed across the interrogated time points, demonstrating the variability of the members of the E. coli community in these subjects. Exposure to antibiotic treatment did not have an apparent impact on the E. coli community or the presence of resistance and virulence genes within E. coli genomes. The findings of this study highlight the variable nature of specific bacterial members of the human gastrointestinal tract.

The stability of the Escherichia coli populations in the human gastrointestinal tract is not fully appreciated, and represents a significant knowledge gap regarding gastrointestinal community structure, as well as resistance to incoming pathogenic bacterial species and antibiotic treatment. The current study examines the genomic content of 240 Escherichia coli isolates from 30 children, aged 2 to 35 months old, in Tanzania. The E. coli strains were isolated from three time points spanning a six-month time period, with and without antibiotic treatment. The resulting isolates were sequenced, and the genomes compared. The findings in this study highlight the transient nature of E. coli strains in the gastrointestinal tract of these children, as during a six-month interval, no one individual contained phylogenomically related isolates at all three time points. While the majority of the isolates at any one time point were phylogenomically similar, most individuals did not contain phylogenomically similar isolates at more than two time points. Examination of global genome content, canonical E. coli virulence factors, multilocus sequence type, serotype, and antimicrobial resistance genes identified diversity even among phylogenomically similar strains. There was no apparent increase in the antimicrobial resistance gene content after antibiotic treatment. The examination of the E. coli from longitudinal samples from multiple children in Tanzania provides insight into the genomic diversity and population variability of resident E. coli within the rapidly changing environment of the gastrointestinal tract of these children.

IMPORTANCE This study increases the number of resident Escherichia coli genome sequences, and explores E. coli diversity through longitudinal sampling. We investigate the genomes of E. coli isolated from human gastrointestinal tracts as part of an antibiotic treatment program among rural Tanzanian children. Phylogenomics demonstrates that resident E. coli are diverse, even within a single host. Though the E. coli isolates of the gastrointestinal community tend to be phylogenomically similar at a given time, they differed across the interrogated time points, demonstrating the variability of the members of the E. coli community in these subjects. Exposure to antibiotic treatment did not have an apparent impact on the E. coli community or the presence of resistance and virulence genes within E. coli genomes. The findings of this study highlight the variable nature of specific bacterial members of the human gastrointestinal tract.

Collapse

A comparative analysis of library prep approaches for sequencing low input translatome samples. BMC Genomics 2018;19:696. [PMID: 30241496 PMCID: PMC6151020 DOI: 10.1186/s12864-018-5066-2] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Accepted: 09/11/2018] [Indexed: 02/06/2023] Open

Abstract

BACKGROUND

Cell type-specific ribosome-pulldown has become an increasingly popular method for analysis of gene expression. It allows for expression analysis from intact tissues and monitoring of protein synthesis in vivo. However, while its utility has been assessed, technical aspects related to sequencing of these samples, often starting with a smaller amount of RNA, have not been reported. In this study, we evaluated the performance of five library prep protocols for ribosome-associated mRNAs when only 250 pg-4 ng of total RNA are used.

RESULTS

We obtained total and RiboTag-IP RNA, in three biological replicates. We compared 5 methods of library preparation for Illumina Next Generation sequencing: NuGEN Ovation RNA-Seq system V2 Kit, TaKaRa SMARTer Stranded Total RNA-Seq Kit, TaKaRa SMART-Seq v4 Ultra Low Input RNA Kit, Illumina TruSeq RNA Library Prep Kit v2 and NEBNext® Ultra™ Directional RNA Library Prep Kit using slightly modified protocols each with 4 ng of total RNA. An additional set of samples was processed using the TruSeq kit with 70 ng, as a 'gold standard' control and the SMART-Seq v4 with 250 pg of total RNA. TruSeq-processed samples had the best metrics overall, with similar results for the 4 ng and 70 ng samples. The results of the SMART-Seq v4 processed samples were similar to TruSeq (Spearman correlation > 0.8) despite using lower amount of input RNA. All RiboTag-IP samples had an increase in the intronic reads compared with the corresponding whole tissue, suggesting that the IP captures some immature mRNAs. The SMARTer-processed samples had a higher representation of ribosomal and non-coding RNAs leading to lower representation of protein coding mRNA. The enrichment or depletion of IP samples compared to corresponding input RNA was similar across all kits except for SMARTer kit.

CONCLUSION

RiboTag-seq can be performed successfully with as little as 250 pg of total RNA when using the SMART-Seq v4 kit and 4 ng when using the modified protocols of other library preparation kits. The SMART-Seq v4 and TruSeq kits resulted in the highest quality libraries. RiboTag IP RNA contains some immature transcripts.

Collapse

Ma ZS, Li L. Measuring metagenome diversity and similarity with Hill numbers. Mol Ecol Resour 2018;18:1339-1355. [PMID: 29985552 DOI: 10.1111/1755-0998.12923] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2017] [Revised: 01/31/2018] [Accepted: 02/17/2018] [Indexed: 11/27/2022]

Abstract

The first step of any metagenome sequencing project is to get the inventory of OTU abundances (operational taxonomic units) and/or metagenomic gene abundances. The former is generated with 16S-rRNA-tagged amplicon sequencing technology, and the latter can be generated from either gene-targeted or whole-sample shotgun metagenomics technologies. With 16S-rRNA data sets, measuring community diversity with diversity indexes such as species richness and Shannon's index has been a de facto standard analysis; nevertheless, similarly comprehensive approaches to metagenomic gene abundances are still largely missing, despite that both OTU and gene abundances are DNA reads. Here, we adapt the Hill numbers, which were reintroduced to macrocommunity ecology recently and are now widely regarded as a most appropriate measure system for ecological diversity, for measuring metagenome alpha-, beta- and gamma-diversities, and similarity. Our proposal includes the following: (a) Metagenomic gene (MG) diversity measures the single-gene-level metagenome diversity; (b) Type-I metagenome functional gene cluster (MFGC) diversity measures the diversity of functional gene clusters but ignoring within-cluster gene abundance information; (c) Type-II MFGC diversity considers within-cluster gene abundances information and integrates gene-cluster-level metagenome diversity and functional gene redundancy information; and (d) Four classes of Hill-numbers-based similarity metrics, including local gene overlap, regional gene overlap, gene homogeneity measure and gene turnover complement, were introduced in terms of MG and MFGC, respectively. We demonstrate the proposal with the gut metagenomes from healthy and IBD (inflammatory bowel disease) cohorts. The Hill numbers offer a unified approach to cohesively and comprehensively measuring the ecological and metagenome diversities of microbiomes.

Collapse

Responses of the Human Gut Escherichia coli Population to Pathogen and Antibiotic Disturbances. mSystems 2018;3:mSystems00047-18. [PMID: 30057943 PMCID: PMC6060285 DOI: 10.1128/msystems.00047-18] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Accepted: 07/06/2018] [Indexed: 11/23/2022] Open

Abstract

Research on human-associated E. coli tends to focus on pathogens, such as enterotoxigenic E. coli (ETEC) strains, which are a leading cause of diarrhea in developing countries. However, the severity of disease caused by these pathogens is thought to be influenced by the microbiome. The nonpathogenic E. coli community that resides in the human gastrointestinal tract may play a role in pathogen colonization and disease severity and may become a reservoir for virulence and antibiotic resistance genes. Our study used whole-genome sequencing of E. coli before, during, and after challenge with an archetype ETEC isolate, H10407, and antibiotic treatment to explore the diversity and resiliency of the resident E. coli population in response to the ecological disturbances caused by pathogen invasion and antibiotic treatment.

Studies of Escherichia coli in the human gastrointestinal tract have focused on pathogens, such as diarrhea-causing enterotoxigenic E. coli (ETEC), while overlooking the resident, nonpathogenic E. coli community. Relatively few genomes of nonpathogenic E. coli strains are available for comparative genomic analysis, and the ecology of these strains is poorly understood. This study examined the diversity and dynamics of resident human gastrointestinal E. coli communities in the face of the ecological challenges presented by pathogen (ETEC) challenge, as well as of antibiotic treatment. Whole-genome sequences obtained from E. coli isolates from before, during, and after ETEC challenge were used in phylogenomic and comparative genomic analyses to examine the diversity of the resident E. coli communities, as well as the dynamics of the challenge strain, H10407, a well-studied ETEC strain (serotype O78:H11) that produces both heat-labile and heat-stable enterotoxins. ETEC failed to become the dominant E. coli clone in two of the six challenge subjects, each of whom exhibited limited or no clinical presentation of diarrhea. The E. coli communities of the remaining four subjects became ETEC dominant during the challenge but reverted to their original, subject-specific populations following antibiotic treatment, suggesting resiliency of the resident E. coli population following major ecological disruptions. This resiliency is likely due in part to the abundance of antibiotic-resistant ST131 E. coli strains in the resident populations. This report provides valuable insights into the potential interactions of members of the gastrointestinal microbiome and its responses to challenge by an external pathogen and by antibiotic exposure.

IMPORTANCE Research on human-associated E. coli tends to focus on pathogens, such as enterotoxigenic E. coli (ETEC) strains, which are a leading cause of diarrhea in developing countries. However, the severity of disease caused by these pathogens is thought to be influenced by the microbiome. The nonpathogenic E. coli community that resides in the human gastrointestinal tract may play a role in pathogen colonization and disease severity and may become a reservoir for virulence and antibiotic resistance genes. Our study used whole-genome sequencing of E. coli before, during, and after challenge with an archetype ETEC isolate, H10407, and antibiotic treatment to explore the diversity and resiliency of the resident E. coli population in response to the ecological disturbances caused by pathogen invasion and antibiotic treatment.

Collapse

Nickerson KP, Senger S, Zhang Y, Lima R, Patel S, Ingano L, Flavahan WA, Kumar DKV, Fraser CM, Faherty CS, Sztein MB, Fiorentino M, Fasano A. Salmonella Typhi Colonization Provokes Extensive Transcriptional Changes Aimed at Evading Host Mucosal Immune Defense During Early Infection of Human Intestinal Tissue. EBioMedicine 2018;31:92-109. [PMID: 29735417 PMCID: PMC6013756 DOI: 10.1016/j.ebiom.2018.04.005] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 04/02/2018] [Accepted: 04/05/2018] [Indexed: 12/29/2022] Open

Affiliation(s)

K P Nickerson Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States; Department of Pediatrics, Harvard Medical School, Harvard University, Boston, MA, United States.
S Senger Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States; Department of Pediatrics, Harvard Medical School, Harvard University, Boston, MA, United States
Y Zhang Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, United States
R Lima Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States
S Patel Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States
L Ingano Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States
W A Flavahan Department of Pathology, Massachusetts General Hospital, Boston, MA, United States
D K V Kumar Department for the Neuroscience of Genetics and Aging, Massachusetts General Hospital, Boston, MA, United States
C M Fraser Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, United States
C S Faherty Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States; Department of Pediatrics, Harvard Medical School, Harvard University, Boston, MA, United States
M B Sztein Center for Vaccine Development, Department of Pediatrics, University of Maryland, Baltimore, MD, United States
M Fiorentino Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States; Department of Pediatrics, Harvard Medical School, Harvard University, Boston, MA, United States
A Fasano Department of Pediatric Gastroenterology, Mucosal Immunology and Biology Research Center, Massachusetts General Hospital, Boston, MA, United States; Department of Pediatrics, Harvard Medical School, Harvard University, Boston, MA, United States.

Collapse

Shuman J, Giles TX, Carroll L, Tabata K, Powers A, Suh SJ, Silo-Suh L. Transcriptome analysis of a Pseudomonas aeruginosasn-glycerol-3-phosphate dehydrogenase mutant reveals a disruption in bioenergetics. MICROBIOLOGY-SGM 2018. [PMID: 29533746 DOI: 10.1099/mic.0.000646] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Sintsova A, Smith S, Subashchandrabose S, Mobley HL. Role of Ethanolamine Utilization Genes in Host Colonization during Urinary Tract Infection. Infect Immun 2018;86:e00542-17. [PMID: 29229730 PMCID: PMC5820945 DOI: 10.1128/iai.00542-17] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Accepted: 12/05/2017] [Indexed: 11/20/2022] Open

Phosphotyrosine-Mediated Regulation of Enterohemorrhagic Escherichia coli Virulence. mBio 2018;9:mBio.00097-18. [PMID: 29487233 PMCID: PMC5829826 DOI: 10.1128/mbio.00097-18] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

Abstract

Enteric pathogens with low infectious doses rely on the ability to orchestrate the expression of virulence and metabolism-associated genes in response to environmental cues for successful infection. Accordingly, the human pathogen enterohemorrhagic Escherichia coli (EHEC) employs a complex multifaceted regulatory network to link the expression of type III secretion system (T3SS) components to nutrient availability. While phosphorylation of histidine and aspartate residues on two-component system response regulators is recognized as an integral part of bacterial signaling, the involvement of phosphotyrosine-mediated control is minimally explored in Gram-negative pathogens. Our recent phosphotyrosine profiling study of E. coli identified 342 phosphorylated proteins, indicating that phosphotyrosine modifications in bacteria are more prevalent than previously anticipated. The present study demonstrates that tyrosine phosphorylation of a metabolite-responsive LacI/GalR family regulator, Cra, negatively affects T3SS expression under glycolytic conditions that are typical for the colonic lumen environment where production of the T3SS is unnecessary. Our data suggest that Cra phosphorylation affects T3SS expression by modulating the expression of ler, which encodes the major activator of EHEC virulence gene expression. Phosphorylation of the Cra Y47 residue diminishes DNA binding to fine-tune the expression of virulence-associated genes, including those of the locus of enterocyte effacement pathogenicity island that encode the T3SS, and thereby negatively affects the formation of attaching and effacing lesions. Our data indicate that tyrosine phosphorylation provides an additional mechanism to control the DNA binding of Cra and other LacI/GalR family regulators, including LacI and PurR. This study describes an initial effort to unravel the role of global phosphotyrosine signaling in the control of EHEC virulence potential.

Enterohemorrhagic Escherichia coli (EHEC) causes outbreaks of hemorrhagic colitis and the potentially fatal hemolytic-uremic syndrome. Successful host colonization by EHEC relies on the ability to coordinate the expression of virulence factors in response to environmental cues. A complex network that integrates environmental signals at multiple regulatory levels tightly controls virulence gene expression. We demonstrate that EHEC utilizes a previously uncharacterized phosphotyrosine signaling pathway through Cra to fine-tune the expression of virulence-associated genes to effectively control T3SS production. This study demonstrates that tyrosine phosphorylation negatively affects the DNA-binding capacity of Cra, which affects the expression of genes related to virulence and metabolism. We demonstrate for the first time that phosphotyrosine-mediated control affects global transcription in EHEC. Our data provide insight into a hitherto unexplored regulatory level of the global network controlling EHEC virulence gene expression.

Collapse

Ahmed Z, Ucar D. I-ATAC: interactive pipeline for the management and pre-processing of ATAC-seq samples. PeerJ 2017;5:e4040. [PMID: 29181276 PMCID: PMC5702251 DOI: 10.7717/peerj.4040] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2017] [Accepted: 10/25/2017] [Indexed: 11/20/2022] Open

Abstract

Assay for Transposase Accessible Chromatin (ATAC-seq) is an open chromatin profiling assay that is adapted to interrogate chromatin accessibility from small cell numbers. ATAC-seq surmounted a major technical barrier and enabled epigenome profiling of clinical samples. With this advancement in technology, we are now accumulating ATAC-seq samples from clinical samples at an unprecedented rate. These epigenomic profiles hold the key to uncovering how transcriptional programs are established in diverse human cells and are disrupted by genetic or environmental factors. Thus, the barrier to deriving important clinical insights from clinical epigenomic samples is no longer one of data generation but of data analysis. Specifically, we are still missing easy-to-use software tools that will enable non-computational scientists to analyze their own ATAC-seq samples. To facilitate systematic pre-processing and management of ATAC-seq samples, we developed an interactive, cross-platform, user-friendly and customized desktop application: interactive-ATAC (I-ATAC). I-ATAC integrates command-line data processing tools (FASTQC, Trimmomatic, BWA, Picard, ATAC_BAM_shiftrt_gappedAlign.pl, Bedtools and Macs2) into an easy-to-use platform with user interface to automatically pre-process ATAC-seq samples with parallelized and customizable pipelines. Its performance has been tested using public ATAC-seq datasets in GM12878 and CD4+T cells and a feature-based comparison is performed with some available interactive LIMS (Galaxy, SMITH, SeqBench, Wasp, NG6, openBIS). I-ATAC is designed to empower non-computational scientists to process their own datasets and to break to exclusivity of data analyses to computational scientists. Additionally, I-ATAC is capable of processing WGS and ChIP-seq samples, and can be customized by the user for one-independent or multiple-sequential operations.

Collapse

Lloyd-Price J, Mahurkar A, Rahnavard G, Crabtree J, Orvis J, Hall AB, Brady A, Creasy HH, McCracken C, Giglio MG, McDonald D, Franzosa EA, Knight R, White O, Huttenhower C. Strains, functions and dynamics in the expanded Human Microbiome Project. Nature 2017;550:61-66. [PMID: 28953883 PMCID: PMC5831082 DOI: 10.1038/nature23889] [Citation(s) in RCA: 727] [Impact Index Per Article: 103.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2016] [Accepted: 08/08/2017] [Indexed: 12/29/2022]

Abstract

The characterization of baseline microbial and functional diversity in the human microbiome has enabled studies of microbiome-related disease, diversity, biogeography, and molecular function. The National Institutes of Health Human Microbiome Project has provided one of the broadest such characterizations so far. Here we introduce a second wave of data from the study, comprising 1,631 new metagenomes (2,355 total) targeting diverse body sites with multiple time points in 265 individuals. We applied updated profiling and assembly methods to provide new characterizations of microbiome personalization. Strain identification revealed subspecies clades specific to body sites; it also quantified species with phylogenetic diversity under-represented in isolate genomes. Body-wide functional profiling classified pathways into universal, human-enriched, and body site-enriched subsets. Finally, temporal analysis decomposed microbial variation into rapidly variable, moderately variable, and stable subsets. This study furthers our knowledge of baseline human microbial diversity and enables an understanding of personalized microbiome function and dynamics.

Updates from the Human Microbiome Project analyse the largest known body-wide metagenomic profile of human microbiome personalization.

The National Institutes of Health Human Microbiome Project, published in 2012, provided a broad overview of the baseline microbiome in healthy individuals using samples from 18 different body sites. In this second installment, the authors expand this dataset with new whole-metagenome sequences and additional time points to assess the diversity and spatiotemporal distributions of the microbiota at six of these body sites. Using a combination of strain profiling, species-level metagenomic functional profiling and longitudinal analyses, this study delivers deeper insights into human microbial communities and provides an important resource for understanding what constitutes a 'healthy' microbiota.

Collapse

Affiliation(s)

Jason Lloyd-Price Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts 02115, USA.,The Broad Institute, Cambridge, Massachusetts 02142, USA
Anup Mahurkar Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
Gholamali Rahnavard Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts 02115, USA.,The Broad Institute, Cambridge, Massachusetts 02142, USA
Jonathan Crabtree Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
Joshua Orvis Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
A Brantley Hall The Broad Institute, Cambridge, Massachusetts 02142, USA
Arthur Brady Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
Heather H Creasy Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
Carrie McCracken Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
Michelle G Giglio Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
Daniel McDonald Department of Pediatrics, University of California San Diego, La Jolla, California 92093, USA
Eric A Franzosa Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts 02115, USA.,The Broad Institute, Cambridge, Massachusetts 02142, USA
Rob Knight Department of Pediatrics, University of California San Diego, La Jolla, California 92093, USA.,Department of Computer Science & Engineering, University of California San Diego, La Jolla, California 92093, USA
Owen White Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
Curtis Huttenhower Biostatistics Department, Harvard T. H. Chan School of Public Health, Boston, Massachusetts 02115, USA.,The Broad Institute, Cambridge, Massachusetts 02142, USA

Collapse

Transcriptional Variation of Diverse Enteropathogenic Escherichia coli Isolates under Virulence-Inducing Conditions. mSystems 2017;2:mSystems00024-17. [PMID: 28766584 PMCID: PMC5527300 DOI: 10.1128/msystems.00024-17] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2017] [Accepted: 05/06/2017] [Indexed: 12/23/2022] Open

Abstract

Enteropathogenic Escherichia coli (EPEC) bacteria are a diverse group of pathogens that cause moderate to severe diarrhea in young children in developing countries. EPEC isolates can be further subclassified as typical EPEC (tEPEC) isolates that contain the bundle-forming pilus (BFP) or as atypical EPEC (aEPEC) isolates that do not contain BFP. Comparative genomics studies have recently highlighted the considerable genomic diversity among EPEC isolates. In the current study, we used RNA sequencing (RNA-Seq) to characterize the global transcriptomes of eight tEPEC isolates representing the identified genomic diversity, as well as one aEPEC isolate. The global transcriptomes were determined for the EPEC isolates under conditions of laboratory growth that are known to induce expression of virulence-associated genes. The findings demonstrate that unique genes of EPEC isolates from diverse phylogenomic lineages contribute to variation in their global transcriptomes. There were also phylogroup-specific differences in the global transcriptomes, including genes involved in iron acquisition, which had significant differential expression in the EPEC isolates belonging to phylogroup B2. Also, three EPEC isolates from the same phylogenomic lineage (EPEC8) had greater levels of similarity in their genomic content and exhibited greater similarities in their global transcriptomes than EPEC from other lineages; however, even among closely related isolates there were isolate-specific differences among their transcriptomes. These findings highlight the transcriptional variability that correlates with the previously unappreciated genomic diversity of EPEC. IMPORTANCE Recent studies have demonstrated that there is considerable genomic diversity among EPEC isolates; however, it is unknown if this genomic diversity leads to differences in their global transcription. This study used RNA-Seq to compare the global transcriptomes of EPEC isolates from diverse phylogenomic lineages. We demonstrate that there are lineage- and isolate-specific differences in the transcriptomes of genomically diverse EPEC isolates during growth under in vitro virulence-inducing conditions. This study addressed biological variation among isolates of a single pathovar in an effort to demonstrate that while each of these isolates is considered an EPEC isolate, there is significant transcriptional diversity among members of this pathovar. Future studies should consider whether this previously undescribed transcriptional variation may play a significant role in isolate-specific variability of EPEC clinical presentations.

Collapse

Comparative genomics and transcriptomics of Escherichia coli isolates carrying virulence factors of both enteropathogenic and enterotoxigenic E. coli. Sci Rep 2017;7:3513. [PMID: 28615618 PMCID: PMC5471185 DOI: 10.1038/s41598-017-03489-z] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Accepted: 04/28/2017] [Indexed: 12/21/2022] Open

Agrawal S, Arze C, Adkins RS, Crabtree J, Riley D, Vangala M, Galens K, Fraser CM, Tettelin H, White O, Angiuoli SV, Mahurkar A, Fricke WF. CloVR-Comparative: automated, cloud-enabled comparative microbial genome sequence analysis pipeline. BMC Genomics 2017;18:332. [PMID: 28449639 PMCID: PMC5408420 DOI: 10.1186/s12864-017-3717-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 04/21/2017] [Indexed: 11/11/2022] Open

Crawl D, Singh A, Altintas I. Kepler WebView: A Lightweight, Portable Framework for Constructing Real-time Web Interfaces of Scientific Workflows. ACTA ACUST UNITED AC 2017;80:673-679. [PMID: 28232853 DOI: 10.1016/j.procs.2016.05.361] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Gonzalez S, Clavijo B, Rivarola M, Moreno P, Fernandez P, Dopazo J, Paniego N. ATGC transcriptomics: a web-based application to integrate, explore and analyze de novo transcriptomic data. BMC Bioinformatics 2017;18:121. [PMID: 28222698 PMCID: PMC5320735 DOI: 10.1186/s12859-017-1494-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2016] [Accepted: 01/21/2017] [Indexed: 12/21/2022] Open

Investigating the Relatedness of Enteroinvasive Escherichia coli to Other E. coli and Shigella Isolates by Using Comparative Genomics. Infect Immun 2016;84:2362-2371. [PMID: 27271741 DOI: 10.1128/iai.00350-16] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2016] [Accepted: 05/31/2016] [Indexed: 12/17/2022] Open

Kania DA, Hazen TH, Hossain A, Nataro JP, Rasko DA. Genome diversity of Shigella boydii. Pathog Dis 2016;74:ftw027. [PMID: 27056949 DOI: 10.1093/femspd/ftw027] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/04/2016] [Indexed: 11/13/2022] Open

Davidson RL, Weber RJM, Liu H, Sharma-Oates A, Viant MR. Galaxy-M: a Galaxy workflow for processing and analyzing direct infusion and liquid chromatography mass spectrometry-based metabolomics data. Gigascience 2016;5:10. [PMID: 26913198 PMCID: PMC4765054 DOI: 10.1186/s13742-016-0115-8] [Citation(s) in RCA: 71] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2015] [Accepted: 02/06/2016] [Indexed: 12/05/2022] Open

Abstract

BACKGROUND

Metabolomics is increasingly recognized as an invaluable tool in the biological, medical and environmental sciences yet lags behind the methodological maturity of other omics fields. To achieve its full potential, including the integration of multiple omics modalities, the accessibility, standardization and reproducibility of computational metabolomics tools must be improved significantly.

RESULTS

Here we present our end-to-end mass spectrometry metabolomics workflow in the widely used platform, Galaxy. Named Galaxy-M, our workflow has been developed for both direct infusion mass spectrometry (DIMS) and liquid chromatography mass spectrometry (LC-MS) metabolomics. The range of tools presented spans from processing of raw data, e.g. peak picking and alignment, through data cleansing, e.g. missing value imputation, to preparation for statistical analysis, e.g. normalization and scaling, and principal components analysis (PCA) with associated statistical evaluation. We demonstrate the ease of using these Galaxy workflows via the analysis of DIMS and LC-MS datasets, and provide PCA scores and associated statistics to help other users to ensure that they can accurately repeat the processing and analysis of these two datasets. Galaxy and data are all provided pre-installed in a virtual machine (VM) that can be downloaded from the GigaDB repository. Additionally, source code, executables and installation instructions are available from GitHub.

CONCLUSIONS

The Galaxy platform has enabled us to produce an easily accessible and reproducible computational metabolomics workflow. More tools could be added by the community to expand its functionality. We recommend that Galaxy-M workflow files are included within the supplementary information of publications, enabling metabolomics studies to achieve greater reproducibility.

Collapse

Genomic diversity of EPEC associated with clinical presentations of differing severity. Nat Microbiol 2016;1:15014. [PMID: 27571975 DOI: 10.1038/nmicrobiol.2015.14] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Accepted: 11/06/2015] [Indexed: 01/01/2023]

Mariette J, Escudié F, Bardou P, Nabihoudine I, Noirot C, Trotard MS, Gaspin C, Klopp C. Jflow: a workflow management system for web applications. Bioinformatics 2015;32:456-8. [PMID: 26454273 PMCID: PMC5859998 DOI: 10.1093/bioinformatics/btv589] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Accepted: 10/07/2015] [Indexed: 11/14/2022] Open

Kanterakis A, Deelen P, van Dijk F, Byelas H, Dijkstra M, Swertz MA. Molgenis-impute: imputation pipeline in a box. BMC Res Notes 2015;8:359. [PMID: 26286716 PMCID: PMC4541731 DOI: 10.1186/s13104-015-1309-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2014] [Accepted: 07/30/2015] [Indexed: 12/12/2022] Open

Abstract

Background

Genotype imputation is an important procedure in current genomic analysis such as genome-wide association studies, meta-analyses and fine mapping. Although high quality tools are available that perform the steps of this process, considerable effort and expertise is required to set up and run a best practice imputation pipeline, particularly for larger genotype datasets, where imputation has to scale out in parallel on computer clusters.

Results

Here we present MOLGENIS-impute, an ‘imputation in a box’ solution that seamlessly and transparently automates the set up and running of all the steps of the imputation process. These steps include genome build liftover (liftovering), genotype phasing with SHAPEIT2, quality control, sample and chromosomal chunking/merging, and imputation with IMPUTE2. MOLGENIS-impute builds on MOLGENIS-compute, a simple pipeline management platform for submission and monitoring of bioinformatics tasks in High Performance Computing (HPC) environments like local/cloud servers, clusters and grids. All the required tools, data and scripts are downloaded and installed in a single step. Researchers with diverse backgrounds and expertise have tested MOLGENIS-impute on different locations and imputed over 30,000 samples so far using the 1,000 Genomes Project and new Genome of the Netherlands data as the imputation reference. The tests have been performed on PBS/SGE clusters, cloud VMs and in a grid HPC environment.

Conclusions

MOLGENIS-impute gives priority to the ease of setting up, configuring and running an imputation. It has minimal dependencies and wraps the pipeline in a simple command line interface, without sacrificing flexibility to adapt or limiting the options of underlying imputation tools. It does not require knowledge of a workflow system or programming, and is targeted at researchers who just want to apply best practices in imputation via simple commands. It is built on the MOLGENIS compute workflow framework to enable customization with additional computational steps or it can be included in other bioinformatics pipelines. It is available as open source from: https://github.com/molgenis/molgenis-imputation.

Electronic supplementary material

The online version of this article (doi:10.1186/s13104-015-1309-3) contains supplementary material, which is available to authorized users.

Collapse

JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing. PLoS One 2015;10:e0134273. [PMID: 26280450 PMCID: PMC4539224 DOI: 10.1371/journal.pone.0134273] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2015] [Accepted: 07/07/2015] [Indexed: 12/04/2022] Open

MaPSeq, A Service-Oriented Architecture for Genomics Research within an Academic Biomedical Research Institution. INFORMATICS 2015. [DOI: 10.3390/informatics2030020] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Hazen TH, Daugherty SC, Shetty A, Mahurkar AA, White O, Kaper JB, Rasko DA. RNA-Seq analysis of isolate- and growth phase-specific differences in the global transcriptomes of enteropathogenic Escherichia coli prototype isolates. Front Microbiol 2015;6:569. [PMID: 26124752 PMCID: PMC4464170 DOI: 10.3389/fmicb.2015.00569] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 05/24/2015] [Indexed: 11/13/2022] Open

Draft Genome Sequence of Thauera sp. Strain SWB20, Isolated from a Singapore Wastewater Treatment Facility Using Gel Microdroplets. GENOME ANNOUNCEMENTS 2015;3:3/2/e00132-15. [PMID: 25792053 PMCID: PMC4395064 DOI: 10.1128/genomea.00132-15] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Chancey ST, Agrawal S, Schroeder MR, Farley MM, Tettelin H, Stephens DS. Composite mobile genetic elements disseminating macrolide resistance in Streptococcus pneumoniae. Front Microbiol 2015;6:26. [PMID: 25709602 PMCID: PMC4321634 DOI: 10.3389/fmicb.2015.00026] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2014] [Accepted: 01/08/2015] [Indexed: 01/17/2023] Open

Blocking yersiniabactin import attenuates extraintestinal pathogenic Escherichia coli in cystitis and pyelonephritis and represents a novel target to prevent urinary tract infection. Infect Immun 2015;83:1443-50. [PMID: 25624354 DOI: 10.1128/iai.02904-14] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open

Draft Genome Sequence of Enterobacter cloacae Strain S611. GENOME ANNOUNCEMENTS 2014;2:2/6/e00710-14. [PMID: 25502660 PMCID: PMC4263822 DOI: 10.1128/genomea.00710-14] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Venco F, Vaskin Y, Ceol A, Muller H. SMITH: a LIMS for handling next-generation sequencing workflows. BMC Bioinformatics 2014;15 Suppl 14:S3. [PMID: 25471934 PMCID: PMC4255740 DOI: 10.1186/1471-2105-15-s14-s3] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open

Abstract

Background

Life-science laboratories make increasing use of Next Generation Sequencing (NGS) for studying bio-macromolecules and their interactions. Array-based methods for measuring gene expression or protein-DNA interactions are being replaced by RNA-Seq and ChIP-Seq. Sequencing is generally performed by specialized facilities that have to keep track of sequencing requests, trace samples, ensure quality and make data available according to predefined privileges.

An integrated tool helps to troubleshoot problems, to maintain a high quality standard, to reduce time and costs. Commercial and non-commercial tools called LIMS (Laboratory Information Management Systems) are available for this purpose. However, they often come at prohibitive cost and/or lack the flexibility and scalability needed to adjust seamlessly to the frequently changing protocols employed.

In order to manage the flow of sequencing data produced at the Genomic Unit of the Italian Institute of Technology (IIT), we developed SMITH (Sequencing Machine Information Tracking and Handling).

Methods

SMITH is a web application with a MySQL server at the backend. Wet-lab scientists of the Centre for Genomic Science and database experts from the Politecnico of Milan in the context of a Genomic Data Model Project developed SMITH. The data base schema stores all the information of an NGS experiment, including the descriptions of all protocols and algorithms used in the process. Notably, an attribute-value table allows associating an unconstrained textual description to each sample and all the data produced afterwards. This method permits the creation of metadata that can be used to search the database for specific files as well as for statistical analyses.

Results

SMITH runs automatically and limits direct human interaction mainly to administrative tasks. SMITH data-delivery procedures were standardized making it easier for biologists and analysts to navigate the data. Automation also helps saving time. The workflows are available through an API provided by the workflow management system. The parameters and input data are passed to the workflow engine that performs de-multiplexing, quality control, alignments, etc.

Conclusions

SMITH standardizes, automates, and speeds up sequencing workflows. Annotation of data with key-value pairs facilitates meta-analysis.

Collapse

Simonyan V, Mazumder R. High-Performance Integrated Virtual Environment (HIVE) Tools and Applications for Big Data Analysis. Genes (Basel) 2014;5:957-81. [PMID: 25271953 PMCID: PMC4276921 DOI: 10.3390/genes5040957] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Revised: 09/22/2014] [Accepted: 09/22/2014] [Indexed: 12/30/2022] Open

Phan IQH, Stacy R, Myler PJ. Selecting targets from eukaryotic parasites for structural genomics and drug discovery. Methods Mol Biol 2014;1140:53-9. [PMID: 24590708 DOI: 10.1007/978-1-4939-0354-2_4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Draft Genome Sequence of Pseudomonas putida Strain S610, a Seed-Borne Bacterium of Wheat. GENOME ANNOUNCEMENTS 2013;1:1/6/e01048-13. [PMID: 24371199 PMCID: PMC3873609 DOI: 10.1128/genomea.01048-13] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]

Bacterial endosymbiosis in a chordate host: long-term co-evolution and conservation of secondary metabolism. PLoS One 2013;8:e80822. [PMID: 24324632 PMCID: PMC3851785 DOI: 10.1371/journal.pone.0080822] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Accepted: 10/16/2013] [Indexed: 11/19/2022] Open

Abstract

Intracellular symbiosis is known to be widespread in insects, but there are few described examples in other types of host. These symbionts carry out useful activities such as synthesizing nutrients and conferring resistance against adverse events such as parasitism. Such symbionts persist through host speciation events, being passed down through vertical transmission. Due to various evolutionary forces, symbionts go through a process of genome reduction, eventually resulting in tiny genomes where only those genes essential to immediate survival and those beneficial to the host remain. In the marine environment, invertebrates such as tunicates are known to harbor complex microbiomes implicated in the production of natural products that are toxic and probably serve a defensive function. Here, we show that the intracellular symbiont Candidatus Endolissoclinum faulkneri is a long-standing symbiont of the tunicate Lissoclinum patella, that has persisted through cryptic speciation of the host. In contrast to the known examples of insect symbionts, which tend to be either relatively recent or ancient relationships, the genome of Ca. E. faulkneri has a very low coding density but very few recognizable pseudogenes. The almost complete degradation of intergenic regions and stable gene inventory of extant strains of Ca. E. faulkneri show that further degradation and deletion is happening very slowly. This is a novel stage of genome reduction and provides insight into how tiny genomes are formed. The ptz pathway, which produces the defensive patellazoles, is shown to date to before the divergence of Ca. E. faulkneri strains, reinforcing its importance in this symbiotic relationship. Lastly, as in insects we show that stable symbionts can be lost, as we describe an L. patella animal where Ca. E. faulkneri is displaced by a likely intracellular pathogen. Our results suggest that intracellular symbionts may be an important source of ecologically significant natural products in animals.

Collapse

Sanderson LA, Ficklin SP, Cheng CH, Jung S, Feltus FA, Bett KE, Main D. Tripal v1.1: a standards-based toolkit for construction of online genetic and genomic databases. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2013;2013:bat075. [PMID: 24163125 PMCID: PMC3808541 DOI: 10.1093/database/bat075] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]

Mongodin EF, Casjens SR, Bruno JF, Xu Y, Drabek EF, Riley DR, Cantarel BL, Pagan PE, Hernandez YA, Vargas LC, Dunn JJ, Schutzer SE, Fraser CM, Qiu WG, Luft BJ. Inter- and intra-specific pan-genomes of Borrelia burgdorferi sensu lato: genome stability and adaptive radiation. BMC Genomics 2013;14:693. [PMID: 24112474 PMCID: PMC3833655 DOI: 10.1186/1471-2164-14-693] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2013] [Accepted: 09/26/2013] [Indexed: 02/06/2023] Open

Genome Sequences of Two Klebsiella pneumoniae Isolates from Different Geographical Regions, Argentina (Strain JHCK1) and the United States (Strain VA360). GENOME ANNOUNCEMENTS 2013;1:1/2/e00168-13. [PMID: 23640195 PMCID: PMC3642250 DOI: 10.1128/genomea.00168-13] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

White JR, Maddox C, White O, Angiuoli SV, Fricke WF. CloVR-ITS: Automated internal transcribed spacer amplicon sequence analysis pipeline for the characterization of fungal microbiota. MICROBIOME 2013;1:6. [PMID: 24451270 PMCID: PMC3869194 DOI: 10.1186/2049-2618-1-6] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/19/2012] [Accepted: 11/21/2012] [Indexed: 05/16/2023]

Abstract

BACKGROUND

Besides the development of comprehensive tools for high-throughput 16S ribosomal RNA amplicon sequence analysis, there exists a growing need for protocols emphasizing alternative phylogenetic markers such as those representing eukaryotic organisms.

RESULTS

Here we introduce CloVR-ITS, an automated pipeline for comparative analysis of internal transcribed spacer (ITS) pyrosequences amplified from metagenomic DNA isolates and representing fungal species. This pipeline performs a variety of steps similar to those commonly used for 16S rRNA amplicon sequence analysis, including preprocessing for quality, chimera detection, clustering of sequences into operational taxonomic units (OTUs), taxonomic assignment (at class, order, family, genus, and species levels) and statistical analysis of sample groups of interest based on user-provided information. Using ITS amplicon pyrosequencing data from a previous human gastric fluid study, we demonstrate the utility of CloVR-ITS for fungal microbiota analysis and provide runtime and cost examples, including analysis of extremely large datasets on the cloud. We show that the largest fractions of reads from the stomach fluid samples were assigned to Dothideomycetes, Saccharomycetes, Agaricomycetes and Sordariomycetes but that all samples were dominated by sequences that could not be taxonomically classified. Representatives of the Candida genus were identified in all samples, most notably C. quercitrusa, while sequence reads assigned to the Aspergillus genus were only identified in a subset of samples. CloVR-ITS is made available as a pre-installed, automated, and portable software pipeline for cloud-friendly execution as part of the CloVR virtual machine package (http://clovr.org).

CONCLUSION

The CloVR-ITS pipeline provides fungal microbiota analysis that can be complementary to bacterial 16S rRNA and total metagenome sequence analysis allowing for more comprehensive studies of environmental and host-associated microbial communities.

Collapse

Mariette J, Escudié F, Allias N, Salin G, Noirot C, Thomas S, Klopp C. NG6: Integrated next generation sequencing storage and processing environment. BMC Genomics 2012;13:462. [PMID: 22958229 PMCID: PMC3444930 DOI: 10.1186/1471-2164-13-462] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2012] [Accepted: 08/30/2012] [Indexed: 11/10/2022] Open

McLellan AS, Dubin RA, Jing Q, Broin PÓ, Moskowitz D, Suzuki M, Calder RB, Hargitai J, Golden A, Greally JM. The Wasp System: an open source environment for managing and analyzing genomic data. Genomics 2012;100:345-51. [PMID: 22944616 DOI: 10.1016/j.ygeno.2012.08.005] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2012] [Revised: 08/16/2012] [Accepted: 08/20/2012] [Indexed: 01/17/2023]