1
|
Jung YH, Wang HLV, Ali S, Corces VG, Kremsky I. Characterization of a strain-specific CD-1 reference genome reveals potential inter- and intra-strain functional variability. BMC Genomics 2023; 24:437. [PMID: 37537522 PMCID: PMC10401811 DOI: 10.1186/s12864-023-09523-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 07/19/2023] [Indexed: 08/05/2023] Open
Abstract
BACKGROUND CD-1 is an outbred mouse stock that is frequently used in toxicology, pharmacology, and fundamental biomedical research. Although inbred strains are typically better suited for such studies due to minimal genetic variability, outbred stocks confer practical advantages over inbred strains, such as improved breeding performance and low cost. Knowledge of the full genetic variability of CD-1 would make it more useful in toxicology, pharmacology, and fundamental biomedical research. RESULTS We performed deep genomic DNA sequencing of CD-1 mice and used the data to identify genome-wide SNPs, indels, and germline transposable elements relative to the mm10 reference genome. We used multiple genome-wide sequencing data types and previously published CD-1 SNPs to validate our called variants. We used the called variants to construct a strain-specific CD-1 reference genome, which we show can improve mappability and reduce experimental biases from genome-wide sequencing data derived from CD-1 mice. Based on previously published ChIP-seq and ATAC-seq data, we find evidence that genetic variation between CD-1 mice can lead to alterations in transcription factor binding. We also identified a number of variants in the coding region of genes which could have effects on translation of genes. CONCLUSIONS We have identified millions of previously unidentified CD-1 variants with the potential to confound studies involving CD-1. We used the identified variants to construct a CD-1-specific reference genome, which can improve accuracy and reduce bias when aligning genomics data derived from CD-1 mice.
Collapse
Affiliation(s)
- Yoon Hee Jung
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA
| | - Hsiao-Lin V Wang
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA
| | - Samir Ali
- Department of Basic Sciences, Loma Linda University School of Medicine, Loma Linda, CA, 92350, USA
| | - Victor G Corces
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA
| | - Isaac Kremsky
- Department of Basic Sciences, Loma Linda University School of Medicine, Loma Linda, CA, 92350, USA.
- Center for Genomics, Loma Linda University School of Medicine, Loma Linda, CA, USA.
| |
Collapse
|
2
|
Abstract
Experiments involving metagenomics data are become increasingly commonplace. Processing such data requires a unique set of considerations. Quality control of metagenomics data is critical to extracting pertinent insights. In this chapter, we outline some considerations in terms of study design and other confounding factors that can often only be realized at the point of data analysis.In this chapter, we outline some basic principles of quality control in metagenomics, including overall reproducibility and some good practices to follow. The general quality control of sequencing data is then outlined, and we introduce ways to process this data by using bash scripts and developing pipelines in Snakemake (Python).A significant part of quality control in metagenomics is in analyzing the data to ensure you can spot relationships between variables and to identify when they might be confounded. This chapter provides a walkthrough of analyzing some microbiome data (in the R statistical language) and demonstrates a few days to identify overall differences and similarities in microbiome data. The chapter is concluded by discussing remarks about considering taxonomic results in the context of the study and interrogating sequence alignments using the command line.
Collapse
Affiliation(s)
- Abraham Gihawi
- Bob Champion Research & Education Building, Norwich Medical School, University of East Anglia, Norwich, UK
| | - Ryan Cardenas
- Bob Champion Research & Education Building, Norwich Medical School, University of East Anglia, Norwich, UK
| | - Rachel Hurst
- Bob Champion Research & Education Building, Norwich Medical School, University of East Anglia, Norwich, UK
| | - Daniel S Brewer
- Bob Champion Research & Education Building, Norwich Medical School, University of East Anglia, Norwich, UK.
- Earlham Institute, Norwich Research Park, Norwich, UK.
| |
Collapse
|
3
|
Sänger PA, Wagner S, Liebler-Tenorio EM, Fuchs TM. Dissecting the invasion of Galleria mellonella by Yersinia enterocolitica reveals metabolic adaptations and a role of a phage lysis cassette in insect killing. PLoS Pathog 2022; 18:e1010991. [PMID: 36399504 PMCID: PMC9718411 DOI: 10.1371/journal.ppat.1010991] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 12/02/2022] [Accepted: 11/08/2022] [Indexed: 11/19/2022] Open
Abstract
The human pathogen Yersinia enterocolitica strain W22703 is characterized by its toxicity towards invertebrates that requires the insecticidal toxin complex (Tc) proteins encoded by the pathogenicity island Tc-PAIYe. Molecular and pathophysiological details of insect larvae infection and killing by this pathogen, however, have not been dissected. Here, we applied oral infection of Galleria mellonella (Greater wax moth) larvae to study the colonisation, proliferation, tissue invasion, and killing activity of W22703. We demonstrated that this strain is strongly toxic towards the larvae, in which they proliferate by more than three orders of magnitude within six days post infection. Deletion mutants of the genes tcaA and tccC were atoxic for the insect. W22703 ΔtccC, in contrast to W22703 ΔtcaA, initially proliferated before being eliminated from the host, thus confirming TcaA as membrane-binding Tc subunit and TccC as cell toxin. Time course experiments revealed a Tc-dependent infection process starting with midgut colonisation that is followed by invasion of the hemolymph where the pathogen elicits morphological changes of hemocytes and strongly proliferates. The in vivo transcriptome of strain W22703 shows that the pathogen undergoes a drastic reprogramming of central cell functions and gains access to numerous carbohydrate and amino acid resources within the insect. Strikingly, a mutant lacking a phage-related holin/endolysin (HE) cassette, which is located within Tc-PAIYe, resembled the phenotypes of W22703 ΔtcaA, suggesting that this dual lysis cassette may be an example of a phage-related function that has been adapted for the release of a bacterial toxin.
Collapse
Affiliation(s)
| | - Stefanie Wagner
- Friedrich-Loeffler-Institut, Institut für Molekulare Pathogenese, Jena, Germany
| | | | - Thilo M. Fuchs
- Friedrich-Loeffler-Institut, Institut für Molekulare Pathogenese, Jena, Germany
- * E-mail:
| |
Collapse
|
4
|
Gomez-Escribano JP, Algora Gallardo L, Bozhüyük KAJ, Kendrew SG, Huckle BD, Crowhurst NA, Bibb MJ, Collis AJ, Micklefield J, Herron PR, Wilkinson B. Genome editing reveals that pSCL4 is required for chromosome linearity in Streptomyces clavuligerus. Microb Genom 2021; 7:000669. [PMID: 34747689 PMCID: PMC8743545 DOI: 10.1099/mgen.0.000669] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Accepted: 08/09/2021] [Indexed: 12/28/2022] Open
Abstract
Streptomyces clavuligerus is an industrially important actinomycete whose genetic manipulation is limited by low transformation and conjugation efficiencies, low levels of recombination of introduced DNA, and difficulty in obtaining consistent sporulation. We describe the construction and application of versatile vectors for Cas9-mediated genome editing of this strain. To design spacer sequences with confidence, we derived a highly accurate genome assembly for an isolate of the type strain (ATCC 27064). This yielded a chromosome assembly (6.75 Mb) plus assemblies for pSCL4 (1795 kb) and pSCL2 (149 kb). The strain also carries pSCL1 (12 kb), but its small size resulted in only partial sequence coverage. The previously described pSCL3 (444 kb) is not present in this isolate. Using our Cas9 vectors, we cured pSCL4 with high efficiency by targeting the plasmid's parB gene. Five of the resulting pSCL4-cured isolates were characterized and all showed impaired sporulation. Shotgun genome sequencing of each of these derivatives revealed large deletions at the ends of the chromosomes in all of them, and for two clones sufficient sequence data was obtained to show that the chromosome had circularized. Taken together, these data indicate that pSCL4 is essential for the structural stability of the linear chromosome.
Collapse
Affiliation(s)
- Juan Pablo Gomez-Escribano
- Department of Molecular Microbiology, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
- Present address: Department of Bioresources for Bioeconomy and Health Research, Leibniz Institute, DSMZ-German Collection of Microorganisms and Cell Cultures, Inhoffenstraße 7B, 38124 Braunschweig, Germany
| | - Lis Algora Gallardo
- Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, 161 Cathedral Street, Glasgow G4 0RE, UK
| | - Kenan A. J. Bozhüyük
- Department of Molecular Microbiology, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
- Present address: Molecular Biotechnology, Department of Biosciences, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| | - Steven G. Kendrew
- Biotechnology and Environmental Shared Service, GlaxoSmithKline, Southdown View Way, Worthing BN14 8QH, UK
- Engineered Biodesign Limited, Cambridge CB1 3SN, UK
| | - Benjamin D. Huckle
- Biotechnology and Environmental Shared Service, GlaxoSmithKline, Southdown View Way, Worthing BN14 8QH, UK
| | - Nicola A. Crowhurst
- Biotechnology and Environmental Shared Service, GlaxoSmithKline, Southdown View Way, Worthing BN14 8QH, UK
| | - Mervyn J. Bibb
- Department of Molecular Microbiology, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Andrew J. Collis
- Biotechnology and Environmental Shared Service, GlaxoSmithKline, Southdown View Way, Worthing BN14 8QH, UK
| | - Jason Micklefield
- Department of Chemistry, Manchester Institute for Biotechnology, University of Manchester, 131 Princess Street, Manchester M1 7DN, UK
| | - Paul R. Herron
- Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, 161 Cathedral Street, Glasgow G4 0RE, UK
| | - Barrie Wilkinson
- Department of Molecular Microbiology, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| |
Collapse
|
5
|
In Silico Prediction for ncRNAs in Prokaryotes. Methods Mol Biol 2021. [PMID: 34251633 DOI: 10.1007/978-1-0716-1534-8_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
The identification and characterization of non-coding RNAs (ncRNAs) in prokaryotes is an important step in the study of the interaction of these molecules with mRNAs-or target proteins, in the post-transcriptional regulation process. Here, we describe one of the main in silico prediction methods in prokaryotes, using the TargetRNA2 tool to predict target mRNAs.
Collapse
|
6
|
Fennell TG, Blackwell GA, Thomson NR, Dorman MJ. gbpA and chiA genes are not uniformly distributed amongst diverse Vibrio cholerae. Microb Genom 2021; 7:000594. [PMID: 34100695 PMCID: PMC8461464 DOI: 10.1099/mgen.0.000594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 04/26/2021] [Indexed: 11/18/2022] Open
Abstract
Members of the bacterial genus Vibrio utilize chitin both as a metabolic substrate and a signal to activate natural competence. Vibrio cholerae is a bacterial enteric pathogen, sub-lineages of which can cause pandemic cholera. However, the chitin metabolic pathway in V. cholerae has been dissected using only a limited number of laboratory strains of this species. Here, we survey the complement of key chitin metabolism genes amongst 195 diverse V. cholerae. We show that the gene encoding GbpA, known to be an important colonization and virulence factor in pandemic isolates, is not ubiquitous amongst V. cholerae. We also identify a putatively novel chitinase, and present experimental evidence in support of its functionality. Our data indicate that the chitin metabolic pathway within V. cholerae is more complex than previously thought, and emphasize the importance of considering genes and functions in the context of a species in its entirety, rather than simply relying on traditional reference strains.
Collapse
Affiliation(s)
- Thea G. Fennell
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
- Churchill College, Storey’s Way, Cambridge, CB3 0DS, UK
- Present address: Sainsbury Laboratory, University of Cambridge, Bateman Street, Cambridge, UK
| | - Grace A. Blackwell
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
- EMBL-EBI, Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Nicholas R. Thomson
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
- London School of Hygiene and Tropical Medicine, Keppel St., Bloomsbury, London, WC1E 7HT, UK
| | - Matthew J. Dorman
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
- Churchill College, Storey’s Way, Cambridge, CB3 0DS, UK
| |
Collapse
|
7
|
Reinscheid RK, Mafessoni F, Lüttjohann A, Jüngling K, Pape HC, Schulz S. Neandertal introgression and accumulation of hypomorphic mutations in the neuropeptide S (NPS) system promote attenuated functionality. Peptides 2021; 138:170506. [PMID: 33556445 DOI: 10.1016/j.peptides.2021.170506] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 01/14/2021] [Accepted: 02/03/2021] [Indexed: 12/21/2022]
Abstract
The neuropeptide S (NPS) system plays an important role in fear and fear memory processing but has also been associated with allergic and inflammatory diseases. Genes for NPS and its receptor NPSR1 are found in all tetrapods. Compared to non-human primates, several non-synonymous single-nucleotide polymorphisms (SNPs) occur in both human genes that collectively result in functional attenuation, suggesting adaptive mechanisms in a human context. To investigate historic and geographic origins of these hypomorphic mutations and explore genetic signs of selection, we analyzed ancient genomes and worldwide genotype frequencies of four prototypic SNPs in the NPS system. Neandertal and Denisovan genomes contain exclusively ancestral alleles for NPSR1 while all derived alleles occur in ancient genomes of anatomically modern humans, indicating that they arose in modern Homo sapiens. Worldwide genotype frequencies for three hypomorphic NPSR1 SNPs show significant regional homogeneity but follow a gradient towards increasing derived allele frequencies that supports an out-of-Africa scenario. Increased density of high-frequency polymorphisms around the three NPSR1 loci suggests weak or possibly balancing selection. A hypomorphic mutation in the NPS precursor, however, was detected at high frequency in Eurasian Neandertal genomes and shows genetic signatures indicating that it was introgressed into the human gene pool, particularly in Southern Europe, by interbreeding with Neandertals. We discuss potential evolutionary scenarios including behavior and immune-based natural selection.
Collapse
Affiliation(s)
- Rainer K Reinscheid
- Institute of Pharmacology & Toxicology, Friedrich-Schiller-University, Jena, Germany; Institute of Physiology I, Westfälische-Wilhelms-University, Münster, Germany.
| | | | - Annika Lüttjohann
- Institute of Physiology I, Westfälische-Wilhelms-University, Münster, Germany
| | - Kay Jüngling
- Institute of Physiology I, Westfälische-Wilhelms-University, Münster, Germany
| | - Hans-Christian Pape
- Institute of Physiology I, Westfälische-Wilhelms-University, Münster, Germany
| | - Stefan Schulz
- Institute of Pharmacology & Toxicology, Friedrich-Schiller-University, Jena, Germany
| |
Collapse
|
8
|
Prax N, Wagner S, Schardt J, Neuhaus K, Clavel T, Fuchs TM. A diet-specific microbiota drives Salmonella Typhimurium to adapt its in vivo response to plant-derived substrates. Anim Microbiome 2021; 3:24. [PMID: 33731218 PMCID: PMC7972205 DOI: 10.1186/s42523-021-00082-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 02/08/2021] [Indexed: 11/23/2022] Open
Abstract
Background Little is known about the complex interactions between the diet, the gut microbiota, and enteropathogens. Here, the impact of two specific diets on the composition of the mouse gut microbiota and on the transcriptional response of Salmonella Typhimurium (S. Typhimurium) was analyzed in an enteritis model. Results Mice were fed for two weeks a fibre-rich, plant-based diet (PD), or a Westernized diet (WD) rich in animal fat and proteins and in simple sugars, and then infected with an invasin-negative S. Typhimurium strain ST4/74 following streptomycin-treatment. Seventy-two hours post infection, fecal pathogen loads were equal in both diet groups, suggesting that neither of the diets had negatively influenced the ability of this ST4/74 strain to colonize and proliferate in the gut at this time point. To define its diet-dependent gene expression pattern, S. Typhimurium was immunomagnetically isolated from the gut content, and its transcriptome was analyzed. A total of 66 genes were more strongly expressed in mice fed the plant-based diet. The majority of these genes was involved in metabolic functions degrading substrates of fruits and plants. Four of them are part of the gat gene cluster responsible for the uptake and metabolism of galactitol and D-tagatose. In line with this finding, 16S rRNA gene amplicon analysis revealed higher relative abundance of bacterial families able to degrade fiber and nutritive carbohydrates in PD-fed mice in comparison with those nourished with a WD. Competitive mice infection experiments performed with strain ST4/74 and ST4/74 ΔSTM3254 lacking tagatose-1,6-biphosphate aldolase, which is essential for galactitol and tagatose utilization, did not reveal a growth advantage of strain ST4/74 in the gastrointestinal tract of mice fed plant-based diet as compared to the deletion mutant. Conclusion A Westernized diet and a plant-based diet evoke distinct transcriptional responses of S. Typhimurium during infection that allows the pathogen to adapt its metabolic activities to the diet-derived nutrients. This study therefore provides new insights into the dynamic interplay between nutrient availability, indigenous gut microbiota, and proliferation of S. Typhimurium. Supplementary Information The online version contains supplementary material available at 10.1186/s42523-021-00082-8.
Collapse
Affiliation(s)
- Nicoletta Prax
- Lehrstuhl für Mikrobielle Ökologie, TUM School of Life Sciences, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.,ZIEL - Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 1, 85354, Freising, Germany
| | - Stefanie Wagner
- Friedrich-Loeffler-Institut, Institut für Molekulare Pathogenese, Naumburger Str. 96a, 07743, Jena, Germany
| | - Jakob Schardt
- Lehrstuhl für Mikrobielle Ökologie, TUM School of Life Sciences, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.,ZIEL - Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 1, 85354, Freising, Germany
| | - Klaus Neuhaus
- ZIEL - Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 1, 85354, Freising, Germany.,Core Facility Microbiome, ZIEL - Institute für Food & Health, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Thomas Clavel
- ZIEL - Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 1, 85354, Freising, Germany.,Arbeitsgruppe Funktionelle Mikrobiomforschung, Institut für Medizinische Mikrobiologie, Uniklinik der RWTH Aachen, Pauwelsstrasse 30, 52074, Aachen, Germany
| | - Thilo M Fuchs
- Lehrstuhl für Mikrobielle Ökologie, TUM School of Life Sciences, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany. .,ZIEL - Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 1, 85354, Freising, Germany. .,Friedrich-Loeffler-Institut, Institut für Molekulare Pathogenese, Naumburger Str. 96a, 07743, Jena, Germany.
| |
Collapse
|
9
|
Kwon M, Lee S, Berselli M, Chu C, Park PJ. BamSnap: a lightweight viewer for sequencing reads in BAM files. Bioinformatics 2021; 37:263-264. [PMID: 33416869 PMCID: PMC8055225 DOI: 10.1093/bioinformatics/btaa1101] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 12/07/2020] [Accepted: 12/28/2020] [Indexed: 11/14/2022] Open
Abstract
SUMMARY Despite the improvement in variant detection algorithms, visual inspection of the read-level data remains an essential step for accurate identification of variants in genome analysis. We developed BamSnap, an efficient BAM file viewer utilizing a graphics library and BAM indexing. In contrast to existing viewers, BamSnap can generate high-quality snapshots rapidly, with customized tracks and layout. As an example, we produced read-level images at 1000 genomic loci for >2500 whole-genomes. AVAILABILITY BamSnap is freely available at https://github.com/parklab/bamsnap. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Minseok Kwon
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Soohyun Lee
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Michele Berselli
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Chong Chu
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Peter J Park
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
10
|
Otto TD, Assefa SA, Böhme U, Sanders MJ, Kwiatkowski D, Berriman M, Newbold C. Evolutionary analysis of the most polymorphic gene family in falciparum malaria. Wellcome Open Res 2019; 4:193. [PMID: 32055709 PMCID: PMC7001760 DOI: 10.12688/wellcomeopenres.15590.1] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/26/2019] [Indexed: 12/22/2022] Open
Abstract
The var gene family of the human malaria parasite Plasmodium falciparum encode proteins that are crucial determinants of both pathogenesis and immune evasion and are highly polymorphic. Here we have assembled nearly complete var gene repertoires from 2398 field isolates and analysed a normalised set of 714 from across 12 countries. This therefore represents the first large scale attempt to catalogue the worldwide distribution of var gene sequences We confirm the extreme polymorphism of this gene family but also demonstrate an unexpected level of sequence sharing both within and between continents. We show that this is likely due to both the remnants of selective sweeps as well as a worrying degree of recent gene flow across continents with implications for the spread of drug resistance. We also address the evolution of the var repertoire with respect to the ancestral genes within the Laverania and show that diversity generated by recombination is concentrated in a number of hotspots. An analysis of the subdomain structure indicates that some existing definitions may need to be revised From the analysis of this data, we can now understand the way in which the family has evolved and how the diversity is continuously being generated. Finally, we demonstrate that because the genes are distributed across the genome, sequence sharing between genotypes acts as a useful population genetic marker.
Collapse
Affiliation(s)
- Thomas D. Otto
- Parasite Genetics, Wellcome Trust Sanger Institute, Hinxton, UK
- Institute of Infection, Immunity & Inflammation, MVLS, University of Glasgow, Glasgow, UK
| | - Sammy A. Assefa
- Parasite Genetics, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Ulrike Böhme
- Parasite Genetics, Wellcome Trust Sanger Institute, Hinxton, UK
| | | | - Dominic Kwiatkowski
- Parasite Genetics, Wellcome Trust Sanger Institute, Hinxton, UK
- The Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Pf3k consortium
- Parasite Genetics, Wellcome Trust Sanger Institute, Hinxton, UK
- Institute of Infection, Immunity & Inflammation, MVLS, University of Glasgow, Glasgow, UK
- The Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK
- Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, UK
| | - Matt Berriman
- Parasite Genetics, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Chris Newbold
- Parasite Genetics, Wellcome Trust Sanger Institute, Hinxton, UK
- Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, UK
| |
Collapse
|
11
|
Bi JH, Tong YF, Qiu ZW, Yang XF, Minna J, Gazdar AF, Song K. ClickGene: an open cloud-based platform for big pan-cancer data genome-wide association study, visualization and exploration. BioData Min 2019; 12:12. [PMID: 31391866 PMCID: PMC6595587 DOI: 10.1186/s13040-019-0202-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2018] [Accepted: 06/17/2019] [Indexed: 12/15/2022] Open
Abstract
Tremendous amount of whole-genome sequencing data have been provided by large consortium projects such as TCGA (The Cancer Genome Atlas), COSMIC and so on, which creates incredible opportunities for functional gene research and cancer associated mechanism uncovering. While the existing web servers are valuable and widely used, many whole genome analysis functions urgently needed by experimental biologists are still not adequately addressed. A cloud-based platform, named CG (ClickGene), therefore, was developed for DIY analyzing of user's private in-house data or public genome data without any requirement of software installation or system configuration. CG platform provides key interactive and customized functions including Bee-swarm plot, linear regression analyses, Mountain plot, Directional Manhattan plot, Deflection plot and Volcano plot. Using these tools, global profiling or individual gene distributions for expression and copy number variation (CNV) analyses can be generated by only mouse button clicking. The easy accessibility of such comprehensive pan-cancer genome analysis greatly facilitates data mining in wide research areas, such as therapeutic discovery process. Therefore, it fills in the gaps between big cancer genomics data and the delivery of integrated knowledge to end-users, thus helping unleash the value of the current data resources. More importantly, unlike other R-based web platforms, Dubbo, a cloud distributed service governance framework for 'big data' stream global transferring, was used to develop CG platform. After being developed, CG is run on an independent cloud-server, which ensures its steady global accessibility. More than 2 years running history of CG proved that advanced plots for hundreds of whole-genome data can be created through it within seconds by end-users anytime and anywhere. CG is available at http://www.clickgenome.org/.
Collapse
Affiliation(s)
- Jia-Hao Bi
- 1School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072 China
| | - Yi-Fan Tong
- 1School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072 China
| | - Zhe-Wei Qiu
- 1School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072 China
| | - Xing-Feng Yang
- 2School of Computer Software, Tianjin University, Tianjin, 300072 China
| | - John Minna
- 3Hamon Center for Therapeutic Oncology, University of Texas Southwestern Medical Center, Dallas, TX 75390 USA.,4Department of Pharmacology, University of Texas Southwestern Medical Center, Dallas, TX 75390 USA.,5Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390 USA
| | - Adi F Gazdar
- 3Hamon Center for Therapeutic Oncology, University of Texas Southwestern Medical Center, Dallas, TX 75390 USA.,6Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX 75390 USA
| | - Kai Song
- 1School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072 China.,3Hamon Center for Therapeutic Oncology, University of Texas Southwestern Medical Center, Dallas, TX 75390 USA
| |
Collapse
|
12
|
Dorman MJ, Kane L, Domman D, Turnbull JD, Cormie C, Fazal MA, Goulding DA, Russell JE, Alexander S, Thomson NR. The history, genome and biology of NCTC 30: a non-pandemic Vibrio cholerae isolate from World War One. Proc Biol Sci 2019; 286:20182025. [PMID: 30966987 PMCID: PMC6501683 DOI: 10.1098/rspb.2018.2025] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Accepted: 03/21/2019] [Indexed: 02/01/2023] Open
Abstract
The sixth global cholera pandemic lasted from 1899 to 1923. However, despite widespread fear of the disease and of its negative effects on troop morale, very few soldiers in the British Expeditionary Forces contracted cholera between 1914 and 1918. Here, we have revived and sequenced the genome of NCTC 30, a 102-year-old Vibrio cholerae isolate, which we believe is the oldest publicly available live V. cholerae strain in existence. NCTC 30 was isolated in 1916 from a British soldier convalescent in Egypt. We found that this strain does not encode cholera toxin, thought to be necessary to cause cholera, and is not part of V. cholerae lineages responsible for the pandemic disease. We also show that NCTC 30, which predates the introduction of penicillin-based antibiotics, harbours a functional β-lactamase antibiotic resistance gene. Our data corroborate and provide molecular explanations for previous phenotypic studies of NCTC 30 and provide a new high-quality genome sequence for historical, non-pandemic V. cholerae.
Collapse
Affiliation(s)
- Matthew J. Dorman
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
| | - Leanne Kane
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
| | - Daryl Domman
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
| | | | - Claire Cormie
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
| | | | - David A. Goulding
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
| | | | - Sarah Alexander
- Public Health England, 61 Colindale Avenue, London NW9 5DF, UK
| | - Nicholas R. Thomson
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
- London School of Hygiene and Tropical Medicine, Keppel Street, Bloomsbury, London WC1E 7HT, UK
| |
Collapse
|
13
|
Dynamic Interactions Between the Genome and an Endogenous Retrovirus: Tirant in Drosophila simulans Wild-Type Strains. G3-GENES GENOMES GENETICS 2019; 9:855-865. [PMID: 30658967 PMCID: PMC6404621 DOI: 10.1534/g3.118.200789] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
All genomes contain repeated sequences that are known as transposable elements (TEs). Among these are endogenous retroviruses (ERVs), which are sequences similar to retroviruses and are transmitted across generations from parent to progeny. These sequences are controlled in genomes through epigenetic mechanisms. At the center of the epigenetic control of TEs are small interfering RNAs of the piRNA class, which trigger heterochromatinization of TE sequences. The tirant ERV of Drosophila simulans displays intra-specific variability in copy numbers, insertion sites, and transcription levels, providing us with a well-suited model to study the dynamic relationship between a TE family and the host genome through epigenetic mechanisms. We show that tirant transcript amounts and piRNA amounts are positively correlated in ovaries in normal conditions, unlike what was previously described following divergent crosses. In addition, we describe tirant insertion polymorphism in the genomes of three D. simulans wild-type strains, which reveals a limited number of insertions that may be associated with gene transcript level changes through heterochromatin spreading and have phenotypic impacts. Taken together, our results participate in the understanding of the equilibrium between the host genome and its TEs.
Collapse
|
14
|
Carvalho Garcia A, Dos Santos VLP, Santos Cavalcanti TC, Collaço LM, Graf H. Bacterial Small RNAs in the Genus Herbaspirillum spp. Int J Mol Sci 2018; 20:ijms20010046. [PMID: 30583511 PMCID: PMC6337395 DOI: 10.3390/ijms20010046] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Revised: 12/12/2018] [Accepted: 12/12/2018] [Indexed: 12/26/2022] Open
Abstract
The genus Herbaspirillum includes several strains isolated from different grasses. The identification of non-coding RNAs (ncRNAs) in the genus Herbaspirillum is an important stage studying the interaction of these molecules and the way they modulate physiological responses of different mechanisms, through RNA⁻RNA interaction or RNA⁻protein interaction. This interaction with their target occurs through the perfect pairing of short sequences (cis-encoded ncRNAs) or by the partial pairing of short sequences (trans-encoded ncRNAs). However, the companion Hfq can stabilize interactions in the trans-acting class. In addition, there are Riboswitches, located at the 5' end of mRNA and less often at the 3' end, which respond to environmental signals, high temperatures, or small binder molecules. Recently, CRISPR (clustered regularly interspaced palindromic repeats), in prokaryotes, have been described that consist of serial repeats of base sequences (spacer DNA) resulting from a previous exposure to exogenous plasmids or bacteriophages. We identified 285 ncRNAs in Herbaspirillum seropedicae (H. seropedicae) SmR1, expressed in different experimental conditions of RNA-seq material, classified as cis-encoded ncRNAs or trans-encoded ncRNAs and detected RNA riboswitch domains and CRISPR sequences. The results provide a better understanding of the participation of this type of RNA in the regulation of the metabolism of bacteria of the genus Herbaspirillum spp.
Collapse
Affiliation(s)
- Amanda Carvalho Garcia
- Department of Internal Medicine, Federal University of Paraná, Curitiba 80.060-240, Brazil.
| | | | | | - Luiz Martins Collaço
- Department of Pathology, Federal University of Paraná, PR, Curitiba 80.060-240, Brazil.
| | - Hans Graf
- Department of Internal Medicine, Federal University of Paraná, Curitiba 80.060-240, Brazil.
| |
Collapse
|
15
|
Campino S, Marin-Menendez A, Kemp A, Cross N, Drought L, Otto TD, Benavente ED, Ravenhall M, Schwach F, Girling G, Manske M, Theron M, Gould K, Drury E, Clark TG, Kwiatkowski DP, Pance A, Rayner JC. A forward genetic screen reveals a primary role for Plasmodium falciparum Reticulocyte Binding Protein Homologue 2a and 2b in determining alternative erythrocyte invasion pathways. PLoS Pathog 2018; 14:e1007436. [PMID: 30496294 PMCID: PMC6289454 DOI: 10.1371/journal.ppat.1007436] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Revised: 12/11/2018] [Accepted: 10/24/2018] [Indexed: 12/14/2022] Open
Abstract
Invasion of human erythrocytes is essential for Plasmodium falciparum parasite survival and pathogenesis, and is also a complex phenotype. While some later steps in invasion appear to be invariant and essential, the earlier steps of recognition are controlled by a series of redundant, and only partially understood, receptor-ligand interactions. Reverse genetic analysis of laboratory adapted strains has identified multiple genes that when deleted can alter invasion, but how the relative contributions of each gene translate to the phenotypes of clinical isolates is far from clear. We used a forward genetic approach to identify genes responsible for variable erythrocyte invasion by phenotyping the parents and progeny of previously generated experimental genetic crosses. Linkage analysis using whole genome sequencing data revealed a single major locus was responsible for the majority of phenotypic variation in two invasion pathways. This locus contained the PfRh2a and PfRh2b genes, members of one of the major invasion ligand gene families, but not widely thought to play such a prominent role in specifying invasion phenotypes. Variation in invasion pathways was linked to significant differences in PfRh2a and PfRh2b expression between parasite lines, and their role in specifying alternative invasion was confirmed by CRISPR-Cas9-mediated genome editing. Expansion of the analysis to a large set of clinical P. falciparum isolates revealed common deletions, suggesting that variation at this locus is a major cause of invasion phenotypic variation in the endemic setting. This work has implications for blood-stage vaccine development and will help inform the design and location of future large-scale studies of invasion in clinical isolates. Plasmodium parasites cause more than 200 million cases of malaria each year. All the symptoms of malaria are caused after Plasmodium parasites invade human red blood cells. Once inside, they grow, multiply and break open the red blood cells to release new parasites. This cycle is repeated every 48 hours, rapidly amplifying the number of parasites and causing severe anemia and other complications. Plasmodium falciparum, the parasite species responsible for almost all malaria deaths, can use multiple different pathways to invade human red blood cells, but the relative importance of each is not well understood. We tested the invasion pathways used by a collection of closely related parasites and compared their genome sequences to identify the genes responsible. This analysis revealed that expression differences in two neighboring genes of the Reticulocyte Binding Homologue family are responsible for most of the variation in two invasion pathways. P. falciparum may use variation in these genes to avoid the immune system or adapt to specific blood groups, which has important implications for vaccine development against malaria.
Collapse
Affiliation(s)
- Susana Campino
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
- * E-mail: (SC); (JCR)
| | - Alejandro Marin-Menendez
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Alison Kemp
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Nadia Cross
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Laura Drought
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Thomas D. Otto
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
- Centre of Immunobiology, Institute of Infection, Immunity & Inflammation, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, United Kingdom
| | - Ernest Diez Benavente
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Matt Ravenhall
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Frank Schwach
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Gareth Girling
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Magnus Manske
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Michel Theron
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Kelda Gould
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Eleanor Drury
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Taane G. Clark
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Dominic P. Kwiatkowski
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Alena Pance
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Julian C. Rayner
- Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
- * E-mail: (SC); (JCR)
| |
Collapse
|
16
|
Sapountzis P, Zhukova M, Shik JZ, Schiott M, Boomsma JJ. Reconstructing the functions of endosymbiotic Mollicutes in fungus-growing ants. eLife 2018; 7:e39209. [PMID: 30454555 PMCID: PMC6245734 DOI: 10.7554/elife.39209] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Accepted: 10/16/2018] [Indexed: 12/25/2022] Open
Abstract
Mollicutes, a widespread class of bacteria associated with animals and plants, were recently identified as abundant abdominal endosymbionts in healthy workers of attine fungus-farming leaf-cutting ants. We obtained draft genomes of the two most common strains harbored by Panamanian fungus-growing ants. Reconstructions of their functional significance showed that they are independently acquired symbionts, most likely to decompose excess arginine consistent with the farmed fungal cultivars providing this nitrogen-rich amino-acid in variable quantities. Across the attine lineages, the relative abundances of the two Mollicutes strains are associated with the substrate types that foraging workers offer to fungus gardens. One of the symbionts is specific to the leaf-cutting ants and has special genomic machinery to catabolize citrate/glucose into acetate, which appears to deliver direct metabolic energy to the ant workers. Unlike other Mollicutes associated with insect hosts, both attine ant strains have complete phage-defense systems, underlining that they are actively maintained as mutualistic symbionts.
Collapse
Affiliation(s)
- Panagiotis Sapountzis
- Centre for Social Evolution, Department of BiologyUniversity of CopenhagenCopenhagenDenmark
| | - Mariya Zhukova
- Centre for Social Evolution, Department of BiologyUniversity of CopenhagenCopenhagenDenmark
| | - Jonathan Z Shik
- Centre for Social Evolution, Department of BiologyUniversity of CopenhagenCopenhagenDenmark
| | - Morten Schiott
- Centre for Social Evolution, Department of BiologyUniversity of CopenhagenCopenhagenDenmark
| | - Jacobus J Boomsma
- Centre for Social Evolution, Department of BiologyUniversity of CopenhagenCopenhagenDenmark
| |
Collapse
|
17
|
Alkasir R, Ma Y, Liu F, Li J, Lv N, Xue Y, Hu Y, Zhu B. Characterization and Transcriptome Analysis of Acinetobacter baumannii Persister Cells. Microb Drug Resist 2018; 24:1466-1474. [PMID: 29902105 DOI: 10.1089/mdr.2017.0341] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Acinetobacter baumannii is a nonfermenting Gram-negative bacillus. A. baumannii resistance is a significant obstacle to clinical infection treatment. The existence of persister cells (persisters) might represent the reason for therapy failure and relapse, and such cells may be the driving force behind rising resistance rates. In this study, A. baumannii ATCC 19606 was used as a target to explore the essential features of A. baumannii persisters. Antibiotic treatment of A. baumannii cultures at 50-fold the minimum inhibitory concentration resulted in a distinct plateau of surviving drug-tolerant persisters. The sensitive bacteria were lysed with ceftazidime, and the nonreplicating bacteria were isolated for transcriptome analysis using RNA sequencing. We analyzed the transcriptome of A. baumannii persisters and identified significantly differentially expressed genes, as well as their enriched pathways. The results showed that both the GP49 (HigB)/Cro (HigA) and DUF1044/RelB toxin/antitoxin systems were significantly increased during the persister incubation period. In addition, the activities of certain metabolic pathways (such as electron transport, adenosine triphosphate [ATP], and the citrate cycle) decreased sharply after antibiotic treatment and remained low during the persister period, while aromatic compound degradation genes were only upregulated in persisters. These results suggest the involvement of aromatic compound degradation genes in persister formation and maintenance. They further provide the first insight into the mechanism of persister formation in A. baumannii.
Collapse
Affiliation(s)
- Rashad Alkasir
- 1 CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology , Chinese Academy of Sciences, Beijing, China
| | - Yanan Ma
- 1 CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology , Chinese Academy of Sciences, Beijing, China
| | - Fei Liu
- 1 CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology , Chinese Academy of Sciences, Beijing, China .,2 Beijing Key Laboratory of Microbial Drug Resistance and Resistome , Beijing, China
| | - Jing Li
- 1 CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology , Chinese Academy of Sciences, Beijing, China .,2 Beijing Key Laboratory of Microbial Drug Resistance and Resistome , Beijing, China
| | - Na Lv
- 1 CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology , Chinese Academy of Sciences, Beijing, China .,2 Beijing Key Laboratory of Microbial Drug Resistance and Resistome , Beijing, China
| | - Yong Xue
- 1 CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology , Chinese Academy of Sciences, Beijing, China
| | - Yongfei Hu
- 1 CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology , Chinese Academy of Sciences, Beijing, China .,2 Beijing Key Laboratory of Microbial Drug Resistance and Resistome , Beijing, China
| | - Baoli Zhu
- 1 CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology , Chinese Academy of Sciences, Beijing, China .,2 Beijing Key Laboratory of Microbial Drug Resistance and Resistome , Beijing, China .,3 Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, College of Medicine, Zhejiang University , Hangzhou, China
| |
Collapse
|
18
|
Schardt J, Jones G, Müller-Herbst S, Schauer K, D'Orazio SEF, Fuchs TM. Comparison between Listeria sensu stricto and Listeria sensu lato strains identifies novel determinants involved in infection. Sci Rep 2017; 7:17821. [PMID: 29259308 PMCID: PMC5736727 DOI: 10.1038/s41598-017-17570-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Accepted: 11/28/2017] [Indexed: 01/01/2023] Open
Abstract
The human pathogen L. monocytogenes and the animal pathogen L. ivanovii, together with four other species isolated from symptom-free animals, form the “Listeria sensu stricto” clade. The members of the second clade, “Listeria sensu lato”, are believed to be solely environmental bacteria without the ability to colonize mammalian hosts. To identify novel determinants that contribute to infection by L. monocytogenes, the causative agent of the foodborne disease listeriosis, we performed a genome comparison of the two clades and found 151 candidate genes that are conserved in the Listeria sensu stricto species. Two factors were investigated further in vitro and in vivo. A mutant lacking an ATP-binding cassette transporter exhibited defective adhesion and invasion of human Caco-2 cells. Using a mouse model of foodborne L. monocytogenes infection, a reduced number of the mutant strain compared to the parental strain was observed in the small intestine and the liver. Another mutant with a defective 1,2-propanediol degradation pathway showed reduced persistence in the stool of infected mice, suggesting a role of 1,2-propanediol as a carbon and energy source of listeriae during infection. These findings reveal the relevance of novel factors for the colonization process of L. monocytogenes.
Collapse
Affiliation(s)
- Jakob Schardt
- ZIEL-Institute for Food & Health, and Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Grant Jones
- Department of Microbiology, Immunology, & Molecular Genetics, University of Kentucky, Lexington, Kentucky, USA
| | - Stefanie Müller-Herbst
- ZIEL-Institute for Food & Health, and Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Kristina Schauer
- Lehrstuhl für Hygiene und Technologie der Milch, Tiermedizinische Fakultät, Ludwig-Maximilians-Universität München, Schönleutner Str. 8, 85764, Oberschleißheim, Germany
| | - Sarah E F D'Orazio
- Department of Microbiology, Immunology, & Molecular Genetics, University of Kentucky, Lexington, Kentucky, USA
| | - Thilo M Fuchs
- ZIEL-Institute for Food & Health, and Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany. .,Friedrich-Loeffler-Institut, Institut für Molekulare Pathogenese, Naumburger Str. 96a, 07743, Jena, Germany.
| |
Collapse
|
19
|
Hücker SM, Ardern Z, Goldberg T, Schafferhans A, Bernhofer M, Vestergaard G, Nelson CW, Schloter M, Rost B, Scherer S, Neuhaus K. Discovery of numerous novel small genes in the intergenic regions of the Escherichia coli O157:H7 Sakai genome. PLoS One 2017; 12:e0184119. [PMID: 28902868 PMCID: PMC5597208 DOI: 10.1371/journal.pone.0184119] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Accepted: 08/20/2017] [Indexed: 12/29/2022] Open
Abstract
In the past, short protein-coding genes were often disregarded by genome annotation pipelines. Transcriptome sequencing (RNAseq) signals outside of annotated genes have usually been interpreted to indicate either ncRNA or pervasive transcription. Therefore, in addition to the transcriptome, the translatome (RIBOseq) of the enteric pathogen Escherichia coli O157:H7 strain Sakai was determined at two optimal growth conditions and a severe stress condition combining low temperature and high osmotic pressure. All intergenic open reading frames potentially encoding a protein of ≥ 30 amino acids were investigated with regard to coverage by transcription and translation signals and their translatability expressed by the ribosomal coverage value. This led to discovery of 465 unique, putative novel genes not yet annotated in this E. coli strain, which are evenly distributed over both DNA strands of the genome. For 255 of the novel genes, annotated homologs in other bacteria were found, and a machine-learning algorithm, trained on small protein-coding E. coli genes, predicted that 89% of these translated open reading frames represent bona fide genes. The remaining 210 putative novel genes without annotated homologs were compared to the 255 novel genes with homologs and to 250 short annotated genes of this E. coli strain. All three groups turned out to be similar with respect to their translatability distribution, fractions of differentially regulated genes, secondary structure composition, and the distribution of evolutionary constraint, suggesting that both novel groups represent legitimate genes. However, the machine-learning algorithm only recognized a small fraction of the 210 genes without annotated homologs. It is possible that these genes represent a novel group of genes, which have unusual features dissimilar to the genes of the machine-learning algorithm training set.
Collapse
Affiliation(s)
- Sarah M. Hücker
- Chair for Microbial Ecology, Technische Universität München, Freising, Germany
- ZIEL - Institute for Food & Health, Technische Universität München, Freising, Germany
| | - Zachary Ardern
- Chair for Microbial Ecology, Technische Universität München, Freising, Germany
- ZIEL - Institute for Food & Health, Technische Universität München, Freising, Germany
| | - Tatyana Goldberg
- Department of Informatics—Bioinformatics & TUM-IAS, Technische Universität München, Garching, Germany
| | - Andrea Schafferhans
- Department of Informatics—Bioinformatics & TUM-IAS, Technische Universität München, Garching, Germany
| | - Michael Bernhofer
- Department of Informatics—Bioinformatics & TUM-IAS, Technische Universität München, Garching, Germany
| | - Gisle Vestergaard
- Research Unit Environmental Genomics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Chase W. Nelson
- Sackler Institute for Comparative Genomics, American Museum of Natural History New York, New York, United States of America
| | - Michael Schloter
- Research Unit Environmental Genomics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Burkhard Rost
- Department of Informatics—Bioinformatics & TUM-IAS, Technische Universität München, Garching, Germany
| | - Siegfried Scherer
- Chair for Microbial Ecology, Technische Universität München, Freising, Germany
- ZIEL - Institute for Food & Health, Technische Universität München, Freising, Germany
| | - Klaus Neuhaus
- Chair for Microbial Ecology, Technische Universität München, Freising, Germany
- Core Facility Microbiome/NGS, ZIEL - Institute for Food & Health, Technische Universität München, Freising, Germany
- * E-mail:
| |
Collapse
|
20
|
Neuhaus K, Landstorfer R, Simon S, Schober S, Wright PR, Smith C, Backofen R, Wecko R, Keim DA, Scherer S. Differentiation of ncRNAs from small mRNAs in Escherichia coli O157:H7 EDL933 (EHEC) by combined RNAseq and RIBOseq - ryhB encodes the regulatory RNA RyhB and a peptide, RyhP. BMC Genomics 2017; 18:216. [PMID: 28245801 PMCID: PMC5331693 DOI: 10.1186/s12864-017-3586-9] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2016] [Accepted: 02/13/2017] [Indexed: 12/14/2022] Open
Abstract
Background While NGS allows rapid global detection of transcripts, it remains difficult to distinguish ncRNAs from short mRNAs. To detect potentially translated RNAs, we developed an improved protocol for bacterial ribosomal footprinting (RIBOseq). This allowed distinguishing ncRNA from mRNA in EHEC. A high ratio of ribosomal footprints per transcript (ribosomal coverage value, RCV) is expected to indicate a translated RNA, while a low RCV should point to a non-translated RNA. Results Based on their low RCV, 150 novel non-translated EHEC transcripts were identified as putative ncRNAs, representing both antisense and intergenic transcripts, 74 of which had expressed homologs in E. coli MG1655. Bioinformatics analysis predicted statistically significant target regulons for 15 of the intergenic transcripts; experimental analysis revealed 4-fold or higher differential expression of 46 novel ncRNA in different growth media. Out of 329 annotated EHEC ncRNAs, 52 showed an RCV similar to protein-coding genes, of those, 16 had RIBOseq patterns matching annotated genes in other enterobacteriaceae, and 11 seem to possess a Shine-Dalgarno sequence, suggesting that such ncRNAs may encode small proteins instead of being solely non-coding. To support that the RIBOseq signals are reflecting translation, we tested the ribosomal-footprint covered ORF of ryhB and found a phenotype for the encoded peptide in iron-limiting condition. Conclusion Determination of the RCV is a useful approach for a rapid first-step differentiation between bacterial ncRNAs and small mRNAs. Further, many known ncRNAs may encode proteins as well. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3586-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Klaus Neuhaus
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, D-85354, Freising, Germany. .,Core Facility Microbiome/NGS, ZIEL Institute for Food & Health, Weihenstephaner Berg 3, D-85354, Freising, Germany.
| | - Richard Landstorfer
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, D-85354, Freising, Germany
| | - Svenja Simon
- Informatik und Informationswissenschaft, Universität Konstanz, D-78457, Konstanz, Germany
| | - Steffen Schober
- Institut für Nachrichtentechnik, Universität Ulm, Albert-Einstein-Allee 43, D-89081, Ulm, Germany
| | - Patrick R Wright
- Bioinformatics Group, Department of Computer Science and BIOSS Centre for Biological Signaling Studies, Cluster of Excellence, University of Freiburg, D-79110, Freiburg, Germany
| | - Cameron Smith
- Bioinformatics Group, Department of Computer Science and BIOSS Centre for Biological Signaling Studies, Cluster of Excellence, University of Freiburg, D-79110, Freiburg, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science and BIOSS Centre for Biological Signaling Studies, Cluster of Excellence, University of Freiburg, D-79110, Freiburg, Germany
| | - Romy Wecko
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, D-85354, Freising, Germany
| | - Daniel A Keim
- Informatik und Informationswissenschaft, Universität Konstanz, D-78457, Konstanz, Germany
| | - Siegfried Scherer
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, D-85354, Freising, Germany
| |
Collapse
|
21
|
Aserse AA, Woyke T, Kyrpides NC, Whitman WB, Lindström K. Draft genome sequence of type strain HBR26 T and description of Rhizobium aethiopicum sp. nov. Stand Genomic Sci 2017; 12:14. [PMID: 28163823 PMCID: PMC5278577 DOI: 10.1186/s40793-017-0220-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2016] [Accepted: 12/24/2016] [Indexed: 11/29/2022] Open
Abstract
Rhizobium aethiopicum sp. nov. is a newly proposed species within the genus Rhizobium. This species includes six rhizobial strains; which were isolated from root nodules of the legume plant Phaseolus vulgaris growing in soils of Ethiopia. The species fixes nitrogen effectively in symbiosis with the host plant P. vulgaris, and is composed of aerobic, Gram-negative staining, rod-shaped bacteria. The genome of type strain HBR26T of R. aethiopicum sp. nov. was one of the rhizobial genomes sequenced as a part of the DOE JGI 2014 Genomic Encyclopedia project designed for soil and plant-associated and newly described type strains. The genome sequence is arranged in 62 scaffolds and consists of 6,557,588 bp length, with a 61% G + C content and 6221 protein-coding and 86 RNAs genes. The genome of HBR26T contains repABC genes (plasmid replication genes) homologous to the genes found in five different Rhizobium etli CFN42T plasmids, suggesting that HBR26T may have five additional replicons other than the chromosome. In the genome of HBR26T, the nodulation genes nodB, nodC, nodS, nodI, nodJ and nodD are located in the same module, and organized in a similar way as nod genes found in the genome of other known common bean-nodulating rhizobial species. nodA gene is found in a different scaffold, but it is also very similar to nodA genes of other bean-nodulating rhizobial strains. Though HBR26T is distinct on the phylogenetic tree and based on ANI analysis (the highest value 90.2% ANI with CFN42T) from other bean-nodulating species, these nod genes and most nitrogen-fixing genes found in the genome of HBR26T share high identity with the corresponding genes of known bean-nodulating rhizobial species (96–100% identity). This suggests that symbiotic genes might be shared between bean-nodulating rhizobia through horizontal gene transfer. R. aethiopicum sp. nov. was grouped into the genus Rhizobium but was distinct from all recognized species of that genus by phylogenetic analyses of combined sequences of the housekeeping genes recA and glnII. The closest reference type strains for HBR26T were R. etli CFN42T (94% similarity of the combined recA and glnII sequences) and Rhizobium bangladeshense BLR175T (93%). Genomic ANI calculation based on protein-coding genes also revealed that the closest reference strains were R. bangladeshense BLR175T and R. etli CFN42T with ANI values 91.8 and 90.2%, respectively. Nevertheless, the ANI values between HBR26T and BLR175T or CFN42T are far lower than the cutoff value of ANI (> = 96%) between strains in the same species, confirming that HBR26T belongs to a novel species. Thus, on the basis of phylogenetic, comparative genomic analyses and ANI results, we formally propose the creation of R. aethiopicum sp. nov. with strain HBR26T (=HAMBI 3550T=LMG 29711T) as the type strain. The genome assembly and annotation data is deposited in the DOE JGI portal and also available at European Nucleotide Archive under accession numbers FMAJ01000001-FMAJ01000062.
Collapse
Affiliation(s)
- Aregu Amsalu Aserse
- Department of Environmental Sciences, University of Helsinki, Viikinkaari 2a, Helsinki, Finland
| | - Tanja Woyke
- DOE Joint Genome Institute, Walnut Creek, USA
| | | | - William B Whitman
- Department of Microbiology, University of Georgia, Biological Sciences Building, Athens, USA
| | - Kristina Lindström
- Department of Environmental Sciences, University of Helsinki, Viikinkaari 2a, Helsinki, Finland
| |
Collapse
|
22
|
Kim J, Park C, Imlay JA, Park W. Lineage-specific SoxR-mediated Regulation of an Endoribonuclease Protects Non-enteric Bacteria from Redox-active Compounds. J Biol Chem 2017; 292:121-133. [PMID: 27895125 PMCID: PMC5217672 DOI: 10.1074/jbc.m116.757500] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2016] [Revised: 11/17/2016] [Indexed: 11/06/2022] Open
Abstract
Bacteria use redox-sensitive transcription factors to coordinate responses to redox stress. The [2Fe-2S] cluster-containing transcription factor SoxR is particularly tuned to protect cells against redox-active compounds (RACs). In enteric bacteria, SoxR is paired with a second transcription factor, SoxS, that activates downstream effectors. However, SoxS is absent in non-enteric bacteria, raising questions as to how SoxR functions. Here, we first show that SoxR of Acinetobacter oleivorans displayed similar activation profiles in response to RACs as did its homolog from Escherichia coli but controlled a different set of target genes, including sinE, which encodes an endoribonuclease. Expression, gel mobility shift, and mutational analyses indicated that sinE is a direct target of SoxR. Redox potentials and permeability of RACs determined optimal sinE induction. Bioinformatics suggested that only a few γ- and β-proteobacteria might have SoxR-regulated sinE Purified SinE, in the presence of Mg2+ ions, degrades rRNAs, thus inhibiting protein synthesis. Similarly, pretreatment of cells with RACs demonstrated a role for SinE in promoting persistence in the presence of antibiotics that inhibit protein synthesis. Our data improve our understanding of the physiology of soil microorganisms by suggesting that both non-enteric SoxR and its target SinE play protective roles in the presence of RACs and antibiotics.
Collapse
Affiliation(s)
- Jisun Kim
- From the Laboratory of Molecular Environmental Microbiology, Department of Environmental Science and Ecological Engineering, Korea University, Seoul 02841, Korea and
| | - Chulwoo Park
- From the Laboratory of Molecular Environmental Microbiology, Department of Environmental Science and Ecological Engineering, Korea University, Seoul 02841, Korea and
| | - James A Imlay
- the Department of Microbiology, University of Illinois, Urbana, Illinois 61801
| | - Woojun Park
- From the Laboratory of Molecular Environmental Microbiology, Department of Environmental Science and Ecological Engineering, Korea University, Seoul 02841, Korea and
| |
Collapse
|
23
|
From next-generation resequencing reads to a high-quality variant data set. Heredity (Edinb) 2016; 118:111-124. [PMID: 27759079 DOI: 10.1038/hdy.2016.102] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2016] [Revised: 09/03/2016] [Accepted: 09/06/2016] [Indexed: 12/11/2022] Open
Abstract
Sequencing has revolutionized biology by permitting the analysis of genomic variation at an unprecedented resolution. High-throughput sequencing is fast and inexpensive, making it accessible for a wide range of research topics. However, the produced data contain subtle but complex types of errors, biases and uncertainties that impose several statistical and computational challenges to the reliable detection of variants. To tap the full potential of high-throughput sequencing, a thorough understanding of the data produced as well as the available methodologies is required. Here, I review several commonly used methods for generating and processing next-generation resequencing data, discuss the influence of errors and biases together with their resulting implications for downstream analyses and provide general guidelines and recommendations for producing high-quality single-nucleotide polymorphism data sets from raw reads by highlighting several sophisticated reference-based methods representing the current state of the art.
Collapse
|
24
|
Sahlin K, Frånberg M, Arvestad L. Structural Variation Detection with Read Pair Information: An Improved Null Hypothesis Reduces Bias. J Comput Biol 2016; 24:581-589. [PMID: 27681236 DOI: 10.1089/cmb.2016.0124] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Reads from paired-end and mate-pair libraries are often utilized to find structural variation in genomes, and one common approach is to use their fragment length for detection. After aligning read pairs to the reference, read pair distances are analyzed for statistically significant deviations. However, previously proposed methods are based on a simplified model of observed fragment lengths that does not agree with data. We show how this model limits statistical analysis of identifying variants and propose a new model by adapting a model we have previously introduced for contig scaffolding, which agrees with data. From this model, we derive an improved null hypothesis that when applied in the variant caller CLEVER, reduces the number of false positives and corrects a bias that contributes to more deletion calls than insertion calls. We advise developers of variant callers with statistical fragment length-based methods to adapt the concepts in our proposed model and null hypothesis.
Collapse
Affiliation(s)
- Kristoffer Sahlin
- 1 Science for Life Laboratory, School of Computer Science and Communication, KTH Royal Institute of Technology , Stockholm, Sweden
| | - Mattias Frånberg
- 2 Cardiovascular Unit, Department of Medicine, Karolinska Institutet , Stockholm, Sweden
| | - Lars Arvestad
- 3 Swedish e-Science Research Centre (SeRC) and Department of Numerical Analysis and Computer Science, Stockholm University , Solna, Sweden
| |
Collapse
|
25
|
Suárez-Esquivel M, Ruiz-Villalobos N, Castillo-Zeledón A, Jiménez-Rojas C, Roop Ii RM, Comerci DJ, Barquero-Calvo E, Chacón-Díaz C, Caswell CC, Baker KS, Chaves-Olarte E, Thomson NR, Moreno E, Letesson JJ, De Bolle X, Guzmán-Verri C. Brucella abortus Strain 2308 Wisconsin Genome: Importance of the Definition of Reference Strains. Front Microbiol 2016; 7:1557. [PMID: 27746773 PMCID: PMC5041503 DOI: 10.3389/fmicb.2016.01557] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2016] [Accepted: 09/16/2016] [Indexed: 12/25/2022] Open
Abstract
Brucellosis is a bacterial infectious disease affecting a wide range of mammals and a neglected zoonosis caused by species of the genetically homogenous genus Brucella. As in most studies on bacterial diseases, research in brucellosis is carried out by using reference strains as canonical models to understand the mechanisms underlying host pathogen interactions. We performed whole genome sequencing analysis of the reference strain B. abortus 2308 routinely used in our laboratory, including manual curated annotation accessible as an editable version through a link at https://en.wikipedia.org/wiki/Brucella#Genomics. Comparison of this genome with two publically available 2308 genomes showed significant differences, particularly indels related to insertional elements, suggesting variability related to the transposition of these elements within the same strain. Considering the outcome of high resolution genomic techniques in the bacteriology field, the conventional concept of strain definition needs to be revised.
Collapse
Affiliation(s)
- Marcela Suárez-Esquivel
- Programa de Investigación en Enfermedades Tropicales, Escuela de Medicina Veterinaria, Universidad Nacional de Costa Rica Heredia, Costa Rica
| | - Nazareth Ruiz-Villalobos
- Programa de Investigación en Enfermedades Tropicales, Escuela de Medicina Veterinaria, Universidad Nacional de Costa Rica Heredia, Costa Rica
| | - Amanda Castillo-Zeledón
- Programa de Investigación en Enfermedades Tropicales, Escuela de Medicina Veterinaria, Universidad Nacional de Costa Rica Heredia, Costa Rica
| | - César Jiménez-Rojas
- Programa de Investigación en Enfermedades Tropicales, Escuela de Medicina Veterinaria, Universidad Nacional de Costa Rica Heredia, Costa Rica
| | - R Martin Roop Ii
- Department of Microbiology and Immunology, Brody School of Medicine, East Carolina University Greenville, NC, USA
| | - Diego J Comerci
- Instituto de Investigaciones Biotecnológicas "Dr. Rodolfo A. Ugalde", Instituto Tecnológico de Chascomús, Universidad Nacional de San Martín, Consejo Nacional de Investigaciones Científicas y Técnicas, Comisión Nacional de Energía Atómica, Grupo Pecuario, Centro Atómico Ezeiza Buenos Aires, Argentina
| | - Elías Barquero-Calvo
- Centro de Investigación en Enfermedades Tropicales, Facultad de Microbiología, Universidad de Costa Rica San José, Costa Rica
| | - Carlos Chacón-Díaz
- Programa de Investigación en Enfermedades Tropicales, Escuela de Medicina Veterinaria, Universidad Nacional de Costa RicaHeredia, Costa Rica; Centro de Investigación en Enfermedades Tropicales, Facultad de Microbiología, Universidad de Costa RicaSan José, Costa Rica
| | - Clayton C Caswell
- Center for Molecular Medicine and Infectious Diseases, Department of Biomedical Sciences and Pathobiology, Virginia-Maryland College of Veterinary Medicine, Virginia Tech Blacksburg, VA, USA
| | - Kate S Baker
- Wellcome Trust Sanger InstituteHinxton, UK; Department of Functional and Comparative Genomics, Institute of Integrative Biology, University of LiverpoolLiverpool, UK
| | - Esteban Chaves-Olarte
- Centro de Investigación en Enfermedades Tropicales, Facultad de Microbiología, Universidad de Costa Rica San José, Costa Rica
| | - Nicholas R Thomson
- Wellcome Trust Sanger InstituteHinxton, UK; The London School of Hygiene and Tropical MedicineLondon, UK
| | - Edgardo Moreno
- Programa de Investigación en Enfermedades Tropicales, Escuela de Medicina Veterinaria, Universidad Nacional de Costa RicaHeredia, Costa Rica; Instituto Clodomiro Picado, Universidad de Costa RicaSan José, Costa Rica
| | - Jean J Letesson
- Unité de Recherche en Biologie des Microorganismes, Université de Namur Namur Belgium
| | - Xavier De Bolle
- Unité de Recherche en Biologie des Microorganismes, Université de Namur Namur Belgium
| | - Caterina Guzmán-Verri
- Programa de Investigación en Enfermedades Tropicales, Escuela de Medicina Veterinaria, Universidad Nacional de Costa RicaHeredia, Costa Rica; Centro de Investigación en Enfermedades Tropicales, Facultad de Microbiología, Universidad de Costa RicaSan José, Costa Rica
| |
Collapse
|
26
|
Graf FE, Ludin P, Arquint C, Schmidt RS, Schaub N, Kunz Renggli C, Munday JC, Krezdorn J, Baker N, Horn D, Balmer O, Caccone A, de Koning HP, Mäser P. Comparative genomics of drug resistance in Trypanosoma brucei rhodesiense. Cell Mol Life Sci 2016; 73:3387-400. [PMID: 26973180 PMCID: PMC4967103 DOI: 10.1007/s00018-016-2173-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Accepted: 03/01/2016] [Indexed: 12/02/2022]
Abstract
Trypanosoma brucei rhodesiense is one of the causative agents of human sleeping sickness, a fatal disease that is transmitted by tsetse flies and restricted to Sub-Saharan Africa. Here we investigate two independent lines of T. b. rhodesiense that have been selected with the drugs melarsoprol and pentamidine over the course of 2 years, until they exhibited stable cross-resistance to an unprecedented degree. We apply comparative genomics and transcriptomics to identify the underlying mutations. Only few mutations have become fixed during selection. Three genes were affected by mutations in both lines: the aminopurine transporter AT1, the aquaporin AQP2, and the RNA-binding protein UBP1. The melarsoprol-selected line carried a large deletion including the adenosine transporter gene AT1, whereas the pentamidine-selected line carried a heterozygous point mutation in AT1, G430R, which rendered the transporter non-functional. Both resistant lines had lost AQP2, and both lines carried the same point mutation, R131L, in the RNA-binding motif of UBP1. The finding that concomitant deletion of the known resistance genes AT1 and AQP2 in T. b. brucei failed to phenocopy the high levels of resistance of the T. b. rhodesiense mutants indicated a possible role of UBP1 in melarsoprol-pentamidine cross-resistance. However, homozygous in situ expression of UBP1-Leu(131) in T. b. brucei did not affect the sensitivity to melarsoprol or pentamidine.
Collapse
Affiliation(s)
- Fabrice E Graf
- Swiss Tropical and Public Health Institute, Socinstrasse 57, 4051, Basel, Switzerland
- University of Basel, 4000, Basel, Switzerland
| | - Philipp Ludin
- Swiss Tropical and Public Health Institute, Socinstrasse 57, 4051, Basel, Switzerland
- University of Basel, 4000, Basel, Switzerland
| | - Christian Arquint
- Swiss Tropical and Public Health Institute, Socinstrasse 57, 4051, Basel, Switzerland
- University of Basel, 4000, Basel, Switzerland
| | - Remo S Schmidt
- Swiss Tropical and Public Health Institute, Socinstrasse 57, 4051, Basel, Switzerland
- University of Basel, 4000, Basel, Switzerland
| | - Nadia Schaub
- Swiss Tropical and Public Health Institute, Socinstrasse 57, 4051, Basel, Switzerland
- University of Basel, 4000, Basel, Switzerland
| | - Christina Kunz Renggli
- Swiss Tropical and Public Health Institute, Socinstrasse 57, 4051, Basel, Switzerland
- University of Basel, 4000, Basel, Switzerland
| | - Jane C Munday
- Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow, G12 8TA, UK
| | - Jessica Krezdorn
- Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow, G12 8TA, UK
| | - Nicola Baker
- Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dow Street, Dundee, DD1 5EH, UK
- The University of Kent, Canterbury, Kent, CT2 7NZ, UK
| | - David Horn
- Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dow Street, Dundee, DD1 5EH, UK
| | - Oliver Balmer
- Swiss Tropical and Public Health Institute, Socinstrasse 57, 4051, Basel, Switzerland
- University of Basel, 4000, Basel, Switzerland
| | - Adalgisa Caccone
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA
| | - Harry P de Koning
- Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow, G12 8TA, UK
| | - Pascal Mäser
- Swiss Tropical and Public Health Institute, Socinstrasse 57, 4051, Basel, Switzerland.
- University of Basel, 4000, Basel, Switzerland.
| |
Collapse
|
27
|
Balsingh J, Radhakrishna S, Ulaganathan K. Draft Genome Sequence of Bacillus pumilus ku-bf1 Isolated from the Gut Contents of Wood Boring Mesomorphus sp. Front Microbiol 2016; 7:1037. [PMID: 27446065 PMCID: PMC4927586 DOI: 10.3389/fmicb.2016.01037] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Accepted: 06/20/2016] [Indexed: 11/16/2022] Open
Affiliation(s)
- Jatoth Balsingh
- Center for Plant Molecular Biology, Osmania University Hyderabad, India
| | | | | |
Collapse
|
28
|
Neuhaus K, Landstorfer R, Fellner L, Simon S, Schafferhans A, Goldberg T, Marx H, Ozoline ON, Rost B, Kuster B, Keim DA, Scherer S. Translatomics combined with transcriptomics and proteomics reveals novel functional, recently evolved orphan genes in Escherichia coli O157:H7 (EHEC). BMC Genomics 2016; 17:133. [PMID: 26911138 PMCID: PMC4765031 DOI: 10.1186/s12864-016-2456-1] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Accepted: 02/09/2016] [Indexed: 12/30/2022] Open
Abstract
Background Genomes of E. coli, including that of the human pathogen Escherichia coli O157:H7 (EHEC) EDL933, still harbor undetected protein-coding genes which, apparently, have escaped annotation due to their small size and non-essential function. To find such genes, global gene expression of EHEC EDL933 was examined, using strand-specific RNAseq (transcriptome), ribosomal footprinting (translatome) and mass spectrometry (proteome). Results Using the above methods, 72 short, non-annotated protein-coding genes were detected. All of these showed signals in the ribosomal footprinting assay indicating mRNA translation. Seven were verified by mass spectrometry. Fifty-seven genes are annotated in other enterobacteriaceae, mainly as hypothetical genes; the remaining 15 genes constitute novel discoveries. In addition, protein structure and function were predicted computationally and compared between EHEC-encoded proteins and 100-times randomly shuffled proteins. Based on this comparison, 61 of the 72 novel proteins exhibit predicted structural and functional features similar to those of annotated proteins. Many of the novel genes show differential transcription when grown under eleven diverse growth conditions suggesting environmental regulation. Three genes were found to confer a phenotype in previous studies, e.g., decreased cattle colonization. Conclusions These findings demonstrate that ribosomal footprinting can be used to detect novel protein coding genes, contributing to the growing body of evidence that hypothetical genes are not annotation artifacts and opening an additional way to study their functionality. All 72 genes are taxonomically restricted and, therefore, appear to have evolved relatively recently de novo. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2456-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Klaus Neuhaus
- Lehrstuhl für Mikrobielle Ökologie, Zentralinstitut für Ernährungs- und Lebensmittelforschung, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.
| | - Richard Landstorfer
- Lehrstuhl für Mikrobielle Ökologie, Zentralinstitut für Ernährungs- und Lebensmittelforschung, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.
| | - Lea Fellner
- Lehrstuhl für Mikrobielle Ökologie, Zentralinstitut für Ernährungs- und Lebensmittelforschung, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.
| | - Svenja Simon
- Lehrstuhl für Datenanalyse und Visualisierung, Fachbereich Informatik und Informationswissenschaft, Universität Konstanz, Box 78, 78457, Konstanz, Germany.
| | - Andrea Schafferhans
- Department of Informatics - Bioinformatics & TUM-IAS, Technische Universität München, Boltzmannstraße 3, 85748, Garching, Germany.
| | - Tatyana Goldberg
- Department of Informatics - Bioinformatics & TUM-IAS, Technische Universität München, Boltzmannstraße 3, 85748, Garching, Germany.
| | - Harald Marx
- Chair of Proteomics and Bioanalytics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Emil-Erlenmeyer-Forum 5, 85354, Freising, Germany.
| | - Olga N Ozoline
- Institute of Cell Biophysics, Russian Academy of Sciences, Moscow Region, 142290, Pushchino, Russia.
| | - Burkhard Rost
- Department of Informatics - Bioinformatics & TUM-IAS, Technische Universität München, Boltzmannstraße 3, 85748, Garching, Germany.
| | - Bernhard Kuster
- Chair of Proteomics and Bioanalytics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Emil-Erlenmeyer-Forum 5, 85354, Freising, Germany. .,Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), Technische Universität München, Gregor-Mendel-Str. 4, 85354, Freising, Germany.
| | - Daniel A Keim
- Lehrstuhl für Datenanalyse und Visualisierung, Fachbereich Informatik und Informationswissenschaft, Universität Konstanz, Box 78, 78457, Konstanz, Germany.
| | - Siegfried Scherer
- Lehrstuhl für Mikrobielle Ökologie, Zentralinstitut für Ernährungs- und Lebensmittelforschung, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.
| |
Collapse
|
29
|
Thangam M, Gopal RK. CRCDA--Comprehensive resources for cancer NGS data analysis. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav092. [PMID: 26450948 PMCID: PMC4597977 DOI: 10.1093/database/bav092] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Accepted: 08/31/2015] [Indexed: 12/24/2022]
Abstract
Next generation sequencing (NGS) innovations put a compelling landmark in life science and changed the direction of research in clinical oncology with its productivity to diagnose and treat cancer. The aim of our portal comprehensive resources for cancer NGS data analysis (CRCDA) is to provide a collection of different NGS tools and pipelines under diverse classes with cancer pathways and databases and furthermore, literature information from PubMed. The literature data was constrained to 18 most common cancer types such as breast cancer, colon cancer and other cancers that exhibit in worldwide population. NGS-cancer tools for the convenience have been categorized into cancer genomics, cancer transcriptomics, cancer epigenomics, quality control and visualization. Pipelines for variant detection, quality control and data analysis were listed to provide out-of-the box solution for NGS data analysis, which may help researchers to overcome challenges in selecting and configuring individual tools for analysing exome, whole genome and transcriptome data. An extensive search page was developed that can be queried by using (i) type of data [literature, gene data and sequence read archive (SRA) data] and (ii) type of cancer (selected based on global incidence and accessibility of data). For each category of analysis, variety of tools are available and the biggest challenge is in searching and using the right tool for the right application. The objective of the work is collecting tools in each category available at various places and arranging the tools and other data in a simple and user-friendly manner for biologists and oncologists to find information easier. To the best of our knowledge, we have collected and presented a comprehensive package of most of the resources available in cancer for NGS data analysis. Given these factors, we believe that this website will be an useful resource to the NGS research community working on cancer. Database URL: http://bioinfo.au-kbc.org.in/ngs/ngshome.html.
Collapse
Affiliation(s)
- Manonanthini Thangam
- AU-KBC Research Centre, MIT Campus of Anna University, Chromepet, Chennai, India
| | - Ramesh Kumar Gopal
- AU-KBC Research Centre, MIT Campus of Anna University, Chromepet, Chennai, India
| |
Collapse
|
30
|
Finney RP, Chen QR, Nguyen CV, Hsu CH, Yan C, Hu Y, Abawi M, Bian X, Meerzaman DM. Alview: Portable Software for Viewing Sequence Reads in BAM Formatted Files. Cancer Inform 2015; 14:105-7. [PMID: 26417198 PMCID: PMC4573065 DOI: 10.4137/cin.s26470] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2015] [Revised: 05/24/2015] [Accepted: 05/26/2015] [Indexed: 11/05/2022] Open
Abstract
The name Alview is a contraction of the term Alignment Viewer. Alview is a compiled to native architecture software tool for visualizing the alignment of sequencing data. Inputs are files of short-read sequences aligned to a reference genome in the SAM/BAM format and files containing reference genome data. Outputs are visualizations of these aligned short reads. Alview is written in portable C with optional graphical user interface (GUI) code written in C, C++, and Objective-C. The application can run in three different ways: as a web server, as a command line tool, or as a native, GUI program. Alview is compatible with Microsoft Windows, Linux, and Apple OS X. It is available as a web demo at https://cgwb.nci.nih.gov/cgi-bin/alview. The source code and Windows/Mac/Linux executables are available via https://github.com/NCIP/alview.
Collapse
Affiliation(s)
- Richard P Finney
- Computational Genomics Research Group, Center for Bioinformatics and Information Technology, National Cancer Institute, Bethesda, MD, USA
| | - Qing-Rong Chen
- Computational Genomics Research Group, Center for Bioinformatics and Information Technology, National Cancer Institute, Bethesda, MD, USA
| | - Cu V Nguyen
- Computational Genomics Research Group, Center for Bioinformatics and Information Technology, National Cancer Institute, Bethesda, MD, USA
| | - Chih Hao Hsu
- Computational Genomics Research Group, Center for Bioinformatics and Information Technology, National Cancer Institute, Bethesda, MD, USA
| | - Chunhua Yan
- Computational Genomics Research Group, Center for Bioinformatics and Information Technology, National Cancer Institute, Bethesda, MD, USA
| | - Ying Hu
- Computational Genomics Research Group, Center for Bioinformatics and Information Technology, National Cancer Institute, Bethesda, MD, USA
| | - Massih Abawi
- Computational Genomics Research Group, Center for Bioinformatics and Information Technology, National Cancer Institute, Bethesda, MD, USA
| | - Xiaopeng Bian
- Computational Genomics Research Group, Center for Bioinformatics and Information Technology, National Cancer Institute, Bethesda, MD, USA
| | - Daoud M Meerzaman
- Computational Genomics Research Group, Center for Bioinformatics and Information Technology, National Cancer Institute, Bethesda, MD, USA
| |
Collapse
|
31
|
Huang Z, Gallot A, Lao NT, Puechmaille SJ, Foley NM, Jebb D, Bekaert M, Teeling EC. A nonlethal sampling method to obtain, generate and assemble whole blood transcriptomes from small, wild mammals. Mol Ecol Resour 2015; 16:150-62. [PMID: 26186236 DOI: 10.1111/1755-0998.12447] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Revised: 07/08/2015] [Accepted: 07/13/2015] [Indexed: 12/01/2022]
Abstract
The acquisition of tissue samples from wild populations is a constant challenge in conservation biology, especially for endangered species and protected species where nonlethal sampling is the only option. Whole blood has been suggested as a nonlethal sample type that contains a high percentage of bodywide and genomewide transcripts and therefore can be used to assess the transcriptional status of an individual, and to infer a high percentage of the genome. However, only limited quantities of blood can be nonlethally sampled from small species and it is not known if enough genetic material is contained in only a few drops of blood, which represents the upper limit of sample collection for some small species. In this study, we developed a nonlethal sampling method, the laboratory protocols and a bioinformatic pipeline to sequence and assemble the whole blood transcriptome, using Illumina RNA-Seq, from wild greater mouse-eared bats (Myotis myotis). For optimal results, both ribosomal and globin RNAs must be removed before library construction. Treatment of DNase is recommended but not required enabling the use of smaller amounts of starting RNA. A large proportion of protein-coding genes (61%) in the genome were expressed in the blood transcriptome, comparable to brain (65%), kidney (63%) and liver (58%) transcriptomes, and up to 99% of the mitogenome (excluding D-loop) was recovered in the RNA-Seq data. In conclusion, this nonlethal blood sampling method provides an opportunity for a genomewide transcriptomic study of small, endangered or critically protected species, without sacrificing any individuals.
Collapse
Affiliation(s)
- Zixia Huang
- UCD School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Aurore Gallot
- UCD School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland.,Laboratoire de Biométrie et Biologie Évolutive, UMR 5558, Centre National de la Recherche Scientifique, Université Lyon 1, Lyon, France
| | - Nga T Lao
- UCD School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland.,Molecular Biology Laboratory, National Institute for Cellular Biotechnology, Dublin City University, Glasnevin, Dublin 9, Ireland
| | - Sébastien J Puechmaille
- UCD School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland.,Applied Zoology and Nature Conservation, Greifswald University, J.-S.-Bach-Str. 11/12, 17489, Greifswald, Germany
| | - Nicole M Foley
- UCD School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - David Jebb
- UCD School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Michaël Bekaert
- NSilico Lifescience Limited, Melbourn Building, CIT Campus, Bishopstown, Co., Cork, Ireland
| | - Emma C Teeling
- UCD School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| |
Collapse
|
32
|
Woods NT, Jhuraney A, Monteiro ANA. Incorporating computational resources in a cancer research program. Hum Genet 2015; 134:467-78. [PMID: 25324189 PMCID: PMC4401625 DOI: 10.1007/s00439-014-1496-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2014] [Accepted: 09/29/2014] [Indexed: 10/24/2022]
Abstract
Recent technological advances have transformed cancer genetics research. These advances have served as the basis for the generation of a number of richly annotated datasets relevant to the cancer geneticist. In addition, many of these technologies are now within reach of smaller laboratories to answer specific biological questions. Thus, one of the most pressing issues facing an experimental cancer biology research program in genetics is incorporating data from multiple sources to annotate, visualize, and analyze the system under study. Fortunately, there are several computational resources to aid in this process. However, a significant effort is required to adapt a molecular biology-based research program to take advantage of these datasets. Here, we discuss the lessons learned in our laboratory and share several recommendations to make this transition effective. This article is not meant to be a comprehensive evaluation of all the available resources, but rather highlight those that we have incorporated into our laboratory and how to choose the most appropriate ones for your research program.
Collapse
Affiliation(s)
- Nicholas T Woods
- Cancer Epidemiology Program, H. Lee Moffitt Cancer Center and Research Institute, 12902 Magnolia Drive, Tampa, FL, 33612, USA
| | | | | |
Collapse
|
33
|
Bahassi EM, Stambrook PJ. Next-generation sequencing technologies: breaking the sound barrier of human genetics. Mutagenesis 2014; 29:303-10. [PMID: 25150023 PMCID: PMC7318892 DOI: 10.1093/mutage/geu031] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Demand for new technologies that deliver fast, inexpensive and accurate genome information has never been greater. This challenge has catalysed the rapid development of advances in next-generation sequencing (NGS). The generation of large volumes of sequence data and the speed of data acquisition are the primary advantages over previous, more standard methods. In 2013, the Food and Drug Administration granted marketing authorisation for the first high-throughput NG sequencer, Illumina's MiSeqDx, which allowed the development and use of a large number of new genome-based tests. Here, we present a review of template preparation, nucleic acid sequencing and imaging, genome assembly and alignment approaches as well as recent advances in current and near-term commercially available NGS instruments. We also outline the broad range of applications for NGS technologies and provide guidelines for platform selection to best address biological questions of interest. DNA sequencing has revolutionised biological and medical research, and is poised to have a similar impact on the practice of medicine. This tool is but one of an increasing arsenal of developing tools that enhance our capabilities to identify, quantify and functionally characterise the components of biological networks that keep us healthy or make us sick. Despite advances in other 'omic' technologies, DNA sequencing and analysis, in many respects, have played the leading role to date. The new technologies provide a bridge between genotype and phenotype, both in man and model organisms, and have revolutionised how risk of developing a complex human disease may be assessed. The generation of large DNA sequence data sets is producing a wealth of medically relevant information on a large number of individuals and populations that will potentially form the basis of truly individualised medical care in the future.
Collapse
Affiliation(s)
- El Mustapha Bahassi
- Department of Internal Medicine, Division of Hematology/Oncology, UC Brain Tumor Center, University of Cincinnati, 3125 Eden Avenue, Cincinnati, OH 45267-0508, USA, Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati, 3125 Eden Avenue, Cincinnati, OH 45267-0508, USA
| | - Peter J Stambrook
- Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati, 3125 Eden Avenue, Cincinnati, OH 45267-0508, USA
| |
Collapse
|
34
|
Landstorfer R, Simon S, Schober S, Keim D, Scherer S, Neuhaus K. Comparison of strand-specific transcriptomes of enterohemorrhagic Escherichia coli O157:H7 EDL933 (EHEC) under eleven different environmental conditions including radish sprouts and cattle feces. BMC Genomics 2014; 15:353. [PMID: 24885796 PMCID: PMC4048457 DOI: 10.1186/1471-2164-15-353] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2013] [Accepted: 03/31/2014] [Indexed: 12/26/2022] Open
Abstract
Background Multiple infection sources for enterohemorrhagic Escherichia coli O157:H7 (EHEC) are known, including animal products, fruit and vegetables. The ecology of this pathogen outside its human host is largely unknown and one third of its annotated genes are still hypothetical. To identify genetic determinants expressed under a variety of environmental factors, we applied strand-specific RNA-sequencing, comparing the SOLiD and Illumina systems. Results Transcriptomes of EHEC were sequenced under 11 different biotic and abiotic conditions: LB medium at pH4, pH7, pH9, or at 15°C; LB with nitrite or trimethoprim-sulfamethoxazole; LB-agar surface, M9 minimal medium, spinach leaf juice, surface of living radish sprouts, and cattle feces. Of 5379 annotated genes in strain EDL933 (genome and plasmid), a surprising minority of only 144 had null sequencing reads under all conditions. We therefore developed a statistical method to distinguish weakly transcribed genes from background transcription. We find that 96% of all genes and 91.5% of the hypothetical genes exhibit a significant transcriptional signal under at least one condition. Comparing SOLiD and Illumina systems, we find a high correlation between both approaches for fold-changes of the induced or repressed genes. The pathogenicity island LEE showed highest transcriptional activity in LB medium, minimal medium, and after treatment with antibiotics. Unique sets of genes, including many hypothetical genes, are highly up-regulated on radish sprouts, cattle feces, or in the presence of antibiotics. Furthermore, we observed induction of the shiga-toxin carrying phages by antibiotics and confirmed active biofilm related genes on radish sprouts, in cattle feces, and on agar plates. Conclusions Since only a minority of genes (2.7%) were not active under any condition tested (null reads), we suggest that the assumption of significant genome over-annotations is wrong. Environmental transcriptomics uncovered hitherto unknown gene functions and unique regulatory patterns in EHEC. For instance, the environmental function of azoR had been elusive, but this gene is highly active on radish sprouts. Thus, NGS-transcriptomics is an appropriate technique to propose new roles of hypothetical genes and to guide future research. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-353) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | | | | | - Klaus Neuhaus
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, D-85350 Freising, Germany.
| |
Collapse
|
35
|
Karlsson OE, Belák S, Granberg F. The Effect of Preprocessing by Sequence-Independent, Single-Primer Amplification (SISPA) on Metagenomic Detection of Viruses. Biosecur Bioterror 2013; 11 Suppl 1:S227-34. [DOI: 10.1089/bsp.2013.0008] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
|
36
|
Tsatsaronis JA, Hollands A, Cole JN, Maamary PG, Gillen CM, Ben Zakour NL, Kotb M, Nizet V, Beatson SA, Walker MJ, Sanderson-Smith ML. Streptococcal collagen-like protein A and general stress protein 24 are immunomodulating virulence factors of group A Streptococcus. FASEB J 2013; 27:2633-43. [PMID: 23531597 DOI: 10.1096/fj.12-226662] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
In Western countries, invasive infections caused by M1T1 serotype group A Streptococcus (GAS) are epidemiologically linked to mutations in the control of virulence regulatory 2-component operon (covRS). In indigenous communities and developing countries, severe GAS disease is associated with genetically diverse non-M1T1 GAS serotypes. Hypervirulent M1T1 covRS mutant strains arise through selection by human polymorphonuclear cells for increased expression of GAS virulence factors such as the DNase Sda1, which promotes neutrophil resistance. The GAS bacteremia isolate NS88.2 (emm 98.1) is a covS mutant that exhibits a hypervirulent phenotype and neutrophil resistance yet lacks the phage-encoded Sda1. Here, we have employed a comprehensive systems biology (genomic, transcriptomic, and proteomic) approach to identify NS88.2 virulence determinants that enhance neutrophil resistance in the non-M1T1 GAS genetic background. Using this approach, we have identified streptococcal collagen-like protein A and general stress protein 24 proteins as NS88.2 determinants that contribute to survival in whole blood and neutrophil resistance in non-M1T1 GAS. This study has revealed new factors that contribute to GAS pathogenicity that may play important roles in resisting innate immune defenses and the development of human invasive infections.
Collapse
Affiliation(s)
- James A Tsatsaronis
- Illawarra Health and Medical Research Institute, and School of Biological Sciences, University of Wollongong, Wollongong, NSW, 2522, Australia
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Toffano-Nioche C, Nguyen AN, Kuchly C, Ott A, Gautheret D, Bouloc P, Jacq A. Transcriptomic profiling of the oyster pathogen Vibrio splendidus opens a window on the evolutionary dynamics of the small RNA repertoire in the Vibrio genus. RNA (NEW YORK, N.Y.) 2012; 18:2201-2219. [PMID: 23097430 PMCID: PMC3504672 DOI: 10.1261/rna.033324.112] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2012] [Accepted: 09/08/2012] [Indexed: 06/01/2023]
Abstract
Work in recent years has led to the recognition of the importance of small regulatory RNAs (sRNAs) in bacterial regulation networks. New high-throughput sequencing technologies are paving the way to the exploration of an expanding sRNA world in nonmodel bacteria. In the Vibrio genus, compared to the enterobacteriaceae, still a limited number of sRNAs have been characterized, mostly in Vibrio cholerae, where they have been shown to be important for virulence, as well as in Vibrio harveyi. In addition, genome-wide approaches in V. cholerae have led to the discovery of hundreds of potential new sRNAs. Vibrio splendidus is an oyster pathogen that has been recently associated with massive mortality episodes in the French oyster growing industry. Here, we report the first RNA-seq study in a Vibrio outside of the V. cholerae species. We have uncovered hundreds of candidate regulatory RNAs, be it cis-regulatory elements, antisense RNAs, and trans-encoded sRNAs. Conservation studies showed the majority of them to be specific to V. splendidus. However, several novel sRNAs, previously unidentified, are also present in V. cholerae. Finally, we identified 28 trans sRNAs that are conserved in all the Vibrio genus species for which a complete genome sequence is available, possibly forming a Vibrio "sRNA core."
Collapse
Affiliation(s)
- Claire Toffano-Nioche
- Institut de Génétique et Microbiologie, CNRS/UMR 8621, IFR115, Université Paris-Sud, Bâtiment 400, 91405 Orsay Cedex, France
| | - An N. Nguyen
- Institut de Génétique et Microbiologie, CNRS/UMR 8621, IFR115, Université Paris-Sud, Bâtiment 400, 91405 Orsay Cedex, France
| | - Claire Kuchly
- Institut de Génétique et Microbiologie, CNRS/UMR 8621, IFR115, Université Paris-Sud, Bâtiment 400, 91405 Orsay Cedex, France
| | - Alban Ott
- Institut de Génétique et Microbiologie, CNRS/UMR 8621, IFR115, Université Paris-Sud, Bâtiment 400, 91405 Orsay Cedex, France
| | - Daniel Gautheret
- Institut de Génétique et Microbiologie, CNRS/UMR 8621, IFR115, Université Paris-Sud, Bâtiment 400, 91405 Orsay Cedex, France
| | - Philippe Bouloc
- Institut de Génétique et Microbiologie, CNRS/UMR 8621, IFR115, Université Paris-Sud, Bâtiment 400, 91405 Orsay Cedex, France
| | - Annick Jacq
- Institut de Génétique et Microbiologie, CNRS/UMR 8621, IFR115, Université Paris-Sud, Bâtiment 400, 91405 Orsay Cedex, France
| |
Collapse
|
38
|
Olson PD, Zarowiecki M, Kiss F, Brehm K. Cestode genomics - progress and prospects for advancing basic and applied aspects of flatworm biology. Parasite Immunol 2012; 34:130-50. [PMID: 21793855 DOI: 10.1111/j.1365-3024.2011.01319.x] [Citation(s) in RCA: 74] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Characterization of the first tapeworm genome, Echinococcus multilocularis, is now nearly complete, and genome assemblies of E. granulosus, Taenia solium and Hymenolepis microstoma are in advanced draft versions. These initiatives herald the beginning of a genomic era in cestodology and underpin a diverse set of research agendas targeting both basic and applied aspects of tapeworm biology. We discuss the progress in the genomics of these species, provide insights into the presence and composition of immunologically relevant gene families, including the antigen B- and EG95/45W families, and discuss chemogenomic approaches toward the development of novel chemotherapeutics against cestode diseases. In addition, we discuss the evolution of tapeworm parasites and introduce the research programmes linked to genome initiatives that are aimed at understanding signalling systems involved in basic host-parasite interactions and morphogenesis.
Collapse
Affiliation(s)
- P D Olson
- Department of Zoology, The Natural History Museum, London, UK
| | | | | | | |
Collapse
|
39
|
Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 2012; 14:178-92. [PMID: 22517427 PMCID: PMC3603213 DOI: 10.1093/bib/bbs017] [Citation(s) in RCA: 5857] [Impact Index Per Article: 488.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Data visualization is an essential component of genomic data analysis. However, the size and diversity of the data sets produced by today’s sequencing and array-based profiling methods present major challenges to visualization tools. The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive user experience at all levels of genome resolution. A key characteristic of IGV is its focus on the integrative nature of genomic studies, with support for both array-based and next-generation sequencing data, and the integration of clinical and phenotypic data. Although IGV is often used to view genomic data from public sources, its primary emphasis is to support researchers who wish to visualize and explore their own data sets or those from colleagues. To that end, IGV supports flexible loading of local and remote data sets, and is optimized to provide high-performance data visualization and exploration on standard desktop systems. IGV is freely available for download from http://www.broadinstitute.org/igv, under a GNU LGPL open-source license.
Collapse
|
40
|
HOLROYD N, SANCHEZ-FLORES A. Producing parasitic helminth reference and draft genomes at the Wellcome Trust Sanger Institute. Parasite Immunol 2012; 34:100-7. [DOI: 10.1111/j.1365-3024.2011.01311.x] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
41
|
Gulledge AA, Roberts AD, Vora H, Patel K, Loraine AE. Mining Arabidopsis thaliana RNA-seq data with Integrated Genome Browser reveals stress-induced alternative splicing of the putative splicing regulator SR45a. AMERICAN JOURNAL OF BOTANY 2012; 99:219-31. [PMID: 22291167 DOI: 10.3732/ajb.1100355] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
PREMISE OF THE STUDY High-throughput sequencing of cDNA libraries prepared from diverse samples (RNA-seq) can reveal genome-wide changes in alternative splicing. Using RNA-seq data to assess splicing at the level of individual genes requires the ability to visualize read alignments alongside genomic annotations. To meet this need, we added RNA-seq visualization capability to Integrated Genome Browser (IGB), a free desktop genome visualization tool. To illustrate this capability, we present an in-depth analysis of abiotic stresses and their effects on alternative splicing of SR45a (AT1G07350), a putative splicing regulator from Arabidopsis thaliana. METHODS cDNA libraries prepared from Arabidopsis plants that were subjected to heat and dehydration stresses were sequenced on an Illumina GAIIx sequencer, yielding more than 511 million high-quality 75-base, single-end sequence reads. Reads were aligned onto the reference genome and visualized in IGB. KEY RESULTS Using IGB, we confirmed exon-skipping alternative splicing in SR45a. Exon-skipped variant AT1G07350.1 encodes full-length SR45a protein with intact RS and RNA recognition motifs, while nonskipped variant AT1G07350.2 lacks the C-terminal RS region due to a frameshift in the alternative exon. Heat and drought stresses increased both transcript abundance and the proportion of exon-skipped transcripts encoding the full-length protein. We identified new splice sites and observed frequent intron retention flanking the alternative exon. CONCLUSIONS This study underlines the importance of visual inspection of RNA-seq alignments when investigating alternatively spliced genes. We showed that heat and dehydration stresses increase overall abundance of SR45a mRNA while also increasing production of transcripts encoding the full-length SR45a protein relative to other splice variants.
Collapse
Affiliation(s)
- Alyssa A Gulledge
- Department of Bioinformatics and Genomics, North Carolina Research Campus, University of North Carolina at Charlotte, 600 Laureate Way, Kannapolis, North Carolina 28081, USA
| | | | | | | | | |
Collapse
|
42
|
Laczik M, Tukacs E, Uzonyi B, Domokos B, Doma Z, Kiss M, Horváth A, Batta Z, Maros-Szabó Z, Török Z. Geno viewer, a SAM/BAM viewer tool. Bioinformation 2012; 8:107-9. [PMID: 22359445 PMCID: PMC3282266 DOI: 10.6026/97320630008107] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2011] [Accepted: 01/07/2012] [Indexed: 11/23/2022] Open
Abstract
UNLABELLED The ever evolving Next Generation Sequencing technology is calling for new and innovative ways of data processing and visualization. Following a detailed survey of the current needs of researchers and service providers, the authors have developed GenoViewer: a highly user-friendly, easy-to-operate SAM/BAM viewer and aligner tool. GenoViewer enables fast and efficient NGS assembly browsing, analysis and read mapping. It is highly customized, making it suitable for a wide range of NGS related tasks. Due to its relatively simple architecture, it is easy to add specialised visualization functionalities, facilitating further customised data analysis. The software's source code is freely available; it is open for project and task-specific modifications. AVAILABILITY The database is available for free at http://www.genoviewer.com/
Collapse
Affiliation(s)
- Miklós Laczik
- Astrid Research Inc., 4029 Debrecen, Csapó street 42., Hungary
- Bioinformatics Research Group, University of Debrecen, Faculty of Informatics, 4010 Debrecen, POB 12, Hungary
| | - Edit Tukacs
- Astrid Research Inc., 4029 Debrecen, Csapó street 42., Hungary
- Bioinformatics Research Group, University of Debrecen, Faculty of Informatics, 4010 Debrecen, POB 12, Hungary
| | - Béla Uzonyi
- Astrid Research Inc., 4029 Debrecen, Csapó street 42., Hungary
| | - Bálint Domokos
- Astrid Research Inc., 4029 Debrecen, Csapó street 42., Hungary
| | - Zsolt Doma
- Astrid Research Inc., 4029 Debrecen, Csapó street 42., Hungary
| | - Máté Kiss
- Astrid Research Inc., 4029 Debrecen, Csapó street 42., Hungary
| | - Attila Horváth
- Astrid Research Inc., 4029 Debrecen, Csapó street 42., Hungary
- Bioinformatics Research Group, University of Debrecen, Faculty of Informatics, 4010 Debrecen, POB 12, Hungary
| | - Zoltán Batta
- Astrid Research Inc., 4029 Debrecen, Csapó street 42., Hungary
- Bioinformatics Research Group, University of Debrecen, Faculty of Informatics, 4010 Debrecen, POB 12, Hungary
| | - Zsuzsanna Maros-Szabó
- Astrid Research Inc., 4029 Debrecen, Csapó street 42., Hungary
- Bioinformatics Research Group, University of Debrecen, Faculty of Informatics, 4010 Debrecen, POB 12, Hungary
| | - Zsolt Török
- Astrid Research Inc., 4029 Debrecen, Csapó street 42., Hungary
- Bioinformatics Research Group, University of Debrecen, Faculty of Informatics, 4010 Debrecen, POB 12, Hungary
| |
Collapse
|
43
|
Carver T, Harris SR, Otto TD, Berriman M, Parkhill J, McQuillan JA. BamView: visualizing and interpretation of next-generation sequencing read alignments. Brief Bioinform 2012; 14:203-12. [PMID: 22253280 PMCID: PMC3603209 DOI: 10.1093/bib/bbr073] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
So-called next-generation sequencing (NGS) has provided the ability to sequence on a massive scale at low cost, enabling biologists to perform powerful experiments and gain insight into biological processes. BamView has been developed to visualize and analyse sequence reads from NGS platforms, which have been aligned to a reference sequence. It is a desktop application for browsing the aligned or mapped reads [Ruffalo, M, LaFramboise, T, Koyutürk, M. Comparative analysis of algorithms for next-generation sequencing read alignment. Bioinformatics 2011;27:2790–6] at different levels of magnification, from nucleotide level, where the base qualities can be seen, to genome or chromosome level where overall coverage is shown. To enable in-depth investigation of NGS data, various views are provided that can be configured to highlight interesting aspects of the data. Multiple read alignment files can be overlaid to compare results from different experiments, and filters can be applied to facilitate the interpretation of the aligned reads. As well as being a standalone application it can be used as an integrated part of the Artemis genome browser, BamView allows the user to study NGS data in the context of the sequence and annotation of the reference genome. Single nucleotide polymorphism (SNP) density and candidate SNP sites can be highlighted and investigated, and read-pair information can be used to discover large structural insertions and deletions. The application will also calculate simple analyses of the read mapping, including reporting the read counts and reads per kilobase per million mapped reads (RPKM) for genes selected by the user. Availability: BamView and Artemis are freely available software. These can be downloaded from their home pages: http://bamview.sourceforge.net/; http://www.sanger.ac.uk/resources/software/artemis/. Requirements: Java 1.6 or higher.
Collapse
Affiliation(s)
- Tim Carver
- Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
| | | | | | | | | | | |
Collapse
|
44
|
Oyola SO, Otto TD, Gu Y, Maslen G, Manske M, Campino S, Turner DJ, Macinnis B, Kwiatkowski DP, Swerdlow HP, Quail MA. Optimizing Illumina next-generation sequencing library preparation for extremely AT-biased genomes. BMC Genomics 2012; 13:1. [PMID: 22214261 PMCID: PMC3312816 DOI: 10.1186/1471-2164-13-1] [Citation(s) in RCA: 268] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2011] [Accepted: 01/03/2012] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Massively parallel sequencing technology is revolutionizing approaches to genomic and genetic research. Since its advent, the scale and efficiency of Next-Generation Sequencing (NGS) has rapidly improved. In spite of this success, sequencing genomes or genomic regions with extremely biased base composition is still a great challenge to the currently available NGS platforms. The genomes of some important pathogenic organisms like Plasmodium falciparum (high AT content) and Mycobacterium tuberculosis (high GC content) display extremes of base composition. The standard library preparation procedures that employ PCR amplification have been shown to cause uneven read coverage particularly across AT and GC rich regions, leading to problems in genome assembly and variation analyses. Alternative library-preparation approaches that omit PCR amplification require large quantities of starting material and hence are not suitable for small amounts of DNA/RNA such as those from clinical isolates. We have developed and optimized library-preparation procedures suitable for low quantity starting material and tolerant to extremely high AT content sequences. RESULTS We have used our optimized conditions in parallel with standard methods to prepare Illumina sequencing libraries from a non-clinical and a clinical isolate (containing ~53% host contamination). By analyzing and comparing the quality of sequence data generated, we show that our optimized conditions that involve a PCR additive (TMAC), produces amplified libraries with improved coverage of extremely AT-rich regions and reduced bias toward GC neutral templates. CONCLUSION We have developed a robust and optimized Next-Generation Sequencing library amplification method suitable for extremely AT-rich genomes. The new amplification conditions significantly reduce bias and retain the complexity of either extremes of base composition. This development will greatly benefit sequencing clinical samples that often require amplification due to low mass of DNA starting material.
Collapse
Affiliation(s)
- Samuel O Oyola
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Carver T, Harris SR, Berriman M, Parkhill J, McQuillan JA. Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data. Bioinformatics 2011; 28:464-9. [PMID: 22199388 PMCID: PMC3278759 DOI: 10.1093/bioinformatics/btr703] [Citation(s) in RCA: 838] [Impact Index Per Article: 64.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Motivation: High-throughput sequencing (HTS) technologies have made low-cost sequencing of large numbers of samples commonplace. An explosion in the type, not just number, of sequencing experiments has also taken place including genome re-sequencing, population-scale variation detection, whole transcriptome sequencing and genome-wide analysis of protein-bound nucleic acids. Results: We present Artemis as a tool for integrated visualization and computational analysis of different types of HTS datasets in the context of a reference genome and its corresponding annotation. Availability: Artemis is freely available (under a GPL licence) for download (for MacOSX, UNIX and Windows) at the Wellcome Trust Sanger Institute websites: http://www.sanger.ac.uk/resources/software/artemis/. Contact:artemis@sanger.ac.uk; tjc@sanger.ac.uk
Collapse
Affiliation(s)
- Tim Carver
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
| | | | | | | | | |
Collapse
|
46
|
Pelleymounter LL, Moon I, Johnson JA, Laederach A, Halvorsen M, Eckloff B, Abo R, Rossetti S. A novel application of pattern recognition for accurate SNP and indel discovery from high-throughput data: targeted resequencing of the glucocorticoid receptor co-chaperone FKBP5 in a Caucasian population. Mol Genet Metab 2011; 104:457-69. [PMID: 21917492 PMCID: PMC3224211 DOI: 10.1016/j.ymgme.2011.08.019] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/24/2011] [Revised: 08/18/2011] [Accepted: 08/18/2011] [Indexed: 11/28/2022]
Abstract
The detection of single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) with precision from high-throughput data remains a significant bioinformatics challenge. Accurate detection is necessary before next-generation sequencing can routinely be used in the clinic. In research, scientific advances are inhibited by gaps in data, exemplified by the underrepresented discovery of rare variants, variants in non-coding regions and indels. The continued presence of false positives and false negatives prevents full automation and requires additional manual verification steps. Our methodology presents applications of both pattern recognition and sensitivity analysis to eliminate false positives and aid in the detection of SNP/indel loci and genotypes from high-throughput data. We chose FK506-binding protein 51(FKBP5) (6p21.31) for our clinical target because of its role in modulating pharmacological responses to physiological and synthetic glucocorticoids and because of the complexity of the genomic region. We detected genetic variation across a 160 kb region encompassing FKBP5. 613 SNPs and 57 indels, including a 3.3 kb deletion were discovered. We validated our method using three independent data sets and, with Sanger sequencing and Affymetrix and Illumina microarrays, achieved 99% concordance. Furthermore we were able to detect 267 novel rare variants and assess linkage disequilibrium. Our results showed both a sensitivity and specificity of 98%, indicating near perfect classification between true and false variants. The process is scalable and amenable to automation, with the downstream filters taking only 1.5h to analyze 96 individuals simultaneously. We provide examples of how our level of precision uncovered the interactions of multiple loci, their predicted influences on mRNA stability, perturbations of the hsp90 binding site, and individual variation in FKBP5 expression. Finally we show how our discovery of rare variants may change current conceptions of evolution at this locus.
Collapse
Affiliation(s)
- Linda L Pelleymounter
- Department of Pharmacology, Department of Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA.
| | | | | | | | | | | | | | | |
Collapse
|
47
|
Laing C, Villegas A, Taboada EN, Kropinski A, Thomas JE, Gannon VPJ. Identification of Salmonella enterica species- and subgroup-specific genomic regions using Panseq 2.0. INFECTION GENETICS AND EVOLUTION 2011; 11:2151-61. [PMID: 22001825 DOI: 10.1016/j.meegid.2011.09.021] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2011] [Revised: 09/02/2011] [Accepted: 09/22/2011] [Indexed: 01/04/2023]
Abstract
The pan-genome of a taxonomic group consists of evolutionarily conserved core genes shared by all members and accessory genes that are present only in some members of the group. Group- and subgroup-specific core genes are thought to contribute to shared phenotypes such as virulence and niche specificity. In this study we analyzed 39 Salmonella enterica genomes (16 closed, 23 draft), a species that contains two human-specific serovars that cause typhoid fever, as well as a large number of zoonotic serovars that cause gastroenteritis in humans. Panseq 2.0 was used to define the pan-genome by adjusting the threshold at which group-specific "core" loci are defined. We found the pan-genome to be 9.03 Mbp in size, and that the core genome size decreased, while the number of SNPs/100 bp increased, as the number of strains used to define the core genome increased, suggesting substantial divergence among S. enterica subgroups. Subgroup-specific "core" genes, in contrast, had fewer SNPs/100 bp, likely reflecting their more recent acquisition. Phylogenetic trees were created from the concatenated and aligned pan-genome, the core genome, and multi-locus-sequence typing (MLST) loci. Branch support increased among the trees, and strains of the same serovar grouped closer together as the number of loci used to create the tree increased. Further, high levels of discrimination were achieved even amongst the most closely related strains of S. enterica Typhi, suggesting that the data generated by Panseq may also be of value in short-term epidemiological studies. Panseq provides an easy and fast way of performing pan-genomic analyses, which can include the identification of group-dominant as well as group-specific loci and is available as a web-server and a standalone version at http://lfz.corefacility.ca/panseq/.
Collapse
Affiliation(s)
- Chad Laing
- Laboratory for Foodborne Zoonoses, Public Health Agency of Canada, Lethbridge, AB, Canada.
| | | | | | | | | | | |
Collapse
|
48
|
Laing R, Hunt M, Protasio AV, Saunders G, Mungall K, Laing S, Jackson F, Quail M, Beech R, Berriman M, Gilleard JS. Annotation of two large contiguous regions from the Haemonchus contortus genome using RNA-seq and comparative analysis with Caenorhabditis elegans. PLoS One 2011; 6:e23216. [PMID: 21858033 PMCID: PMC3156134 DOI: 10.1371/journal.pone.0023216] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2011] [Accepted: 07/12/2011] [Indexed: 11/30/2022] Open
Abstract
The genomes of numerous parasitic nematodes are currently being sequenced, but their complexity and size, together with high levels of intra-specific sequence variation and a lack of reference genomes, makes their assembly and annotation a challenging task. Haemonchus contortus is an economically significant parasite of livestock that is widely used for basic research as well as for vaccine development and drug discovery. It is one of many medically and economically important parasites within the strongylid nematode group. This group of parasites has the closest phylogenetic relationship with the model organism Caenorhabditis elegans, making comparative analysis a potentially powerful tool for genome annotation and functional studies. To investigate this hypothesis, we sequenced two contiguous fragments from the H. contortus genome and undertook detailed annotation and comparative analysis with C. elegans. The adult H. contortus transcriptome was sequenced using an Illumina platform and RNA-seq was used to annotate a 409 kb overlapping BAC tiling path relating to the X chromosome and a 181 kb BAC insert relating to chromosome I. In total, 40 genes and 12 putative transposable elements were identified. 97.5% of the annotated genes had detectable homologues in C. elegans of which 60% had putative orthologues, significantly higher than previous analyses based on EST analysis. Gene density appears to be less in H. contortus than in C. elegans, with annotated H. contortus genes being an average of two-to-three times larger than their putative C. elegans orthologues due to a greater intron number and size. Synteny appears high but gene order is generally poorly conserved, although areas of conserved microsynteny are apparent. C. elegans operons appear to be partially conserved in H. contortus. Our findings suggest that a combination of RNA-seq and comparative analysis with C. elegans is a powerful approach for the annotation and analysis of strongylid nematode genomes.
Collapse
Affiliation(s)
- Roz Laing
- Welcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
- Faculty of Veterinary Medicine, University of Glasgow, Glasgow, Strathclyde, United Kingdom
| | - Martin Hunt
- Welcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Anna V. Protasio
- Welcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Gary Saunders
- Faculty of Veterinary Medicine, University of Glasgow, Glasgow, Strathclyde, United Kingdom
| | - Karen Mungall
- Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Steven Laing
- Faculty of Veterinary Medicine, University of Glasgow, Glasgow, Strathclyde, United Kingdom
| | - Frank Jackson
- Moredun Research Institute, Pentlands Science Park, Bush Loan, Penicuik, United Kingdom
| | - Michael Quail
- Welcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Robin Beech
- Institute of Parasitology, McGill University, Ste Anne de Bellevue, Quebec, Canada
| | - Matthew Berriman
- Welcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - John S. Gilleard
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- * E-mail:
| |
Collapse
|
49
|
Chaudhuri RR, Yu L, Kanji A, Perkins TT, Gardner PP, Choudhary J, Maskell DJ, Grant AJ. Quantitative RNA-seq analysis of the Campylobacter jejuni transcriptome. MICROBIOLOGY-SGM 2011; 157:2922-2932. [PMID: 21816880 PMCID: PMC3353397 DOI: 10.1099/mic.0.050278-0] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Campylobacter jejuni is the most common bacterial cause of foodborne disease in the developed world. Its general physiology and biochemistry, as well as the mechanisms enabling it to colonize and cause disease in various hosts, are not well understood, and new approaches are required to understand its basic biology. High-throughput sequencing technologies provide unprecedented opportunities for functional genomic research. Recent studies have shown that direct Illumina sequencing of cDNA (RNA-seq) is a useful technique for the quantitative and qualitative examination of transcriptomes. In this study we report RNA-seq analyses of the transcriptomes of C. jejuni (NCTC11168) and its rpoN mutant. This has allowed the identification of hitherto unknown transcriptional units, and further defines the regulon that is dependent on rpoN for expression. The analysis of the NCTC11168 transcriptome was supplemented by additional proteomic analysis using liquid chromatography-MS. The transcriptomic and proteomic datasets represent an important resource for the Campylobacter research community.
Collapse
Affiliation(s)
- Roy R. Chaudhuri
- Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK
| | - Lu Yu
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Alpa Kanji
- Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK
| | - Timothy T. Perkins
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Paul P. Gardner
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Jyoti Choudhary
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Duncan J. Maskell
- Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK
| | - Andrew J. Grant
- Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK
| |
Collapse
|
50
|
Abstract
MOTIVATION Rapidly decreasing sequencing cost due to the emergence and improvement of massively parallel sequencing technologies has resulted in a dramatic increase in the quantity of data that needs to be analyzed. Therefore, software tools to process, visualize, analyze and integrate data produced on multiple platforms and using multiple methods are needed. RESULTS GenPlay is a fast, easy to use and stable tool for rapid analysis and data processing. It is written in Java and runs on all major operating systems. GenPlay recognizes a wide variety of common genomic data formats from microarray- or sequencing-based platforms and offers a library of operations (normalization, binning, smoothing) to process raw data into visualizable tracks. GenPlay displays tracks adapted to summarize gene structure, gene expression, repeat families, CPG islands, etc. as well as custom tracks to show the results of RNA-Seq, ChIP-Seq, TimEX-Seq and single nucleotide polymorphism (SNP) analysis. GenPlay can generate statistics (minimum, maximum, SD, correlation, etc.). The tools provided include Gaussian filter, peak finders, signal saturation, island finders. The software also offers graphical features such as scatter plots and bar charts to depict signal repartition. The library of operations is continuously growing based on the emerging needs. AVAILABILITY GenPlay is an open-source project available from http://www.genplay.net. The code source of the software is available at https://genplay.einstein.yu.edu/svn/GenPlay.
Collapse
Affiliation(s)
- Julien Lajugie
- Department of Medicine, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | | |
Collapse
|