1
|
Brown JA, Feye KM, Ricke SC. Illumina MiSeq 16S rRNA Gene Library Preparation for Poultry Processing Microbiome Analyses. Methods Mol Biol 2025; 2852:273-288. [PMID: 39235750 DOI: 10.1007/978-1-0716-4100-2_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2024]
Abstract
The standardization of the microbiome sequencing of poultry rinsates is essential for generating comparable microbial composition data among poultry processing facilities if this technology is to be adopted by the industry. Samples must first be acquired, DNA must be extracted, and libraries must be constructed. In order to proceed to library sequencing, the samples should meet quality control standards. Finally, data must be analyzed using computer bioinformatics pipelines. This data can subsequently be incorporated into more advanced computer algorithms for risk assessment. Ultimately, *a uniform sequencing pipeline will enable both the government regulatory agencies and the poultry industry to identify potential weaknesses in food safety.This chapter presents the different steps for monitoring the population dynamics of the microbiome in poultry processing using 16S rDNA sequencing.
Collapse
Affiliation(s)
- Jessica A Brown
- Meat Science and Animal Biologics Discovery, Department of Animal & Dairy Sciences, University of Wisconsin-Madison, Madison, WI, USA
| | - Kristina M Feye
- Cell and Molecular Biology, University of Arkansas, Fayetteville, AR, USA
| | - Steven C Ricke
- Meat Science and Animal Biologics Discovery, Department of Animal & Dairy Sciences, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
2
|
Unitt A, Maiden M, Harrison O. Characterizing the diversity and commensal origins of penA mosaicism in the genus Neisseria. Microb Genom 2024; 10:001209. [PMID: 38381035 PMCID: PMC10926701 DOI: 10.1099/mgen.0.001209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 02/10/2024] [Indexed: 02/22/2024] Open
Abstract
Mosaic penA alleles formed through horizontal gene transfer (HGT) have been instrumental to the rising incidence of ceftriaxone-resistant gonococcal infections. Although interspecies HGT of regions of the penA gene between Neisseria gonorrhoeae and commensal Neisseria species has been described, knowledge concerning which species are the most common contributors to mosaic penA alleles is limited, with most studies examining only a small number of alleles. Here, we investigated the origins of recombinant penA alleles through in silico analyses that incorporated 1700 penA alleles from 35 513 Neisseria isolates, comprising 15 different Neisseria species. We identified Neisseria subflava and Neisseria cinerea as the most common source of recombinant sequences in N. gonorrhoeae penA. This contrasted with Neisseria meningitidis penA, for which the primary source of recombinant DNA was other meningococci, followed by Neisseria lactamica. Additionally, we described the distribution of polymorphisms implicated in antimicrobial resistance in penA, and found that these are present across the genus. These results provide insight into resistance-related changes in the penA gene across human-associated Neisseria species, illustrating the importance of genomic surveillance of not only the pathogenic Neisseria, but also of the oral niche-associated commensals from which these pathogens are sourcing key genetic variation.
Collapse
Affiliation(s)
- Anastasia Unitt
- Department of Biology, University of Oxford, Oxford, OX1 3SY, UK
| | - Martin Maiden
- Department of Biology, University of Oxford, Oxford, OX1 3SY, UK
| | - Odile Harrison
- Department of Biology, University of Oxford, Oxford, OX1 3SY, UK
- Infectious Disease Epidemiology Unit, Nuffield Department of Population Health, University of Oxford, OX3 7LF, UK
| |
Collapse
|
3
|
Jacques F, Bolivar P, Pietras K, Hammarlund EU. Roadmap to the study of gene and protein phylogeny and evolution-A practical guide. PLoS One 2023; 18:e0279597. [PMID: 36827278 PMCID: PMC9955684 DOI: 10.1371/journal.pone.0279597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 12/12/2022] [Indexed: 02/25/2023] Open
Abstract
Developments in sequencing technologies and the sequencing of an ever-increasing number of genomes have revolutionised studies of biodiversity and organismal evolution. This accumulation of data has been paralleled by the creation of numerous public biological databases through which the scientific community can mine the sequences and annotations of genomes, transcriptomes, and proteomes of multiple species. However, to find the appropriate databases and bioinformatic tools for respective inquiries and aims can be challenging. Here, we present a compilation of DNA and protein databases, as well as bioinformatic tools for phylogenetic reconstruction and a wide range of studies on molecular evolution. We provide a protocol for information extraction from biological databases and simple phylogenetic reconstruction using probabilistic and distance methods, facilitating the study of biodiversity and evolution at the molecular level for the broad scientific community.
Collapse
Affiliation(s)
- Florian Jacques
- Lund University Cancer Centre, Department of Laboratory Medicine, Lund University, Lund, Sweden
- Lund Stem Cell Center, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Paulina Bolivar
- Lund University Cancer Centre, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Kristian Pietras
- Lund University Cancer Centre, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Emma U. Hammarlund
- Lund University Cancer Centre, Department of Laboratory Medicine, Lund University, Lund, Sweden
- Lund Stem Cell Center, Department of Laboratory Medicine, Lund University, Lund, Sweden
| |
Collapse
|
4
|
Lee BD, Neri U, Roux S, Wolf YI, Camargo AP, Krupovic M, Simmonds P, Kyrpides N, Gophna U, Dolja VV, Koonin EV. Mining metatranscriptomes reveals a vast world of viroid-like circular RNAs. Cell 2023; 186:646-661.e4. [PMID: 36696902 PMCID: PMC9911046 DOI: 10.1016/j.cell.2022.12.039] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 11/11/2022] [Accepted: 12/20/2022] [Indexed: 01/26/2023]
Abstract
Viroids and viroid-like covalently closed circular (ccc) RNAs are minimal replicators that typically encode no proteins and hijack cellular enzymes for replication. The extent and diversity of viroid-like agents are poorly understood. We developed a computational pipeline to identify viroid-like cccRNAs and applied it to 5,131 metatranscriptomes and 1,344 plant transcriptomes. The search yielded 11,378 viroid-like cccRNAs spanning 4,409 species-level clusters, a 5-fold increase compared to the previously identified viroid-like elements. Within this diverse collection, we discovered numerous putative viroids, satellite RNAs, retrozymes, and ribozy-like viruses. Diverse ribozyme combinations and unusual ribozymes within the cccRNAs were identified. Self-cleaving ribozymes were identified in ambiviruses, some mito-like viruses and capsid-encoding satellite virus-like cccRNAs. The broad presence of viroid-like cccRNAs in diverse transcriptomes and ecosystems implies that their host range is far broader than currently known, and matches to CRISPR spacers suggest that some cccRNAs replicate in prokaryotes.
Collapse
Affiliation(s)
- Benjamin D Lee
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK
| | - Uri Neri
- The Shmunis School of Biomedicine and Cancer Research, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Simon Roux
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Antonio Pedro Camargo
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Mart Krupovic
- Institut Pasteur, Université Paris Cité, CNRS UMR6047, Archaeal Virology Unit, 75015 Paris, France
| | - Peter Simmonds
- Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK
| | - Nikos Kyrpides
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Uri Gophna
- The Shmunis School of Biomedicine and Cancer Research, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Valerian V Dolja
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| |
Collapse
|
5
|
Barnett SE, Youngblut ND, Buckley DH. Bacterial community dynamics explain carbon mineralization and assimilation in soils of different land-use history. Environ Microbiol 2022; 24:5230-5247. [PMID: 35920035 DOI: 10.1111/1462-2920.16146] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 07/16/2022] [Accepted: 07/20/2022] [Indexed: 11/30/2022]
Abstract
Soil dwelling microorganisms are key players in the terrestrial carbon cycle, driving both the degradation and stabilization of soil organic matter. Bacterial community structure and function vary with respect to land-use, yet the ecological drivers of this variation remain poorly described and difficult to predict. We conducted a multi-substrate DNA-stable isotope probing experiment across cropland, old-field, and forest habitats to link carbon mineralization dynamics with the dynamics of bacterial growth and carbon assimilation. We tracked the movement of 13 C derived from five distinct carbon sources as it was assimilated into bacterial DNA over time. We show that carbon mineralization, community composition, and carbon assimilation dynamics all differed with respect to land-use. We also show that microbial community dynamics affect carbon assimilation dynamics and are associated with soil DNA content. Soil DNA yield is easy to measure and may be useful in predicting microbial community dynamics linked to soil carbon cycling. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Samuel E Barnett
- School of Integrative Plant Science, Cornell University, Ithaca, NY, USA
| | - Nicholas D Youngblut
- Department of Microbiome Science, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Daniel H Buckley
- School of Integrative Plant Science, Cornell University, Ithaca, NY, USA
| |
Collapse
|
6
|
Soulé A, Reinharz V, Sarrazin-Gendron R, Denise A, Waldispühl J. Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs. PLoS Comput Biol 2021; 17:e1008990. [PMID: 34048427 PMCID: PMC8191989 DOI: 10.1371/journal.pcbi.1008990] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 06/10/2021] [Accepted: 04/22/2021] [Indexed: 11/25/2022] Open
Abstract
RNA tertiary structure is crucial to its many non-coding molecular functions. RNA architecture is shaped by its secondary structure composed of stems, stacked canonical base pairs, enclosing loops. While stems are precisely captured by free-energy models, loops composed of non-canonical base pairs are not. Nor are distant interactions linking together those secondary structure elements (SSEs). Databases of conserved 3D geometries (a.k.a. modules) not captured by energetic models are leveraged for structure prediction and design, but the computational complexity has limited their study to local elements, loops. Representing the RNA structure as a graph has recently allowed to expend this work to pairs of SSEs, uncovering a hierarchical organization of these 3D modules, at great computational cost. Systematically capturing recurrent patterns on a large scale is a main challenge in the study of RNA structures. In this paper, we present an efficient algorithm to compute maximal isomorphisms in edge colored graphs. We extend this algorithm to a framework well suited to identify RNA modules, and fast enough to considerably generalize previous approaches. To exhibit the versatility of our framework, we first reproduce results identifying all common modules spanning more than 2 SSEs, in a few hours instead of weeks. The efficiency of our new algorithm is demonstrated by computing the maximal modules between any pair of entire RNA in the non-redundant corpus of known RNA 3D structures. We observe that the biggest modules our method uncovers compose large shared sub-structure spanning hundreds of nucleotides and base pairs between the ribosomes of Thermus thermophilus, Escherichia Coli, and Pseudomonas aeruginosa. Ribonucleic Acids (RNAs) are performing a broad range of essential molecular functions in cells, many of which rely on intricate folding properties of the molecule. Watson-Crick and Wobble base pairs form early, stack onto each other to create stems connected by loops, which are themselves stabilized by more sophisticated base interaction patterns. These networks are essential to shape RNA 3D structures but unfortunately still poorly understood. Here, we undertake the task to build a catalog of base interaction networks occurring in multiple structures. However, a pairwise comparison of all RNA structures is computationally heavy. Therefore, we devise an algorithm leveraging intrinsic properties of RNA base interaction networks that enables us to quickly mine full databases of 3D structures. Compared to previous methods, our techniques bring the total running time of the analysis from months to hours while performing more general searches. The data collected though this work will benefit molecular evolution studies and serve in structure prediction tools.
Collapse
Affiliation(s)
- Antoine Soulé
- School of Computer Science, McGill University, Montréal, Canada
- LiX, École Polytechnique, Paris, France
| | - Vladimir Reinharz
- Department of Computer Science, Université du Québec à Montréal, Montréal, Canada
| | | | - Alain Denise
- Laboratoire de recherche en informatique, Université Paris-Saclay - CNRS, Orsay, France
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay - CEA - CNRS, Gif-sur-Yvette, France
| | - Jérôme Waldispühl
- School of Computer Science, McGill University, Montréal, Canada
- * E-mail:
| |
Collapse
|
7
|
Yavitt JB, Roco CA, Debenport SJ, Barnett SE, Shapleigh JP. Community Organization and Metagenomics of Bacterial Assemblages Across Local Scale pH Gradients in Northern Forest Soils. MICROBIAL ECOLOGY 2021; 81:758-769. [PMID: 33001224 DOI: 10.1007/s00248-020-01613-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Accepted: 09/25/2020] [Indexed: 06/11/2023]
Abstract
Soil pH has shown to predict bacterial diversity, but mechanisms are still poorly understood. To investigate how bacteria distribute themselves as a function of soil pH, we assessed community composition, diversity, assembly, and gene abundance across local (ca. 1 km) scale gradients in soil pH from ~ 3.8 to 6.5 created by differences in soil parent material in three northern forests. Plant species were the same on all sites, with no evidence of agriculture in the past. Concentrations of extractable calcium, iron, and phosphorus also varied significantly across the pH gradients. Among taxa, Alphaproteobacteria and Acidobacteria were more common in soils with acidic pH values. Overall richness and diversity of OTUs peaked at intermediate pH values. Variations in OTU richness and diversity also had a quadratic fit with concentrations of extractable calcium and phosphorus. Community assembly was via homogeneous deterministic processes in soils with acidic pH values, whereas stochastic processes dominated in soils with near-neutral pH values. Although we expected selection via genes for acid tolerance response in acidic soils, genes for genetic information processing were more selective. Taxa in higher pH soils had differential abundance of transporter genes, suggesting adaptation to acquire metabolic substrates from soils. Soil bacterial communities in northern forest soils are incredibly diverse, and we still have much to learn about how soil pH and co-varying soil parameters directly drive gene selection in this critical component of ecosystem structure.
Collapse
Affiliation(s)
- Joseph B Yavitt
- Department of Natural Resources, Cornell University, 226 Mann Drive, Fernow Hall, Ithaca, NY, 14853, USA.
| | - C Armanda Roco
- Department of Microbiology, Cornell University, 123 Wing Drive, Wing Hall, Ithaca, NY, 14853, USA
| | - Spencer J Debenport
- School of Integrative Plant Science, Cornell University, 306 Tower Road, Bradfield Hall, Ithaca, NY, 14853, USA
| | - Samuel E Barnett
- School of Integrative Plant Science, Cornell University, 306 Tower Road, Bradfield Hall, Ithaca, NY, 14853, USA
| | - James P Shapleigh
- Department of Microbiology, Cornell University, 123 Wing Drive, Wing Hall, Ithaca, NY, 14853, USA
| |
Collapse
|
8
|
Abstract
Standard workflows for analyzing microbiomes often include the creation and curation of phylogenetic trees. Here we present EMPress, an interactive web tool for visualizing trees in the context of microbiome, metabolome, and other community data scalable to trees with well over 500,000 nodes. EMPress provides novel functionality—including ordination integration and animations—alongside many standard tree visualization features and thus simplifies exploratory analyses of many forms of ‘omic data. IMPORTANCE Phylogenetic trees are integral data structures for the analysis of microbial communities. Recent work has also shown the utility of trees constructed from certain metabolomic data sets, further highlighting their importance in microbiome research. The ever-growing scale of modern microbiome surveys has led to numerous challenges in visualizing these data. In this paper we used five diverse data sets to showcase the versatility and scalability of EMPress, an interactive web visualization tool. EMPress addresses the growing need for exploratory analysis tools that can accommodate large, complex multi-omic data sets.
Collapse
|
9
|
Jermiin LS, Catullo RA, Holland BR. A new phylogenetic protocol: dealing with model misspecification and confirmation bias in molecular phylogenetics. NAR Genom Bioinform 2020; 2:lqaa041. [PMID: 33575594 PMCID: PMC7671319 DOI: 10.1093/nargab/lqaa041] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 05/18/2020] [Accepted: 06/04/2020] [Indexed: 12/15/2022] Open
Abstract
Molecular phylogenetics plays a key role in comparative genomics and has increasingly significant impacts on science, industry, government, public health and society. In this paper, we posit that the current phylogenetic protocol is missing two critical steps, and that their absence allows model misspecification and confirmation bias to unduly influence phylogenetic estimates. Based on the potential offered by well-established but under-used procedures, such as assessment of phylogenetic assumptions and tests of goodness of fit, we introduce a new phylogenetic protocol that will reduce confirmation bias and increase the accuracy of phylogenetic estimates.
Collapse
Affiliation(s)
- Lars S Jermiin
- CSIRO Land & Water, Canberra, ACT 2601, Australia
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
- School of Biology & Environment Science, University College Dublin, Belfield, Dublin 4, Ireland
- Earth Institute, University College Dublin, Belfield, Dublin 4, Ireland
| | - Renee A Catullo
- CSIRO Land & Water, Canberra, ACT 2601, Australia
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
- School of Science and Health & Hawkesbury Institute of the Environment, Western Sydney University, Penrith, NSW 2751, Australia
| | - Barbara R Holland
- School of Natural Sciences, University of Tasmania, Hobart, TAS 7001, Australia
| |
Collapse
|
10
|
Zhu Y, Ong CS, Huttley GA. Machine Learning Techniques for Classifying the Mutagenic Origins of Point Mutations. Genetics 2020; 215:25-40. [PMID: 32193188 PMCID: PMC7198283 DOI: 10.1534/genetics.120.303093] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 03/05/2020] [Indexed: 11/18/2022] Open
Abstract
There is increasing interest in developing diagnostics that discriminate individual mutagenic mechanisms in a range of applications that include identifying population-specific mutagenesis and resolving distinct mutation signatures in cancer samples. Analyses for these applications assume that mutagenic mechanisms have a distinct relationship with neighboring bases that allows them to be distinguished. Direct support for this assumption is limited to a small number of simple cases, e.g., CpG hypermutability. We have evaluated whether the mechanistic origin of a point mutation can be resolved using only sequence context for a more complicated case. We contrasted single nucleotide variants originating from the multitude of mutagenic processes that normally operate in the mouse germline with those induced by the potent mutagen N-ethyl-N-nitrosourea (ENU). The considerable overlap in the mutation spectra of these two samples make this a challenging problem. Employing a new, robust log-linear modeling method, we demonstrate that neighboring bases contain information regarding point mutation direction that differs between the ENU-induced and spontaneous mutation variant classes. A logistic regression classifier exhibited strong performance at discriminating between the different mutation classes. Concordance between the feature set of the best classifier and information content analyses suggest our results can be generalized to other mutation classification problems. We conclude that machine learning can be used to build a practical classification tool to identify the mutation mechanism for individual genetic variants. Software implementing our approach is freely available under an open-source license.
Collapse
Affiliation(s)
- Yicheng Zhu
- Research School of Biology, The Australian National University, Canberra, Australian Capital Territory 2601, Australia
| | - Cheng Soon Ong
- Data61, CSIRO, Black Mountain Campus, Canberra, Australian Capital Territory 2601, Australia
- Research School of Computer Science, The Australian National University, Canberra, Australian Capital Territory 2601, Australia
| | - Gavin A Huttley
- Research School of Biology, The Australian National University, Canberra, Australian Capital Territory 2601, Australia
| |
Collapse
|
11
|
Advances in monitoring soil microbial community dynamic and function. J Appl Genet 2020; 61:249-263. [PMID: 32062778 DOI: 10.1007/s13353-020-00549-5] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 01/17/2020] [Accepted: 02/06/2020] [Indexed: 12/22/2022]
Abstract
Microorganisms are vital to the overall ecosystem functioning, stability, and sustainability. Soil fertility and health depend on chemical composition and also on the qualitative and quantitative nature of microorganisms inhabiting it. Historically, denaturing gradient gel electrophoresis (DGGE) and temperature gradient gel electrophoresis (TGGE), single-strand conformation polymorphism, DNA amplification fingerprinting, amplified ribosomal DNA restriction analysis, terminal restriction fragment length polymorphism, length heterogeneity PCR, and ribosomal intergenic spacer analysis were used to assess soil microbial community structure (SMCS), abundance, and diversity. However, these methods had significant shortcomings and limitations for application in land reclamation monitoring. SMCS has been primarily determined by phospholipid fatty acid (PLFA) analysis. This method provides a direct measure of viable biomass in addition to a biochemical profile of the microbial community. PLFA has limitations such as overlap in the composition of microorganisms and the specificity of PLFAs signature. In recent years, high-throughput next-generation sequencing has dramatically increased the resolution and detectable spectrum of diverse microbial phylotypes from environmental samples and it plays a significant role in microbial ecology studies. Next-generation sequencings using 454, Illumina, SOLiD, and Ion Torrent platforms are rapid and flexible. The two methods, PLFA and next-generation sequencing, are useful in detecting changes in microbial community diversity and structure in different ecosystems. Single-molecule real-time (SMRT) and nanopore sequencing technologies represent third-generation sequencing (TGS) platforms that have been developed to address the shortcomings of second-generation sequencing (SGS). Enzymatic and soil respiration analyses are performed to further determine soil quality and microbial activities. Other valuable methods that are being recently applied to microbial function and structures include NanoSIM, GeoChip, and DNA stable staple isotope probing (DNA-SIP) technologies. They are powerful metagenomics tool for analyzing microbial communities, including their structure, metabolic potential, diversity, and their impact on ecosystem functions. This review is a critical analysis of current methods used in monitoring soil microbial community dynamic and functions.
Collapse
|
12
|
Feye KM, Thompson DR, Rothrock MJ, Kogut MH, Ricke SC. Poultry processing and the application of microbiome mapping. Poult Sci 2020; 99:678-688. [PMID: 32029154 PMCID: PMC7587767 DOI: 10.1016/j.psj.2019.12.019] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Indexed: 01/28/2023] Open
Abstract
Chicken is globally one of the most popular food animals. However, it is also one of the major reservoirs for foodborne pathogens, annually resulting in continued morbidity and mortality incidences worldwide. In an effort to reduce the threat of foodborne disease, the poultry industry has implemented a multifaceted antimicrobial program that incorporates not only chemical compounds, but also extensive amounts of water application and pathogen monitoring. Unfortunately, the pathogen detection methods currently used by the poultry industry lack speed, relying on microbiological plate methods and molecular detection systems that take time and lack precision. In many cases, the time to data acquisition can take 12 to 24 h. This is problematic if shorter-term answers are required which is becoming more likely as the public demand for chicken meat is only increasing, leading to new pressures to increase line speed. Therefore, new innovations in detection methods must occur to mitigate the risk of foodborne pathogens that could result from faster slaughter and processing speeds. Future technology will have 2 tracks: rapid methods that are meant to detect pathogens and indicator organisms within a few hours, and long-term methods that use microbiome mapping to evaluate sanitation and antimicrobial efficacy. Together, these methods will provide rapid, comprehensive data capable of being applied in both risk-assessment algorithms and used by management to safeguard the public.
Collapse
Affiliation(s)
- K M Feye
- Southern Plains Agricultural Research Center, USDA-ARS, Athens, TX 30605
| | - D R Thompson
- Department of Computer Science and Engineering, University of Arkansas, Fayetteville, AR 72704
| | - M J Rothrock
- US National Poultry Research Center, Egg Safety and Quality Research, USDA-ARS, Athens, GA 30605
| | - M H Kogut
- Southern Plains Agricultural Research Center, USDA-ARS, Athens, TX 30605
| | - S C Ricke
- Center for Food Safety, Department of Food Science, University of Arkansas, Fayetteville, AR 72704.
| |
Collapse
|
13
|
Higuchi T, Yoshimura M, Oka S, Tanaka K, Naito T, Yuhara S, Warabi E, Mizuno S, Ono M, Takahashi S, Tohma S, Tsuchiya N, Furukawa H. Modulation of methotrexate-induced intestinal mucosal injury by dietary factors. Hum Exp Toxicol 2019; 39:500-513. [PMID: 31876189 DOI: 10.1177/0960327119896605] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Methotrexate (MTX)-induced intestinal mucosal injury in animals has been studied to understand how MTX can cause gastrointestinal disorders, but the pathogenesis of gastrointestinal disorders is still uncertain. We have attempted to reveal how dietary factors influence intestinal toxicity due to MTX. Mice were fed normal chow (NC) or a high-fat high-sucrose diet (HFHSD) before oral administration of MTX. While MTX significantly decreased the survival rates of mice fed HFHSD, the intestinal epithelial injury was detected. MTX excretion in the feces of mice fed HFHSD was reduced. Change of diets between NC and HFHSD influences the survival. The survival rates of the mice fed a high-sucrose diet or control diet were higher than those fed HFHSD. Higher survival rates were observed in mice fed a high-fat high-sucrose diet modified (HFHSD-M) in which casein was replaced by soybean-derived proteins. The survival rates of mice treated with vancomycin were lower than those administered neomycin. Microbiome and metabolome analyses on feces suggest a similarity of the intestinal environments of mice fed NC and HFHSD-M. HFHSD may modify MTX-induced toxicity in intestinal epithelia on account of an altered MTX distribution as a result of change in the intestinal environment.
Collapse
Affiliation(s)
- T Higuchi
- Molecular and Genetic Epidemiology Laboratory, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan.,Both the authors contributed equally to this work
| | - M Yoshimura
- Molecular and Genetic Epidemiology Laboratory, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan.,Both the authors contributed equally to this work
| | - S Oka
- Molecular and Genetic Epidemiology Laboratory, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan.,Clinical Research Center for Allergy and Rheumatology, National Hospital Organization Sagamihara National Hospital, Sagamihara, Japan.,Department of Rheumatology, National Hospital Organization Tokyo National Hospital, Kiyose, Japan
| | - K Tanaka
- Business Department, Miraca Research Institute G.K., Sagamihara, Japan
| | - T Naito
- Business Department, Miraca Research Institute G.K., Sagamihara, Japan
| | - S Yuhara
- Research Department, Miraca Research Institute G.K., Hachioji, Japan
| | - E Warabi
- Department of Anatomy and Embryology, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan
| | - S Mizuno
- Laborarory Animal Resource Center, University of Tsukuba, Tsukuba, Japan
| | - M Ono
- Department of Clinical Laboratory, National Hospital Organization Mito Medical Center, Ibaraki, Japan
| | - S Takahashi
- Department of Anatomy and Embryology, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan
| | - S Tohma
- Clinical Research Center for Allergy and Rheumatology, National Hospital Organization Sagamihara National Hospital, Sagamihara, Japan.,Department of Rheumatology, National Hospital Organization Tokyo National Hospital, Kiyose, Japan
| | - N Tsuchiya
- Molecular and Genetic Epidemiology Laboratory, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan
| | - H Furukawa
- Molecular and Genetic Epidemiology Laboratory, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan.,Clinical Research Center for Allergy and Rheumatology, National Hospital Organization Sagamihara National Hospital, Sagamihara, Japan.,Department of Rheumatology, National Hospital Organization Tokyo National Hospital, Kiyose, Japan
| |
Collapse
|
14
|
Barnett SE, Youngblut ND, Buckley DH. Soil characteristics and land-use drive bacterial community assembly patterns. FEMS Microbiol Ecol 2019; 96:5675623. [DOI: 10.1093/femsec/fiz194] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 12/12/2019] [Indexed: 11/13/2022] Open
Abstract
ABSTRACT
Land-use and soil characteristics drive variation in soil community composition, but the influences of these factors on dispersal and community assembly at regional scale remain poorly characterized. Land-use remains a consistent driver of soil community composition even when exhibiting patchy spatial distribution at regional scale. In addition, disturbed and early successional soils often exhibit stochastic community assembly patterns. These observations suggest local community composition is influenced by dispersal and assembly from regional species pools. We examined bacterial community assembly within agricultural cropland, old-field, and forested sites across 10 landscapes in the region around Ithaca, New York (USA). We found that the Sloan neutral model explained assembly well at regional scale (R2 = 0.763), but that both soil pH and land-use imposed selection that shaped community composition. We show that homogeneous selection was a dominant assembly process with respect to both soil pH and land-use regime, but that these two factors interacted in their effects on bacterial community assembly. We conclude that bacterial community assembly at a regional scale is driven by dispersal from regional species pools and local selection on the basis of soil pH and other soil characteristics that vary with land-use.
Collapse
Affiliation(s)
- Samuel E Barnett
- School of Integrative Plant Science, Cornell University, 306 Tower Road, Bradfield Hall, Ithaca, NY, USA 14853, Ithaca, NY, USA
| | - Nicholas D Youngblut
- School of Integrative Plant Science, Cornell University, 306 Tower Road, Bradfield Hall, Ithaca, NY, USA 14853, Ithaca, NY, USA
- Department of Microbiome Science, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Daniel H Buckley
- School of Integrative Plant Science, Cornell University, 306 Tower Road, Bradfield Hall, Ithaca, NY, USA 14853, Ithaca, NY, USA
| |
Collapse
|
15
|
Naser-Khdour S, Minh BQ, Zhang W, Stone EA, Lanfear R. The Prevalence and Impact of Model Violations in Phylogenetic Analysis. Genome Biol Evol 2019; 11:3341-3352. [PMID: 31536115 PMCID: PMC6893154 DOI: 10.1093/gbe/evz193] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/03/2019] [Indexed: 12/24/2022] Open
Abstract
In phylogenetic inference, we commonly use models of substitution which assume that sequence evolution is stationary, reversible, and homogeneous (SRH). Although the use of such models is often criticized, the extent of SRH violations and their effects on phylogenetic inference of tree topologies and edge lengths are not well understood. Here, we introduce and apply the maximal matched-pairs tests of homogeneity to assess the scale and impact of SRH model violations on 3,572 partitions from 35 published phylogenetic data sets. We show that roughly one-quarter of all the partitions we analyzed (23.5%) reject the SRH assumptions, and that for 25% of data sets, tree topologies inferred from all partitions differ significantly from topologies inferred using the subset of partitions that do not reject the SRH assumptions. This proportion increases when comparing trees inferred using the subset of partitions that rejects the SRH assumptions, to those inferred from partitions that do not reject the SRH assumptions. These results suggest that the extent and effects of model violation in phylogenetics may be substantial. They highlight the importance of testing for model violations and possibly excluding partitions that violate models prior to tree reconstruction. Our results also suggest that further effort in developing models that do not require SRH assumptions could lead to large improvements in the accuracy of phylogenomic inference. The scripts necessary to perform the analysis are available in https://github.com/roblanf/SRHtests, and the new tests we describe are available as a new option in IQ-TREE (http://www.iqtree.org).
Collapse
Affiliation(s)
- Suha Naser-Khdour
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Bui Quang Minh
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
- Research School of Computer Science, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Wenqi Zhang
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Eric A Stone
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Robert Lanfear
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| |
Collapse
|
16
|
Lathifah AN, Guo Y, Sakagami N, Suda W, Higuchi M, Nishizawa T, Prijambada ID, Ohta H. Comparative Characterization of Bacterial Communities in Moss-Covered and Unvegetated Volcanic Deposits of Mount Merapi, Indonesia. Microbes Environ 2019; 34:268-277. [PMID: 31327812 PMCID: PMC6759343 DOI: 10.1264/jsme2.me19041] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Accepted: 05/15/2019] [Indexed: 01/30/2023] Open
Abstract
Microbial colonization, followed by succession, on newly exposed volcanic substrates represents the beginning of the development of an early ecosystem. During early succession, colonization by mosses or plants significantly alters the pioneer microbial community composition through the photosynthetic carbon input. To provide further insights into this process, we investigated the three-year-old volcanic deposits of Mount Merapi, Indonesia. Samples were collected from unvegetated (BRD) and moss-covered (BRUD) sites. Forest site soil (FRS) near the volcanic deposit-covered area was also collected for reference. An analysis of BRD and BRUD revealed high culturable cell densities (1.7-8.5×105 CFU g-1) despite their low total C (<0.01%). FRS possessed high CFU (3×106 g-1); however, its relative value per unit of total C (2.6%) was lower than that of the deposit samples. Based on the tag pyrosequencing of 16S rRNA genes, the BRD bacterial community was characterized by a higher number of betaproteobacterial families (or genus), represented by chemolithotrophic Methylophilaceae, Leptothrix, and Sulfuricellaceae. In contrast, BRUD was predominated by different betaproteobacterial families, such as Oxalobacteraceae, Comamonadaceae, and Rhodocyclaceae. Some bacterial (Oxalobacteraceae) sequences were phylogenetically related to those of known moss-associated bacteria. Within the FRS community, Proteobacteria was the most abundant phylum, followed by Acidobacteria, whereas Burkholderiaceae was the most dominant bacterial family within FRS. These results suggest that an inter-family succession of Betaproteobacteria occurred in response to colonization by mosses, followed by plants.
Collapse
Affiliation(s)
- Annisa N. Lathifah
- United Graduate School of Agricultural Science, Tokyo University of Agriculture and Technology3–5–8 Saiwai-cho, Fuchu-shi, Tokyo 183–8509Japan
- Ibaraki University College of Agriculture3–21–1 Chuo, Ami-machi, Ibaraki 300–0393Japan
| | - Yong Guo
- Ibaraki University College of Agriculture3–21–1 Chuo, Ami-machi, Ibaraki 300–0393Japan
| | - Nobuo Sakagami
- United Graduate School of Agricultural Science, Tokyo University of Agriculture and Technology3–5–8 Saiwai-cho, Fuchu-shi, Tokyo 183–8509Japan
- Ibaraki University College of Agriculture3–21–1 Chuo, Ami-machi, Ibaraki 300–0393Japan
| | - Wataru Suda
- Department of Computational Biology, Graduate School of Frontier Science, The University of TokyoKashiwaJapan
| | - Masanobu Higuchi
- Department of Botany, National Museum of Nature and Science4–1–1, Amakubo, IbarakiJapan
| | - Tomoyasu Nishizawa
- United Graduate School of Agricultural Science, Tokyo University of Agriculture and Technology3–5–8 Saiwai-cho, Fuchu-shi, Tokyo 183–8509Japan
- Ibaraki University College of Agriculture3–21–1 Chuo, Ami-machi, Ibaraki 300–0393Japan
| | - Irfan D. Prijambada
- Graduate School of Biotechnology, University of Gadjah MadaYogyakartaIndonesia
| | - Hiroyuki Ohta
- United Graduate School of Agricultural Science, Tokyo University of Agriculture and Technology3–5–8 Saiwai-cho, Fuchu-shi, Tokyo 183–8509Japan
- Ibaraki University College of Agriculture3–21–1 Chuo, Ami-machi, Ibaraki 300–0393Japan
| |
Collapse
|
17
|
Reinharz V, Soulé A, Westhof E, Waldispühl J, Denise A. Mining for recurrent long-range interactions in RNA structures reveals embedded hierarchies in network families. Nucleic Acids Res 2019; 46:3841-3851. [PMID: 29608773 PMCID: PMC5934684 DOI: 10.1093/nar/gky197] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2017] [Accepted: 03/22/2018] [Indexed: 11/14/2022] Open
Abstract
The wealth of the combinatorics of nucleotide base pairs enables RNA molecules to assemble into sophisticated interaction networks, which are used to create complex 3D substructures. These interaction networks are essential to shape the 3D architecture of the molecule, and also to provide the key elements to carry molecular functions such as protein or ligand binding. They are made of organised sets of long-range tertiary interactions which connect distinct secondary structure elements in 3D structures. Here, we present a de novo data-driven approach to extract automatically from large data sets of full RNA 3D structures the recurrent interaction networks (RINs). Our methodology enables us for the first time to detect the interaction networks connecting distinct components of the RNA structure, highlighting their diversity and conservation through non-related functional RNAs. We use a graphical model to perform pairwise comparisons of all RNA structures available and to extract RINs and modules. Our analysis yields a complete catalog of RNA 3D structures available in the Protein Data Bank and reveals the intricate hierarchical organization of the RNA interaction networks and modules. We assembled our results in an online database (http://carnaval.lri.fr) which will be regularly updated. Within the site, a tool allows users with a novel RNA structure to detect automatically whether the novel structure contains previously observed RINs.
Collapse
Affiliation(s)
- Vladimir Reinharz
- Department of Computer Science, Ben-Gurion University of the Negev, P.O.B. 653 Beer-Sheva, 84105, Israel.,School of Computer Science, McGill University, 3480 University, Montreal, Quebec H3A 0E9, Canada
| | - Antoine Soulé
- School of Computer Science, McGill University, 3480 University, Montreal, Quebec H3A 0E9, Canada.,LIX, École Polytechnique, CNRS, Inria, Palaiseau 91120, France
| | - Eric Westhof
- ARN, Université de Strasbourg, IBMC-CNRS, 15 rue René Descartes, Strasbourg Cedex 67084, France
| | - Jérôme Waldispühl
- School of Computer Science, McGill University, 3480 University, Montreal, Quebec H3A 0E9, Canada
| | - Alain Denise
- LRI, Université Paris-Sud, CNRS, Université Paris-Saclay, Bâtiment 650, Orsay cedex 91405, France.,I2BC, Université Paris-Sud, CNRS, CEA, Université Paris-Saclay, Bâtiment 400, Orsay cedex 91405, France
| |
Collapse
|
18
|
Kumar A, Vyas P, Malla MA, Dubey A. Taxonomic and Functional Annotation of Termite Degraded Butea monosperma (Lam.) Kuntze (Flame of the Forest). Open Microbiol J 2019. [DOI: 10.2174/1874285801913010154] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Background:
Butea monosperma is an economically and medicinally important plant that grows all over India, however, the plant is highly susceptible to termite attack. The present study unravelled the bacterial community composition and their functional attributions from the termite degraded Butea.
Methods:
Total genomic DNA from termite degraded Butea monosperma samples was extracted and subjected to sequencing on Illumina's Miseq. The raw and unassembled reads obtained from high-throughput sequencing were used for taxonomic and functional profiling using different online and stand-alone softwares. Moreover, to ascertain the effect of different geographical locations and environmental factors, comparative analysis was performed using four other publically available metagenomes.
Results:
The higher abundance of Actinobacteria (21.27%), Proteobacteria (14.18%), Firmicutes (10.46%), and Bacteroidetes (4.11%) was found at the phylum level. The genus level was dominated by Bacillus (4.33%), Gemmatimonas (3.13%), Mycobacterium (1.82%), Acidimicrobium (1.69%), Thermoleophilum (1.23%), Nocardioides (1.44%), Terrimonas and Acidithermus (1.09%) and Clostridium (1.05%). Functional annotation of the termite degraded B. monosperma metagenome revealed a high abundance of ammonia oxidizers, sulfate reducers, dehalogenators, nitrate reducers, sulfide oxidizers, xylan degraders, nitrogen fixers and chitin degraders.
Conclusion:
The present study highlights the significance of the inherent microbiome of the degraded Butea shaping the microbial communities for effective degradation of biomass and different environmental toxicants. The unknown bacterial communities present in the sample can serve as enzyme sources for lignocelluloses degradation for biofuel production.
Collapse
|
19
|
Beleva Guthrie V, Masica DL, Fraser A, Federico J, Fan Y, Camps M, Karchin R. Network Analysis of Protein Adaptation: Modeling the Functional Impact of Multiple Mutations. Mol Biol Evol 2019. [PMID: 29522102 PMCID: PMC5967520 DOI: 10.1093/molbev/msy036] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
The evolution of new biochemical activities frequently involves complex dependencies between mutations and rapid evolutionary radiation. Mutation co-occurrence and covariation have previously been used to identify compensating mutations that are the result of physical contacts and preserve protein function and fold. Here, we model pairwise functional dependencies and higher order interactions that enable evolution of new protein functions. We use a network model to find complex dependencies between mutations resulting from evolutionary trade-offs and pleiotropic effects. We present a method to construct these networks and to identify functionally interacting mutations in both extant and reconstructed ancestral sequences (Network Analysis of Protein Adaptation). The time ordering of mutations can be incorporated into the networks through phylogenetic reconstruction. We apply NAPA to three distantly homologous β-lactamase protein clusters (TEM, CTX-M-3, and OXA-51), each of which has experienced recent evolutionary radiation under substantially different selective pressures. By analyzing the network properties of each protein cluster, we identify key adaptive mutations, positive pairwise interactions, different adaptive solutions to the same selective pressure, and complex evolutionary trajectories likely to increase protein fitness. We also present evidence that incorporating information from phylogenetic reconstruction and ancestral sequence inference can reduce the number of spurious links in the network, whereas preserving overall network community structure. The analysis does not require structural or biochemical data. In contrast to function-preserving mutation dependencies, which are frequently from structural contacts, gain-of-function mutation dependencies are most commonly between residues distal in protein structure.
Collapse
Affiliation(s)
- Violeta Beleva Guthrie
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD
| | - David L Masica
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD
| | - Andrew Fraser
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD
| | - Joseph Federico
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD
| | - Yunfan Fan
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD
| | - Manel Camps
- Department of Environmental Toxicology, University of California Santa Cruz, Santa Cruz, CA
| | - Rachel Karchin
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD.,Department of Oncology, Johns Hopkins University Medicine, Baltimore, MD
| |
Collapse
|
20
|
Thiel BC, Beckmann IK, Kerpedjiev P, Hofacker IL. 3D based on 2D: Calculating helix angles and stacking patterns using forgi 2.0, an RNA Python library centered on secondary structure elements. F1000Res 2019; 8:ISCB Comm J-287. [PMID: 31069053 PMCID: PMC6480952 DOI: 10.12688/f1000research.18458.2] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/04/2019] [Indexed: 01/01/2023] Open
Abstract
We present forgi, a Python library to analyze the tertiary structure of RNA secondary structure elements. Our representation of an RNA molecule is centered on secondary structure elements (stems, bulges and loops). By fitting a cylinder to the helix axis, these elements are carried over into a coarse-grained 3D structure representation. Integration with Biopython allows for handling of all-atom 3D information. forgi can deal with a variety of file formats including dotbracket strings, PDB and MMCIF files. We can handle modified residues, missing residues, cofold and multifold structures as well as nucleotide numbers starting at arbitrary positions. We apply this library to the study of stacking helices in junctions and pseudoknots and investigate how far stacking helices in solved experimental structures can divert from coaxial geometries.
Collapse
Affiliation(s)
- Bernhard C. Thiel
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Vienna, 1090, Austria
| | - Irene K. Beckmann
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Vienna, 1090, Austria
| | - Peter Kerpedjiev
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, 02115, USA
| | - Ivo L. Hofacker
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Vienna, 1090, Austria
- Research Group Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Vienna, 1090, Austria
| |
Collapse
|
21
|
Thiel BC, Beckmann IK, Kerpedjiev P, Hofacker IL. 3D based on 2D: Calculating helix angles and stacking patterns using forgi 2.0, an RNA Python library centered on secondary structure elements. F1000Res 2019; 8:ISCB Comm J-287. [PMID: 31069053 PMCID: PMC6480952 DOI: 10.12688/f1000research.18458.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/06/2019] [Indexed: 10/12/2023] Open
Abstract
We present forgi, a Python library to analyze the tertiary structure of RNA secondary structure elements. Our representation of an RNA molecule is centered on secondary structure elements (stems, bulges and loops). By fitting a cylinder to the helix axis, these elements are carried over into a coarse-grained 3D structure representation. Integration with Biopython allows for handling of all-atom 3D information. forgi can deal with a variety of file formats including dotbracket strings, PDB and MMCIF files. We can handle modified residues, missing residues, cofold and multifold structures as well as nucleotide numbers starting at arbitrary positions. We apply this library to the study of stacking helices in junctions and pseudo knots and investigate how far stacking helices in solved experimental structures can divert from coaxial geometries.
Collapse
Affiliation(s)
- Bernhard C. Thiel
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Vienna, 1090, Austria
| | - Irene K. Beckmann
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Vienna, 1090, Austria
| | - Peter Kerpedjiev
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, 02115, USA
| | - Ivo L. Hofacker
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Vienna, 1090, Austria
- Research Group Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Vienna, 1090, Austria
| |
Collapse
|
22
|
Berube PM, Rasmussen A, Braakman R, Stepanauskas R, Chisholm SW. Emergence of trait variability through the lens of nitrogen assimilation in Prochlorococcus. eLife 2019; 8:41043. [PMID: 30706847 PMCID: PMC6370341 DOI: 10.7554/elife.41043] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 01/31/2019] [Indexed: 12/12/2022] Open
Abstract
Intraspecific trait variability has important consequences for the function and stability of marine ecosystems. Here we examine variation in the ability to use nitrate across hundreds of Prochlorococcus genomes to better understand the modes of evolution influencing intraspecific allocation of ecologically important functions. Nitrate assimilation genes are absent in basal lineages but occur at an intermediate frequency that is randomly distributed within recently emerged clades. The distribution of nitrate assimilation genes within clades appears largely governed by vertical inheritance, gene loss, and homologous recombination. By mapping this process onto a model of Prochlorococcus’ macroevolution, we propose that niche-constructing adaptive radiations and subsequent niche partitioning set the stage for loss of nitrate assimilation genes from basal lineages as they specialized to lower light levels. Retention of these genes in recently emerged lineages has likely been facilitated by selection as they sequentially partitioned into niches where nitrate assimilation conferred a fitness benefit.
Collapse
Affiliation(s)
- Paul M Berube
- Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, United States
| | - Anna Rasmussen
- Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, United States
| | - Rogier Braakman
- Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, United States
| | | | - Sallie W Chisholm
- Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, United States.,Department of Biology, Massachusetts Institute of Technology, Cambridge, United States
| |
Collapse
|
23
|
Feye KM, Ricke SC. Establishment of a Standardized 16S rDNA Library Preparation to Enable Analysis of Microbiome in Poultry Processing Using Illumina MiSeq Platform. Methods Mol Biol 2019; 1918:213-227. [PMID: 30580412 DOI: 10.1007/978-1-4939-9000-9_18] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The standardization of the microbiome sequencing of poultry rinsates is essential for generating comparable microbial composition data among poultry processing facilities if this technology is to be adopted by the industry. Samples must first be acquired, DNA must be extracted, and libraries must be constructed. In order to proceed to library sequencing, the samples should meet quality control standards. Finally, data must be analyzed using computer bioinformatics pipelines. This data can subsequently be incorporated into more advanced computer algorithms for risk assessment. Ultimately, a uniform sequencing pipeline will enable both the government regulatory agencies and the poultry industry to identify potential weaknesses in food safety. This chapter presents the different steps for monitoring the population dynamics of the microbiome in poultry processing using 16S rDNA sequencing.
Collapse
Affiliation(s)
- Kristina M Feye
- Department of Food Science, Center for Food Safety, University of Arkansas, Fayetteville, AR, USA
| | - Steven C Ricke
- Department of Food Science, Center for Food Safety, University of Arkansas, Fayetteville, AR, USA.
| |
Collapse
|
24
|
Banos S, Lentendu G, Kopf A, Wubet T, Glöckner FO, Reich M. A comprehensive fungi-specific 18S rRNA gene sequence primer toolkit suited for diverse research issues and sequencing platforms. BMC Microbiol 2018; 18:190. [PMID: 30458701 PMCID: PMC6247509 DOI: 10.1186/s12866-018-1331-4] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 10/30/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Several fungi-specific primers target the 18S rRNA gene sequence, one of the prominent markers for fungal classification. The design of most primers goes back to the last decades. Since then, the number of sequences in public databases increased leading to the discovery of new fungal groups and changes in fungal taxonomy. However, no reevaluation of primers was carried out and relevant information on most primers is missing. With this study, we aimed to develop an 18S rRNA gene sequence primer toolkit allowing an easy selection of the best primer pair appropriate for different sequencing platforms, research aims (biodiversity assessment versus isolate classification) and target groups. RESULTS We performed an intensive literature research, reshuffled existing primers into new pairs, designed new Illumina-primers, and annealing blocking oligonucleotides. A final number of 439 primer pairs were subjected to in silico PCRs. Best primer pairs were selected and experimentally tested. The most promising primer pair with a small amplicon size, nu-SSU-1333-5'/nu-SSU-1647-3' (FF390/FR-1), was successful in describing fungal communities by Illumina sequencing. Results were confirmed by a simultaneous metagenomics and eukaryote-specific primer approach. Co-amplification occurred in all sample types but was effectively reduced by blocking oligonucleotides. CONCLUSIONS The compiled data revealed the presence of an enormous diversity of fungal 18S rRNA gene primer pairs in terms of fungal coverage, phylum spectrum and co-amplification. Therefore, the primer pair has to be carefully selected to fulfill the requirements of the individual research projects. The presented primer toolkit offers comprehensive lists of 164 primers, 439 primer combinations, 4 blocking oligonucleotides, and top primer pairs holding all relevant information including primer's characteristics and performance to facilitate primer pair selection.
Collapse
Affiliation(s)
- Stefanos Banos
- Molecular Ecology, Institute of Ecology, FB02, University of Bremen, Leobener Str. 2, 28359, Bremen, Germany
| | - Guillaume Lentendu
- Department of Soil Ecology, Helmholtz Centre for Environmental Research GmbH - UFZ, Halle-Saale, Germany.,Department of Ecology, University of Kaiserslautern, Kaiserslautern, Germany
| | - Anna Kopf
- Microbial Genomics and Bioinformatics Research Group, Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - Tesfaye Wubet
- Department of Soil Ecology, Helmholtz Centre for Environmental Research GmbH - UFZ, Halle-Saale, Germany.,Present address: Department of Community Ecology, Helmholtz Centre for Environmental Research GmbH - UFZ, Halle-Saale, Germany.,German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
| | - Frank Oliver Glöckner
- Microbial Genomics and Bioinformatics Research Group, Max Planck Institute for Marine Microbiology, Bremen, Germany.,Department of Life Sciences and Chemistry, Jacobs University Bremen gGmbH, Bremen, Germany
| | - Marlis Reich
- Molecular Ecology, Institute of Ecology, FB02, University of Bremen, Leobener Str. 2, 28359, Bremen, Germany.
| |
Collapse
|
25
|
Ying H, Cooke I, Sprungala S, Wang W, Hayward DC, Tang Y, Huttley G, Ball EE, Forêt S, Miller DJ. Comparative genomics reveals the distinct evolutionary trajectories of the robust and complex coral lineages. Genome Biol 2018; 19:175. [PMID: 30384840 PMCID: PMC6214176 DOI: 10.1186/s13059-018-1552-8] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2018] [Accepted: 09/28/2018] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Despite the biological and economic significance of scleractinian reef-building corals, the lack of large molecular datasets for a representative range of species limits understanding of many aspects of their biology. Within the Scleractinia, based on molecular evidence, it is generally recognised that there are two major clades, Complexa and Robusta, but the genomic bases of significant differences between them remain unclear. RESULTS Draft genome assemblies and annotations were generated for three coral species: Galaxea fascicularis (Complexa), Fungia sp., and Goniastrea aspera (Robusta). Whilst phylogenetic analyses strongly support a deep split between Complexa and Robusta, synteny analyses reveal a high level of gene order conservation between all corals, but not between corals and sea anemones or between sea anemones. HOX-related gene clusters are, however, well preserved across all of these combinations. Differences between species are apparent in the distribution and numbers of protein domains and an apparent correlation between number of HSP20 proteins and stress tolerance. Uniquely amongst animals, a complete histidine biosynthesis pathway is present in robust corals but not in complex corals or sea anemones. This pathway appears to be ancestral, and its retention in the robust coral lineage has important implications for coral nutrition and symbiosis. CONCLUSIONS The availability of three new coral genomes enabled recognition of a de novo histidine biosynthesis pathway in robust corals which is only the second identified biosynthetic difference between corals. These datasets provide a platform for understanding many aspects of coral biology, particularly the interactions of corals with their endosymbionts.
Collapse
Affiliation(s)
- Hua Ying
- Division of Ecology and Evolution, Research School of Biology, Australian National University, Acton, ACT 2601 Australia
| | - Ira Cooke
- Comparative Genomics Centre, Department of Molecular and Cell Biology, James Cook University, Townsville, QLD 4811 Australia
| | - Susanne Sprungala
- Comparative Genomics Centre, Department of Molecular and Cell Biology, James Cook University, Townsville, QLD 4811 Australia
| | - Weiwen Wang
- Division of Ecology and Evolution, Research School of Biology, Australian National University, Acton, ACT 2601 Australia
| | - David C. Hayward
- Division of Ecology and Evolution, Research School of Biology, Australian National University, Acton, ACT 2601 Australia
| | - Yurong Tang
- Division of Ecology and Evolution, Research School of Biology, Australian National University, Acton, ACT 2601 Australia
- Computational Biology and Bioinformatics Unit, Research School of Biology, Australian National University, Acton, ACT 2601 Australia
| | - Gavin Huttley
- Division of Ecology and Evolution, Research School of Biology, Australian National University, Acton, ACT 2601 Australia
- Computational Biology and Bioinformatics Unit, Research School of Biology, Australian National University, Acton, ACT 2601 Australia
| | - Eldon E. Ball
- Division of Ecology and Evolution, Research School of Biology, Australian National University, Acton, ACT 2601 Australia
- ARC Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, QLD 4811 Australia
| | - Sylvain Forêt
- Division of Ecology and Evolution, Research School of Biology, Australian National University, Acton, ACT 2601 Australia
- ARC Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, QLD 4811 Australia
| | - David J. Miller
- Comparative Genomics Centre, Department of Molecular and Cell Biology, James Cook University, Townsville, QLD 4811 Australia
- ARC Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, QLD 4811 Australia
| |
Collapse
|
26
|
Aun E, Brauer A, Kisand V, Tenson T, Remm M. A k-mer-based method for the identification of phenotype-associated genomic biomarkers and predicting phenotypes of sequenced bacteria. PLoS Comput Biol 2018; 14:e1006434. [PMID: 30346947 PMCID: PMC6211763 DOI: 10.1371/journal.pcbi.1006434] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Revised: 11/01/2018] [Accepted: 08/15/2018] [Indexed: 11/18/2022] Open
Abstract
We have developed an easy-to-use and memory-efficient method called PhenotypeSeeker that (a) identifies phenotype-specific k-mers, (b) generates a k-mer-based statistical model for predicting a given phenotype and (c) predicts the phenotype from the sequencing data of a given bacterial isolate. The method was validated on 167 Klebsiella pneumoniae isolates (virulence), 200 Pseudomonas aeruginosa isolates (ciprofloxacin resistance) and 459 Clostridium difficile isolates (azithromycin resistance). The phenotype prediction models trained from these datasets obtained the F1-measure of 0.88 on the K. pneumoniae test set, 0.88 on the P. aeruginosa test set and 0.97 on the C. difficile test set. The F1-measures were the same for assembled sequences and raw sequencing data; however, building the model from assembled genomes is significantly faster. On these datasets, the model building on a mid-range Linux server takes approximately 3 to 5 hours per phenotype if assembled genomes are used and 10 hours per phenotype if raw sequencing data are used. The phenotype prediction from assembled genomes takes less than one second per isolate. Thus, PhenotypeSeeker should be well-suited for predicting phenotypes from large sequencing datasets. PhenotypeSeeker is implemented in Python programming language, is open-source software and is available at GitHub (https://github.com/bioinfo-ut/PhenotypeSeeker/).
Collapse
Affiliation(s)
- Erki Aun
- Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
- * E-mail:
| | - Age Brauer
- Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Veljo Kisand
- Institute of Technology, University of Tartu, Tartu, Estonia
| | - Tanel Tenson
- Institute of Technology, University of Tartu, Tartu, Estonia
| | - Maido Remm
- Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| |
Collapse
|
27
|
Kunzmann P, Hamacher K. Biotite: a unifying open source computational biology framework in Python. BMC Bioinformatics 2018; 19:346. [PMID: 30285630 PMCID: PMC6167853 DOI: 10.1186/s12859-018-2367-z] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Accepted: 09/10/2018] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND As molecular biology is creating an increasing amount of sequence and structure data, the multitude of software to analyze this data is also rising. Most of the programs are made for a specific task, hence the user often needs to combine multiple programs in order to reach a goal. This can make the data processing unhandy, inflexible and even inefficient due to an overhead of read/write operations. Therefore, it is crucial to have a comprehensive, accessible and efficient computational biology framework in a scripting language to overcome these limitations. RESULTS We have developed the Python package Biotite: a general computational biology framework, that represents sequence and structure data based on NumPyndarrays. Furthermore the package contains seamless interfaces to biological databases and external software. The source code is freely accessible at https://github.com/biotite-dev/biotite . CONCLUSIONS Biotite is unifying in two ways: At first it bundles popular tasks in sequence analysis and structural bioinformatics in a consistently structured package. Secondly it adresses two groups of users: novice programmers get an easy access to Biotite due to its simplicity and the comprehensive documentation. On the other hand, advanced users can profit from its high performance and extensibility. They can implement their algorithms upon Biotite, so they can skip writing code for general functionality (like file parsers) and can focus on what their software makes unique.
Collapse
Affiliation(s)
- Patrick Kunzmann
- Department of Computational Biology and Simulation, TU Darmstadt, Schnittspahnstraße 2, Darmstadt, 64287, Germany.
| | - Kay Hamacher
- Department of Computational Biology and Simulation, TU Darmstadt, Schnittspahnstraße 2, Darmstadt, 64287, Germany
| |
Collapse
|
28
|
Sutherland TD, Sriskantha A, Rapson TD, Kaehler BD, Huttley GA. Did aculeate silk evolve as an antifouling material? PLoS One 2018; 13:e0203948. [PMID: 30240428 PMCID: PMC6150510 DOI: 10.1371/journal.pone.0203948] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 08/30/2018] [Indexed: 01/23/2023] Open
Abstract
Many of the challenges we currently face as an advanced society have been solved in unique ways by biological systems. One such challenge is developing strategies to avoid microbial infection. Social aculeates (wasps, bees and ants) mitigate the risk of infection to their colonies using a wide range of adaptations and mechanisms. These adaptations and mechanisms are reliant on intricate social structures and are energetically costly for the colony. It seems likely that these species must have had alternative and simpler mechanisms in place to ensure the maintenance of hygienic domicile conditions prior to the evolution of these complex behaviours. Features of the aculeate coiled-coil silk proteins are reminiscent of those of naturally occurring α-helical antimicrobial peptides (AMPs). In this study, we demonstrate that peptides derived from the aculeate silk proteins have antimicrobial activity. We reconstruct the predicted ancestral silk sequences of an aculeate ancestor that pre-dates the evolution of sociality and demonstrate that these ancestral sequences also contained peptides with antimicrobial properties. It is possible that the silks evolved as an antifouling material and facilitated the evolution of sociality. These materials serve as model materials for consideration in future biomaterial development.
Collapse
Affiliation(s)
- Tara D. Sutherland
- CSIRO (The Commonwealth Scientific and Industrial Research Organisation), Health and Biosecurity, Canberra, Australian Capital Territory, Australia
| | - Alagacone Sriskantha
- CSIRO (The Commonwealth Scientific and Industrial Research Organisation), Health and Biosecurity, Canberra, Australian Capital Territory, Australia
| | - Trevor D. Rapson
- CSIRO (The Commonwealth Scientific and Industrial Research Organisation), Health and Biosecurity, Canberra, Australian Capital Territory, Australia
| | - Benjamin D. Kaehler
- Research School of Biology, Australian National University, Australian Capital Territory, Australia
| | - Gavin A. Huttley
- Research School of Biology, Australian National University, Australian Capital Territory, Australia
| |
Collapse
|
29
|
Kaehler BD, Yap VB, Huttley GA. Standard Codon Substitution Models Overestimate Purifying Selection for Nonstationary Data. Genome Biol Evol 2018; 9:134-149. [PMID: 28175284 PMCID: PMC5381540 DOI: 10.1093/gbe/evw308] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/02/2017] [Indexed: 01/28/2023] Open
Abstract
Estimation of natural selection on protein-coding sequences is a key comparative genomics approach for de novo prediction of lineage-specific adaptations. Selective pressure is measured on a per-gene basis by comparing the rate of nonsynonymous substitutions to the rate of synonymous substitutions. All published codon substitution models have been time-reversible and thus assume that sequence composition does not change over time. We previously demonstrated that if time-reversible DNA substitution models are applied in the presence of changing sequence composition, the number of substitutions is systematically biased towards overestimation. We extend these findings to the case of codon substitution models and further demonstrate that the ratio of nonsynonymous to synonymous rates of substitution tends to be underestimated over three data sets of mammals, vertebrates, and insects. Our basis for comparison is a nonstationary codon substitution model that allows sequence composition to change. Goodness-of-fit results demonstrate that our new model tends to fit the data better. Direct measurement of nonstationarity shows that bias in estimates of natural selection and genetic distance increases with the degree of violation of the stationarity assumption. Additionally, inferences drawn under time-reversible models are systematically affected by compositional divergence. As genomic sequences accumulate at an accelerating rate, the importance of accurate de novo estimation of natural selection increases. Our results establish that our new model provides a more robust perspective on this fundamental quantity.
Collapse
Affiliation(s)
- Benjamin D Kaehler
- Research School of Biology, College of Medicine, Biology, and Environment, Australian National University, Canberra, ACT, Australia
| | - Von Bing Yap
- Department of Statistics and Applied Probability, National University of Singapore, Singapore, Singapore
| | - Gavin A Huttley
- Research School of Biology, College of Medicine, Biology, and Environment, Australian National University, Canberra, ACT, Australia
| |
Collapse
|
30
|
Koo H, Hakim JA, Morrow CD, Andersen DT, Bej AK. Microbial Community Composition and Predicted Functional Attributes of Antarctic Lithobionts Using Targeted Next-Generation Sequencing and Bioinformatics Tools. J Microbiol Methods 2018. [DOI: 10.1016/bs.mim.2018.06.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
31
|
Bruneau M, Mottet T, Moulin S, Kerbiriou M, Chouly F, Chretien S, Guyeux C. A clustering package for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Model. Comput Biol Med 2017; 93:66-74. [PMID: 29288886 DOI: 10.1016/j.compbiomed.2017.12.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Revised: 12/08/2017] [Accepted: 12/09/2017] [Indexed: 11/25/2022]
Abstract
In this article, a new Python package for nucleotide sequences clustering is proposed. This package, freely available on-line, implements a Laplacian eigenmap embedding and a Gaussian Mixture Model for DNA clustering. It takes nucleotide sequences as input, and produces the optimal number of clusters along with a relevant visualization. Despite the fact that we did not optimise the computational speed, our method still performs reasonably well in practice. Our focus was mainly on data analytics and accuracy and as a result, our approach outperforms the state of the art, even in the case of divergent sequences. Furthermore, an a priori knowledge on the number of clusters is not required here. For the sake of illustration, this method is applied on a set of 100 DNA sequences taken from the mitochondrially encoded NADH dehydrogenase 3 (ND3) gene, extracted from a collection of Platyhelminthes and Nematoda species. The resulting clusters are tightly consistent with the phylogenetic tree computed using a maximum likelihood approach on gene alignment. They are coherent too with the NCBI taxonomy. Further test results based on synthesized data are then provided, showing that the proposed approach is better able to recover the clusters than the most widely used software, namely Cd-hit-est and BLASTClust.
Collapse
Affiliation(s)
- Marine Bruneau
- Laboratoire de Mathématiques de Besançon, UMR 6623 CNRS, France; Université de Bourgogne Franche-Comté, 16 route de Gray, 25030 Besançon, France
| | - Thierry Mottet
- Computer Science Department, FEMTO-ST Institute, UMR 6174 CNRS, France; Université de Bourgogne Franche-Comté, 16 route de Gray, 25030 Besançon, France
| | - Serge Moulin
- Computer Science Department, FEMTO-ST Institute, UMR 6174 CNRS, France; Université de Bourgogne Franche-Comté, 16 route de Gray, 25030 Besançon, France.
| | - Maël Kerbiriou
- Laboratoire de Mathématiques de Besançon, UMR 6623 CNRS, France; Université de Bourgogne Franche-Comté, 16 route de Gray, 25030 Besançon, France
| | - Franz Chouly
- Laboratoire de Mathématiques de Besançon, UMR 6623 CNRS, France; Université de Bourgogne Franche-Comté, 16 route de Gray, 25030 Besançon, France
| | - Stéphane Chretien
- National Physical Laboratory, Hampton Road, Teddington, United Kingdom
| | - Christophe Guyeux
- Computer Science Department, FEMTO-ST Institute, UMR 6174 CNRS, France; Université de Bourgogne Franche-Comté, 16 route de Gray, 25030 Besançon, France
| |
Collapse
|
32
|
Robeson MS, Khanipov K, Golovko G, Wisely SM, White MD, Bodenchuck M, Smyser TJ, Fofanov Y, Fierer N, Piaggio AJ. Assessing the utility of metabarcoding for diet analyses of the omnivorous wild pig ( Sus scrofa). Ecol Evol 2017; 8:185-196. [PMID: 29321862 PMCID: PMC5756863 DOI: 10.1002/ece3.3638] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Revised: 10/11/2017] [Accepted: 10/20/2017] [Indexed: 01/20/2023] Open
Abstract
Wild pigs (Sus scrofa) are an invasive species descended from both domestic swine and Eurasian wild boar that was introduced to North America during the early 1500s. Wild pigs have since become the most abundant free‐ranging exotic ungulate in the United States. Large and ever‐increasing populations of wild pigs negatively impact agriculture, sport hunting, and native ecosystems with costs estimated to exceed $1.5 billion/year within the United States. Wild pigs are recognized as generalist feeders, able to exploit a broad array of locally available food resources, yet their feeding behaviors remain poorly understood as partially digested material is often unidentifiable through traditional stomach content analyses. To overcome the limitation of stomach content analyses, we developed a DNA sequencing‐based protocol to describe the plant and animal diet composition of wild pigs. Additionally, we developed and evaluated blocking primers to reduce the amplification and sequencing of host DNA, thus providing greater returns of sequences from diet items. We demonstrate that the use of blocking primers produces significantly more sequencing reads per sample from diet items, which increases the robustness of ascertaining animal diet composition with molecular tools. Further, we show that the overall plant and animal diet composition is significantly different between the three areas sampled, demonstrating this approach is suitable for describing differences in diet composition among the locations.
Collapse
Affiliation(s)
- Michael S Robeson
- Fish and Wildlife Conservation Biology Colorado State University Fort Collins CO USA.,USDA, Wildlife Services National Wildlife Research Center Wildlife Genetics Lab Fort Collins CO USA.,Present address: Department of Biomedical Informatics College of Medicine University of Arkansas for Medical Sciences Little Rock AR USA
| | - Kamil Khanipov
- Department of Pharmacology The University of Texas Medical Branch Galveston TX USA
| | - George Golovko
- Department of Pharmacology The University of Texas Medical Branch Galveston TX USA
| | - Samantha M Wisely
- Department of Wildlife Ecology and Conservation USA 5 USDA, Wildlife Services University of Florida Gainesville FL USA
| | | | | | - Timothy J Smyser
- USDA, Wildlife Services National Wildlife Research Center Wildlife Genetics Lab Fort Collins CO USA
| | - Yuriy Fofanov
- Department of Pharmacology The University of Texas Medical Branch Galveston TX USA
| | - Noah Fierer
- Department of Ecology and Evolutionary Biology Cooperative Institute for Research in Environmental Sciences University of Colorado Boulder CO USA
| | - Antoinette J Piaggio
- USDA, Wildlife Services National Wildlife Research Center Wildlife Genetics Lab Fort Collins CO USA
| |
Collapse
|
33
|
Czech L, Huerta-Cepas J, Stamatakis A. A Critical Review on the Use of Support Values in Tree Viewers and Bioinformatics Toolkits. Mol Biol Evol 2017; 34:1535-1542. [PMID: 28369572 PMCID: PMC5435079 DOI: 10.1093/molbev/msx055] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Phylogenetic trees are routinely visualized to present and interpret the evolutionary relationships of species. Most empirical evolutionary data studies contain a visualization of the inferred tree with branch support values. Ambiguous semantics in tree file formats can lead to erroneous tree visualizations and therefore to incorrect interpretations of phylogenetic analyses. Here, we discuss problems that arise when displaying branch values on trees after rerooting. Branch values are typically stored as node labels in the widely-used Newick tree format. However, such values are attributes of branches. Storing them as node labels can therefore yield errors when rerooting trees. This depends on the mostly implicit semantics that tools deploy to interpret node labels. We reviewed ten tree viewers and ten bioinformatics toolkits that can display and reroot trees. We found that 14 out of 20 of these tools do not permit users to select the semantics of node labels. Thus, unaware users might obtain incorrect results when rooting trees. We illustrate such incorrect mappings for several test cases and real examples taken from the literature. This review has already led to improvements in eight tools. We suggest tools should provide options that explicitly force users to define the semantics of node labels.
Collapse
Affiliation(s)
- Lucas Czech
- Scientific Computing Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Jaime Huerta-Cepas
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Alexandros Stamatakis
- Scientific Computing Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.,Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| |
Collapse
|
34
|
Kaehler BD. Full reconstruction of non-stationary strand-symmetric models on rooted phylogenies. J Theor Biol 2017; 420:144-151. [PMID: 28286217 DOI: 10.1016/j.jtbi.2017.03.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Revised: 03/06/2017] [Accepted: 03/08/2017] [Indexed: 10/20/2022]
Abstract
Understanding the evolutionary relationship among species is of fundamental importance to the biological sciences. The location of the root in any phylogenetic tree is critical as it gives an order to evolutionary events. None of the popular models of nucleotide evolution currently used in likelihood or Bayesian methods are able to infer the location of the root without exogenous information. It is known that the most general Markov models of nucleotide substitution also cannot identify the location of the root or be fitted to multiple sequence alignments with fewer than three sequences. We prove that the location of the root and the full model can be identified and statistically consistently estimated for a non-stationary, strand-symmetric substitution model given a multiple sequence alignment with two or more sequences. We also generalise earlier work to provide a practical means of overcoming the computationally intractable problem of labelling hidden states in a phylogenetic model.
Collapse
Affiliation(s)
- Benjamin D Kaehler
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia.
| |
Collapse
|
35
|
Liang C, Tseng HC, Chen HM, Wang WC, Chiu CM, Chang JY, Lu KY, Weng SL, Chang TH, Chang CH, Weng CT, Wang HM, Huang HD. Diversity and enterotype in gut bacterial community of adults in Taiwan. BMC Genomics 2017; 18:932. [PMID: 28198673 PMCID: PMC5310273 DOI: 10.1186/s12864-016-3261-6] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Background Gastrointestinal microbiota, particularly gut microbiota, is associated with human health. The biodiversity of gut microbiota is affected by ethnicities and environmental factors such as dietary habits or medicine intake, and three enterotypes of the human gut microbiome were announced in 2011. These enterotypes are not significantly correlated with gender, age, or body weight but are influenced by long-term dietary habits. However, to date, only two enterotypes (predominantly consisting of Bacteroides and Prevotella) have shown these characteristics in previous research; the third enterotype remains ambiguous. Understanding the enterotypes can improve the knowledge of the relationship between microbiota and human health. Results We obtained 181 human fecal samples from adults in Taiwan. Microbiota compositions were analyzed using next-generation sequencing (NGS) technology, which is a culture-independent method of constructing microbial community profiles by sequencing 16S ribosomal DNA (rDNA). In these samples, 17,675,898 sequencing reads were sequenced, and on average, 215 operational taxonomic units (OTUs) were identified for each sample. In this study, the major bacteria in the enterotypes identified from the fecal samples were Bacteroides, Prevotella, and Enterobacteriaceae, and their correlation with dietary habits was confirmed. A microbial interaction network in the gut was observed on the basis of the amount of short-chain fatty acids, pH value of the intestine, and composition of the bacterial community (enterotypes). Finally, a decision tree was derived to provide a predictive model for the three enterotypes. The accuracies of this model in training and independent testing sets were 97.2 and 84.0%, respectively. Conclusions We used NGS technology to characterize the microbiota and constructed a predictive model. The most significant finding was that Enterobacteriaceae, the predominant subtype, could be a new subtype of enterotypes in the Asian population. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3261-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Chao Liang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, HsinChu, Taiwan
| | | | - Hui-Mei Chen
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, HsinChu, Taiwan
| | | | | | | | - Kuan-Yi Lu
- Health GeneTech Corporation, Taoyuan, Taiwan
| | - Shun-Long Weng
- Department of Obstetrics and Gynecology, Hsinchu Mackay Memorial Hospital, Hsinchu, Taiwan.,Mackay Medicine, Nursing and Management College, Taipei, Taiwan.,Department of Medicine, Mackay Medical College, New Taipei City, Taiwan
| | - Tzu-Hao Chang
- Graduate Institute of Biomedical Informatics, Taipei Medical University, Taipei, Taiwan
| | - Chao-Hsiang Chang
- School of Pharmacy, College of Pharmacy, Taipei Medical University, Taipei, Taiwan
| | | | | | - Hsien-Da Huang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, HsinChu, Taiwan. .,Department of Biological Science and Technology, National Chiao Tung University, HsinChu, Taiwan. .,Department of Biomedical Science and Environmental Biology, Kaohsiung Medical University, Kaohsiung, Taiwan.
| |
Collapse
|
36
|
Statistical Methods for Identifying Sequence Motifs Affecting Point Mutations. Genetics 2016; 205:843-856. [PMID: 27974498 DOI: 10.1534/genetics.116.195677] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 12/01/2016] [Indexed: 11/18/2022] Open
Abstract
Mutation processes differ between types of point mutation, genomic locations, cells, and biological species. For some point mutations, specific neighboring bases are known to be mechanistically influential. Beyond these cases, numerous questions remain unresolved, including: what are the sequence motifs that affect point mutations? How large are the motifs? Are they strand symmetric? And, do they vary between samples? We present new log-linear models that allow explicit examination of these questions, along with sequence logo style visualization to enable identifying specific motifs. We demonstrate the performance of these methods by analyzing mutation processes in human germline and malignant melanoma. We recapitulate the known CpG effect, and identify novel motifs, including a highly significant motif associated with A[Formula: see text]G mutations. We show that major effects of neighbors on germline mutation lie within [Formula: see text] of the mutating base. Models are also presented for contrasting the entire mutation spectra (the distribution of the different point mutations). We show the spectra vary significantly between autosomes and X-chromosome, with a difference in T[Formula: see text]C transition dominating. Analyses of malignant melanoma confirmed reported characteristic features of this cancer, including statistically significant strand asymmetry, and markedly different neighboring influences. The methods we present are made freely available as a Python library https://bitbucket.org/pycogent3/mutationmotif.
Collapse
|
37
|
Martirosyan V, Unc A, Miller G, Doniger T, Wachtel C, Steinberger Y. Desert Perennial Shrubs Shape the Microbial-Community Miscellany in Laimosphere and Phyllosphere Space. MICROBIAL ECOLOGY 2016; 72:659-668. [PMID: 27450478 DOI: 10.1007/s00248-016-0822-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2016] [Accepted: 07/14/2016] [Indexed: 06/06/2023]
Abstract
Microbial function, composition, and distribution play a fundamental role in ecosystem ecology. The interaction between desert plants and their associated microbes is expected to greatly affect their response to changes in this harsh environment. Using comparative analyses, we studied the impact of three desert shrubs, Atriplex halimus (A), Artemisia herba-alba (AHA), and Hammada scoparia (HS), on soil- and leaf-associated microbial communities. DNA extracted from the leaf surface and soil samples collected beneath the shrubs were used to study associated microbial diversity using a sequencing survey of variable regions of bacterial 16S rRNA and fungal ribosomal internal transcribed spacer (ITS1). We found that the composition of bacterial and fungal orders is plant-type-specific, indicating that each plant type provides a suitable and unique microenvironment. The different adaptive ecophysiological properties of the three plant species and the differential effect on their associated microbial composition point to the role of adaptation in the shaping of microbial diversity. Overall, our findings suggest a link between plant ecophysiological adaptation as a "temporary host" and the biotic-community parameters in extreme xeric environments.
Collapse
Affiliation(s)
- Varsik Martirosyan
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, 5290002, Israel
- Life Sciences International Postgraduate Educational Center, Acharyan 31 Str., Yerevan, 0040, Armenia
| | - Adrian Unc
- Boreal Ecosystems Research Initiative, Memorial University of Newfoundland, Corner Brook, Newfoundland and Labrador, A2H 6P9, Canada
| | - Gad Miller
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, 5290002, Israel
| | - Tirza Doniger
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, 5290002, Israel
| | - Chaim Wachtel
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, 5290002, Israel
| | - Yosef Steinberger
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, 5290002, Israel.
| |
Collapse
|
38
|
Adaptive radiation by waves of gene transfer leads to fine-scale resource partitioning in marine microbes. Nat Commun 2016; 7:12860. [PMID: 27653556 PMCID: PMC5036157 DOI: 10.1038/ncomms12860] [Citation(s) in RCA: 89] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2016] [Accepted: 08/09/2016] [Indexed: 11/17/2022] Open
Abstract
Adaptive radiations are important drivers of niche filling, since they rapidly adapt a single clade of organisms to ecological opportunities. Although thought to be common for animals and plants, adaptive radiations have remained difficult to document for microbes in the wild. Here we describe a recent adaptive radiation leading to fine-scale ecophysiological differentiation in the degradation of an algal glycan in a clade of closely related marine bacteria. Horizontal gene transfer is the primary driver in the diversification of the pathway leading to several ecophysiologically differentiated Vibrionaceae populations adapted to different physical forms of alginate. Pathway architecture is predictive of function and ecology, underscoring that horizontal gene transfer without extensive regulatory changes can rapidly assemble fully functional pathways in microbes. Adaptive radiations are well-known for animals and plants, but not for microbes. Here, Hehemann et al. show that there has been a recent adaptive radiation of bacteria in the Vibrionaceae to use different forms of alginate and that this radiation has been mediated by horizontal gene transfer.
Collapse
|
39
|
Machado JP, Philip S, Maldonado E, O'Brien SJ, Johnson WE, Antunes A. Positive Selection Linked with Generation of Novel Mammalian Dentition Patterns. Genome Biol Evol 2016; 8:2748-59. [PMID: 27613398 PMCID: PMC5630915 DOI: 10.1093/gbe/evw200] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
A diverse group of genes are involved in the tooth development of mammals. Several studies, focused mainly on mice and rats, have provided a detailed depiction of the processes coordinating tooth formation and shape. Here we surveyed 236 tooth-associated genes in 39 mammalian genomes and tested for signatures of selection to assess patterns of molecular adaptation in genes regulating mammalian dentition. Of the 236 genes, 31 (∼13.1%) showed strong signatures of positive selection that may be responsible for the phenotypic diversity observed in mammalian dentition. Mammalian-specific tooth-associated genes had accelerated mutation rates compared with older genes found across all vertebrates. More recently evolved genes had fewer interactions (either genetic or physical), were associated with fewer Gene Ontology terms and had faster evolutionary rates compared with older genes. The introns of these positively selected genes also exhibited accelerated evolutionary rates, which may reflect additional adaptive pressure in the intronic regions that are associated with regulatory processes that influence tooth-gene networks. The positively selected genes were mainly involved in processes like mineralization and structural organization of tooth specific tissues such as enamel and dentin. Of the 236 analyzed genes, 12 mammalian-specific genes (younger genes) provided insights on diversification of mammalian teeth as they have higher evolutionary rates and exhibit different expression profiles compared with older genes. Our results suggest that the evolution and development of mammalian dentition occurred in part through positive selection acting on genes that previously had other functions.
Collapse
Affiliation(s)
- João Paulo Machado
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Porto, Portugal Abel Salazar Biomedical Sciences Institute (ICBAS), University of Porto, Porto, Portugal
| | - Siby Philip
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Porto, Portugal Department of Biology, Faculty of Sciences, University of Porto, Porto, Portugal
| | - Emanuel Maldonado
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Porto, Portugal
| | - Stephen J O'Brien
- Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, St. Petersburg, Russia Oceanographic Center, Nova Southeastern University, Ft Lauderdale
| | - Warren E Johnson
- Smithsonian Conservation Biology Institute, National Zoological Park, Front Royal, Virginia, USA
| | - Agostinho Antunes
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Porto, Portugal Abel Salazar Biomedical Sciences Institute (ICBAS), University of Porto, Porto, Portugal Department of Biology, Faculty of Sciences, University of Porto, Porto, Portugal
| |
Collapse
|
40
|
Abstract
The Environment for Tree Exploration (ETE) is a computational framework that simplifies the reconstruction, analysis, and visualization of phylogenetic trees and multiple sequence alignments. Here, we present ETE v3, featuring numerous improvements in the underlying library of methods, and providing a novel set of standalone tools to perform common tasks in comparative genomics and phylogenetics. The new features include (i) building gene-based and supermatrix-based phylogenies using a single command, (ii) testing and visualizing evolutionary models, (iii) calculating distances between trees of different size or including duplications, and (iv) providing seamless integration with the NCBI taxonomy database. ETE is freely available at http://etetoolkit.org.
Collapse
Affiliation(s)
- Jaime Huerta-Cepas
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - François Serra
- Centro Nacional de Análisis Genómico (CNAG-CRG), Center for Genomic Regulation, Universitat Pompeu Fabra (UPF), 08028 Barcelona, Spain
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany Germany Molecular Medicine Partnership Unit (MMPU), University Hospital Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany
| |
Collapse
|
41
|
Abstract
pyGeno is a Python package mainly intended for precision medicine applications that revolve around genomics and proteomics. It integrates reference sequences and annotations from Ensembl, genomic polymorphisms from the dbSNP database and data from next-gen sequencing into an easy to use, memory-efficient and fast framework, therefore allowing the user to easily explore subject-specific genomes and proteomes. Compared to a standalone program, pyGeno gives the user access to the complete expressivity of Python, a general programming language. Its range of application therefore encompasses both short scripts and large scale genome-wide studies.
Collapse
Affiliation(s)
- Tariq Daouda
- Institute for Research in Immunology and Cancer, Université de Montréal, Montreal, Canada; Department of Biochemistry, Faculty of Medicine, Université de Montréal, Montreal, Canada
| | - Claude Perreault
- Institute for Research in Immunology and Cancer, Université de Montréal, Montreal, Canada; Division of Hematology, Hôpital Maisonneuve-Rosemont, Montreal, Canada; Department of Medicine, Faculty of Medicine, Université de Montréal, Montreal, Canada
| | - Sébastien Lemieux
- Institute for Research in Immunology and Cancer, Université de Montréal, Montreal, Canada; Department of Computer Science and Operations Research, Faculty of Arts and Sciences, Université de Montréal, Montreal, Canada
| |
Collapse
|
42
|
Daouda T, Perreault C, Lemieux S. pyGeno: A Python package for precision medicine and proteogenomics. F1000Res 2016; 5:381. [PMID: 27785359 PMCID: PMC5022704 DOI: 10.12688/f1000research.8251.1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/06/2016] [Indexed: 01/26/2024] Open
Abstract
pyGeno is a Python package mainly intended for precision medicine applications that revolve around genomics and proteomics. It integrates reference sequences and annotations from Ensembl, genomic polymorphisms from the dbSNP database and data from next-gen sequencing into an easy to use, memory-efficient and fast framework, therefore allowing the user to easily explore subject-specific genomes and proteomes. Compared to a standalone program, pyGeno gives the user access to the complete expressivity of Python, a general programming language. Its range of application therefore encompasses both short scripts and large scale genome-wide studies.
Collapse
Affiliation(s)
- Tariq Daouda
- Institute for Research in Immunology and Cancer, Université de Montréal, Montreal, Canada
- Department of Biochemistry, Faculty of Medicine, Université de Montréal, Montreal, Canada
| | - Claude Perreault
- Institute for Research in Immunology and Cancer, Université de Montréal, Montreal, Canada
- Division of Hematology, Hôpital Maisonneuve-Rosemont, Montreal, Canada
- Department of Medicine, Faculty of Medicine, Université de Montréal, Montreal, Canada
| | - Sébastien Lemieux
- Institute for Research in Immunology and Cancer, Université de Montréal, Montreal, Canada
- Department of Computer Science and Operations Research, Faculty of Arts and Sciences, Université de Montréal, Montreal, Canada
| |
Collapse
|
43
|
Vargas WA, Sanz-Martín JM, Rech GE, Armijos-Jaramillo VD, Rivera LP, Echeverria MM, Díaz-Mínguez JM, Thon MR, Sukno SA. A Fungal Effector With Host Nuclear Localization and DNA-Binding Properties Is Required for Maize Anthracnose Development. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2016; 29:83-95. [PMID: 26554735 DOI: 10.1094/mpmi-09-15-0209-r] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Plant pathogens have the capacity to manipulate the host immune system through the secretion of effectors. We identified 27 putative effector proteins encoded in the genome of the maize anthracnose pathogen Colletotrichum graminicola that are likely to target the host's nucleus, as they simultaneously contain sequence signatures for secretion and nuclear localization. We functionally characterized one protein, identified as CgEP1. This protein is synthesized during the early stages of disease development and is necessary for anthracnose development in maize leaves, stems, and roots. Genetic, molecular, and biochemical studies confirmed that this effector targets the host's nucleus and defines a novel class of double-stranded DNA-binding protein. We show that CgEP1 arose from a gene duplication in an ancestor of a lineage of monocot-infecting Colletotrichum spp. and has undergone an intense evolution process, with evidence for episodes of positive selection. We detected CgEP1 homologs in several species of a grass-infecting lineage of Colletotrichum spp., suggesting that its function may be conserved across a large number of anthracnose pathogens. Our results demonstrate that effectors targeted to the host nucleus may be key elements for disease development and aid in the understanding of the genetic basis of anthracnose development in maize plants.
Collapse
Affiliation(s)
- Walter A Vargas
- 1 Instituto Hispano-Luso de Investigaciones Agrarias (CIALE), Departamento de Microbiología y Genética, Universidad de Salamanca, 37185 Villamayor, Spain
| | - José M Sanz-Martín
- 1 Instituto Hispano-Luso de Investigaciones Agrarias (CIALE), Departamento de Microbiología y Genética, Universidad de Salamanca, 37185 Villamayor, Spain
| | - Gabriel E Rech
- 1 Instituto Hispano-Luso de Investigaciones Agrarias (CIALE), Departamento de Microbiología y Genética, Universidad de Salamanca, 37185 Villamayor, Spain
| | - Vinicio D Armijos-Jaramillo
- 1 Instituto Hispano-Luso de Investigaciones Agrarias (CIALE), Departamento de Microbiología y Genética, Universidad de Salamanca, 37185 Villamayor, Spain
| | - Lina P Rivera
- 1 Instituto Hispano-Luso de Investigaciones Agrarias (CIALE), Departamento de Microbiología y Genética, Universidad de Salamanca, 37185 Villamayor, Spain
| | - María Mercedes Echeverria
- 2 Facultad de Ciencias Agrarias, Universidad Nacional de Mar del Plata - C.C. 276 (7620) Balcarce, Buenos Aires, Argentina
| | - José M Díaz-Mínguez
- 1 Instituto Hispano-Luso de Investigaciones Agrarias (CIALE), Departamento de Microbiología y Genética, Universidad de Salamanca, 37185 Villamayor, Spain
| | - Michael R Thon
- 1 Instituto Hispano-Luso de Investigaciones Agrarias (CIALE), Departamento de Microbiología y Genética, Universidad de Salamanca, 37185 Villamayor, Spain
| | - Serenella A Sukno
- 1 Instituto Hispano-Luso de Investigaciones Agrarias (CIALE), Departamento de Microbiología y Genética, Universidad de Salamanca, 37185 Villamayor, Spain
| |
Collapse
|
44
|
Pepe-Ranney C, Koechli C, Potrafka R, Andam C, Eggleston E, Garcia-Pichel F, Buckley DH. Non-cyanobacterial diazotrophs mediate dinitrogen fixation in biological soil crusts during early crust formation. THE ISME JOURNAL 2016; 10:287-98. [PMID: 26114889 PMCID: PMC4737922 DOI: 10.1038/ismej.2015.106] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2014] [Revised: 05/19/2015] [Accepted: 05/25/2015] [Indexed: 11/08/2022]
Abstract
Biological soil crusts (BSCs) are key components of ecosystem productivity in arid lands and they cover a substantial fraction of the terrestrial surface. In particular, BSC N2-fixation contributes significantly to the nitrogen (N) budget of arid land ecosystems. In mature crusts, N2-fixation is largely attributed to heterocystous cyanobacteria; however, early successional crusts possess few N2-fixing cyanobacteria and this suggests that microorganisms other than cyanobacteria mediate N2-fixation during the critical early stages of BSC development. DNA stable isotope probing with (15)N2 revealed that Clostridiaceae and Proteobacteria are the most common microorganisms that assimilate (15)N2 in early successional crusts. The Clostridiaceae identified are divergent from previously characterized isolates, though N2-fixation has previously been observed in this family. The Proteobacteria identified share >98.5% small subunit rRNA gene sequence identity with isolates from genera known to possess diazotrophs (for example, Pseudomonas, Klebsiella, Shigella and Ideonella). The low abundance of these heterotrophic diazotrophs in BSCs may explain why they have not been characterized previously. Diazotrophs have a critical role in BSC formation and characterization of these organisms represents a crucial step towards understanding how anthropogenic change will affect the formation and ecological function of BSCs in arid ecosystems.
Collapse
Affiliation(s)
- Charles Pepe-Ranney
- Department of Crop and Soil Sciences, School of Integrative Plant Sciences, Cornell University, Ithaca, NY, USA
| | - Chantal Koechli
- Department of Microbiology, Cornell University, Ithaca, NY, USA
| | - Ruth Potrafka
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Cheryl Andam
- Department of Microbiology, Cornell University, Ithaca, NY, USA
| | - Erin Eggleston
- Department of Microbiology, Cornell University, Ithaca, NY, USA
| | | | - Daniel H Buckley
- Department of Crop and Soil Sciences, School of Integrative Plant Sciences, Cornell University, Ithaca, NY, USA
| |
Collapse
|
45
|
Li J, Wei Z, Hakonarson H. Application of computational methods in genetic study of inflammatory bowel disease. World J Gastroenterol 2016; 22:949-960. [PMID: 26811639 PMCID: PMC4716047 DOI: 10.3748/wjg.v22.i3.949] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/02/2015] [Revised: 11/04/2015] [Accepted: 11/24/2015] [Indexed: 02/06/2023] Open
Abstract
Genetic factors play an important role in the etiology of inflammatory bowel disease (IBD). The launch of genome-wide association study (GWAS) represents a landmark in the genetic study of human complex disease. Concurrently, computational methods have undergone rapid development during the past a few years, which led to the identification of numerous disease susceptibility loci. IBD is one of the successful examples of GWAS and related analyses. A total of 163 genetic loci and multiple signaling pathways have been identified to be associated with IBD. Pleiotropic effects were found for many of these loci; and risk prediction models were built based on a broad spectrum of genetic variants. Important gene-gene, gene-environment interactions and key contributions of gut microbiome are being discovered. Here we will review the different types of analyses that have been applied to IBD genetic study, discuss the computational methods for each type of analysis, and summarize the discoveries made in IBD research with the application of these methods.
Collapse
|
46
|
Weiss SJ, Mansell TJ, Mortazavi P, Knight R, Gill RT. Parallel Mapping of Antibiotic Resistance Alleles in Escherichia coli. PLoS One 2016; 11:e0146916. [PMID: 26771672 PMCID: PMC4714920 DOI: 10.1371/journal.pone.0146916] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2015] [Accepted: 12/23/2015] [Indexed: 12/26/2022] Open
Abstract
Chemical genomics expands our understanding of microbial tolerance to inhibitory chemicals, but its scope is often limited by the throughput of genome-scale library construction and genotype-phenotype mapping. Here we report a method for rapid, parallel, and deep characterization of the response to antibiotics in Escherichia coli using a barcoded genome-scale library, next-generation sequencing, and streamlined bioinformatics software. The method provides quantitative growth data (over 200,000 measurements) and identifies contributing antimicrobial resistance and susceptibility alleles. Using multivariate analysis, we also find that subtle differences in the population responses resonate across multiple levels of functional hierarchy. Finally, we use machine learning to identify a unique allelic and proteomic fingerprint for each antibiotic. The method can be broadly applied to tolerance for any chemical from toxic metabolites to next-generation biofuels and antibiotics.
Collapse
Affiliation(s)
- Sophie J. Weiss
- Department of Chemical and Biological Engineering, University of Colorado Boulder, 3415 Colorado Avenue, Boulder, Colorado, 80303, United States of America
| | - Thomas J. Mansell
- Department of Chemical and Biological Engineering, University of Colorado Boulder, 3415 Colorado Avenue, Boulder, Colorado, 80303, United States of America
| | - Pooneh Mortazavi
- Department of Computer Science, University of Colorado Boulder, 1111 Engineering Drive ECOT 717, Boulder, CO 80303, United States of America
| | - Rob Knight
- Department of Pediatrics, University of California San Diego School of Medicine, 9500 Gilman Drive, MC 0602, La Jolla, CA 92093, United States of America
- Department of Computer Science & Engineering, University of California San Diego, 9500 Gilman Drive, MC 0404, La Jolla, CA 92093, United States of America
| | - Ryan T. Gill
- Department of Chemical and Biological Engineering, University of Colorado Boulder, 3415 Colorado Avenue, Boulder, Colorado, 80303, United States of America
- * E-mail:
| |
Collapse
|
47
|
Kanterakis A, Kuiper J, Potamias G, Swertz MA. PyPedia: using the wiki paradigm as crowd sourcing environment for bioinformatics protocols. SOURCE CODE FOR BIOLOGY AND MEDICINE 2015; 10:14. [PMID: 26587054 PMCID: PMC4652372 DOI: 10.1186/s13029-015-0042-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Accepted: 10/20/2015] [Indexed: 11/10/2022]
Abstract
Background Today researchers can choose from many bioinformatics protocols for all types of life sciences research, computational environments and coding languages. Although the majority of these are open source, few of them possess all virtues to maximize reuse and promote reproducible science. Wikipedia has proven a great tool to disseminate information and enhance collaboration between users with varying expertise and background to author qualitative content via crowdsourcing. However, it remains an open question whether the wiki paradigm can be applied to bioinformatics protocols. Results We piloted PyPedia, a wiki where each article is both implementation and documentation of a bioinformatics computational protocol in the python language. Hyperlinks within the wiki can be used to compose complex workflows and induce reuse. A RESTful API enables code execution outside the wiki. Initial content of PyPedia contains articles for population statistics, bioinformatics format conversions and genotype imputation. Use of the easy to learn wiki syntax effectively lowers the barriers to bring expert programmers and less computer savvy researchers on the same page. Conclusions PyPedia demonstrates how wiki can provide a collaborative development, sharing and even execution environment for biologists and bioinformaticians that complement existing resources, useful for local and multi-center research teams. Availability PyPedia is available online at: http://www.pypedia.com. The source code and installation instructions are available at: https://github.com/kantale/PyPedia_server. The PyPedia python library is available at: https://github.com/kantale/pypedia. PyPedia is open-source, available under the BSD 2-Clause License. Electronic supplementary material The online version of this article (doi:10.1186/s13029-015-0042-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alexandros Kanterakis
- University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Postbus 30 001, Groningen, 9700 RB The Netherlands ; Institute of Computer Science, Foundation for Research and Technology Hellas (FORTH), Nikolaou Plastira 100, Heraklion, 71110 Greece
| | - Joël Kuiper
- University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Postbus 30 001, Groningen, 9700 RB The Netherlands
| | - George Potamias
- Institute of Computer Science, Foundation for Research and Technology Hellas (FORTH), Nikolaou Plastira 100, Heraklion, 71110 Greece
| | - Morris A Swertz
- University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Postbus 30 001, Groningen, 9700 RB The Netherlands
| |
Collapse
|
48
|
Aflitos SA, Severing E, Sanchez-Perez G, Peters S, de Jong H, de Ridder D. Cnidaria: fast, reference-free clustering of raw and assembled genome and transcriptome NGS data. BMC Bioinformatics 2015; 16:352. [PMID: 26525298 PMCID: PMC4630969 DOI: 10.1186/s12859-015-0806-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2015] [Accepted: 10/29/2015] [Indexed: 12/05/2022] Open
Abstract
Background Identification of biological specimens is a requirement for a range of applications. Reference-free methods analyse unprocessed sequencing data without relying on prior knowledge, but generally do not scale to arbitrarily large genomes and arbitrarily large phylogenetic distances. Results We present Cnidaria, a practical tool for clustering genomic and transcriptomic data with no limitation on genome size or phylogenetic distances. We successfully simultaneously clustered 169 genomic and transcriptomic datasets from 4 kingdoms, achieving 100 % identification accuracy at supra-species level and 78 % accuracy at the species level. Conclusion CNIDARIA allows for fast, resource-efficient comparison and identification of both raw and assembled genome and transcriptome data. This can help answer both fundamental (e.g. in phylogeny, ecological diversity analysis) and practical questions (e.g. sequencing quality control, primer design). Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0806-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Saulo Alves Aflitos
- Applied Bioinformatics, Plant Research International, Wageningen, The Netherlands. .,Bioinformatics Group, Department of Plant Sciences, Wageningen University, Wageningen, The Netherlands.
| | - Edouard Severing
- Laboratory of Genetics, Wageningen University, Wageningen, The Netherlands.
| | - Gabino Sanchez-Perez
- Applied Bioinformatics, Plant Research International, Wageningen, The Netherlands. .,Bioinformatics Group, Department of Plant Sciences, Wageningen University, Wageningen, The Netherlands.
| | - Sander Peters
- Applied Bioinformatics, Plant Research International, Wageningen, The Netherlands.
| | - Hans de Jong
- Laboratory of Genetics, Wageningen University, Wageningen, The Netherlands.
| | - Dick de Ridder
- Bioinformatics Group, Department of Plant Sciences, Wageningen University, Wageningen, The Netherlands.
| |
Collapse
|
49
|
Re-evaluating the phylogeny of allopolyploid Gossypium L. Mol Phylogenet Evol 2015; 92:45-52. [DOI: 10.1016/j.ympev.2015.05.023] [Citation(s) in RCA: 65] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Revised: 05/19/2015] [Accepted: 05/29/2015] [Indexed: 01/06/2023]
|
50
|
Farrell D, Shaughnessy RG, Britton L, MacHugh DE, Markey B, Gordon SV. The Identification of Circulating MiRNA in Bovine Serum and Their Potential as Novel Biomarkers of Early Mycobacterium avium subsp paratuberculosis Infection. PLoS One 2015; 10:e0134310. [PMID: 26218736 PMCID: PMC4517789 DOI: 10.1371/journal.pone.0134310] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2015] [Accepted: 07/07/2015] [Indexed: 01/09/2023] Open
Abstract
Mycobacterium avium subspecies paratuberculosis (MAP) is the aetiological agent of Johne's disease (JD), a chronic enteritis in ruminants that causes substantial economic loses to agriculture worldwide. Current diagnostic assays are hampered by low sensitivity and specificity that seriously complicate disease control; a new generation of diagnostic and prognostic assays are therefore urgently needed. Circulating microRNAs (miRNAs) have been shown to have significant potential as novel biomarkers for a range of human diseases, but their potential application in the veterinary sphere has been less well characterised. The aim of this study was therefore to apply RNA-sequencing approaches to serum from an experimental JD infection model as a route to identify novel diagnostic and prognostic miRNA biomarkers. Sera from experimental MAP-challenged calves (n = 6) and age-matched controls (n = 6) were used. We identified a subset of known miRNAs from bovine serum across all samples, with approximately 90 being at potentially functional abundance levels. The majority of known bovine miRNAs displayed multiple isomiRs that differed from the canonical sequences. Thirty novel miRNAs were identified after filtering and were found within sera from all animals tested. No significant differential miRNA expression was detected when comparing sera from MAP-challenged animals to their age-matched controls at six-month's post-infection. However, comparing sera from pre-infection bleeds to six-month's post-infection across all 12 animals did identify increased miR-205 (2-fold) and decreased miR-432 (2-fold) within both challenged and control groups, which suggests changes in circulating miRNA profiles due to ageing or development (P<0.00001). In conclusion our study has identified a range of novel miRNA in bovine serum, and shown the utility of small RNA sequencing approaches to explore the potential of miRNA as novel biomarkers for infectious disease in cattle.
Collapse
Affiliation(s)
- Damien Farrell
- UCD School of Veterinary Medicine, University College Dublin, Dublin, Ireland
| | | | - Louise Britton
- UCD School of Veterinary Medicine, University College Dublin, Dublin, Ireland
| | - David E. MacHugh
- UCD School of Agriculture and Food Science, University College Dublin, Dublin, Ireland
- UCD Conway Institute, University College Dublin, Dublin, Ireland
| | - Bryan Markey
- UCD School of Veterinary Medicine, University College Dublin, Dublin, Ireland
| | - Stephen V. Gordon
- UCD School of Veterinary Medicine, University College Dublin, Dublin, Ireland
- UCD School of Medicine, University College Dublin, Dublin, Ireland
- UCD School of Biomolecular and Biomedical Science, University College Dublin, Dublin, Ireland
- UCD Conway Institute, University College Dublin, Dublin, Ireland
| |
Collapse
|