1
|
Welsh BL, Eisenhofer R. The prevalence of controls in phyllosphere microbiome research: a methodological review. THE NEW PHYTOLOGIST 2024; 242:23-29. [PMID: 38339825 DOI: 10.1111/nph.19573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 01/19/2024] [Indexed: 02/12/2024]
Abstract
DNA contamination can critically confound microbiome studies. Here, we take a systematic approach to review the current literature and investigate the prevalence of contamination controls in phyllosphere microbiome research over the past decade. By utilising systematic review principles for this review, we were able to conduct a thorough investigation, screening 450 articles from three databases for eligibility and extracting data in a controlled and methodical manner. Worryingly, we observed a surprisingly low usage of both positive and negative contamination controls in phyllosphere research. As a result, we propose a set of minimum standards to combat the effects of contamination in future phyllosphere research.
Collapse
Affiliation(s)
- Brady L Welsh
- School of Biological Sciences, The University of Adelaide, North Terrace Campus, Adelaide, SA, 5005, Australia
| | - Raphael Eisenhofer
- School of Biological Sciences, The University of Adelaide, North Terrace Campus, Adelaide, SA, 5005, Australia
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, 1353, Denmark
| |
Collapse
|
2
|
Orel N, Fadeev E, Herndl GJ, Turk V, Tinta T. Recovering high-quality bacterial genomes from cross-contaminated cultures: a case study of marine Vibrio campbellii. BMC Genomics 2024; 25:146. [PMID: 38321410 PMCID: PMC10845552 DOI: 10.1186/s12864-024-10062-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 01/29/2024] [Indexed: 02/08/2024] Open
Abstract
BACKGROUND Environmental monitoring of bacterial pathogens is critical for disease control in coastal marine ecosystems to maintain animal welfare and ecosystem function and to prevent significant economic losses. This requires accurate taxonomic identification of environmental bacterial pathogens, which often cannot be achieved by commonly used genetic markers (e.g., 16S rRNA gene), and an understanding of their pathogenic potential based on the information encoded in their genomes. The decreasing costs of whole genome sequencing (WGS), combined with newly developed bioinformatics tools, now make it possible to unravel the full potential of environmental pathogens, beyond traditional microbiological approaches. However, obtaining a high-quality bacterial genome, requires initial cultivation in an axenic culture, which is a bottleneck in environmental microbiology due to cross-contamination in the laboratory or isolation of non-axenic strains. RESULTS We applied WGS to determine the pathogenic potential of two Vibrio isolates from coastal seawater. During the analysis, we identified cross-contamination of one of the isolates and decided to use this dataset to evaluate the possibility of bioinformatic contaminant removal and recovery of bacterial genomes from a contaminated culture. Despite the contamination, using an appropriate bioinformatics workflow, we were able to obtain high quality and highly identical genomes (Average Nucleotide Identity value 99.98%) of one of the Vibrio isolates from both the axenic and the contaminated culture. Using the assembled genome, we were able to determine that this isolate belongs to a sub-lineage of Vibrio campbellii associated with several diseases in marine organisms. We also found that the genome of the isolate contains a novel Vibrio plasmid associated with bacterial defense mechanisms and horizontal gene transfer, which may offer a competitive advantage to this putative pathogen. CONCLUSIONS Our study shows that, using state-of-the-art bioinformatics tools and a sufficient sequencing effort, it is possible to obtain high quality genomes of the bacteria of interest and perform in-depth genomic analyses even in the case of a contaminated culture. With the new isolate and its complete genome, we are providing new insights into the genomic characteristics and functional potential of this sub-lineage of V. campbellii. The approach described here also highlights the possibility of recovering complete bacterial genomes in the case of non-axenic cultures or obligatory co-cultures.
Collapse
Affiliation(s)
- Neža Orel
- Marine Biology Station Piran, National Institute of Biology, Piran, Slovenia.
| | - Eduard Fadeev
- Department of Functional and Evolutionary Ecology, Bio-Oceanography and Marine Biology Unit, University of Vienna, Vienna, Austria
| | - Gerhard J Herndl
- Department of Functional and Evolutionary Ecology, Bio-Oceanography and Marine Biology Unit, University of Vienna, Vienna, Austria
- NIOZ, Department of Marine Microbiology and Biogeochemistry, Royal Netherlands Institute for Sea Research, Den Burg, The Netherlands
| | - Valentina Turk
- Marine Biology Station Piran, National Institute of Biology, Piran, Slovenia
| | - Tinkara Tinta
- Marine Biology Station Piran, National Institute of Biology, Piran, Slovenia.
| |
Collapse
|
3
|
Alvarez RV, Landsman D. GTax: improving de novo transcriptome assembly by removing foreign RNA contamination. Genome Biol 2024; 25:12. [PMID: 38191464 PMCID: PMC10773103 DOI: 10.1186/s13059-023-03141-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 12/08/2023] [Indexed: 01/10/2024] Open
Abstract
The cost and complexity of generating a complete reference genome means that many organisms lack an annotated reference. An alternative is to use a de novo reference transcriptome. This technology is cost-effective but is susceptible to off-target RNA contamination. In this manuscript, we present GTax, a taxonomy-structured database of genomic sequences that can be used with BLAST to detect and remove foreign contamination in RNA sequencing samples before assembly. In addition, we use a de novo transcriptome assembly of Solanum lycopersicum (tomato) to demonstrate that removing foreign contamination in sequencing samples reduces the number of assembled chimeric transcripts.
Collapse
Affiliation(s)
- Roberto Vera Alvarez
- Computational Biology Branch, National Center for Biotechnology Information, Intramural Research Program, National Library of Medicine, NIH, Bethesda, MD, USA
| | - David Landsman
- Computational Biology Branch, National Center for Biotechnology Information, Intramural Research Program, National Library of Medicine, NIH, Bethesda, MD, USA.
| |
Collapse
|
4
|
Mee L, Barribeau SM. Influence of social lifestyles on host-microbe symbioses in the bees. Ecol Evol 2023; 13:e10679. [PMID: 37928198 PMCID: PMC10620586 DOI: 10.1002/ece3.10679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 10/16/2023] [Accepted: 10/17/2023] [Indexed: 11/07/2023] Open
Abstract
Microbiomes are increasingly recognised as critical for the health of an organism. In eusocial insect societies, frequent social interactions allow for high-fidelity transmission of microbes across generations, leading to closer host-microbe coevolution. The microbial communities of bees with other social lifestyles are less studied, and few comparisons have been made between taxa that vary in social structure. To address this gap, we leveraged a cloud-computing resource and publicly available transcriptomic data to conduct a survey of microbial diversity in bee samples from a variety of social lifestyles and taxa. We consistently recover the core microbes of well-studied corbiculate bees, supporting this method's ability to accurately characterise microbial communities. We find that the bacterial communities of bees are influenced by host location, phylogeny and social lifestyle, although no clear effect was found for fungal or viral microbial communities. Bee genera with more complex societies tend to harbour more diverse microbes, with Wolbachia detected more commonly in solitary tribes. We present a description of the microbiota of Euglossine bees and find that they do not share the "corbiculate core" microbiome. Notably, we find that bacteria with known anti-pathogenic properties are present across social bee genera, suggesting that symbioses that enhance host immunity are important with higher sociality. Our approach provides an inexpensive means of exploring microbiomes of a given taxa and identifying avenues for further research. These findings contribute to our understanding of the relationships between bees and their associated microbial communities, highlighting the importance of considering microbiome dynamics in investigations of bee health.
Collapse
Affiliation(s)
- Lauren Mee
- Institute of Infection, Veterinary and Ecological Sciences, Department of Evolution, Ecology and BehaviourUniversity of LiverpoolLiverpoolUK
| | - Seth M. Barribeau
- Institute of Infection, Veterinary and Ecological Sciences, Department of Evolution, Ecology and BehaviourUniversity of LiverpoolLiverpoolUK
| |
Collapse
|
5
|
Rollin J, Rong W, Massart S. Cont-ID: detection of sample cross-contamination in viral metagenomic data. BMC Biol 2023; 21:217. [PMID: 37833740 PMCID: PMC10576407 DOI: 10.1186/s12915-023-01708-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 09/20/2023] [Indexed: 10/15/2023] Open
Abstract
BACKGROUND High-throughput sequencing (HTS) technologies completed by the bioinformatic analysis of the generated data are becoming an important detection technique for virus diagnostics. They have the potential to replace or complement the current PCR-based methods thanks to their improved inclusivity and analytical sensitivity, as well as their overall good repeatability and reproducibility. Cross-contamination is a well-known phenomenon in molecular diagnostics and corresponds to the exchange of genetic material between samples. Cross-contamination management was a key drawback during the development of PCR-based detection and is now adequately monitored in routine diagnostics. HTS technologies are facing similar difficulties due to their very high analytical sensitivity. As a single viral read could be detected in millions of sequencing reads, it is mandatory to fix a detection threshold that will be informed by estimated cross-contamination. Cross-contamination monitoring should therefore be a priority when detecting viruses by HTS technologies. RESULTS We present Cont-ID, a bioinformatic tool designed to check for cross-contamination by analysing the relative abundance of virus sequencing reads identified in sequence metagenomic datasets and their duplication between samples. It can be applied when the samples in a sequencing batch have been processed in parallel in the laboratory and with at least one specific external control called Alien control. Using 273 real datasets, including 68 virus species from different hosts (fruit tree, plant, human) and several library preparation protocols (Ribodepleted total RNA, small RNA and double-stranded RNA), we demonstrated that Cont-ID classifies with high accuracy (91%) viral species detection into (true) infection or (cross) contamination. This classification raises confidence in the detection and facilitates the downstream interpretation and confirmation of the results by prioritising the virus detections that should be confirmed. CONCLUSIONS Cross-contamination between samples when detecting viruses using HTS (Illumina technology) can be monitored and highlighted by Cont-ID (provided an alien control is present). Cont-ID is based on a flexible methodology relying on the output of bioinformatics analyses of the sequencing reads and considering the contamination pattern specific to each batch of samples. The Cont-ID method is adaptable so that each laboratory can optimise it before its validation and routine use.
Collapse
Affiliation(s)
- Johan Rollin
- Plant Pathology Laboratory, Gembloux Agro-Bio Tech, University of Liège, 5030, Gembloux, Belgium
- DNAVision, 6041, Gosselies, Belgium
| | - Wei Rong
- Plant Pathology Laboratory, Gembloux Agro-Bio Tech, University of Liège, 5030, Gembloux, Belgium
| | - Sébastien Massart
- Plant Pathology Laboratory, Gembloux Agro-Bio Tech, University of Liège, 5030, Gembloux, Belgium.
| |
Collapse
|
6
|
Pavia MJ, Chede A, Wu Z, Cadillo-Quiroz H, Zhu Q. BinaRena: a dedicated interactive platform for human-guided exploration and binning of metagenomes. MICROBIOME 2023; 11:186. [PMID: 37596696 PMCID: PMC10439608 DOI: 10.1186/s40168-023-01625-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 07/16/2023] [Indexed: 08/20/2023]
Abstract
BACKGROUND Exploring metagenomic contigs and "binning" them into metagenome-assembled genomes (MAGs) are essential for the delineation of functional and evolutionary guilds within microbial communities. Despite the advances in automated binning algorithms, their capabilities in recovering MAGs with accuracy and biological relevance are so far limited. Researchers often find that human involvement is necessary to achieve representative binning results. This manual process however is expertise demanding and labor intensive, and it deserves to be supported by software infrastructure. RESULTS We present BinaRena, a comprehensive and versatile graphic interface dedicated to aiding human operators to explore metagenome assemblies via customizable visualization and to associate contigs with bins. Contigs are rendered as an interactive scatter plot based on various data types, including sequence metrics, coverage profiles, taxonomic assignments, and functional annotations. Various contig-level operations are permitted, such as selection, masking, highlighting, focusing, and searching. Binning plans can be conveniently edited, inspected, and compared visually or using metrics including silhouette coefficient and adjusted Rand index. Completeness and contamination of user-selected contigs can be calculated in real time. In demonstration of BinaRena's usability, we show that it facilitated biological pattern discovery, hypothesis generation, and bin refinement in a complex tropical peatland metagenome. It enabled isolation of pathogenic genomes within closely related populations from the gut microbiota of diarrheal human subjects. It significantly improved overall binning quality after curating results of automated binners using a simulated marine dataset. CONCLUSIONS BinaRena is an installation-free, dependency-free, client-end web application that operates directly in any modern web browser, facilitating ease of deployment and accessibility for researchers of all skill levels. The program is hosted at https://github.com/qiyunlab/binarena , together with documentation, tutorials, example data, and a live demo. It effectively supports human researchers in intuitive interpretation and fine tuning of metagenomic data. Video Abstract.
Collapse
Affiliation(s)
- Michael J Pavia
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Biodesign Center for Fundamental and Applied Microbiomics, Arizona State University, Tempe, AZ, USA
- Biodesign Swette Center for Environmental Biotechnology, Arizona State University, Tempe, AZ, USA
| | - Abhinav Chede
- Biodesign Center for Fundamental and Applied Microbiomics, Arizona State University, Tempe, AZ, USA
| | - Zijun Wu
- Biodesign Center for Fundamental and Applied Microbiomics, Arizona State University, Tempe, AZ, USA
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Hinsby Cadillo-Quiroz
- School of Life Sciences, Arizona State University, Tempe, AZ, USA.
- Biodesign Center for Fundamental and Applied Microbiomics, Arizona State University, Tempe, AZ, USA.
- Biodesign Swette Center for Environmental Biotechnology, Arizona State University, Tempe, AZ, USA.
| | - Qiyun Zhu
- School of Life Sciences, Arizona State University, Tempe, AZ, USA.
- Biodesign Center for Fundamental and Applied Microbiomics, Arizona State University, Tempe, AZ, USA.
| |
Collapse
|
7
|
Xie J, Tan B, Zhang Y. A Large-Scale Study into Protist-Animal Interactions Based on Public Genomic Data Using DNA Barcodes. Animals (Basel) 2023; 13:2243. [PMID: 37508021 PMCID: PMC10376638 DOI: 10.3390/ani13142243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 07/06/2023] [Accepted: 07/06/2023] [Indexed: 07/30/2023] Open
Abstract
With the birth of next-generation sequencing (NGS) technology, genomic data in public databases have increased exponentially. Unfortunately, exogenous contamination or intracellular parasite sequences in assemblies could confuse genomic analysis. Meanwhile, they can provide a valuable resource for studies of host-microbe interactions. Here, we used a strategy based on DNA barcodes to scan protistan contamination in the GenBank WGS/TSA database. The results showed a total of 13,952 metazoan/animal assemblies in GenBank, where 17,036 contigs were found to be protistan contaminants in 1507 assemblies (10.8%), with even higher contamination rates in taxa of Cnidaria (150/281), Crustacea (237/480), and Mollusca (107/410). Taxonomic analysis of the protists derived from these contigs showed variations in abundance and evenness of protistan contamination across different metazoan taxa, reflecting host preferences of Apicomplexa, Ciliophora, Oomycota and Symbiodiniaceae for mammals and birds, Crustacea, insects, and Cnidaria, respectively. Finally, mitochondrial proteins COX1 and CYTB were predicted from these contigs, and the phylogenetic analysis corroborated the protistan origination and heterogeneous distribution of the contaminated contigs. Overall, in this study, we conducted a large-scale scan of protistan contaminant in genomic resources, and the protistan sequences detected will help uncover the protist diversity and relationships of these picoeukaryotes with Metazoa.
Collapse
Affiliation(s)
- Jiazheng Xie
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | - Bowen Tan
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | - Yi Zhang
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| |
Collapse
|
8
|
Ferreira SCM, Jarquín-Díaz VH, Heitlinger E. Amplicon sequencing allows differential quantification of closely related parasite species: an example from rodent Coccidia (Eimeria). Parasit Vectors 2023; 16:204. [PMID: 37330545 PMCID: PMC10276917 DOI: 10.1186/s13071-023-05800-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 05/03/2023] [Indexed: 06/19/2023] Open
Abstract
BACKGROUND Quantifying infection intensity is a common goal in parasitological studies. We have previously shown that the amount of parasite DNA in faecal samples can be a biologically meaningful measure of infection intensity, even if it does not agree well with complementary counts of transmission stages (oocysts in the case of Coccidia). Parasite DNA can be quantified at relatively high throughput using quantitative polymerase chain reaction (qPCR), but amplification needs a high specificity and does not simultaneously distinguish between parasite species. Counting of amplified sequence variants (ASVs) from high-throughput marker gene sequencing using a relatively universal primer pair has the potential to distinguish between closely related co-infecting taxa and to uncover the community diversity, thus being both more specific and more open-ended. METHODS We here compare qPCR to the sequencing-based amplification using standard PCR and a microfluidics-based PCR to quantify the unicellular parasite Eimeria in experimentally infected mice. We use multiple amplicons to differentially quantify Eimeria spp. in a natural house mouse population. RESULTS We show that sequencing-based quantification has high accuracy. Using a combination of phylogenetic analysis and the co-occurrence network, we distinguish three Eimeria species in naturally infected mice based on multiple marker regions and genes. We investigate geographical and host-related effects on Eimeria spp. community composition and find, as expected, prevalence to be largely explained by sampling locality (farm). Controlling for this effect, the novel approach allowed us to find body condition of mice to be negatively associated with Eimeria spp. abundance. CONCLUSIONS We conclude that amplicon sequencing provides the underused potential for species distinction and simultaneous quantification of parasites in faecal material. The method allowed us to detect a negative effect of Eimeria infection on the body condition of mice in the natural environment.
Collapse
Affiliation(s)
- Susana C. M. Ferreira
- Division of Computational Systems Biology, Center for Microbiology and Ecological Systems Science, University of Vienna, Djerassipl. 1, 1030 Vienna, Austria
- Institute for Biology. Department of Molecular Parasitology, Humboldt-Universität zu Berlin (HU), Philippstr. 13, Haus 14, 10115 Berlin, Germany
| | - Víctor Hugo Jarquín-Díaz
- Institute for Biology. Department of Molecular Parasitology, Humboldt-Universität zu Berlin (HU), Philippstr. 13, Haus 14, 10115 Berlin, Germany
- Leibniz-Institut Für Zoo- Und Wildtierforschung (IZW) im Forschungsverbund Berlin E.V., Alfred-Kowalke-Straße 17, 10315 Berlin, Germany
- Experimental and Clinical Research Center, a cooperation between the Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association and the Charité - Universitätsmedizin Berlin, Berlin, Germany
- Experimental and Clinical Research Center, Charité – Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Lindenberger Weg 80, 13125 Berlin, Germany
- Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Emanuel Heitlinger
- Institute for Biology. Department of Molecular Parasitology, Humboldt-Universität zu Berlin (HU), Philippstr. 13, Haus 14, 10115 Berlin, Germany
- Leibniz-Institut Für Zoo- Und Wildtierforschung (IZW) im Forschungsverbund Berlin E.V., Alfred-Kowalke-Straße 17, 10315 Berlin, Germany
| |
Collapse
|
9
|
Bornemann TLV, Esser SP, Stach TL, Burg T, Probst AJ. uBin: A manual refining tool for genomes from metagenomes. Environ Microbiol 2023; 25:1077-1083. [PMID: 36764661 DOI: 10.1111/1462-2920.16351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Accepted: 02/03/2023] [Indexed: 02/12/2023]
Abstract
Resolving bacterial and archaeal genomes from metagenomes has revolutionized our understanding of Earth's biomes yet producing high-quality genomes from assembled fragments has been an ever-standing problem. While automated binning software and their combination produce prokaryotic bins in high throughput, their manual refinement has been slow, sometimes difficult or missing entirely facilitating error propagation in public databases. Here, we present uBin, a GUI-based, standalone bin refiner that runs on all major operating platforms and was additionally designed for educational purposes. When applied to the public CAMI dataset, refinement of bins using GC content, coverage and taxonomy was able to improve 78.9% of bins by decreasing their contamination. We also applied the bin refiner as a standalone binner to public metagenomes from the International Space Station and demonstrate the recovery of near-complete genomes, whose replication indices indicate the active proliferation of microbes in Earth's lower orbit. uBin is an easy to instal software for bin refinement, binning of simple metagenomes and communication of metagenomic results to other scientists and in classrooms. The software and its helper scripts are open source and available under https://github.com/ProbstLab/uBin.
Collapse
Affiliation(s)
- Till L V Bornemann
- Environmental Metagenomics, Research Center One Health Ruhr of the University Alliance Ruhr, Faculty of Chemistry, University of Duisburg-Essen, Germany
| | - Sarah P Esser
- Environmental Metagenomics, Research Center One Health Ruhr of the University Alliance Ruhr, Faculty of Chemistry, University of Duisburg-Essen, Germany
| | - Tom L Stach
- Environmental Metagenomics, Research Center One Health Ruhr of the University Alliance Ruhr, Faculty of Chemistry, University of Duisburg-Essen, Germany
| | - Tim Burg
- Independent Researcher, Im Acker 59, Koblenz, Germany
| | - Alexander J Probst
- Environmental Metagenomics, Research Center One Health Ruhr of the University Alliance Ruhr, Faculty of Chemistry, University of Duisburg-Essen, Germany
- Centre of Water and Environmental Research (ZWU), University of Duisburg-Essen, Germany
| |
Collapse
|
10
|
Jones A, Zhang D, Massey SE, Deigin Y, Nemzer LR, Quay SC. Discovery of a novel merbecovirus DNA clone contaminating agricultural rice sequencing datasets from Wuhan, China. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.12.528210. [PMID: 36865340 PMCID: PMC9979991 DOI: 10.1101/2023.02.12.528210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Abstract
HKU4-related coronaviruses are a group of betacoronaviruses belonging to the same merbecovirus subgenus as Middle Eastern Respiratory Syndrome coronavirus (MERS-CoV), which causes severe respiratory illness in humans with a mortality rate of over 30%. The high genetic similarity between HKU4-related coronaviruses and MERS-CoV makes them an attractive subject of research for modeling potential zoonotic spillover scenarios. In this study, we identify a novel coronavirus contaminating agricultural rice RNA sequencing datasets from Wuhan, China. The datasets were generated by the Huazhong Agricultural University in early 2020. We were able to assemble the complete viral genome sequence, which revealed that it is a novel HKU4-related merbecovirus. The assembled genome is 98.38% identical to the closest known full genome sequence, Tylonycteris pachypus bat isolate BtTp-GX2012. Using in silico modeling, we identified that the novel HKU4-related coronavirus spike protein likely binds to human dipeptidyl peptidase 4 (DPP4), the receptor used by MERS-CoV. We further identified that the novel HKU4-related coronavirus genome has been inserted into a bacterial artificial chromosome in a format consistent with previously published coronavirus infectious clones. Additionally, we have found a near complete read coverage of the spike gene of the MERS-CoV reference strain HCoV-EMC/2012, and identify the likely presence of a HKU4-related-MERS chimera in the datasets. Our findings contribute to the knowledge of HKU4-related coronaviruses and document the use of a previously unpublished HKU4 reverse genetics system in apparent MERS-CoV related gain-of-function research. Our study also emphasizes the importance of improved biosafety protocols in sequencing centers and coronavirus research facilities.
Collapse
|
11
|
Brewer MS, Cole TJ. Killer Knots: Molecular Evolution of Inhibitor Cystine Knot Toxins in Wandering Spiders (Araneae: Ctenidae). Toxins (Basel) 2023; 15:toxins15020112. [PMID: 36828426 PMCID: PMC9958548 DOI: 10.3390/toxins15020112] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 10/26/2022] [Accepted: 11/05/2022] [Indexed: 01/31/2023] Open
Abstract
Venom expressed by the nearly 50,000 species of spiders on Earth largely remains an untapped reservoir of a diverse array of biomolecules with potential for pharmacological and agricultural applications. A large fraction of the noxious components of spider venoms are a functionally diverse family of structurally related polypeptides with an inhibitor cystine knot (ICK) motif. The cysteine-rich nature of these toxins makes structural elucidation difficult, and most studies have focused on venom components from the small handful of medically relevant spider species such as the highly aggressive Brazilian wandering spider Phoneutria nigriventer. To alleviate difficulties associated with the study of ICK toxins in spiders, we devised a comprehensive approach to explore the evolutionary patterns that have shaped ICK functional diversification using venom gland transcriptomes and proteomes from phylogenetically distinct lineages of wandering spiders and their close relatives. We identified 626 unique ICK toxins belonging to seven topological elaborations. Phylogenetic tests of episodic diversification revealed distinct regions between cysteine residues that demonstrated differential evidence of positive or negative selection, which may have structural implications towards the specificity and efficacy of these toxins. Increased taxon sampling and whole genome sequencing will provide invaluable insights to further understand the evolutionary processes that have given rise to this diverse class of toxins.
Collapse
|
12
|
Kryukov K, Imanishi T, Nakagawa S. Nanopore Sequencing Data Analysis of 16S rRNA Genes Using the GenomeSync-GSTK System. Methods Mol Biol 2023; 2632:215-226. [PMID: 36781731 DOI: 10.1007/978-1-0716-2996-3_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/27/2023]
Abstract
With the development of nanopore sequencing technology, long reads of DNA sequences can now be determined rapidly from various samples. This protocol introduces the GenomeSync-GSTK system for bacterial species identification in a given sample using nanopore sequencing data of 16S rRNA genes as an example. GenomeSync is a collection of genome sequences designed to provide easy access to genomic data of the species as demanded. GSTK (genome search toolkit) is a set of scripts for managing local homology searches using genomes obtained from the GenomeSync database. Based on this protocol, nanopore sequencing data analyses of metagenomes and amplicons could be efficiently performed. We also noted reanalysis in conjunction with future developments in nanopore sequencing technology and the accumulation of genome sequencing data.
Collapse
Affiliation(s)
- Kirill Kryukov
- Department of Informatics, National Institute of Genetics, Shizuoka, Japan
| | - Tadashi Imanishi
- Department of Molecular Life Science, Tokai University School of Medicine, Kanagawa, Japan
| | - So Nakagawa
- Department of Molecular Life Science, Tokai University School of Medicine, Kanagawa, Japan.
| |
Collapse
|
13
|
Xia L, Cai F, Chen S, Cai Y, Zhou K, Yan J, Li P. Phylogenetic Analysis and Genetic Structure of Schlegel's Japanese Gecko ( Gekko japonicus) from China Based on Mitochondrial DNA Sequences. Genes (Basel) 2022; 14:18. [PMID: 36672759 PMCID: PMC9858143 DOI: 10.3390/genes14010018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 12/07/2022] [Accepted: 12/17/2022] [Indexed: 12/24/2022] Open
Abstract
Gekko japonicus, i.e., Schlegel's Japanese Gecko, is an important species which is widely distributed in East Asia. However, the information about population genetics of this species from China remains unclear. To address this issue, we used sequences from a fragment of the mitochondrial protein-coding gene cytochrome c oxidase I to estimate genetic diversity, genetic structure, and historical demography of G. japonicus populations from China. Phylogenetic analysis indicated that G. japonicus had a close relationship with Gekko wenxianensis. A total of 14 haplotypes were obtained, of which haplotype 1 was the most common and widely distributed. The genetic diversity of G. japonicus was comparatively low across different geographic populations. The populations of G. japonicus were divided into four groups which exhibited low levels of genetic differentiation, and expressed an unclear pattern of population structuring. In addition, potential population expansion of G. japonicus has occurred as well. Overall, these results demonstrate that the populations of G. japonicus reveal low genetic diversity in China, which is attributed to the founder and bottleneck events among populations. Our results will provide meaningful information on the population genetics of G. japonicus and will provide some insights into the study of origin of populations.
Collapse
Affiliation(s)
- Longjie Xia
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing 210023, China
| | - Fengna Cai
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing 210023, China
| | - Shasha Chen
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing 210023, China
| | - Yao Cai
- School of Food Science, Nanjing Xiaozhuang University, Nanjing 211171, China
| | - Kaiya Zhou
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing 210023, China
| | - Jie Yan
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing 210023, China
| | - Peng Li
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing 210023, China
| |
Collapse
|
14
|
Badry A, Rüdel H, Göckener B, Nika MC, Alygizakis N, Gkotsis G, Thomaidis NS, Treu G, Dekker RWRJ, Movalli P, Walker LA, Potter ED, Cincinelli A, Martellini T, Duke G, Slobodnik J, Koschorreck J. Making use of apex predator sample collections: an integrated workflow for quality assured sample processing, analysis and digital sample freezing of archived samples. CHEMOSPHERE 2022; 309:136603. [PMID: 36174727 DOI: 10.1016/j.chemosphere.2022.136603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 09/21/2022] [Accepted: 09/22/2022] [Indexed: 06/16/2023]
Abstract
Using monitoring data from apex predators for chemicals risk assessment can provide important information on bioaccumulating as well as biomagnifying chemicals in food webs. A survey among European institutions involved in chemical risk assessment on their experiences with apex predator data in chemical risk assessment revealed great interest in using such data. However, the respondents indicated that constraints were related to expected high costs, lack of standardisation and harmonised quality criteria for exposure assessment, data access, and regulatory acceptance/application. During the Life APEX project, we demonstrated that European sample collections (i.e. environmental specimen banks (ESBs), research collection (RCs), natural history museums (NHMs)) archive a large variety of biological samples that can be readily used for chemical analysis once appropriate quality assurance/control (QA/QC) measures have been developed and implemented. We therefore issued a second survey on sampling, processing and archiving procedures in European sample collections to derive key quality QA/QC criteria for chemical analysis. The survey revealed great differences in QA/QC measures between ESBs, NHMs and RCs. Whereas basic information such as sampling location, date and biometric data were mostly available across institutions, protocols to accompany the sampling strategy with respect to chemical analysis were only available for ESBs. For RCs, the applied QA/QC measures vary with the respective research question, whereas NHMs are generally less aware of e.g. chemical cross-contamination issues. Based on the survey we derived key indicators for assessing the quality of biota samples that can be easily implemented in online databases. Furthermore, we provide a QA/QC workflow not only for sampling and processing but also for the chemical analysis of biota samples. We focussed on comprehensive analytical techniques such as non-target screening and provided insights into subsequent storage of high-resolution chromatograms in online databases (i.e. digital sample freezing platform) to ultimately support chemicals risk assessment.
Collapse
Affiliation(s)
- Alexander Badry
- German Environment Agency (Umweltbundesamt), 06813, Dessau-Roßlau, Germany.
| | - Heinz Rüdel
- Fraunhofer Institute for Molecular Biology and Applied Ecology (Fraunhofer IME), 57392, Schmallenberg, Germany
| | - Bernd Göckener
- Fraunhofer Institute for Molecular Biology and Applied Ecology (Fraunhofer IME), 57392, Schmallenberg, Germany
| | - Maria-Christina Nika
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771, Athens, Greece
| | - Nikiforos Alygizakis
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771, Athens, Greece; Environmental Institute, Okružná 784/42, 97241, Koš, Slovak Republic
| | - Georgios Gkotsis
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771, Athens, Greece
| | - Nikolaos S Thomaidis
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771, Athens, Greece
| | - Gabriele Treu
- German Environment Agency (Umweltbundesamt), 06813, Dessau-Roßlau, Germany
| | - Rene W R J Dekker
- Naturalis Biodiversity Center, Darwinweg 2, 2333, CR, Leiden, the Netherlands
| | - Paola Movalli
- Naturalis Biodiversity Center, Darwinweg 2, 2333, CR, Leiden, the Netherlands
| | - Lee A Walker
- UK Centre for Ecology & Hydrology, Lancaster Environment Centre, Lancaster, LA1 4PQ, United Kingdom
| | - Elaine D Potter
- UK Centre for Ecology & Hydrology, Lancaster Environment Centre, Lancaster, LA1 4PQ, United Kingdom
| | - Alessandra Cincinelli
- Department of Chemistry "Ugo Schiff", University of Florence, 50019, Sesto Fiorentino, Italy
| | - Tania Martellini
- Department of Chemistry "Ugo Schiff", University of Florence, 50019, Sesto Fiorentino, Italy
| | - Guy Duke
- UK Centre for Ecology & Hydrology, MacLean Bldg, Benson Ln, Crowmarsh Gifford, Wallingford, OX10 8BB, United Kingdom
| | | | - Jan Koschorreck
- German Environment Agency (Umweltbundesamt), 06813, Dessau-Roßlau, Germany
| |
Collapse
|
15
|
Forensic Analysis of Novel SARS2r-CoV Identified in Game Animal Datasets in China Shows Evolutionary Relationship to Pangolin GX CoV Clade and Apparent Genetic Experimentation. Appl Microbiol 2022. [DOI: 10.3390/applmicrobiol2040068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Pangolins are the only animals other than bats proposed to have been infected with SARS-CoV-2 related coronaviruses (SARS2r-CoVs) prior to the COVID-19 pandemic. Here, we examine the novel SARS2r-CoV we previously identified in game animal metatranscriptomic datasets sequenced by the Nanjing Agricultural University in 2022, and find that sections of the partial genome phylogenetically group with Guangxi pangolin CoVs (GX PCoVs), while the full RdRp sequence groups with bat-SL-CoVZC45. While the novel SARS2r-CoV is found in 6 pangolin datasets, it is also found in 10 additional NGS datasets from 5 separate mammalian species and is likely related to contamination by a laboratory researched virus. Absence of bat mitochondrial sequences from the datasets, the fragmentary nature of the virus sequence and the presence of a partial sequence of a cloning vector attached to a SARS2r-CoV read suggests that it has been cloned. We find that NGS datasets containing the novel SARS2r-CoV are contaminated with significant Homo sapiens genetic material, and numerous viruses not associated with the host animals sampled. We further identify the dominant human haplogroup of the contaminating H. sapiens genetic material to be F1c1a1, which is of East Asian provenance. The association of this novel SARS2r-CoV with both bat CoV and the GX PCoV clades is an important step towards identifying the origin of the GX PCoVs.
Collapse
|
16
|
Van Camp PJ, Porollo A. SEQ2MGS: an effective tool for generating realistic artificial metagenomes from the existing sequencing data. NAR Genom Bioinform 2022; 4:lqac050. [PMID: 35899079 PMCID: PMC9310082 DOI: 10.1093/nargab/lqac050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 05/02/2022] [Accepted: 06/23/2022] [Indexed: 11/18/2022] Open
Abstract
Assessment of bioinformatics tools for the metagenomics analysis from the whole genome sequencing data requires realistic benchmark sets. We developed an effective and simple generator of artificial metagenomes from real sequencing experiments. The tool (SEQ2MGS) analyzes the input FASTQ files, precomputes genomic content, and blends shotgun reads from different sequenced isolates, or spike isolate(s) in real metagenome, in desired proportions. SEQ2MGS eliminates the need for simulation of sequencing platform variations, reads distributions, presence of plasmids, viruses, and contamination. The tool is especially useful for a quick generation of multiple complex samples that include new or understudied organisms, even without assembled genomes. For illustration, we first demonstrated the ease of SEQ2MGS use for the simulation of altered Schaedler flora (ASF) in comparison with de novo metagenomics generators Grinder and CAMISIM. Next, we emulated the emergence of a pathogen in the human gut microbiome and observed that Kraken, Centrifuge, and MetaPhlAn, while correctly identified Klebsiella pneumoniae, produced inconsistent results for the rest of real metagenome. Finally, using the MG-RAST platform, we affirmed that SEQ2MGS properly transfers genomic information from an isolate into the simulated metagenome by the correct identification of antimicrobial resistance genes anticipated to appear compared to the original metagenome.
Collapse
Affiliation(s)
- Pieter-Jan Van Camp
- Department of Biomedical Informatics, University of Cincinnati, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Aleksey Porollo
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
- Department of Pediatrics, University of Cincinnati, Cincinnati, OH, USA
| |
Collapse
|
17
|
Penaud B, Laurent B, Milhes M, Noüs C, Ehrenmann F, Dutech C. SNP4OrphanSpecies: A bioinformatics pipeline to isolate molecular markers for studying genetic diversity of orphan species. Biodivers Data J 2022; 10:e85587. [PMID: 36761595 PMCID: PMC9848450 DOI: 10.3897/bdj.10.e85587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 06/23/2022] [Indexed: 11/12/2022] Open
Abstract
Background For several decades, an increase in disease or pest emergences due to anthropogenic introduction or environmental changes has been recorded. This increase leads to serious threats to the genetic and species diversity of numerous ecosystems. Many of these events involve species with poor or no genomic resources (called here "orphan species"). This lack of resources is a serious limitation to our understanding of the origin of emergent populations, their ability to adapt to new environments and to predict future consequences to biodiversity. Analyses of genetic diversity are an efficient method to obtain this information rapidly, but require available polymorphic genetic markers. New information We developed a generic bioinformatics pipeline to rapidly isolate such markers with the goal for the pipeline to be applied in studies of invasive taxa from different taxonomic groups, with a special focus on forest fungal pathogens and insect pests. This pipeline is based on: 1) an automated de novo genome assembly obtained from shotgun whole genome sequencing using paired-end Illumina technology; 2) the isolation of single-copy genes conserved in species related to the studied emergent organisms; 3) primer development for multiplexed short sequences obtained from these conserved genes. Previous studies have shown that intronic regions of these conserved genes generally contain several single nucleotide polymorphisms within species. The pipeline's functionality was evaluated with sequenced genomes of five invasive or expanding pathogen and pest species in Europe (Armillariaostoyae (Romagn.) Herink 1973, Bursaphelenchusxylophilus Steiner & Buhrer 1934, Sphaeropsissapinea (fr.) Dicko & B. Sutton 1980, Erysiphealphitoides (Griffon & Maubl.) U. Braun & S. Takam. 2000, Thaumetopoeapityocampa Denis & Schiffermüller, 1775). We successfully isolated several pools of one hundred short gene regions for each assembled genome, which can be amplified in multiplex. The bioinformatics pipeline is user-friendly and requires little computational resources. This easy-to-set-up and run method for genetic marker identification will be useful for numerous laboratories studying biological invasions, but with limited resources and expertise in bioinformatics.
Collapse
Affiliation(s)
- Benjamin Penaud
- BIOGECO, INRAE, Univ. Bordeaux, 33610 Cestas, FranceBIOGECO, INRAE, Univ. Bordeaux33610 CestasFrance
| | - Benoit Laurent
- BIOGECO, INRAE, Univ. Bordeaux, 33610 Cestas, FranceBIOGECO, INRAE, Univ. Bordeaux33610 CestasFrance
| | - Marine Milhes
- INRAE, US 1426, GeT-PlaGe, Genotoul, Castanet-Tolosan, FranceINRAE, US 1426, GeT-PlaGe, GenotoulCastanet-TolosanFrance
| | - Camille Noüs
- Laboratoire Cogitamus, Bordeaux, FranceLaboratoire CogitamusBordeauxFrance
| | - François Ehrenmann
- BIOGECO, INRAE, Univ. Bordeaux, 33610 Cestas, FranceBIOGECO, INRAE, Univ. Bordeaux33610 CestasFrance
| | - Cyril Dutech
- BIOGECO, INRAE, Univ. Bordeaux, 33610 Cestas, FranceBIOGECO, INRAE, Univ. Bordeaux33610 CestasFrance
| |
Collapse
|
18
|
Owen CL, Marshall DC, Wade EJ, Meister R, Goemans G, Kunte K, Moulds M, Hill K, Villet M, Pham TH, Kortyna M, Lemmon EM, Lemmon AR, Simon C. Detecting and removing sample contamination in phylogenomic data: an example and its implications for Cicadidae phylogeny (Insecta: Hemiptera). Syst Biol 2022; 71:1504-1523. [PMID: 35708660 DOI: 10.1093/sysbio/syac043] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 05/23/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022] Open
Abstract
Contamination of a genetic sample with DNA from one or more non-target species is a continuing concern of molecular phylogenetic studies, both Sanger sequencing studies and Next-Generation Sequencing (NGS) studies. We developed an automated pipeline for identifying and excluding likely cross-contaminated loci based on detection of bimodal distributions of patristic distances across gene trees. When the contamination occurs between samples within a dataset, comparisons between a contaminated sample and its contaminant taxon will yield bimodal distributions with one peak close to zero patristic distance. This new method does not rely on a priori knowledge of taxon relatedness nor does it determine the causes(s) of the contamination. Exclusion of putatively contaminated loci from a dataset generated for the insect family Cicadidae showed that these sequences were affecting some topological patterns and branch supports, although the effects were sometimes subtle, with some contamination-influenced relationships exhibiting strong bootstrap support. Long tip branches and outlier values for one anchored phylogenomic pipeline statistic (AvgNHomologs) were correlated with the presence of contamination. While the AHE markers used here, which target hemipteroid taxa, proved effective in resolving deep and shallow level Cicadidae relationships in aggregate, individual markers contained inadequate phylogenetic signal, in part probably due to short length. The cleaned dataset, consisting of 429 loci, from 90 genera representing 44 of 56 current Cicadidae tribes, supported three of the four sampled Cicadidae subfamilies in concatenated-matrix maximum likelihood (ML) and multispecies coalescent-based species tree analyses, with the fourth subfamily weakly supported in the ML trees. No well-supported patterns from previous family-level Sanger sequencing studies of Cicadidae phylogeny were contradicted. One taxon (Aragualna plenalinea) did not fall with its current subfamily in the genetic tree, and this genus and its tribe Aragualnini is reclassified to Tibicininae following morphological re-examination. Only subtle differences were observed in trees after removal of loci for which divergent base frequencies were detected. Greater success may be achieved by increased taxon sampling and developing a probe set targeting a more recent common ancestor and longer loci. Searches for contamination are an essential step in phylogenomic analyses of all kinds and our pipeline is an effective solution.
Collapse
Affiliation(s)
- Christopher L Owen
- Systematic Entomology Laboratory, USDA-ARS, c/o National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - David C Marshall
- Dept. of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Elizabeth J Wade
- Dept. of Natural Science and Mathematics, Curry College, Milton, MA 02186, USA
| | - Russ Meister
- Dept. of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Geert Goemans
- Dept. of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Krushnamegh Kunte
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore 560 065, India
| | - Max Moulds
- Australian Museum Research Institute, 1 William Street, Sydney N.S.W, Australia. 2010
| | - Kathy Hill
- Dept. of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| | - M Villet
- Dept. of Biology, Rhodes University, Grahamstown 6140, South Africa
| | - Thai-Hong Pham
- Mientrung Institute for Scientific Research, Vietnam Academy of Science and Technology, Hue, Vietnam.,Vietnam National Museum of Nature and Graduate School of Science and Technology, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Michelle Kortyna
- Department of Biological Science, Florida State University, 319 Stadium Drive, Tallahassee, USA
| | - Emily Moriarty Lemmon
- Department of Biological Science, Florida State University, 319 Stadium Drive, Tallahassee, FL 32306, USA
| | - Alan R Lemmon
- Department of Scientific Computing, Florida State University 400 Dirac Science Library, Tallahassee, FL 32306, USA
| | - Chris Simon
- Dept. of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
19
|
Adeleke IA, Kavalappara SR, McGregor C, Srinivasan R, Bag S. Persistent, and Asymptomatic Viral Infections and Whitefly-Transmitted Viruses Impacting Cantaloupe and Watermelon in Georgia, USA. Viruses 2022; 14:1310. [PMID: 35746780 PMCID: PMC9227350 DOI: 10.3390/v14061310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 06/13/2022] [Accepted: 06/13/2022] [Indexed: 11/17/2022] Open
Abstract
Cucurbits in Southeastern USA have experienced a drastic decline in production over the years due to the effect of economically important viruses, mainly those transmitted by the sweet potato whitefly (Bemisia tabaci Gennadius). In cucurbits, these viruses can be found as a single or mixed infection, thereby causing significant yield loss. During the spring of 2021, surveys were conducted to evaluate the incidence and distribution of viruses infecting cantaloupe (n = 80) and watermelon (n = 245) in Georgia. Symptomatic foliar tissues were collected from six counties and sRNA libraries were constructed from seven symptomatic samples. High throughput sequencing (HTS) analysis revealed the presence of three different new RNA viruses in Georgia: cucumis melo endornavirus (CmEV), cucumis melo amalgavirus (CmAV1), and cucumis melo cryptic virus (CmCV). Reverse transcription-polymerase chain reaction (RT-PCR) analysis revealed the presence of CmEV and CmAV1 in 25% and 43% of the total samples tested, respectively. CmCV was not detected using RT-PCR. Watermelon crinkle leaf-associated virus 1 (WCLaV-1), recently reported in GA, was detected in 28% of the samples tested. Furthermore, RT-PCR and PCR analysis of 43 symptomatic leaf tissues collected from the fall-grown watermelon in 2019 revealed the presence of cucurbit chlorotic yellows virus (CCYV), cucurbit yellow stunting disorder virus (CYSDV), and cucurbit leaf crumple virus (CuLCrV) at 73%, 2%, and 81%, respectively. This finding broadens our knowledge of the prevalence of viruses in melons in the fall and spring, as well as the geographical expansion of the WCLaV-1 in GA, USA.
Collapse
Affiliation(s)
| | | | - Cecilia McGregor
- Department of Horticulture, University of Georgia, Athens, GA 30602, USA;
| | | | - Sudeep Bag
- Department of Plant Pathology, University of Georgia, Tifton, GA 31793, USA;
| |
Collapse
|
20
|
Has taxonomic vandalism gone too far? A case study, the rise of the pay-to-publish model and the pitfalls of Morchella systematics. Mycol Prog 2022. [DOI: 10.1007/s11557-021-01755-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
21
|
Sze C, Pressler M, Lee JR, Chughtai B. The gut, vaginal, and urine microbiome in overactive bladder: a systematic review. Int Urogynecol J 2022; 33:1157-1164. [PMID: 35237854 DOI: 10.1007/s00192-022-05127-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Accepted: 02/06/2022] [Indexed: 10/19/2022]
Abstract
INTRODUCTION AND HYPOTHESIS The objective was to systemically review the current literature on the association of gut, vaginal, and urinary dysbiosis in female patients with overactive bladder (OAB). METHODS We performed a comprehensive literature search following the Preferred Reporting Items for Systematic Review and Meta-analysis (PRISMA) protocols for systematic reviews. In the EMBASE, CINAHL, and Medline databases, a search was conducted using key words such as "microbiome," "microbiota," "microflora," "overactive bladder," "urge," "gut," "vaginal." Articles were screened using the online tool www.covidence.org . Two independent reviewers screened studies at each stage and resolved conflicts together. We excluded papers that discussed pediatric patients and animal studies. In total, 13 articles met this criterion, which included 6 abstracts. RESULTS After identifying 817 unique references, 13 articles met the criteria for data extraction. Articles were published from 2017 to 2021. No study reported the same microbiota abundance, even in healthy individuals. Overall, there was a loss of bacterial diversity in OAB patients compared with controls. Additionally, the bacterial composition of the controls and OAB patients was not significantly different, especially if the urine was collected midstream. Overall, the composition of the microbiome is dependent on the specimen collection methodology, and the metagenomic sequencing technique utilized. OAB urine microbiome is more predisposed to alteration from the gut or vaginal influences than in controls. CONCLUSIONS Current evidence suggested a potential relationship among gut, vaginal, and urinary microbiome in OAB patients, but there are very limited studies.
Collapse
Affiliation(s)
- Christina Sze
- Department of Urology, New York Presbyterian Hospital, New York, NY, USA
| | | | - John Richard Lee
- Division of Nephrology and Hypertension, Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Bilal Chughtai
- Department of Urology, Weill Cornell Medicine/New York Presbyterian Hospital, 525 E. 68th Street, New York, NY, 10021, USA.
| |
Collapse
|
22
|
Cornet L, Baurain D. Contamination detection in genomic data: more is not enough. Genome Biol 2022; 23:60. [PMID: 35189924 PMCID: PMC8862208 DOI: 10.1186/s13059-022-02619-9] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 01/18/2022] [Indexed: 12/20/2022] Open
Abstract
The decreasing cost of sequencing and concomitant augmentation of publicly available genomes have created an acute need for automated software to assess genomic contamination. During the last 6 years, 18 programs have been published, each with its own strengths and weaknesses. Deciding which tools to use becomes more and more difficult without an understanding of the underlying algorithms. We review these programs, benchmarking six of them, and present their main operating principles. This article is intended to guide researchers in the selection of appropriate tools for specific applications. Finally, we present future challenges in the developing field of contamination detection.
Collapse
Affiliation(s)
- Luc Cornet
- BCCM/IHEM, Mycology and Aerobiology, Sciensano, Bruxelles, Belgium
| | - Denis Baurain
- InBioS-PhytoSYSTEMS, Eukaryotic Phylogenomics, University of Liège, Liège, Belgium.
| |
Collapse
|
23
|
Jung E, Romero R, Yoon BH, Theis KR, Gudicha DW, Tarca AL, Diaz-Primera R, Winters AD, Gomez-Lopez N, Yeo L, Hsu CD. Bacteria in the amniotic fluid without inflammation: early colonization vs. contamination. J Perinat Med 2021; 49:1103-1121. [PMID: 34229367 PMCID: PMC8570988 DOI: 10.1515/jpm-2021-0191] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 05/19/2021] [Indexed: 02/07/2023]
Abstract
OBJECTIVES Intra-amniotic infection, defined by the presence of microorganisms in the amniotic cavity, is often accompanied by intra-amniotic inflammation. Occasionally, laboratories report the growth of bacteria or the presence of microbial nucleic acids in amniotic fluid in the absence of intra-amniotic inflammation. This study was conducted to determine the clinical significance of the presence of bacteria in amniotic fluid samples in the absence of intra-amniotic inflammation. METHODS A retrospective cross-sectional study included 360 patients with preterm labor and intact membranes who underwent transabdominal amniocentesis for evaluation of the microbial state of the amniotic cavity as well as intra-amniotic inflammation. Cultivation techniques were used to isolate microorganisms, and broad-range polymerase chain reaction coupled with electrospray ionization mass spectrometry (PCR/ESI-MS) was utilized to detect the nucleic acids of bacteria, viruses, and fungi. RESULTS Patients whose amniotic fluid samples evinced microorganisms but did not indicate inflammation had a similar perinatal outcome to those without microorganisms or inflammation [amniocentesis-to-delivery interval (p=0.31), spontaneous preterm birth before 34 weeks (p=0.83), acute placental inflammatory lesions (p=1), and composite neonatal morbidity (p=0.8)]. CONCLUSIONS The isolation of microorganisms from a sample of amniotic fluid in the absence of intra-amniotic inflammation is indicative of a benign condition, which most likely represents contamination of the specimen during the collection procedure or laboratory processing rather than early colonization or infection.
Collapse
Affiliation(s)
- Eunjung Jung
- Perinatology Research Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, U.S. Department of Health and Human Services, Bethesda, Maryland, and Detroit, Michigan, USA,Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, Michigan, USA
| | - Roberto Romero
- Perinatology Research Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, U.S. Department of Health and Human Services, Bethesda, Maryland, and Detroit, Michigan, USA,Department of Obstetrics and Gynecology, University of Michigan Health System, Ann Arbor, Michigan, USA,Department of Epidemiology and Biostatistics, College of Human Medicine, Michigan State University, East Lansing, Michigan, USA,Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, USA,Detroit Medical Center, Detroit, Michigan, USA,Department of Obstetrics and Gynecology, Florida International University, Miami, Florida, USA
| | - Bo Hyun Yoon
- BioMedical Research Institute, Seoul National University Hospital, Seoul, Republic of Korea
| | - Kevin R. Theis
- Perinatology Research Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, U.S. Department of Health and Human Services, Bethesda, Maryland, and Detroit, Michigan, USA,Department of Biochemistry, Microbiology and Immunology, Wayne State University School of Medicine, Detroit, Michigan, USA
| | - Dereje W. Gudicha
- Perinatology Research Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, U.S. Department of Health and Human Services, Bethesda, Maryland, and Detroit, Michigan, USA,Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, Michigan, USA
| | - Adi L. Tarca
- Perinatology Research Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, U.S. Department of Health and Human Services, Bethesda, Maryland, and Detroit, Michigan, USA,Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, Michigan, USA,Department of Computer Science, College of Engineering, Wayne State University, Detroit, Michigan, USA
| | - Ramiro Diaz-Primera
- Perinatology Research Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, U.S. Department of Health and Human Services, Bethesda, Maryland, and Detroit, Michigan, USA,Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, Michigan, USA
| | - Andrew D. Winters
- Perinatology Research Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, U.S. Department of Health and Human Services, Bethesda, Maryland, and Detroit, Michigan, USA,Department of Biochemistry, Microbiology and Immunology, Wayne State University School of Medicine, Detroit, Michigan, USA
| | - Nardhy Gomez-Lopez
- Perinatology Research Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, U.S. Department of Health and Human Services, Bethesda, Maryland, and Detroit, Michigan, USA,Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, Michigan, USA,Department of Biochemistry, Microbiology and Immunology, Wayne State University School of Medicine, Detroit, Michigan, USA
| | - Lami Yeo
- Perinatology Research Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, U.S. Department of Health and Human Services, Bethesda, Maryland, and Detroit, Michigan, USA,Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, Michigan, USA
| | - Chaur-Dong Hsu
- Perinatology Research Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, U.S. Department of Health and Human Services, Bethesda, Maryland, and Detroit, Michigan, USA,Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, Michigan, USA,Department of Physiology, Wayne State University School of Medicine, Detroit, Michigan, USA
| |
Collapse
|
24
|
Lucena-Perez M, Kleinman-Ruiz D, Marmesat E, Saveljev AP, Schmidt K, Godoy JA. Bottleneck-associated changes in the genomic landscape of genetic diversity in wild lynx populations. Evol Appl 2021; 14:2664-2679. [PMID: 34815746 PMCID: PMC8591332 DOI: 10.1111/eva.13302] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 08/17/2021] [Accepted: 09/08/2021] [Indexed: 01/06/2023] Open
Abstract
Demographic bottlenecks generally reduce genetic diversity through more intense genetic drift, but their net effect may vary along the genome due to the random nature of genetic drift and to local effects of recombination, mutation, and selection. Here, we analyzed the changes in genetic diversity following a bottleneck by comparing whole-genome diversity patterns in populations with and without severe recent documented declines of Iberian (Lynx pardinus, n = 31) and Eurasian lynx (Lynx lynx, n = 29). As expected, overall genomic diversity correlated negatively with bottleneck intensity and/or duration. Correlations of genetic diversity with divergence, chromosome size, gene or functional site content, GC content, or recombination were observed in nonbottlenecked populations, but were weaker in bottlenecked populations. Also, functional features under intense purifying selection and the X chromosome showed an increase in the observed density of variants, even resulting in higher θ W diversity than in nonbottlenecked populations. Increased diversity seems to be related to both a higher mutational input in those regions creating a large collection of low-frequency variants, a few of which increase in frequency during the bottleneck to the point they become detectable with our limited sample, and the reduced efficacy of purifying selection, which affects not only protein structure and function but also the regulation of gene expression. The results of this study alert to the possible reduction of fitness and adaptive potential associated with the genomic erosion in regulatory elements. Further, the detection of a gain of diversity in ultra-conserved elements can be used as a sensitive and easy-to-apply signature of genetic erosion in wild populations.
Collapse
Affiliation(s)
- Maria Lucena-Perez
- Departamento de Ecología Integrativa Estación Biológica de Doñana (CSIC) Sevilla Spain
| | - Daniel Kleinman-Ruiz
- Departamento de Ecología Integrativa Estación Biológica de Doñana (CSIC) Sevilla Spain
- Departamento de Genética Facultad de Biología Universidad Complutense Madrid Spain
| | - Elena Marmesat
- Departamento de Ecología Integrativa Estación Biológica de Doñana (CSIC) Sevilla Spain
| | - Alexander P Saveljev
- Department of Animal Ecology Russian Research Institute of Game Management and Fur Farming Kirov Russia
| | - Krzysztof Schmidt
- Mammal Research Institute Polish Academy of Sciences Białowieża Poland
| | - José A Godoy
- Departamento de Ecología Integrativa Estación Biológica de Doñana (CSIC) Sevilla Spain
| |
Collapse
|
25
|
Wang Y, Yuan H, Huang J, Li C. Inline index helped in cleaning up data contamination generated during library preparation and the subsequent steps. Mol Biol Rep 2021; 49:385-392. [PMID: 34716505 DOI: 10.1007/s11033-021-06884-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 09/23/2021] [Indexed: 11/24/2022]
Abstract
BACKGROUND High-throughput sequencing involves library preparation and amplification steps, which may induce contamination across samples or between samples and the environment. METHODS We tested the effect of applying an inline-index strategy, in which DNA indices of 6 bp were added to both ends of the inserts at the ligation step of library prep for resolving the data contamination problem. RESULTS Our results showed that the contamination ranged from 0.29 to 1.25% in one experiment and from 0.83 to 27.01% in the other. We also found that contamination could be environmental or from reagents besides cross-contamination between samples. CONCLUSIONS Inline-index method is a useful experimental design to clean up the data and address the contamination problem which has been plaguing high-throughput sequencing data in many applications.
Collapse
Affiliation(s)
- Ying Wang
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai Ocean University, Shanghai, 201306, China.,Shanghai Collaborative Innovation for Aquatic Animal Genetics and Breeding, Shanghai Ocean University, Shanghai, 201306, China
| | - Hao Yuan
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai Ocean University, Shanghai, 201306, China.,Shanghai Collaborative Innovation for Aquatic Animal Genetics and Breeding, Shanghai Ocean University, Shanghai, 201306, China
| | - Junman Huang
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai Ocean University, Shanghai, 201306, China.,Shanghai Collaborative Innovation for Aquatic Animal Genetics and Breeding, Shanghai Ocean University, Shanghai, 201306, China
| | - Chenhong Li
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai Ocean University, Shanghai, 201306, China. .,Shanghai Collaborative Innovation for Aquatic Animal Genetics and Breeding, Shanghai Ocean University, Shanghai, 201306, China.
| |
Collapse
|
26
|
Simion P, Narayan J, Houtain A, Derzelle A, Baudry L, Nicolas E, Arora R, Cariou M, Cruaud C, Gaudray FR, Gilbert C, Guiglielmoni N, Hespeels B, Kozlowski DKL, Labadie K, Limasset A, Llirós M, Marbouty M, Terwagne M, Virgo J, Cordaux R, Danchin EGJ, Hallet B, Koszul R, Lenormand T, Flot JF, Van Doninck K. Chromosome-level genome assembly reveals homologous chromosomes and recombination in asexual rotifer Adineta vaga. SCIENCE ADVANCES 2021; 7:eabg4216. [PMID: 34613768 PMCID: PMC8494291 DOI: 10.1126/sciadv.abg4216] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Bdelloid rotifers are notorious as a speciose ancient clade comprising only asexual lineages. Thanks to their ability to repair highly fragmented DNA, most bdelloid species also withstand complete desiccation and ionizing radiation. Producing a well-assembled reference genome is a critical step to developing an understanding of the effects of long-term asexuality and DNA breakage on genome evolution. To this end, we present the first high-quality chromosome-level genome assemblies for the bdelloid Adineta vaga, composed of six pairs of homologous (diploid) chromosomes with a footprint of paleotetraploidy. The observed large-scale losses of heterozygosity are signatures of recombination between homologous chromosomes, either during mitotic DNA double-strand break repair or when resolving programmed DNA breaks during a modified meiosis. Dynamic subtelomeric regions harbor more structural diversity (e.g., chromosome rearrangements, transposable elements, and haplotypic divergence). Our results trigger the reappraisal of potential meiotic processes in bdelloid rotifers and help unravel the factors underlying their long-term asexual evolutionary success.
Collapse
Affiliation(s)
- Paul Simion
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
- Corresponding author. (K.V.D.); (J.-F.F.); (P.S.)
| | - Jitendra Narayan
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
| | - Antoine Houtain
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
| | - Alessandro Derzelle
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
| | - Lyam Baudry
- Institut Pasteur, Unité Régulation Spatiale des Génomes, UMR 3525, CNRS, Paris F-75015, France
- Collège Doctoral, Sorbonne Université, F-75005 Paris, France
| | - Emilien Nicolas
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
- Molecular Biology and Evolution, Université libre de Bruxelles (ULB), Brussels 1050, Belgium
| | - Rohan Arora
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
- Molecular Biology and Evolution, Université libre de Bruxelles (ULB), Brussels 1050, Belgium
| | - Marie Cariou
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
- CIRI, Centre International de Recherche en Infectiologie, Univ Lyon, Inserm, U1111, Université Claude Bernard Lyon 1, CNRS, UMR5308, ENS de Lyon, F-69007 Lyon, France
| | - Corinne Cruaud
- Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057 Evry, France
| | | | - Clément Gilbert
- Évolution, Génomes, Comportement et Écologie, Université Paris-Saclay, CNRS, IRD, UMR, 91198 Gif-sur-Yvette, France
| | - Nadège Guiglielmoni
- Evolutionary Biology and Ecology, Université libre de Bruxelles (ULB), Brussels 1050, Belgium
| | - Boris Hespeels
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
| | - Djampa K. L. Kozlowski
- INRAE, Université Côte-d’Azur, CNRS, Institut Sophia Agrobiotech, Sophia Antipolis 06903, France
| | - Karine Labadie
- Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057 Evry, France
| | - Antoine Limasset
- Université de Lille, CNRS, UMR 9189 - CRIStAL, 59655 Villeneuve-d’Ascq, France
| | - Marc Llirós
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
- Institut d’Investigació Biomédica de Girona, Malalties Digestives i Microbiota, 17190 Salt, Spain
| | - Martial Marbouty
- Institut Pasteur, Unité Régulation Spatiale des Génomes, UMR 3525, CNRS, Paris F-75015, France
| | - Matthieu Terwagne
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
| | - Julie Virgo
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
| | - Richard Cordaux
- Ecologie et Biologie des interactions, Université de Poitiers, UMR CNRS 7267, 5 rue Albert Turpain, 86073 Poitiers, France
| | - Etienne G. J. Danchin
- INRAE, Université Côte-d’Azur, CNRS, Institut Sophia Agrobiotech, Sophia Antipolis 06903, France
| | - Bernard Hallet
- LIBST, Université Catholique de Louvain (UCLouvain), Croix du Sud 4/5, Louvain-la-Neuve 1348, Belgium
| | - Romain Koszul
- Institut Pasteur, Unité Régulation Spatiale des Génomes, UMR 3525, CNRS, Paris F-75015, France
| | - Thomas Lenormand
- CEFE, Univ Montpellier, CNRS, Univ Paul Valéry Montpellier 3, EPHE, IRD, Montpellier, France
| | - Jean-Francois Flot
- Evolutionary Biology and Ecology, Université libre de Bruxelles (ULB), Brussels 1050, Belgium
- Interuniversity Institute of Bioinformatics in Brussels - (IB), Brussels 1050, Belgium
- Corresponding author. (K.V.D.); (J.-F.F.); (P.S.)
| | - Karine Van Doninck
- Research Unit in Environmental and Evolutionary Biology, Université de Namur, Namur 5000, Belgium
- Molecular Biology and Evolution, Université libre de Bruxelles (ULB), Brussels 1050, Belgium
- Corresponding author. (K.V.D.); (J.-F.F.); (P.S.)
| |
Collapse
|
27
|
Hammoud C, Mulero S, Van Bocxlaer B, Boissier J, Verschuren D, Albrecht C, Huyse T. Simultaneous genotyping of snails and infecting trematode parasites using high-throughput amplicon sequencing. Mol Ecol Resour 2021; 22:567-586. [PMID: 34435445 DOI: 10.1111/1755-0998.13492] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Revised: 07/19/2021] [Accepted: 08/18/2021] [Indexed: 01/04/2023]
Abstract
Several methodological issues currently hamper the study of entire trematode communities within populations of their intermediate snail hosts. Here we develop a new workflow using high-throughput amplicon sequencing to simultaneously genotype snail hosts and their infecting trematode parasites. We designed primers to amplify four snail and five trematode markers in a single multiplex PCR. While also applicable to other genera, we focused on medically and economically important snail genera within the superorder Hygrophila and targeted a broad taxonomic range of parasites within the class Trematoda. We tested the workflow using 417 Biomphalaria glabrata specimens experimentally infected with Schistosoma rodhaini, two strains of Schistosoma mansoni and combinations thereof. We evaluated the reliability of infection diagnostics, the robustness of the workflow, its specificity related to host and parasite identification, and the sensitivity to detect co-infections, immature infections and changes of parasite biomass during the infection process. Finally, we investigated its applicability in wild-caught snails of other genera naturally infected with a diverse range of trematodes. After stringent quality control the workflow allows the identification of snails to species level, and of trematodes to taxonomic levels ranging from family to strain. It is sensitive to detect immature infections and changes in parasite biomass described in previous experimental studies. Co-infections were successfully identified, opening the possibility to examine parasite-parasite interactions such as interspecific competition. Together, these results demonstrate that our workflow provides a powerful tool to analyse the processes shaping trematode communities within natural snail populations.
Collapse
Affiliation(s)
- Cyril Hammoud
- Limnology Unit, Department of Biology, Ghent University, Gent, Belgium.,Department of Biology, Royal Museum for Central Africa, Tervuren, Belgium
| | - Stephen Mulero
- IHPE, Univ. Montpellier, CNRS, Univ. Perpignan Via Domitia, IFREMER, Perpignan, France
| | - Bert Van Bocxlaer
- Limnology Unit, Department of Biology, Ghent University, Gent, Belgium.,Univ. Lille, UMR 8198 Evo-Eco-Paleo, CNRS, Lille, France
| | - Jérôme Boissier
- IHPE, Univ. Montpellier, CNRS, Univ. Perpignan Via Domitia, IFREMER, Perpignan, France
| | - Dirk Verschuren
- Limnology Unit, Department of Biology, Ghent University, Gent, Belgium
| | - Christian Albrecht
- Systematics & Biodiversity Lab, Department of Animal Ecology & Systematics, Justus Liebig University, Giessen, Germany
| | - Tine Huyse
- Department of Biology, Royal Museum for Central Africa, Tervuren, Belgium.,Laboratory of Biodiversity and Evolutionary Genomics, University of Leuven, Leuven, Belgium
| |
Collapse
|
28
|
Rachtman E, Bafna V, Mirarab S. CONSULT: accurate contamination removal using locality-sensitive hashing. NAR Genom Bioinform 2021; 3:lqab071. [PMID: 34377979 PMCID: PMC8340999 DOI: 10.1093/nargab/lqab071] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 06/30/2021] [Accepted: 07/19/2021] [Indexed: 12/27/2022] Open
Abstract
A fundamental question appears in many bioinformatics applications: Does a sequencing read belong to a large dataset of genomes from some broad taxonomic group, even when the closest match in the set is evolutionarily divergent from the query? For example, low-coverage genome sequencing (skimming) projects either assemble the organelle genome or compute genomic distances directly from unassembled reads. Using unassembled reads needs contamination detection because samples often include reads from unintended groups of species. Similarly, assembling the organelle genome needs distinguishing organelle and nuclear reads. While k-mer-based methods have shown promise in read-matching, prior studies have shown that existing methods are insufficiently sensitive for contamination detection. Here, we introduce a new read-matching tool called CONSULT that tests whether k-mers from a query fall within a user-specified distance of the reference dataset using locality-sensitive hashing. Taking advantage of large memory machines available nowadays, CONSULT libraries accommodate tens of thousands of microbial species. Our results show that CONSULT has higher true-positive and lower false-positive rates of contamination detection than leading methods such as Kraken-II and improves distance calculation from genome skims. We also demonstrate that CONSULT can distinguish organelle reads from nuclear reads, leading to dramatic improvements in skim-based mitochondrial assemblies.
Collapse
Affiliation(s)
- Eleonora Rachtman
- Bioinformatics and Systems Biology Graduate Program, UC San Diego, CA 92093, USA
| | - Vineet Bafna
- Department of Computer Science and Engineering, UC San Diego, CA 92093, USA
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, UC San Diego, CA 92093, USA
| |
Collapse
|
29
|
Riquier S, Bessiere C, Guibert B, Bouge AL, Boureux A, Ruffle F, Audoux J, Gilbert N, Xue H, Gautheret D, Commes T. Kmerator Suite: design of specific k-mer signatures and automatic metadata discovery in large RNA-seq datasets. NAR Genom Bioinform 2021; 3:lqab058. [PMID: 34179780 PMCID: PMC8221386 DOI: 10.1093/nargab/lqab058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 05/10/2021] [Accepted: 06/17/2021] [Indexed: 11/12/2022] Open
Abstract
The huge body of publicly available RNA-sequencing (RNA-seq) libraries is a treasure of functional information allowing to quantify the expression of known or novel transcripts in tissues. However, transcript quantification commonly relies on alignment methods requiring a lot of computational resources and processing time, which does not scale easily to large datasets. K-mer decomposition constitutes a new way to process RNA-seq data for the identification of transcriptional signatures, as k-mers can be used to quantify accurately gene expression in a less resource-consuming way. We present the Kmerator Suite, a set of three tools designed to extract specific k-mer signatures, quantify these k-mers into RNA-seq datasets and quickly visualize large dataset characteristics. The core tool, Kmerator, produces specific k-mers for 97% of human genes, enabling the measure of gene expression with high accuracy in simulated datasets. KmerExploR, a direct application of Kmerator, uses a set of predictor gene-specific k-mers to infer metadata including library protocol, sample features or contaminations from RNA-seq datasets. KmerExploR results are visualized through a user-friendly interface. Moreover, we demonstrate that the Kmerator Suite can be used for advanced queries targeting known or new biomarkers such as mutations, gene fusions or long non-coding RNAs for human health applications.
Collapse
Affiliation(s)
- Sébastien Riquier
- IRMB, University of Montpellier, INSERM, 80 rue Augustin Fliche, 34295, Montpellier, France
| | - Chloé Bessiere
- IRMB, University of Montpellier, INSERM, 80 rue Augustin Fliche, 34295, Montpellier, France
| | - Benoit Guibert
- IRMB, University of Montpellier, INSERM, 80 rue Augustin Fliche, 34295, Montpellier, France
| | | | - Anthony Boureux
- IRMB, University of Montpellier, INSERM, 80 rue Augustin Fliche, 34295, Montpellier, France
| | - Florence Ruffle
- IRMB, University of Montpellier, INSERM, 80 rue Augustin Fliche, 34295, Montpellier, France
| | | | - Nicolas Gilbert
- IRMB, University of Montpellier, INSERM, 80 rue Augustin Fliche, 34295, Montpellier, France
| | - Haoliang Xue
- Institute for Integrative Biology of the Cell, CEA, CNRS, Université Paris-Saclay, 91198, Gif sur Yvette, France
| | - Daniel Gautheret
- Institute for Integrative Biology of the Cell, CEA, CNRS, Université Paris-Saclay, 91198, Gif sur Yvette, France
| | - Thérèse Commes
- IRMB, University of Montpellier, INSERM, 80 rue Augustin Fliche, 34295, Montpellier, France
| |
Collapse
|
30
|
Ellis EA, Storer CG, Kawahara AY. De novo genome assemblies of butterflies. Gigascience 2021; 10:6291117. [PMID: 34076242 PMCID: PMC8170690 DOI: 10.1093/gigascience/giab041] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 07/22/2020] [Accepted: 05/05/2021] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND The availability of thousands of genomes has enabled new advancements in biology. However, many genomes have not been investigated for their quality. Here we examine quality trends in a taxonomically diverse and well-known group, butterflies (Papilionoidea), and provide draft, de novo assemblies for all available butterfly genomes. Owing to massive genome sequencing investment and taxonomic curation, this is an excellent group to explore genome quality. FINDINGS We provide de novo assemblies for all 822 available butterfly genomes and interpret their quality in terms of completeness and continuity. We identify the 50 highest quality genomes across butterflies and conclude that the ringlet, Aphantopus hyperantus, has the highest quality genome. Our post-processing of draft genome assemblies identified 118 butterfly genomes that should not be reused owing to contamination or extremely low quality. However, many draft genomes are of high utility, especially because permissibility of low-quality genomes is dependent on the objective of the study. Our assemblies will serve as a key resource for papilionid genomics, especially for researchers without computational resources. CONCLUSIONS Quality metrics and assemblies are typically presented with annotated genome accessions but rarely with de novo genomes. We recommend that studies presenting genome sequences provide the assembly and some metrics of quality because quality will significantly affect downstream results. Transparency in quality metrics is needed to improve the field of genome science and encourage data reuse.
Collapse
Affiliation(s)
- Emily A Ellis
- McGuire Center for Lepidoptera and Biodiversity, Florida Museum of Natural History, University of Florida, 3215 Hull Road, Gainesville, FL 32611-2710, USA
| | - Caroline G Storer
- McGuire Center for Lepidoptera and Biodiversity, Florida Museum of Natural History, University of Florida, 3215 Hull Road, Gainesville, FL 32611-2710, USA
| | - Akito Y Kawahara
- McGuire Center for Lepidoptera and Biodiversity, Florida Museum of Natural History, University of Florida, 3215 Hull Road, Gainesville, FL 32611-2710, USA
| |
Collapse
|
31
|
Reynes L, Thibaut T, Mauger S, Blanfuné A, Holon F, Cruaud C, Couloux A, Valero M, Aurelle D. Genomic signatures of clonality in the deep water kelp Laminaria rodriguezii. Mol Ecol 2021; 30:1806-1822. [PMID: 33629449 DOI: 10.1111/mec.15860] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Revised: 02/18/2021] [Accepted: 02/19/2021] [Indexed: 12/17/2022]
Abstract
The development of population genomic approaches in non-model species allows for renewed studies of the impact of reproductive systems and genetic drift on population diversity. Here, we investigate the genomic signatures of partial clonality in the deep water kelp Laminaria rodriguezii, known to reproduce by both sexual and asexual means. We compared these results with the species Laminaria digitata, a closely related species that differs by different traits, in particular its reproductive mode (no clonal reproduction). We analysed genome-wide variation with dd-RAD sequencing using 4,077 SNPs in L. rodriguezii and 7,364 SNPs in L. digitata. As predicted for partially clonal populations, we show that the distribution of FIS within populations of L. rodriguezii is shifted toward negative values, with a high number of loci showing heterozygote excess. This finding is the opposite of what we observed within sexual populations of L. digitata, characterized by a generalized deficit in heterozygotes. Furthermore, we observed distinct distributions of FIS among populations of L. rodriguezii, which is congruent with the predictions of theoretical models for different levels of clonality and genetic drift. These findings highlight that the empirical distribution of FIS is a promising feature for the genomic study of asexuality in natural populations. Our results also show that the populations of L. rodriguezii analysed here are genetically differentiated and probably isolated. Our study provides a conceptual framework to investigate partial clonality on the basis of RAD-sequencing SNPs. These results could be obtained without any reference genome, and are therefore of interest for various non-model species.
Collapse
Affiliation(s)
- Lauric Reynes
- CNRS, IRD, MIO, Aix Marseille Université, Université de Toulon, Marseille, France
| | - Thierry Thibaut
- CNRS, IRD, MIO, Aix Marseille Université, Université de Toulon, Marseille, France
| | - Stéphane Mauger
- IRL 3614, Evolutionary Biology and Ecology of Algae, CNRS, UC, UACH, Sorbonne Université, Roscoff, France
| | - Aurélie Blanfuné
- CNRS, IRD, MIO, Aix Marseille Université, Université de Toulon, Marseille, France
| | | | - Corinne Cruaud
- Genoscope, Institut de Biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Arnaud Couloux
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Myriam Valero
- IRL 3614, Evolutionary Biology and Ecology of Algae, CNRS, UC, UACH, Sorbonne Université, Roscoff, France
| | - Didier Aurelle
- CNRS, IRD, MIO, Aix Marseille Université, Université de Toulon, Marseille, France
- Institut de Systématique Évolution Biodiversité (ISYEB, UMR 7205), Muséum National d'Histoire Naturelle, CNRS, EPHE, Sorbonne Université, Paris, France
| |
Collapse
|
32
|
Shen J, McFarland AG, Young VB, Hayden MK, Hartmann EM. Toward Accurate and Robust Environmental Surveillance Using Metagenomics. Front Genet 2021; 12:600111. [PMID: 33747038 PMCID: PMC7973286 DOI: 10.3389/fgene.2021.600111] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 01/21/2021] [Indexed: 01/23/2023] Open
Abstract
Environmental surveillance is a critical tool for combatting public health threats represented by the global COVID-19 pandemic and the continuous increase of antibiotic resistance in pathogens. With its power to detect entire microbial communities, metagenomics-based methods stand out in addressing the need. However, several hurdles remain to be overcome in order to generate actionable interpretations from metagenomic sequencing data for infection prevention. Conceptually and technically, we focus on viability assessment, taxonomic resolution, and quantitative metagenomics, and discuss their current advancements, necessary precautions and directions to further development. We highlight the importance of building solid conceptual frameworks and identifying rational limits to facilitate the application of techniques. We also propose the usage of internal standards as a promising approach to overcome analytical bottlenecks introduced by low biomass samples and the inherent lack of quantitation in metagenomics. Taken together, we hope this perspective will contribute to bringing accurate and consistent metagenomics-based environmental surveillance to the ground.
Collapse
Affiliation(s)
- Jiaxian Shen
- Department of Civil and Environmental Engineering, Northwestern University, Evanston, IL, United States
| | - Alexander G. McFarland
- Department of Civil and Environmental Engineering, Northwestern University, Evanston, IL, United States
| | - Vincent B. Young
- Division of Infectious Diseases, Department of Internal Medicine, The University of Michigan Medical School, Ann Arbor, MI, United States
| | - Mary K. Hayden
- Division of Infectious Diseases, Department of Internal Medicine, Rush University Medical Center, Chicago, IL, United States
| | - Erica M. Hartmann
- Department of Civil and Environmental Engineering, Northwestern University, Evanston, IL, United States
| |
Collapse
|
33
|
Nachtigall PG, Grazziotin FG, Junqueira-de-Azevedo ILM. MITGARD: an automated pipeline for mitochondrial genome assembly in eukaryotic species using RNA-seq data. Brief Bioinform 2021; 22:6123950. [PMID: 33515000 DOI: 10.1093/bib/bbaa429] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 11/27/2020] [Accepted: 12/22/2020] [Indexed: 12/19/2022] Open
Abstract
MOTIVATION Over the past decade, the field of next-generation sequencing (NGS) has seen dramatic advances in methods and a decrease in costs. Consequently, a large expansion of data has been generated by NGS, most of which have originated from RNA-sequencing (RNA-seq) experiments. Because mitochondrial genes are expressed in most eukaryotic cells, mitochondrial mRNA sequences are usually co-sequenced within the target transcriptome, generating data that are commonly underused or discarded. Here, we present MITGARD, an automated pipeline that reliably recovers the mitochondrial genome from RNA-seq data from various sources. The pipeline identifies mitochondrial sequence reads based on a phylogenetically related reference, assembles them into contigs, and extracts a complete mtDNA for the target species. RESULTS We demonstrate that MITGARD can reconstruct the mitochondrial genomes of several species throughout the tree of life. We noticed that MITGARD can recover the mitogenomes in different sequencing schemes and even in a scenario of low-sequencing depth. Moreover, we showed that the use of references from congeneric species diverging up to 30 million years ago (MYA) from the target species is sufficient to recover the entire mitogenome, whereas the use of species diverging between 30 and 60 MYA allows the recovery of most mitochondrial genes. Additionally, we provide a case study with original data in which we estimate a phylogenetic tree of snakes from the genus Bothrops, further demonstrating that MITGARD is suitable for use on biodiversity projects. MITGARD is then a valuable tool to obtain high-quality information for studies focusing on the phylogenetic and evolutionary aspects of eukaryotes and provides data for easily identifying a sample using barcoding, and to check for cross-contamination using third-party tools.
Collapse
Affiliation(s)
- Pedro G Nachtigall
- Laboratório Especial de Toxinologia Aplicada, CeTICS, Instituto Butantan, São Paulo, SP, 05503-900, Brazil
| | - Felipe G Grazziotin
- Laboratório de Coleções Zoológicas, Instituto Butantan, São Paulo, SP, 05503-900, Brazil
| | | |
Collapse
|
34
|
Galise TR, Esposito S, D'Agostino N. Guidelines for Setting Up a mRNA Sequencing Experiment and Best Practices for Bioinformatic Data Analysis. Methods Mol Biol 2021; 2264:137-162. [PMID: 33263908 DOI: 10.1007/978-1-0716-1201-9_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
RNA-sequencing, commonly referred to as RNA-seq, is the most recently developed method for the analysis of transcriptomes. It uses high-throughput next-generation sequencing technologies and has revolutionized our understanding of the complexity and dynamics of whole transcriptomes.In this chapter, we recall the key developments in transcriptome analysis and dissect the different steps of the general workflow that can be run by users to design and perform a mRNA-seq experiment as well as to process mRNA-seq data obtained by the Illumina technology. The chapter proposes guidelines for completing a mRNA-seq study properly and makes available recommendations for best practices based on recent literature and on the latest developments in technology and algorithms. We also remark the large number of choices available (especially for bioinformatic data analysis) in front of which the scientist may be in trouble.In the last part of the chapter we discuss the new frontiers of single-cell RNA-seq and isoform sequencing by long read technology.
Collapse
Affiliation(s)
- Teresa Rosa Galise
- Department of Agricultural Sciences, University of Naples Federico II, Portici, Italy
| | - Salvatore Esposito
- CREA Research Centre for Vegetable and Ornamental Crops, Pontecagnano Faiano, Italy
| | - Nunzio D'Agostino
- Department of Agricultural Sciences, University of Naples Federico II, Portici, Italy.
| |
Collapse
|
35
|
Sepulveda AJ, Hutchins PR, Forstchen M, Mckeefry MN, Swigris AM. The Elephant in the Lab (and Field): Contamination in Aquatic Environmental DNA Studies. Front Ecol Evol 2020. [DOI: 10.3389/fevo.2020.609973] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The rapid evolution of environmental (e)DNA methods has resulted in knowledge gaps in smaller, yet critical details like proper use of negative controls to detect contamination. Detecting contamination is vital for confident use of eDNA results in decision-making. We conducted two literature reviews to summarize (a) the types of quality assurance measures taken to detect contamination of eDNA samples from aquatic environments, (b) the occurrence, frequency and attribution (i.e., putative sources) of unexpected amplification in these quality assurance samples, and (c) how results were interpreted when contamination occurred. In the first literature review, we reviewed 156 papers and found that 91% of targeted and 73% of metabarcoding eDNA studies reported inclusion of negative controls within their workflows. However, a large percentage of targeted (49%) and metabarcoding (80%) studies only reported negative controls for laboratory procedures, so results were potentially blind to field contamination. Many of the 156 studies did not provide critical methodological information and amplification results of negative controls. In our second literature review, we reviewed 695 papers and found that 30 targeted and 32 metabarcoding eDNA studies reported amplification of negative controls. This amplification occurred at similar proportions for field and lab workflow steps in targeted and metabarcoding studies. These studies most frequently used amplified negative controls to delimit a detection threshold above which is considered significant or provided rationale for why the unexpected amplifications did not affect results. In summary, we found that there has been minimal convergence over time on negative control implementation, methods, and interpretation, which suggests that increased rigor in these smaller, yet critical details remains an outstanding need. We conclude our review by highlighting several studies that have developed especially effective quality assurance, control and mitigation methods.
Collapse
|
36
|
Suzuki Y, Baidaliuk A, Miesen P, Frangeul L, Crist AB, Merkling SH, Fontaine A, Lequime S, Moltini-Conclois I, Blanc H, van Rij RP, Lambrechts L, Saleh MC. Non-retroviral Endogenous Viral Element Limits Cognate Virus Replication in Aedes aegypti Ovaries. Curr Biol 2020; 30:3495-3506.e6. [PMID: 32679098 PMCID: PMC7522710 DOI: 10.1016/j.cub.2020.06.057] [Citation(s) in RCA: 81] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 06/06/2020] [Accepted: 06/16/2020] [Indexed: 12/27/2022]
Abstract
Endogenous viral elements (EVEs) are viral sequences integrated in host genomes. A large number of non-retroviral EVEs was recently detected in Aedes mosquito genomes, leading to the hypothesis that mosquito EVEs may control exogenous infections by closely related viruses. Here, we experimentally investigated the role of an EVE naturally found in Aedes aegypti populations and derived from the widespread insect-specific virus, cell-fusing agent virus (CFAV). Using CRISPR-Cas9 genome editing, we created an Ae. aegypti line lacking the CFAV EVE. Absence of the EVE resulted in increased CFAV replication in ovaries, possibly modulating vertical transmission of the virus. Viral replication was controlled by targeting of viral RNA by EVE-derived P-element-induced wimpy testis-interacting RNAs (piRNAs). Our results provide evidence that antiviral piRNAs are produced in the presence of a naturally occurring EVE and its cognate virus, demonstrating a functional link between non-retroviral EVEs and antiviral immunity in a natural insect-virus interaction. Aedes aegypti harbors EVEs with high sequence identity to a contemporary RNA virus EVE-derived piRNAs target genomic viral RNA in infected mosquitoes Ablation of EVE results in increased viral replication in Aedes aegypti ovaries piRNA pathway fulfills antiviral function in presence of EVE and cognate virus
Collapse
Affiliation(s)
- Yasutsugu Suzuki
- Viruses and RNA Interference Unit, Institut Pasteur, UMR3569, CNRS, Paris, France
| | - Artem Baidaliuk
- Insect-Virus Interactions Unit, Institut Pasteur, UMR2000, CNRS, Paris, France; Collège Doctoral, Sorbonne Université, 75005 Paris, France
| | - Pascal Miesen
- Viruses and RNA Interference Unit, Institut Pasteur, UMR3569, CNRS, Paris, France; Department of Medical Microbiology, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Lionel Frangeul
- Viruses and RNA Interference Unit, Institut Pasteur, UMR3569, CNRS, Paris, France
| | - Anna B Crist
- Insect-Virus Interactions Unit, Institut Pasteur, UMR2000, CNRS, Paris, France
| | - Sarah H Merkling
- Insect-Virus Interactions Unit, Institut Pasteur, UMR2000, CNRS, Paris, France
| | - Albin Fontaine
- Insect-Virus Interactions Unit, Institut Pasteur, UMR2000, CNRS, Paris, France
| | - Sebastian Lequime
- Laboratory of Clinical and Epidemiological Virology, Rega Institute, Department of Microbiology and Immunology, KU Leuven, Leuven, Belgium
| | | | - Hervé Blanc
- Viruses and RNA Interference Unit, Institut Pasteur, UMR3569, CNRS, Paris, France
| | - Ronald P van Rij
- Department of Medical Microbiology, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Louis Lambrechts
- Insect-Virus Interactions Unit, Institut Pasteur, UMR2000, CNRS, Paris, France.
| | - Maria-Carla Saleh
- Viruses and RNA Interference Unit, Institut Pasteur, UMR3569, CNRS, Paris, France.
| |
Collapse
|
37
|
Martin G, Cardi C, Sarah G, Ricci S, Jenny C, Fondi E, Perrier X, Glaszmann JC, D'Hont A, Yahiaoui N. Genome ancestry mosaics reveal multiple and cryptic contributors to cultivated banana. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 102:1008-1025. [PMID: 31930580 PMCID: PMC7317953 DOI: 10.1111/tpj.14683] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 12/18/2019] [Accepted: 01/02/2020] [Indexed: 05/24/2023]
Abstract
Hybridizations between closely related species commonly occur in the domestication process of many crops. Banana cultivars are derived from such hybridizations between species and subspecies of the Musa genus that have diverged in various tropical Southeast Asian regions and archipelagos. Among the diploid and triploid hybrids generated, those with seedless parthenocarpic fruits were selected by humans and thereafter dispersed through vegetative propagation. Musa acuminata subspecies contribute to most of these cultivars. We analyzed sequence data from 14 M. acuminata wild accessions and 10 M. acuminata-based cultivars, including diploids and one triploid, to characterize the ancestral origins along their chromosomes. We used multivariate analysis and single nucleotide polymorphism clustering and identified five ancestral groups as contributors to these cultivars. Four of these corresponded to known M. acuminata subspecies. A fifth group, found only in cultivars, was defined based on the 'Pisang Madu' cultivar and represented two uncharacterized genetic pools. Diverse ancestral contributions along cultivar chromosomes were found, resulting in mosaics with at least three and up to five ancestries. The commercially important triploid Cavendish banana cultivar had contributions from at least one of the uncharacterized genetic pools and three known M. acuminata subspecies. Our results highlighted that cultivated banana origins are more complex than expected - involving multiple hybridization steps - and also that major wild banana ancestors have yet to be identified. This study revealed the extent to which admixture has framed the evolution and domestication of a crop plant.
Collapse
Affiliation(s)
- Guillaume Martin
- CIRAD, UMR AGAP, F-34398, Montpellier, France
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Céline Cardi
- CIRAD, UMR AGAP, F-34398, Montpellier, France
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Gautier Sarah
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Sébastien Ricci
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
- CARBAP, Rue Dinde, No. 110, Bonanjo, BP 832, Douala, Cameroon
- CIRAD, UMR AGAP, F-97130, Capesterre Belle Eau, France
| | - Christophe Jenny
- CIRAD, UMR AGAP, F-34398, Montpellier, France
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Emmanuel Fondi
- CARBAP, Rue Dinde, No. 110, Bonanjo, BP 832, Douala, Cameroon
| | - Xavier Perrier
- CIRAD, UMR AGAP, F-34398, Montpellier, France
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Jean-Christophe Glaszmann
- CIRAD, UMR AGAP, F-34398, Montpellier, France
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Angélique D'Hont
- CIRAD, UMR AGAP, F-34398, Montpellier, France
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Nabila Yahiaoui
- CIRAD, UMR AGAP, F-34398, Montpellier, France
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| |
Collapse
|
38
|
Allio R, Schomaker-Bastos A, Romiguier J, Prosdocimi F, Nabholz B, Delsuc F. MitoFinder: Efficient automated large-scale extraction of mitogenomic data in target enrichment phylogenomics. Mol Ecol Resour 2020; 20:892-905. [PMID: 32243090 PMCID: PMC7497042 DOI: 10.1111/1755-0998.13160] [Citation(s) in RCA: 712] [Impact Index Per Article: 178.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 02/21/2020] [Accepted: 03/12/2020] [Indexed: 11/27/2022]
Abstract
Thanks to the development of high-throughput sequencing technologies, target enrichment sequencing of nuclear ultraconserved DNA elements (UCEs) now allows routine inference of phylogenetic relationships from thousands of genomic markers. Recently, it has been shown that mitochondrial DNA (mtDNA) is frequently sequenced alongside the targeted loci in such capture experiments. Despite its broad evolutionary interest, mtDNA is rarely assembled and used in conjunction with nuclear markers in capture-based studies. Here, we developed MitoFinder, a user-friendly bioinformatic pipeline, to efficiently assemble and annotate mitogenomic data from hundreds of UCE libraries. As a case study, we used ants (Formicidae) for which 501 UCE libraries have been sequenced whereas only 29 mitogenomes are available. We compared the efficiency of four different assemblers (IDBA-UD, MEGAHIT, MetaSPAdes, and Trinity) for assembling both UCE and mtDNA loci. Using MitoFinder, we show that metagenomic assemblers, in particular MetaSPAdes, are well suited to assemble both UCEs and mtDNA. Mitogenomic signal was successfully extracted from all 501 UCE libraries, allowing us to confirm species identification using CO1 barcoding. Moreover, our automated procedure retrieved 296 cases in which the mitochondrial genome was assembled in a single contig, thus increasing the number of available ant mitogenomes by an order of magnitude. By utilizing the power of metagenomic assemblers, MitoFinder provides an efficient tool to extract complementary mitogenomic data from UCE libraries, allowing testing for potential mitonuclear discordance. Our approach is potentially applicable to other sequence capture methods, transcriptomic data and whole genome shotgun sequencing in diverse taxa. The MitoFinder software is available from GitHub (https://github.com/RemiAllio/MitoFinder).
Collapse
Affiliation(s)
- Rémi Allio
- Institut des Sciences de l'Évolution de Montpellier (ISEM), CNRS, EPHE, IRD, Université de Montpellier, Montpellier, France
| | - Alex Schomaker-Bastos
- Laboratório Multidisciplinar para Análise de Dados (LAMPADA), Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Jonathan Romiguier
- Institut des Sciences de l'Évolution de Montpellier (ISEM), CNRS, EPHE, IRD, Université de Montpellier, Montpellier, France
| | - Francisco Prosdocimi
- Laboratório Multidisciplinar para Análise de Dados (LAMPADA), Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Benoit Nabholz
- Institut des Sciences de l'Évolution de Montpellier (ISEM), CNRS, EPHE, IRD, Université de Montpellier, Montpellier, France
| | - Frédéric Delsuc
- Institut des Sciences de l'Évolution de Montpellier (ISEM), CNRS, EPHE, IRD, Université de Montpellier, Montpellier, France
| |
Collapse
|
39
|
Rousselle M, Simion P, Tilak MK, Figuet E, Nabholz B, Galtier N. Is adaptation limited by mutation? A timescale-dependent effect of genetic diversity on the adaptive substitution rate in animals. PLoS Genet 2020; 16:e1008668. [PMID: 32251427 PMCID: PMC7162527 DOI: 10.1371/journal.pgen.1008668] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 04/16/2020] [Accepted: 02/14/2020] [Indexed: 12/16/2022] Open
Abstract
Whether adaptation is limited by the beneficial mutation supply is a long-standing question of evolutionary genetics, which is more generally related to the determination of the adaptive substitution rate and its relationship with species effective population size (Ne) and genetic diversity. Empirical evidence reported so far is equivocal, with some but not all studies supporting a higher adaptive substitution rate in large-Ne than in small-Ne species. We gathered coding sequence polymorphism data and estimated the adaptive amino-acid substitution rate ωa, in 50 species from ten distant groups of animals with markedly different population mutation rate θ. We reveal the existence of a complex, timescale dependent relationship between species adaptive substitution rate and genetic diversity. We find a positive relationship between ωa and θ among closely related species, indicating that adaptation is indeed limited by the mutation supply, but this was only true in relatively low-θ taxa. In contrast, we uncover no significant correlation between ωa and θ at a larger taxonomic scale, suggesting that the proportion of beneficial mutations scales negatively with species' long-term Ne.
Collapse
Affiliation(s)
| | - Paul Simion
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
- LEGE, Department of Biology, University of Namur, Namur, Belgium
| | - Marie-Ka Tilak
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Emeric Figuet
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Benoit Nabholz
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Nicolas Galtier
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
| |
Collapse
|
40
|
Goig GA, Blanco S, Garcia-Basteiro AL, Comas I. Contaminant DNA in bacterial sequencing experiments is a major source of false genetic variability. BMC Biol 2020; 18:24. [PMID: 32122347 PMCID: PMC7053099 DOI: 10.1186/s12915-020-0748-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Accepted: 02/11/2020] [Indexed: 12/16/2022] Open
Abstract
Background Contaminant DNA is a well-known confounding factor in molecular biology and in genomic repositories. Strikingly, analysis workflows for whole-genome sequencing (WGS) data commonly do not account for errors potentially introduced by contamination, which could lead to the wrong assessment of allele frequency both in basic and clinical research. Results We used a taxonomic filter to remove contaminant reads from more than 4000 bacterial samples from 20 different studies and performed a comprehensive evaluation of the extent and impact of contaminant DNA in WGS. We found that contamination is pervasive and can introduce large biases in variant analysis. We showed that these biases can result in hundreds of false positive and negative SNPs, even for samples with slight contamination. Studies investigating complex biological traits from sequencing data can be completely biased if contamination is neglected during the bioinformatic analysis, and we demonstrate that removing contaminant reads with a taxonomic classifier permits more accurate variant calling. We used both real and simulated data to evaluate and implement reliable, contamination-aware analysis pipelines. Conclusion As sequencing technologies consolidate as precision tools that are increasingly adopted in the research and clinical context, our results urge for the implementation of contamination-aware analysis pipelines. Taxonomic classifiers are a powerful tool to implement such pipelines.
Collapse
Affiliation(s)
- Galo A Goig
- Institute of Biomedicine of Valencia, IBV-CSIC, St. Jaume Roig 11, 46010, Valencia, Spain.
| | - Silvia Blanco
- Centro de Investigaçao em Saúde de Manhiça (CISM), Bairro Cambeve, Rua 12, Distrito da Manhiça, 1929, Maputo, Mozambique
| | - Alberto L Garcia-Basteiro
- Centro de Investigaçao em Saúde de Manhiça (CISM), Bairro Cambeve, Rua 12, Distrito da Manhiça, 1929, Maputo, Mozambique.,ISGlobal, Hospital Clínic - Universitat de Barcelona, Barcelona, Spain
| | - Iñaki Comas
- Institute of Biomedicine of Valencia, IBV-CSIC, St. Jaume Roig 11, 46010, Valencia, Spain.,CIBER in Epidemiology and Public Health, Madrid, Spain
| |
Collapse
|
41
|
Prevalence and Implications of Contamination in Public Genomic Resources: A Case Study of 43 Reference Arthropod Assemblies. G3-GENES GENOMES GENETICS 2020; 10:721-730. [PMID: 31862787 PMCID: PMC7003083 DOI: 10.1534/g3.119.400758] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Thanks to huge advances in sequencing technologies, genomic resources are increasingly being generated and shared by the scientific community. The quality of such public resources are therefore of critical importance. Errors due to contamination are particularly worrying; they are widespread, propagate across databases, and can compromise downstream analyses, especially the detection of horizontally-transferred sequences. However we still lack consistent and comprehensive assessments of contamination prevalence in public genomic data. Here we applied a standardized procedure for foreign sequence annotation to 43 published arthropod genomes from the widely used Ensembl Metazoa database. This method combines information on sequence similarity and synteny to identify contaminant and putative horizontally-transferred sequences in any genome assembly, provided that an adequate reference database is available. We uncovered considerable heterogeneity in quality among arthropod assemblies, some being devoid of contaminant sequences, whereas others included hundreds of contaminant genes. Contaminants far outnumbered horizontally-transferred genes and were a major confounder of their detection, quantification and analysis. We strongly recommend that automated standardized decontamination procedures be systematically embedded into the submission process to genomic databases.
Collapse
|
42
|
Selitsky SR, Marron D, Hollern D, Mose LE, Hoadley KA, Jones C, Parker JS, Dittmer DP, Perou CM. Virus expression detection reveals RNA-sequencing contamination in TCGA. BMC Genomics 2020; 21:79. [PMID: 31992194 PMCID: PMC6986043 DOI: 10.1186/s12864-020-6483-6] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Accepted: 01/10/2020] [Indexed: 02/06/2023] Open
Abstract
Background Contamination of reagents and cross contamination across samples is a long-recognized issue in molecular biology laboratories. While often innocuous, contamination can lead to inaccurate results. Cantalupo et al., for example, found HeLa-derived human papillomavirus 18 (H-HPV18) in several of The Cancer Genome Atlas (TCGA) RNA-sequencing samples. This work motivated us to assess a greater number of samples and determine the origin of possible contaminations using viral sequences. To detect viruses with high specificity, we developed the publicly available workflow, VirDetect, that detects virus and laboratory vector sequences in RNA-seq samples. We applied VirDetect to 9143 RNA-seq samples sequenced at one TCGA sequencing center (28/33 cancer types) over 5 years. Results We confirmed that H-HPV18 was present in many samples and determined that viral transcripts from H-HPV18 significantly co-occurred with those from xenotropic mouse leukemia virus-related virus (XMRV). Using laboratory metadata and viral transcription, we determined that the likely contaminant was a pool of cell lines known as the “common reference”, which was sequenced alongside TCGA RNA-seq samples as a control to monitor quality across technology transitions (i.e. microarray to GAII to HiSeq), and to link RNA-seq to previous generation microarrays that standardly used the “common reference”. One of the cell lines in the pool was a laboratory isolate of MCF-7, which we discovered was infected with XMRV; another constituent of the pool was likely HeLa cells. Conclusions Altogether, this indicates a multi-step contamination process. First, MCF-7 was infected with an XMRV. Second, this infected cell line was added to a pool of cell lines, which contained HeLa. Finally, RNA from this pool of cell lines contaminated several TCGA tumor samples most-likely during library construction. Thus, these human tumors with H-HPV or XMRV reads were likely not infected with H-HPV 18 or XMRV.
Collapse
Affiliation(s)
- Sara R Selitsky
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA
| | - David Marron
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA
| | - Daniel Hollern
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA
| | - Lisle E Mose
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA
| | - Katherine A Hoadley
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA
| | - Corbin Jones
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Joel S Parker
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA
| | - Dirk P Dittmer
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Department of Microbiology and Immunology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Charles M Perou
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA. .,Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.
| |
Collapse
|
43
|
De Simone G, Pasquadibisceglie A, Proietto R, Polticelli F, Aime S, J M Op den Camp H, Ascenzi P. Contaminations in (meta)genome data: An open issue for the scientific community. IUBMB Life 2019; 72:698-705. [PMID: 31869003 DOI: 10.1002/iub.2216] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Accepted: 11/30/2019] [Indexed: 12/13/2022]
Abstract
In recent years, the high throughput and the low cost of next-generation sequencing (NGS) technologies have led to an increase of the amount of (meta)genomic data, revolutionizing genomic research studies. However, the quality of sequencing data could be affected by experimental errors derived from defective methods and protocols. This represents a serious problem for the scientific community with a negative impact on the correctness of studies that involve genomic sequence analysis. As a countermeasure, several alignment and taxonomic classification tools have been developed to uncover and correct errors. In this critical review some of these integrated software tools and pipelines used to detect contaminations in reference genome databases and sequenced samples are reported. In particular, case studies of bacterial contaminations, contaminations of human origin, mitochondrial contaminations of ancient DNA, and cross contaminations are examined.
Collapse
Affiliation(s)
| | | | | | | | - Silvio Aime
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Torino, Italy
| | - Huub J M Op den Camp
- Department of Microbiology, IWWR, Radboud University, Heyendaalseweg 135, Nijmegen, AJ, The Netherlands
| | - Paolo Ascenzi
- Interdepartmental Laboratory for Electron Microscopy, Roma Tre University, Roma, Italy
| |
Collapse
|
44
|
Xing RR, Wang N, Hu RR, Zhang JK, Han JX, Chen Y. Application of next generation sequencing for species identification in meat and poultry products: A DNA metabarcoding approach. Food Control 2019. [DOI: 10.1016/j.foodcont.2019.02.034] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
45
|
Yan Z, Ye G, Werren JH. Evolutionary Rate Correlation between Mitochondrial-Encoded and Mitochondria-Associated Nuclear-Encoded Proteins in Insects. Mol Biol Evol 2019; 36:1022-1036. [PMID: 30785203 DOI: 10.1093/molbev/msz036] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The mitochondrion is a pivotal organelle for energy production, and includes components encoded by both the mitochondrial and nuclear genomes. Functional and evolutionary interactions are expected between the nuclear- and mitochondrial-encoded components. The topic is of broad interest in biology, with implications to genetics, evolution, and medicine. Here, we compare the evolutionary rates of mitochondrial proteins and ribosomal RNAs to rates of mitochondria-associated nuclear-encoded proteins, across the major orders of holometabolous insects. There are significant evolutionary rate correlations (ERCs) between mitochondrial-encoded and mitochondria-associated nuclear-encoded proteins, which are likely driven by different rates of mitochondrial sequence evolution and correlated changes in the interacting nuclear-encoded proteins. The pattern holds after correction for phylogenetic relationships and considering protein conservation levels. Correlations are stronger for both nuclear-encoded OXPHOS proteins that are in contact with mitochondrial OXPHOS proteins and for nuclear-encoded mitochondrial ribosomal amino acids directly contacting the mitochondrial rRNAs. We find that ERC between mitochondrial- and nuclear-encoded proteins is a strong predictor of nuclear-encoded proteins known to interact with mitochondria, and ERC shows promise for identifying new candidate proteins with mitochondrial function. Twenty-three additional candidate nuclear-encoded proteins warrant further study for mitochondrial function based on this approach, including proteins in the minichromosome maintenance helicase complex.
Collapse
Affiliation(s)
- Zhichao Yan
- State Key Laboratory of Rice Biology & Ministry of Agricultural and Rural Affairs Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China.,Department of Biology, University of Rochester, Rochester, NY
| | - Gongyin Ye
- State Key Laboratory of Rice Biology & Ministry of Agricultural and Rural Affairs Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| | - John H Werren
- Department of Biology, University of Rochester, Rochester, NY
| |
Collapse
|
46
|
Alié A, Hiebert LS, Simion P, Scelzo M, Prünster MM, Lotito S, Delsuc F, Douzery EJP, Dantec C, Lemaire P, Darras S, Kawamura K, Brown FD, Tiozzo S. Convergent Acquisition of Nonembryonic Development in Styelid Ascidians. Mol Biol Evol 2019; 35:1728-1743. [PMID: 29660002 PMCID: PMC5995219 DOI: 10.1093/molbev/msy068] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Asexual propagation and whole body regeneration are forms of nonembryonic development (NED) widespread across animal phyla and central in life history and evolutionary diversification of metazoans. Whereas it is challenging to reconstruct the gains or losses of NED at large phylogenetic scale, comparative studies could benefit from being conducted at more restricted taxonomic scale, in groups for which phylogenetic relationships are well established. The ascidian family of Styelidae encompasses strictly sexually reproducing solitary forms as well as colonial species that combine sexual reproduction with different forms of NED. To date, the phylogenetic relationships between colonial and solitary styelids remain controversial and so is the pattern of NED evolution. In this study, we built an original pipeline to combine eight genomes with 18 de novo assembled transcriptomes and constructed data sets of unambiguously orthologous genes. Using a phylogenomic super-matrix of 4,908 genes from these 26 tunicates we provided a robust phylogeny of this family of chordates, which supports two convergent acquisitions of NED. This result prompted us to further describe the budding process in the species Polyandrocarpa zorritensis, leading to the discovery of a novel mechanism of asexual development. Whereas the pipeline and the data sets produced can be used for further phylogenetic reconstructions in tunicates, the phylogeny provided here sets an evolutionary framework for future experimental studies on the emergence and disappearance of complex characters such as asexual propagation and whole body regeneration.
Collapse
Affiliation(s)
- Alexandre Alié
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230 Paris, France
| | - Laurel Sky Hiebert
- Departamento de Zoologia - Instituto Biociências, Universidade de São Paulo, São Paulo, Brazil.,Centro de Biologia Marinha (CEBIMar), Universidade de São Paulo, São Paulo, Brazil
| | - Paul Simion
- ISEM, Université de Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Marta Scelzo
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230 Paris, France
| | - Maria Mandela Prünster
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230 Paris, France
| | - Sonia Lotito
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230 Paris, France
| | - Frédéric Delsuc
- ISEM, Université de Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | | | | | | | - Sébastien Darras
- Sorbonne Université, CNRS, Biologie Int[1]egrative des Organismes Marins (BIOM),Observatoire Oc[1]eanologique, Banyuls/Mer, 06230 Paris, France
| | - Kazuo Kawamura
- Laboratory of Cellular and Molecular Biotechnology, Faculty of Science, Kochi University, Kochi, Japan
| | - Federico D Brown
- Departamento de Zoologia - Instituto Biociências, Universidade de São Paulo, São Paulo, Brazil.,Centro de Biologia Marinha (CEBIMar), Universidade de São Paulo, São Paulo, Brazil.,Instituto Nacional de Ciência e Tecnologia em Estudos Interdisciplinares e Transdisciplinares em Ecologia e Evolução (IN-TREE), Salvador, Brazil
| | - Stefano Tiozzo
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230 Paris, France
| |
Collapse
|
47
|
Low AJ, Koziol AG, Manninger PA, Blais B, Carrillo CD. ConFindr: rapid detection of intraspecies and cross-species contamination in bacterial whole-genome sequence data. PeerJ 2019; 7:e6995. [PMID: 31183253 PMCID: PMC6546082 DOI: 10.7717/peerj.6995] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Accepted: 04/20/2019] [Indexed: 12/16/2022] Open
Abstract
Whole-genome sequencing (WGS) of bacterial pathogens is currently widely used to support public-health investigations. The ability to assess WGS data quality is critical to underpin the reliability of downstream analyses. Sequence contamination is a quality issue that could potentially impact WGS-based findings; however, existing tools do not readily identify contamination from closely-related organisms. To address this gap, we have developed a computational pipeline, ConFindr, for detection of intraspecies contamination. ConFindr determines the presence of contaminating sequences based on the identification of multiple alleles of core, single-copy, ribosomal-protein genes in raw sequencing reads. The performance of this tool was assessed using simulated and lab-generated Illumina short-read WGS data with varying levels of contamination (0-20% of reads) and varying genetic distance between the designated target and contaminant strains. Intraspecies and cross-species contamination was reliably detected in datasets containing 5% or more reads from a second, unrelated strain. ConFindr detected intraspecies contamination with higher sensitivity than existing tools, while also being able to automatically detect cross-species contamination with similar sensitivity. The implementation of ConFindr in quality-control pipelines will help to improve the reliability of WGS databases as well as the accuracy of downstream analyses. ConFindr is written in Python, and is freely available under the MIT License at github.com/OLC-Bioinformatics/ConFindr.
Collapse
Affiliation(s)
- Andrew J Low
- Ottawa Laboratory (Carling), Canadian Food Inspection Agency, Ottawa, Ontario, Canada
| | - Adam G Koziol
- Ottawa Laboratory (Carling), Canadian Food Inspection Agency, Ottawa, Ontario, Canada
| | - Paul A Manninger
- Ottawa Laboratory (Carling), Canadian Food Inspection Agency, Ottawa, Ontario, Canada
| | - Burton Blais
- Ottawa Laboratory (Carling), Canadian Food Inspection Agency, Ottawa, Ontario, Canada
| | - Catherine D Carrillo
- Ottawa Laboratory (Carling), Canadian Food Inspection Agency, Ottawa, Ontario, Canada
| |
Collapse
|
48
|
Allio R, Scornavacca C, Nabholz B, Clamens AL, Sperling FAH, Condamine FL. Whole Genome Shotgun Phylogenomics Resolves the Pattern and Timing of Swallowtail Butterfly Evolution. Syst Biol 2019; 69:38-60. [DOI: 10.1093/sysbio/syz030] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2018] [Revised: 04/26/2019] [Accepted: 04/28/2019] [Indexed: 01/20/2023] Open
Abstract
Abstract
Evolutionary relationships have remained unresolved in many well-studied groups, even though advances in next-generation sequencing and analysis, using approaches such as transcriptomics, anchored hybrid enrichment, or ultraconserved elements, have brought systematics to the brink of whole genome phylogenomics. Recently, it has become possible to sequence the entire genomes of numerous nonbiological models in parallel at reasonable cost, particularly with shotgun sequencing. Here, we identify orthologous coding sequences from whole-genome shotgun sequences, which we then use to investigate the relevance and power of phylogenomic relationship inference and time-calibrated tree estimation. We study an iconic group of butterflies—swallowtails of the family Papilionidae—that has remained phylogenetically unresolved, with continued debate about the timing of their diversification. Low-coverage whole genomes were obtained using Illumina shotgun sequencing for all genera. Genome assembly coupled to BLAST-based orthology searches allowed extraction of 6621 orthologous protein-coding genes for 45 Papilionidae species and 16 outgroup species (with 32% missing data after cleaning phases). Supermatrix phylogenomic analyses were performed with both maximum-likelihood (IQ-TREE) and Bayesian mixture models (PhyloBayes) for amino acid sequences, which produced a fully resolved phylogeny providing new insights into controversial relationships. Species tree reconstruction from gene trees was performed with ASTRAL and SuperTriplets and recovered the same phylogeny. We estimated gene site concordant factors to complement traditional node-support measures, which strengthens the robustness of inferred phylogenies. Bayesian estimates of divergence times based on a reduced data set (760 orthologs and 12% missing data) indicate a mid-Cretaceous origin of Papilionoidea around 99.2 Ma (95% credibility interval: 68.6–142.7 Ma) and Papilionidae around 71.4 Ma (49.8–103.6 Ma), with subsequent diversification of modern lineages well after the Cretaceous-Paleogene event. These results show that shotgun sequencing of whole genomes, even when highly fragmented, represents a powerful approach to phylogenomics and molecular dating in a group that has previously been refractory to resolution.
Collapse
Affiliation(s)
- Rémi Allio
- Institut des Sciences de l’Evolution de Montpellier (Université de Montpellier
- CNRS
- IRD
- EPHE), Place Eugène Bataillon, 34095 Montpellier, France
| | - Céline Scornavacca
- Institut des Sciences de l’Evolution de Montpellier (Université de Montpellier
- CNRS
- IRD
- EPHE), Place Eugène Bataillon, 34095 Montpellier, France
- Institut de Biologie Computationnelle (IBC), Montpellier, France
| | - Benoit Nabholz
- Institut des Sciences de l’Evolution de Montpellier (Université de Montpellier
- CNRS
- IRD
- EPHE), Place Eugène Bataillon, 34095 Montpellier, France
| | - Anne-Laure Clamens
- INRA, UMR 1062 Centre de Biologie pour la Gestion des Populations (INRA, IRD, CIRAD, Montpellier SupAgro), 755 Avenue du Campus Agropolis, 34988 Montferrier-sur-Lez, France
- Department of Biological Sciences, University of Alberta, Edmonton T6G 2E9, AB, Canada
| | - Felix AH Sperling
- Department of Biological Sciences, University of Alberta, Edmonton T6G 2E9, AB, Canada
| | - Fabien L Condamine
- Institut des Sciences de l’Evolution de Montpellier (Université de Montpellier
- CNRS
- IRD
- EPHE), Place Eugène Bataillon, 34095 Montpellier, France
- Department of Biological Sciences, University of Alberta, Edmonton T6G 2E9, AB, Canada
| |
Collapse
|
49
|
Pérez-Rubio P, Lottaz C, Engelmann JC. FastqPuri: high-performance preprocessing of RNA-seq data. BMC Bioinformatics 2019; 20:226. [PMID: 31053060 PMCID: PMC6500068 DOI: 10.1186/s12859-019-2799-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Accepted: 04/09/2019] [Indexed: 12/23/2022] Open
Abstract
Background RNA sequencing (RNA-seq) has become the standard means of analyzing gene and transcript expression in high-throughput. While previously sequence alignment was a time demanding step, fast alignment methods and even more so transcript counting methods which avoid mapping and quantify gene and transcript expression by evaluating whether a read is compatible with a transcript, have led to significant speed-ups in data analysis. Now, the most time demanding step in the analysis of RNA-seq data is preprocessing the raw sequence data, such as running quality control and adapter, contamination and quality filtering before transcript or gene quantification. To do so, many researchers chain different tools, but a comprehensive, flexible and fast software that covers all preprocessing steps is currently missing. Results We here present FastqPuri, a light-weight and highly efficient preprocessing tool for fastq data. FastqPuri provides sequence quality reports on the sample and dataset level with new plots which facilitate decision making for subsequent quality filtering. Moreover, FastqPuri efficiently removes adapter sequences and sequences from biological contamination from the data. It accepts both single- and paired-end data in uncompressed or compressed fastq files. FastqPuri can be run stand-alone and is suitable to be run within pipelines. We benchmarked FastqPuri against existing tools and found that FastqPuri is superior in terms of speed, memory usage, versatility and comprehensiveness. Conclusions FastqPuri is a new tool which covers all aspects of short read sequence data preprocessing. It was designed for RNA-seq data to meet the needs for fast preprocessing of fastq data to allow transcript and gene counting, but it is suitable to process any short read sequencing data of which high sequence quality is needed, such as for genome assembly or SNV (single nucleotide variant) detection. FastqPuri is most flexible in filtering undesired biological sequences by offering two approaches to optimize speed and memory usage dependent on the total size of the potential contaminating sequences. FastqPuri is available at https://github.com/jengelmann/FastqPuri. It is implemented in C and R and licensed under GPL v3. Electronic supplementary material The online version of this article (10.1186/s12859-019-2799-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Paula Pérez-Rubio
- Statistical Bioinformatics, Institute of Functional Genomics, University of Regensburg, Am BioPark 9, Regensburg, 93053, Germany
| | - Claudio Lottaz
- Statistical Bioinformatics, Institute of Functional Genomics, University of Regensburg, Am BioPark 9, Regensburg, 93053, Germany
| | - Julia C Engelmann
- Department of Marine Microbiology and Biogeochemistry, NIOZ Royal Netherlands Institute for Sea Research and Utrecht University, P.O. Box 59, Den Burg, 1790 AB, The Netherlands.
| |
Collapse
|
50
|
Sangiovanni M, Granata I, Thind AS, Guarracino MR. From trash to treasure: detecting unexpected contamination in unmapped NGS data. BMC Bioinformatics 2019; 20:168. [PMID: 30999839 PMCID: PMC6472186 DOI: 10.1186/s12859-019-2684-x] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Background Next Generation Sequencing (NGS) experiments produce millions of short sequences that, mapped to a reference genome, provide biological insights at genomic, transcriptomic and epigenomic level. Typically the amount of reads that correctly maps to the reference genome ranges between 70% and 90%, leaving in some cases a consistent fraction of unmapped sequences. This ’misalignment’ can be ascribed to low quality bases or sequence differences between the sample reads and the reference genome. Investigating the source of the unmapped reads is definitely important to better assess the quality of the whole experiment and to check for possible downstream or upstream ’contamination’ from exogenous nucleic acids. Results Here we propose DecontaMiner, a tool to unravel the presence of contaminating sequences among the unmapped reads. It uses a subtraction approach to identify bacteria, fungi and viruses genome contamination. DecontaMiner generates several output files to track all the processed reads, and to provide a complete report of their characteristics. The good quality matches on microorganism genomes are counted and compared among samples. DecontaMiner builds an offline HTML page containing summary statistics and plots. The latter are obtained using the state-of-the-art D3 javascript libraries. DecontaMiner has been mainly used to detect contamination in human RNA-Seq data. The software is freely available at http://www-labgtp.na.icar.cnr.it/decontaminer. Conclusions DecontaMiner is a tool designed and developed to investigate the presence of contaminating sequences in unmapped NGS data. It can suggest the presence of contaminating organisms in sequenced samples, that might derive either from laboratory contamination or from their biological source, and in both cases can be considered as worthy of further investigation and experimental validation. The novelty of DecontaMiner is mainly represented by its easy integration with the standard procedures of NGS data analysis, while providing a complete, reliable, and automatic pipeline. Electronic supplementary material The online version of this article (10.1186/s12859-019-2684-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mara Sangiovanni
- Stazione Zoologica Anton Dohrn, Villa Comunale, Napoli, 80121, Italy
| | - Ilaria Granata
- High Performance Computing and Networking Institute, National Research Council of Italy, Via P. Castellino, 111, Napoli, 80131, Italy.
| | - Amarinder Singh Thind
- High Performance Computing and Networking Institute, National Research Council of Italy, Via P. Castellino, 111, Napoli, 80131, Italy
| | - Mario Rosario Guarracino
- High Performance Computing and Networking Institute, National Research Council of Italy, Via P. Castellino, 111, Napoli, 80131, Italy
| |
Collapse
|