1
|
Wang Q, Cole JR. Updated RDP taxonomy and RDP Classifier for more accurate taxonomic classification. Microbiol Resour Announc 2024; 13:e0106323. [PMID: 38436268 PMCID: PMC11008197 DOI: 10.1128/mra.01063-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 02/13/2024] [Indexed: 03/05/2024] Open
Abstract
The RDP Classifier is one of the most popular machine learning approaches for taxonomic classification due to its robustness and relatively high accuracy. Both the RDP taxonomy and RDP Classifier have been updated to incorporate newly described taxa and recent changes to prokaryotic nomenclature.
Collapse
Affiliation(s)
- Qiong Wang
- Health & Biosciences, International Flavors & Fragrances, Inc., Wilmington, Delaware, USA
| | - James R. Cole
- Department of Plant, Soil and Microbial Sciences, Center for Microbial Ecology, Michigan State University, East Lansing, Michigan, USA
| |
Collapse
|
2
|
Qing W, Shi Y, Chen R, Zou Y, Qi C, Zhang Y, Zhou Z, Li S, Hou Y, Zhou H, Chen M. Species-level resolution for the vaginal microbiota with short amplicons. mSystems 2024; 9:e0103923. [PMID: 38275296 PMCID: PMC10878104 DOI: 10.1128/msystems.01039-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 12/15/2023] [Indexed: 01/27/2024] Open
Abstract
Specific bacterial species have been found to play important roles in human vagina. Achieving high species-level resolution is vital for analyzing vaginal microbiota data. However, contradictory conclusions were yielded from different methodological studies. More comprehensive evaluation is needed for determining an optimal pipeline for vaginal microbiota. Based on the sequences of vaginal bacterial species downloaded from NCBI, we conducted simulated amplification with various primer sets targeting different 16S regions as well as taxonomic classification on the amplicons applying different combinations of algorithms (BLAST+, VSEARCH, and Sklearn) and reference databases (Greengenes2, SILVA, and RDP). Vaginal swabs were collected from participants with different vaginal microecology to construct 16S full-length sequenced mock communities. Both computational and experimental amplifications were performed on the mock samples. Classification accuracy of each pipeline was determined. Microbial profiles were compared between the full-length and partial 16S sequencing samples. The optimal pipeline was further validated in a multicenter cohort against the PCR results of common STI pathogens. Pipeline V1-V3_Sklearn_Combined had the highest accuracy for classifying the amplicons generated from both the NCBI downloaded data (84.20% ± 2.39%) and the full-length sequencing data (95.65% ± 3.04%). Vaginal samples amplified and sequenced targeting the V1-V3 region but merely employing the forward reads (223 bp) and classified using the optimal pipeline, resembled the mock communities the most. The pipeline demonstrated high F1-scores for detecting STI pathogens within the validation cohort. We have determined an optimal pipeline to achieve high species-level resolution for vaginal microbiota with short amplicons, which will facilitate future studies.IMPORTANCEFor vaginal microbiota studies, diverse 16S rRNA gene regions were applied for amplification and sequencing, which affect the comparability between different studies as well as the species-level resolution of taxonomic classification. We conducted comprehensive evaluation on the methods which influence the accuracy for the taxonomic classification and established an optimal pipeline to achieve high species-level resolution for vaginal microbiota with short amplicons, which will facilitate future studies.
Collapse
Affiliation(s)
- Wei Qing
- Microbiome Medicine Center, Division of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
| | - Yiya Shi
- Microbiome Medicine Center, Division of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
- Department of Medical Laboratory, The Central Hospital of Wuhan, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Rongdan Chen
- Microbiome Medicine Center, Division of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
| | - Yin'ai Zou
- Microbiome Medicine Center, Division of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
| | - Cancan Qi
- Microbiome Medicine Center, Division of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
| | - Yingxuan Zhang
- Microbiome Medicine Center, Division of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
| | - Zuyi Zhou
- Microbiome Medicine Center, Division of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
| | - Shanshan Li
- Microbiome Medicine Center, Division of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
| | - Yi Hou
- Microbiome Medicine Center, Division of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
- Department of Medical Laboratory, Shenzhen People’s Hospital, The Second Clinical Medical College of Jinan University, The First Affiliated Hospital of South University of Science and Technology, Shenzhen, Guangdong, China
| | - Hongwei Zhou
- Microbiome Medicine Center, Division of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
| | - Muxuan Chen
- Microbiome Medicine Center, Division of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
| |
Collapse
|
3
|
Muralidharan HS, Fox NY, Pop M. The impact of transitive annotation on the training of taxonomic classifiers. Front Microbiol 2024; 14:1240957. [PMID: 38235435 PMCID: PMC10792039 DOI: 10.3389/fmicb.2023.1240957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 11/03/2023] [Indexed: 01/19/2024] Open
Abstract
Introduction A common task in the analysis of microbial communities involves assigning taxonomic labels to the sequences derived from organisms found in the communities. Frequently, such labels are assigned using machine learning algorithms that are trained to recognize individual taxonomic groups based on training data sets that comprise sequences with known taxonomic labels. Ideally, the training data should rely on labels that are experimentally verified-formal taxonomic labels require knowledge of physical and biochemical properties of organisms that cannot be directly inferred from sequence alone. However, the labels associated with sequences in biological databases are most commonly computational predictions which themselves may rely on computationally-generated data-a process commonly referred to as "transitive annotation." Methods In this manuscript we explore the implications of training a machine learning classifier (the Ribosomal Database Project's Bayesian classifier in our case) on data that itself has been computationally generated. We generate new training examples based on 16S rRNA data from a metagenomic experiment, and evaluate the extent to which the taxonomic labels predicted by the classifier change after re-training. Results We demonstrate that even a few computationally-generated training data points can significantly skew the output of the classifier to the point where entire regions of the taxonomic space can be disturbed. Discussion and conclusions We conclude with a discussion of key factors that affect the resilience of classifiers to transitively-annotated training data, and propose best practices to avoid the artifacts described in our paper.
Collapse
Affiliation(s)
- Harihara Subrahmaniam Muralidharan
- Department of Computer Science, University of Maryland, College Park, MD, United States
- Center for Bioinformatics and Computational Biology (CBCB), University of Maryland, College Park, MD, United States
| | - Noam Y. Fox
- Department of Computer Science, University of Maryland, College Park, MD, United States
| | - Mihai Pop
- Department of Computer Science, University of Maryland, College Park, MD, United States
- Center for Bioinformatics and Computational Biology (CBCB), University of Maryland, College Park, MD, United States
| |
Collapse
|
4
|
Hall MB, Coin LJM. Pangenome databases improve host removal and mycobacteria classification from clinical metagenomic data. Gigascience 2024; 13:giae010. [PMID: 38573185 PMCID: PMC10993716 DOI: 10.1093/gigascience/giae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 01/10/2024] [Accepted: 02/27/2024] [Indexed: 04/05/2024] Open
Abstract
BACKGROUND Culture-free real-time sequencing of clinical metagenomic samples promises both rapid pathogen detection and antimicrobial resistance profiling. However, this approach introduces the risk of patient DNA leakage. To mitigate this risk, we need near-comprehensive removal of human DNA sequences at the point of sequencing, typically involving the use of resource-constrained devices. Existing benchmarks have largely focused on the use of standardized databases and largely ignored the computational requirements of depletion pipelines as well as the impact of human genome diversity. RESULTS We benchmarked host removal pipelines on simulated and artificial real Illumina and Nanopore metagenomic samples. We found that construction of a custom kraken database containing diverse human genomes results in the best balance of accuracy and computational resource usage. In addition, we benchmarked pipelines using kraken and minimap2 for taxonomic classification of Mycobacterium reads using standard and custom databases. With a database representative of the Mycobacterium genus, both tools obtained improved specificity and sensitivity, compared to the standard databases for classification of Mycobacterium tuberculosis. Computational efficiency of these custom databases was superior to most standard approaches, allowing them to be executed on a laptop device. CONCLUSIONS Customized pangenome databases provide the best balance of accuracy and computational efficiency when compared to standard databases for the task of human read removal and M. tuberculosis read classification from metagenomic samples. Such databases allow for execution on a laptop, without sacrificing accuracy, an especially important consideration in low-resource settings. We make all customized databases and pipelines freely available.
Collapse
Affiliation(s)
- Michael B Hall
- Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, 3000 Victoria, Australia
| | - Lachlan J M Coin
- Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, 3000 Victoria, Australia
| |
Collapse
|
5
|
Román-Camacho JJ, García-García I, Santos-Dueñas IM, García-Martínez T, Mauricio JC. Latest Trends in Industrial Vinegar Production and the Role of Acetic Acid Bacteria: Classification, Metabolism, and Applications-A Comprehensive Review. Foods 2023; 12:3705. [PMID: 37835358 PMCID: PMC10572879 DOI: 10.3390/foods12193705] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 10/03/2023] [Accepted: 10/06/2023] [Indexed: 10/15/2023] Open
Abstract
Vinegar is one of the most appreciated fermented foods in European and Asian countries. In industry, its elaboration depends on numerous factors, including the nature of starter culture and raw material, as well as the production system and operational conditions. Furthermore, vinegar is obtained by the action of acetic acid bacteria (AAB) on an alcoholic medium in which ethanol is transformed into acetic acid. Besides the highlighted oxidative metabolism of AAB, their versatility and metabolic adaptability make them a taxonomic group with several biotechnological uses. Due to new and rapid advances in this field, this review attempts to approach the current state of knowledge by firstly discussing fundamental aspects related to industrial vinegar production and then exploring aspects related to AAB: classification, metabolism, and applications. Emphasis has been placed on an exhaustive taxonomic review considering the progressive increase in the number of new AAB species and genera, especially those with recognized biotechnological potential.
Collapse
Affiliation(s)
- Juan J. Román-Camacho
- Department of Agricultural Chemistry, Edaphology and Microbiology, Agrifood Campus of International Excellence ceiA3, University of Córdoba, 14014 Córdoba, Spain; (J.J.R.-C.); (T.G.-M.); (J.C.M.)
| | - Isidoro García-García
- Department of Inorganic Chemistry and Chemical Engineering, Agrifood Campus of International Excellence ceiA3, Nano Chemistry Institute (IUNAN), University of Córdoba, 14014 Córdoba, Spain;
| | - Inés M. Santos-Dueñas
- Department of Inorganic Chemistry and Chemical Engineering, Agrifood Campus of International Excellence ceiA3, Nano Chemistry Institute (IUNAN), University of Córdoba, 14014 Córdoba, Spain;
| | - Teresa García-Martínez
- Department of Agricultural Chemistry, Edaphology and Microbiology, Agrifood Campus of International Excellence ceiA3, University of Córdoba, 14014 Córdoba, Spain; (J.J.R.-C.); (T.G.-M.); (J.C.M.)
| | - Juan C. Mauricio
- Department of Agricultural Chemistry, Edaphology and Microbiology, Agrifood Campus of International Excellence ceiA3, University of Córdoba, 14014 Córdoba, Spain; (J.J.R.-C.); (T.G.-M.); (J.C.M.)
| |
Collapse
|
6
|
Sun S, Zhang X. Corrigendum: Genetic characteristics and integration specificity of Salmonella enterica temperate phages. Front Microbiol 2023; 14:1273462. [PMID: 37795299 PMCID: PMC10545893 DOI: 10.3389/fmicb.2023.1273462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 09/04/2023] [Indexed: 10/06/2023] Open
Abstract
[This corrects the article DOI: 10.3389/fmicb.2023.1199843.].
Collapse
Affiliation(s)
- Siqi Sun
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, China
- Department of Life Sciences and Technology, Beijing University of Chemical Technology, Chaoyang, Beijing, China
| | - Xianglilan Zhang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, China
| |
Collapse
|
7
|
Virtanen J, Hautaniemi M, Dutra L, Plyusnin I, Hautala K, Smura T, Vapalahti O, Sironen T, Kant R, Kinnunen PM. Partial Genome Characterization of Novel Parapoxvirus in Horse, Finland. Emerg Infect Dis 2023; 29:1941-1944. [PMID: 37610155 PMCID: PMC10461679 DOI: 10.3201/eid2909.230049] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/24/2023] Open
Abstract
We report a sequencing protocol and 121-kb poxvirus sequence from a clinical sample from a horse in Finland with dermatitis. Based on phylogenetic analyses, the virus is a novel parapoxvirus associated with a recent epidemic; previous data suggest zoonotic potential. Increased awareness of this virus and specific diagnostic protocols are needed.
Collapse
|
8
|
Sun S, Zhang X. Genetic characteristics and integration specificity of Salmonella enterica temperate phages. Front Microbiol 2023; 14:1199843. [PMID: 37593543 PMCID: PMC10428622 DOI: 10.3389/fmicb.2023.1199843] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 07/12/2023] [Indexed: 08/19/2023] Open
Abstract
Introduction Temperate phages can engage in the horizontal transfer of functional genes to their bacterial hosts. Thus, their genetic material becomes an intimate part of bacterial genomes and plays essential roles in bacterial mutation and evolution. Specifically, temperate phages can naturally transmit genes by integrating their genomes into the bacterial host genomes via integrases. Our previous study showed that Salmonella enterica contains the largest number of temperate phages among all publicly available bacterial species. S. enterica is an important pathogen that can cause serious systemic infections and even fatalities. Methods Initially, we extracted all S. enterica temperate phages from the extensively developed temperate phage database established in our previous study. Subsequently, we conducted an in-depth analysis of the genetic characteristics and integration specificity exhibited by these S. enterica temperate phages. Results Here we identified 8,777 S. enterica temperate phages, all of which have integrases in their genomes. We found 491 non-redundant S. enterica temperate phage integrases (integrase entries). S. enterica temperate phage integrases were classified into three types: intA, intS, and phiRv2. Correlation analysis showed that the sequence lengths of S. enterica integrase and core regions of attB and attP were strongly correlated. Further phylogenetic analysis and taxonomic classification indicated that both the S. enterica temperate phage genomes and the integrase gene sequences were of high diversities. Discussion Our work provides insight into the essential integration specificity and genetic diversity of S. enterica temperate phages. This study paves the way for a better understanding of the interactions between phages and S. enterica. By analyzing a large number of S. enterica temperate phages and their integrases, we provide valuable insights into the genetic diversity and prevalence of these elements. This knowledge has important implications for developing targeted therapeutic interventions, such as phage therapy, to combat S. enterica infections. By harnessing the lytic capabilities of temperate phages, they can be engineered or utilized in phage cocktails to specifically target and eradicate S. enterica strains, offering an alternative or complementary approach to traditional antibiotic treatments. Our study has implications for public health and holds potential significance in combating clinical infections caused by S. enterica.
Collapse
Affiliation(s)
- Siqi Sun
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, China
- Department of Life Sciences and Technology, Beijing University of Chemical Technology, Chaoyang, Beijing, China
| | - Xianglilan Zhang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, China
| |
Collapse
|
9
|
Márquez-Villa JM, Rodríguez-Sierra JC, Amtanus Chequer N, Cob-Calan NN, García-Maldonado JQ, Cadena S, Hernández-Núñez E. Phenanthrene Degradation by Photosynthetic Bacterial Consortium Dominated by Fischerella sp. Life (Basel) 2023; 13:life13051108. [PMID: 37240753 DOI: 10.3390/life13051108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Revised: 04/20/2023] [Accepted: 04/27/2023] [Indexed: 05/28/2023] Open
Abstract
Microbial degradation of aromatic hydrocarbons is an emerging technology, and it is well recognized for its economic methods, efficiency, and safety; however, its exploration is still scarce and greater emphasis on cyanobacteria-bacterial mutualistic interactions is needed. We evaluated and characterized the phenanthrene biodegradation capacity of consortium dominated by Fischerella sp. under holoxenic conditions with aerobic heterotrophic bacteria and their molecular identification through 16S rRNA Illumina sequencing. Results indicated that our microbial consortium can degrade up to 92% of phenanthrene in five days. Bioinformatic analyses revealed that consortium was dominated by Fischerella sp., however different members of Nostocaceae and Weeksellaceae, as well as several other bacteria, such as Chryseobacterium, and Porphyrobacter, were found to be putatively involved in the biological degradation of phenanthrene. This work contributes to a better understanding of biodegradation of phenanthrene by cyanobacteria and identify the microbial diversity related.
Collapse
Affiliation(s)
| | | | - Nayem Amtanus Chequer
- Department of Marine Resources, Centro de Investigación y de Estudios Avanzados del IPN, Merida 97310, Yucatan, Mexico
| | - Nubia Noemí Cob-Calan
- Instituto Tecnológico Superior de Calkiní en el Estado de Campeche, Calkiní 24900, Campeche, Mexico
| | | | - Santiago Cadena
- Department of Marine Resources, Centro de Investigación y de Estudios Avanzados del IPN, Merida 97310, Yucatan, Mexico
| | - Emanuel Hernández-Núñez
- Department of Marine Resources, Centro de Investigación y de Estudios Avanzados del IPN, Merida 97310, Yucatan, Mexico
| |
Collapse
|
10
|
Cosic A, Leitner E, Petternel C, Galler H, Reinthaler FF, Herzog-Obereder KA, Tatscher E, Raffl S, Feierl G, Högenauer C, Zechner EL, Kienesberger S. Corrigendum: Variation in accessory genes within the Klebsiella oxytoca species complex delineates monophyletic members and simplifies coherent genotyping. Front Microbiol 2023; 14:1155851. [PMID: 36960282 PMCID: PMC10028735 DOI: 10.3389/fmicb.2023.1155851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 02/22/2023] [Indexed: 03/09/2023] Open
Abstract
[This corrects the article DOI: 10.3389/fmicb.2021.692453.].
Collapse
Affiliation(s)
- Amar Cosic
- Institute of Molecular Biosciences, University of Graz, Graz, Austria
- BioTechMed-Graz, Graz, Austria
| | - Eva Leitner
- Diagnostic and Research Institute of Hygiene, Microbiology and Environmental Medicine, Medical University of Graz, Graz, Austria
| | - Christian Petternel
- Diagnostic and Research Institute of Hygiene, Microbiology and Environmental Medicine, Medical University of Graz, Graz, Austria
| | - Herbert Galler
- Diagnostic and Research Institute of Hygiene, Microbiology and Environmental Medicine, Medical University of Graz, Graz, Austria
| | - Franz F. Reinthaler
- Diagnostic and Research Institute of Hygiene, Microbiology and Environmental Medicine, Medical University of Graz, Graz, Austria
| | - Kathrin A. Herzog-Obereder
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Elisabeth Tatscher
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Sandra Raffl
- Institute of Molecular Biosciences, University of Graz, Graz, Austria
| | - Gebhard Feierl
- Diagnostic and Research Institute of Hygiene, Microbiology and Environmental Medicine, Medical University of Graz, Graz, Austria
| | - Christoph Högenauer
- BioTechMed-Graz, Graz, Austria
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Ellen L. Zechner
- Institute of Molecular Biosciences, University of Graz, Graz, Austria
- BioTechMed-Graz, Graz, Austria
- Field of Excellence BioHealth, University of Graz, Graz, Austria
| | - Sabine Kienesberger
- Institute of Molecular Biosciences, University of Graz, Graz, Austria
- BioTechMed-Graz, Graz, Austria
- Field of Excellence BioHealth, University of Graz, Graz, Austria
- *Correspondence: Sabine Kienesberger
| |
Collapse
|
11
|
Shen-Gunther J, Cai H, Wang Y. A Customized Monkeypox Virus Genomic Database (MPXV DB v1.0) for Rapid Sequence Analysis and Phylogenomic Discoveries in CLC Microbial Genomics. Viruses 2022; 15:40. [PMID: 36680080 PMCID: PMC9861985 DOI: 10.3390/v15010040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 12/20/2022] [Indexed: 12/25/2022] Open
Abstract
Monkeypox has been a neglected, zoonotic tropical disease for over 50 years. Since the 2022 global outbreak, hundreds of human clinical samples have been subjected to next-generation sequencing (NGS) worldwide with raw data deposited in public repositories. However, sequence analysis for in-depth investigation of viral evolution remains hindered by the lack of a curated, whole genome Monkeypox virus (MPXV) database (DB) and efficient bioinformatics pipelines. To address this, we developed a customized MPXV DB for integration with "ready-to-use" workflows in the CLC Microbial Genomics Module for whole genomic and metagenomic analysis. After database construction (218 MPXV genomes), whole genome alignment, pairwise comparison, and evolutionary analysis of all genomes were analyzed to autogenerate tabular outputs and visual displays (collective runtime: 16 min). The clinical utility of the MPXV DB was demonstrated by using a Chimpanzee fecal, hybrid-capture NGS dataset (publicly available) for metagenomic, phylogenomic, and viral/host integration analysis. The clinically relevant MPXV DB embedded in CLC workflows proved to be a rapid method of sequence analysis useful for phylogenomic exploration and a wide range of applications in translational science.
Collapse
Affiliation(s)
- Jane Shen-Gunther
- Department of Clinical Investigation, Gynecologic Oncology & Clinical Investigation, Brooke Army Medical Center, Fort Sam Houston, TX 78234, USA
| | - Hong Cai
- Department of Molecular Microbiology and Immunology, University of Texas at San Antonio, San Antonio, TX 78249, USA
- South Texas Center for Emerging Infectious Diseases, University of Texas at San Antonio, San Antonio, TX 78249, USA
| | - Yufeng Wang
- Department of Molecular Microbiology and Immunology, University of Texas at San Antonio, San Antonio, TX 78249, USA
- South Texas Center for Emerging Infectious Diseases, University of Texas at San Antonio, San Antonio, TX 78249, USA
| |
Collapse
|
12
|
Govender KN, Eyre DW. Benchmarking taxonomic classifiers with Illumina and Nanopore sequence data for clinical metagenomic diagnostic applications. Microb Genom 2022; 8. [PMID: 36269282 PMCID: PMC9676057 DOI: 10.1099/mgen.0.000886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Culture-independent metagenomic detection of microbial species has the potential to provide rapid and precise real-time diagnostic results. However, it is potentially limited by sequencing and taxonomic classification errors. We use simulated and real-world data to benchmark rates of species misclassification using 100 reference genomes for each of the ten common bloodstream pathogens and six frequent blood-culture contaminants (n=1568, only 68 genomes were available for Micrococcus luteus). Simulating both with and without sequencing error for both the Illumina and Oxford Nanopore platforms, we evaluated commonly used classification tools including Kraken2, Bracken and Centrifuge, utilizing mini (8 GB) and standard (30–50 GB) databases. Bracken with the standard database performed best, the median percentage of reads across both sequencing platforms identified correctly to the species level was 97.8% (IQR 92.7:99.0) [range 5:100]. For Kraken2 with a mini database, a commonly used combination, median species-level identification was 86.4% (IQR 50.5:93.7) [range 4.3:100]. Classification performance varied by species, with Escherichia coli being more challenging to classify correctly (probability of reads being assigned to the correct species: 56.1–96.0%, varying by tool used). Human read misclassification was negligible. By filtering out shorter Nanopore reads we found performance similar or superior to Illumina sequencing, despite higher sequencing error rates. Misclassification was more common when the misclassified species had a higher average nucleotide identity to the true species. Our findings highlight taxonomic misclassification of sequencing data occurs and varies by sequencing and analysis workflow. To account for ‘bioinformatic contamination’ we present a contamination catalogue that can be used in metagenomic pipelines to ensure accurate results that can support clinical decision making.
Collapse
Affiliation(s)
- Kumeren N Govender
- Nuffield Department of Medicine, John Radcliffe Hospital, University of Oxford, Oxford, UK
| | - David W Eyre
- Nuffield Department of Medicine, John Radcliffe Hospital, University of Oxford, Oxford, UK.,Big Data Institute, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| |
Collapse
|
13
|
Mock F, Kretschmer F, Kriese A, Böcker S, Marz M. Taxonomic classification of DNA sequences beyond sequence similarity using deep neural networks. Proc Natl Acad Sci U S A 2022; 119:e2122636119. [PMID: 36018838 DOI: 10.1073/pnas.2122636119] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Taxonomic classification, that is, the assignment to biological clades with shared ancestry, is a common task in genetics, mainly based on a genome similarity search of large genome databases. The classification quality depends heavily on the database, since representative relatives must be present. Many genomic sequences cannot be classified at all or only with a high misclassification rate. Here we present BERTax, a deep neural network program based on natural language processing to precisely classify the superkingdom and phylum of DNA sequences taxonomically without the need for a known representative relative from a database. We show BERTax to be at least on par with the state-of-the-art approaches when taxonomically similar species are part of the training data. For novel organisms, however, BERTax clearly outperforms any existing approach. Finally, we show that BERTax can also be combined with database approaches to further increase the prediction quality in almost all cases. Since BERTax is not based on similar entries in databases, it allows precise taxonomic classification of a broader range of genomic sequences, thus increasing the overall information gain.
Collapse
|
14
|
Mastriani E, Bienes KM, Wong G, Berthet N. PIMGAVir and Vir-MinION: Two Viral Metagenomic Pipelines for Complete Baseline Analysis of 2nd and 3rd Generation Data. Viruses 2022; 14:1260. [PMID: 35746732 DOI: 10.3390/v14061260] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 05/31/2022] [Accepted: 06/03/2022] [Indexed: 11/16/2022] Open
Abstract
The taxonomic classification of viral sequences is frequently used for the rapid identification of pathogens, which is a key point for when a viral outbreak occurs. Both Oxford Nanopore Technologies (ONT) MinION and the Illumina (NGS) technology provide efficient methods to detect viral pathogens. Despite the availability of many strategies and software, matching them can be a very tedious and time-consuming task. As a result, we developed PIMGAVir and Vir-MinION, two metagenomics pipelines that automatically provide the user with a complete baseline analysis. The PIMGAVir and Vir-MinION pipelines work on 2nd and 3rd generation data, respectively, and provide the user with a taxonomic classification of the reads through three strategies: assembly-based, read-based, and clustering-based. The pipelines supply the scientist with comprehensive results in graphical and textual format for future analyses. Finally, the pipelines equip the user with a stand-alone platform with dedicated and various viral databases, which is a requirement for working in field conditions without internet connection.
Collapse
|
15
|
Matiz-Ceron L, Reyes A, Anzola J. Taxonomical Evaluation of Plant Chloroplastic Markers by Bayesian Classifier. Front Plant Sci 2022; 12:782663. [PMID: 35185949 PMCID: PMC8850773 DOI: 10.3389/fpls.2021.782663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Accepted: 12/29/2021] [Indexed: 06/14/2023]
Abstract
DNA barcodes are standardized sequences that range between 400 and 800 bp, vary at different taxonomic levels, and make it possible to assign sequences to species that have been previously taxonomically characterized. Several DNA barcodes have been postulated for plants, nonetheless, their classification potential has not been evaluated for metabarcoding, and as a result, it would appear as none of them excels above the others in this area. One tool that has been widely used and served as a baseline when evaluating new approaches is Naïve Bayesian Classifiers (NBC). The present study aims at evaluating the classification power of several plant chloroplast genetic markers that have been proposed as barcodes (trnL, rpoB, rbcL, matK, psbA-trnH, and psbK) using an NBC. We performed the classification at different taxonomic levels, and identified problematic genera when resolution was desired. We propose matK and trnL as potential candidate markers with resolution up to genus level. Some problematic genera within certain families could lead to the misclassification no matter which marker is used (i.e., Aegilops, Gueldenstaedtia, Helianthus, Oryza, Shorea, Thysananthus, and Triticum). Finally, we suggest recommendations for the taxonomic identification of plants in samples with potential mixtures.
Collapse
Affiliation(s)
- Luisa Matiz-Ceron
- Research Group in Computational Biology and Microbial Ecology, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
- Max Planck Tandem Group in Computational Biology, Universidad de los Andes, Bogotá, Colombia
| | - Alejandro Reyes
- Research Group in Computational Biology and Microbial Ecology, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
- Max Planck Tandem Group in Computational Biology, Universidad de los Andes, Bogotá, Colombia
| | - Juan Anzola
- Research Group in Computational Biology and Microbial Ecology, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
- Max Planck Tandem Group in Computational Biology, Universidad de los Andes, Bogotá, Colombia
- Department of Engineering and Natural Sciences, Universidad Central, Bogotá, Colombia
| |
Collapse
|
16
|
Somenahally AC, Loeppert RH, Zhou J, Gentry TJ. Niche Differentiation of Arsenic-Transforming Microbial Groups in the Rice Rhizosphere Compartments as Impacted by Water Management and Soil-Arsenic Concentrations. Front Microbiol 2021; 12:736751. [PMID: 34803950 PMCID: PMC8602891 DOI: 10.3389/fmicb.2021.736751] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Accepted: 10/06/2021] [Indexed: 12/02/2022] Open
Abstract
Arsenic (As) bioavailability in the rice rhizosphere is influenced by many microbial interactions, particularly by metal-transforming functional groups at the root-soil interface. This study was conducted to examine As-transforming microbes and As-speciation in the rice rhizosphere compartments, in response to two different water management practices (continuous and intermittently flooded), established on fields with high to low soil-As concentration. Microbial functional gene composition in the rhizosphere and root-plaque compartments were characterized using the GeoChip 4.0 microarray. Arsenic speciation and concentrations were analyzed in the rhizosphere soil, root-plaque, pore water, and grain samples. Results confirmed several As-biotransformation processes in the rice rhizosphere compartments, and distinct assemblage of As-reducing and methylating bacteria was observed between the root-plaque and rhizosphere. Results confirmed higher potential for microbial As-reduction and As-methylation in continuously flooded, long term As-contaminated fields, which accumulated highest concentrations of AsIII and methyl-As concentrations in pore water and rice grains. Water management treatment significantly altered As-speciation in the rhizosphere, and intermittent flooding reduced methyl-As and AsIII concentrations in the pore water, root-plaque and rice grain. Ordination and taxonomic analysis of detected gene-probes indicated that root-plaque and rhizosphere assembled significantly different microbial functional groups demonstrating niche separation. Taxonomic non-redundancy was evident, suggesting that As-reduction, -oxidation and -methylation processes were performed by different microbial functional groups. It was also evident that As transformation was coupled to different biogeochemical cycling processes (nutrient assimilation, carbon metabolism etc.) in the compartments and between treatments, revealing functional non-redundancy of rice-rhizosphere microbiome in response to local biogeochemical conditions and As contamination. This study provided novel insights on As-biotransformation processes and their implications on As-chemistry at the root-soil interface and their responses to water management, which could be applied for mitigating As-bioavailability and accumulation in rice grains.
Collapse
Affiliation(s)
- Anil C Somenahally
- Texas A&M AgriLife Research, Overton, TX, United States.,Department of Soil and Crop Sciences, Texas A&M University, College Station, TX, United States
| | - Richard H Loeppert
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX, United States
| | - Jizhong Zhou
- Institute for Environmental Genomics, University of Oklahoma, Norman, OK, United States
| | - Terry J Gentry
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX, United States
| |
Collapse
|
17
|
Hoffman C, Siddiqui NY, Fields I, Gregory WT, Simon HM, Mooney MA, Wolfe AJ, Karstens L. Species-Level Resolution of Female Bladder Microbiota from 16S rRNA Amplicon Sequencing. mSystems 2021; 6:e0051821. [PMID: 34519534 PMCID: PMC8547459 DOI: 10.1128/msystems.00518-21] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 08/18/2021] [Indexed: 01/04/2023] Open
Abstract
The human bladder contains bacteria, even in the absence of infection. Interest in studying these bacteria and their association with bladder conditions is increasing. However, the chosen experimental method can limit the resolution of the taxonomy that can be assigned to the bacteria found in the bladder. 16S rRNA amplicon sequencing is commonly used to identify bacteria in urinary specimens, but it is typically restricted to genus-level identification. Our primary aim here was to determine if accurate species-level identification of bladder bacteria is possible using 16S rRNA amplicon sequencing. We evaluated the ability of different classification schemes, each consisting of combinations of a reference database, a 16S rRNA gene variable region, and a taxonomic classification algorithm to correctly classify bladder bacteria. We show that species-level identification is possible and that the reference database chosen is the most important component, followed by the 16S variable region sequenced. IMPORTANCE Accurate species-level identification from culture-independent techniques is of importance for microbial niches that are less well characterized, such as that of the bladder. 16S rRNA amplicon sequencing, a common culture-independent way to identify bacteria, is often critiqued for lacking species-level resolution. Here, we extensively evaluate classification schemes for species-level bacterial annotation of 16S amplicon data from bladder bacteria. Our results show that the proper choice of taxonomic database and variable region of the 16S rRNA gene sequence makes species level identification possible. We also show that this improvement can be achieved through the more careful application of existing methods and resources. Species-level information may deepen our understanding of associations between bacteria in the bladder and bladder conditions such as lower urinary tract symptoms and urinary tract infections.
Collapse
Affiliation(s)
- Carter Hoffman
- Division of Bioinformatics and Computational Biomedicine, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
| | - Nazema Y. Siddiqui
- Division of Urogynecology and Reconstructive Pelvic Surgery, Department of Obstetrics and Gynecology, Duke University, Durham, North Carolina, USA
| | - Ian Fields
- Division of Urogynecology, Department of Obstetrics and Gynecology, Oregon Health & Science University, Portland, Oregon, USA
| | - W. Thomas Gregory
- Division of Urogynecology, Department of Obstetrics and Gynecology, Oregon Health & Science University, Portland, Oregon, USA
| | | | - Michael A. Mooney
- Division of Bioinformatics and Computational Biomedicine, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
| | - Alan J. Wolfe
- Department of Microbiology & Immunology, Loyola University Chicago, Maywood, Illinois, USA
| | - Lisa Karstens
- Division of Bioinformatics and Computational Biomedicine, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
- Division of Urogynecology, Department of Obstetrics and Gynecology, Oregon Health & Science University, Portland, Oregon, USA
| |
Collapse
|
18
|
Maguvu TE, Bezuidenhout CC. Whole Genome Sequencing Based Taxonomic Classification, and Comparative Genomic Analysis of Potentially Human Pathogenic Enterobacter spp. Isolated from Chlorinated Wastewater in the North West Province, South Africa. Microorganisms 2021; 9:1928. [PMID: 34576823 DOI: 10.3390/microorganisms9091928] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 09/04/2021] [Accepted: 09/05/2021] [Indexed: 11/17/2022] Open
Abstract
Comparative genomics, in particular, pan-genome analysis, provides an in-depth understanding of the genetic variability and dynamics of a bacterial species. Coupled with whole-genome-based taxonomic analysis, these approaches can help to provide comprehensive, detailed insights into a bacterial species. Here, we report whole-genome-based taxonomic classification and comparative genomic analysis of potential human pathogenic Enterobacter hormaechei subsp. hoffmannii isolated from chlorinated wastewater. Genome Blast Distance Phylogeny (GBDP), digital DNA-DNA hybridization (dDDH), and average nucleotide identity (ANI) confirmed the identity of the isolates. The algorithm PathogenFinder predicted the isolates to be human pathogens with a probability of greater than 0.78. The potential pathogenic nature of the isolates was supported by the presence of biosynthetic gene clusters (BGCs), aerobactin, and aryl polyenes (APEs), which are known to be associated with pathogenic/virulent strains. Moreover, analysis of the genome sequences of the isolates reflected the presence of an arsenal of virulence factors and antibiotic resistance genes that augment the predictions of the algorithm PathogenFinder. The study comprehensively elucidated the genomic features of pathogenic Enterobacter isolates from wastewaters, highlighting the role of wastewaters in the dissemination of pathogenic microbes, and the need for monitoring the effectiveness of the wastewater treatment process.
Collapse
|
19
|
Weaver MA, Hoagland RE, Boyette CD, Brown SP. Taxonomic Evaluation of a Bioherbicidal Isolate of Albifimbria verrucaria, Formerly Myrothecium verrucaria. J Fungi (Basel) 2021; 7:jof7090694. [PMID: 34575732 PMCID: PMC8465294 DOI: 10.3390/jof7090694] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 08/16/2021] [Accepted: 08/18/2021] [Indexed: 11/24/2022] Open
Abstract
The fungal genus Myrothecium was once polyphyletic but a recent reconsideration of the family Stachybotryaceae spilt it into several genera. The ex-neotype specimen of the species Myrothecium verrucaria is now recognized as Albifimbria verrucaria. The well-studied plant pathogen and candidate bioherbicide CABI-IMI 368023, previously identified as M. verrucaria, was analyzed morphologically and genetically and found to be most consistently aligned with the other representatives of A. verrucaria.
Collapse
Affiliation(s)
- Mark A. Weaver
- USDA-ARS, Biological Control of Pests Research Unit, Stoneville, MS 38776, USA;
- Correspondence:
| | - Robert E. Hoagland
- USDA-ARS, Crop Production Systems Research Unit, Stoneville, MS 38776, USA;
| | | | - Shawn P. Brown
- Department of Biological Sciences, University of Memphis, Memphis, TN 38152, USA;
| |
Collapse
|
20
|
Shen-Gunther J, Xia Q, Cai H, Wang Y. HPV DeepSeq: An Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench. Pathogens 2021; 10:1026. [PMID: 34451490 DOI: 10.3390/pathogens10081026] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 08/09/2021] [Accepted: 08/11/2021] [Indexed: 02/06/2023] Open
Abstract
Next-generation sequencing (NGS) has actualized the human papillomavirus (HPV) virome profiling for in-depth investigation of viral evolution and pathogenesis. However, viral computational analysis remains a bottleneck due to semantic discrepancies between computational tools and curated reference genomes. To address this, we developed and tested automated workflows for HPV taxonomic profiling and visualization using a customized papillomavirus database in the CLC Microbial Genomics Module. HPV genomes from Papilloma Virus Episteme were customized and incorporated into CLC “ready-to-use” workflows for stepwise data processing to include: (1) Taxonomic Analysis, (2) Estimate Alpha/Beta Diversities, and (3) Map Reads to Reference. Low-grade (n = 95) and high-grade (n = 60) Pap smears were tested with ensuing collective runtimes: Taxonomic Analysis (36 min); Alpha/Beta Diversities (5 s); Map Reads (45 min). Tabular output conversion to visualizations entailed 1–2 keystrokes. Biodiversity analysis between low- (LSIL) and high-grade squamous intraepithelial lesions (HSIL) revealed loss of species richness and gain of dominance by HPV-16 in HSIL. Integrating clinically relevant, taxonomized HPV reference genomes within automated workflows proved to be an ultra-fast method of virome profiling. The entire process named “HPV DeepSeq” provides a simple, accurate and practical means of NGS data analysis for a broad range of applications in viral research.
Collapse
|
21
|
Reza MS, Cai Y, Zhang L, Zhang X, Wei Y. Editorial: Computational Solutions for Microbiome and Metagenomics Sequencing Analyses. Front Mol Biosci 2021; 8:698384. [PMID: 34395529 PMCID: PMC8361491 DOI: 10.3389/fmolb.2021.698384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 07/21/2021] [Indexed: 12/04/2022] Open
Affiliation(s)
- Md Selim Reza
- Centre for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.,Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences (CAS), Shenzhen, China
| | - YunPeng Cai
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences (CAS), Shenzhen, China
| | - Lu Zhang
- Department of Computer Science, Hong Kong Baptist University, Kowloon, Hong Kong, SAR China
| | - Xingyu Zhang
- University of Pittsburgh Medical Center, Pittsburgh, PA, United States
| | - Yanjie Wei
- Centre for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.,Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences (CAS), Shenzhen, China
| |
Collapse
|
22
|
Wylezich C, Höper D. Meta-Ribosomalomics: RNA Sequencing Is an Unbiased Method for Parasite Detection of Different Sample Types. Front Microbiol 2021; 12:614553. [PMID: 34234748 PMCID: PMC8256892 DOI: 10.3389/fmicb.2021.614553] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 05/26/2021] [Indexed: 01/23/2023] Open
Abstract
In this perspective article, we review the past use of ribosomal sequences to address scientific and diagnostic questions. We highlight a variety of sequencing approaches including metagenomics and DNA barcoding and their different demands and requirements. Meta-ribosomalomics is introduced as an unbiased approach to exploit high-throughput sequencing datasets for eukaryotic and prokaryotic ribosomal sequences. Prerequisites, benefits, drawbacks, and future perspectives are elaborated and compared to other sequencing approaches.
Collapse
Affiliation(s)
- Claudia Wylezich
- Friedrich-Loeffler-Institut, Institute of Diagnostic Virology, Greifswald-Insel Riems, Germany
| | - Dirk Höper
- Friedrich-Loeffler-Institut, Institute of Diagnostic Virology, Greifswald-Insel Riems, Germany
| |
Collapse
|
23
|
Ziemski M, Wisanwanichthan T, Bokulich NA, Kaehler BD. Beating Naive Bayes at Taxonomic Classification of 16S rRNA Gene Sequences. Front Microbiol 2021; 12:644487. [PMID: 34220738 PMCID: PMC8249850 DOI: 10.3389/fmicb.2021.644487] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 05/31/2021] [Indexed: 12/28/2022] Open
Abstract
Naive Bayes classifiers (NBC) have dominated the field of taxonomic classification of amplicon sequences for over a decade. Apart from having runtime requirements that allow them to be trained and used on modest laptops, they have persistently provided class-topping classification accuracy. In this work we compare NBC with random forest classifiers, neural network classifiers, and a perfect classifier that can only fail when different species have identical sequences, and find that in some practical scenarios there is little scope for improving on NBC for taxonomic classification of 16S rRNA gene sequences. Further improvements in taxonomy classification are unlikely to come from novel algorithms alone, and will need to leverage other technological innovations, such as ecological frequency information.
Collapse
Affiliation(s)
- Michal Ziemski
- Laboratory of Food Systems Biotechnology, Institute of Food, Nutrition, and Health, ETH Zürich, Zurich, Switzerland
| | | | - Nicholas A. Bokulich
- Laboratory of Food Systems Biotechnology, Institute of Food, Nutrition, and Health, ETH Zürich, Zurich, Switzerland
| | | |
Collapse
|
24
|
Parks DH, Rigato F, Vera-Wolf P, Krause L, Hugenholtz P, Tyson GW, Wood DLA. Evaluation of the Microba Community Profiler for Taxonomic Profiling of Metagenomic Datasets From the Human Gut Microbiome. Front Microbiol 2021; 12:643682. [PMID: 33959106 PMCID: PMC8093879 DOI: 10.3389/fmicb.2021.643682] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 03/11/2021] [Indexed: 12/12/2022] Open
Abstract
A fundamental goal of microbial ecology is to accurately determine the species composition in a given microbial ecosystem. In the context of the human microbiome, this is important for establishing links between microbial species and disease states. Here we benchmark the Microba Community Profiler (MCP) against other metagenomic classifiers using 140 moderate to complex in silico microbial communities and a standardized reference genome database. MCP generated accurate relative abundance estimates and made substantially fewer false positive predictions than other classifiers while retaining a high recall rate. We further demonstrated that the accuracy of species classification was substantially increased using the Microba Genome Database, which is more comprehensive than reference datasets used by other classifiers and illustrates the importance of including genomes of uncultured taxa in reference databases. Consequently, MCP classifies appreciably more reads than other classifiers when using their recommended reference databases. These results establish MCP as best-in-class with the ability to produce comprehensive and accurate species profiles of human gastrointestinal samples.
Collapse
Affiliation(s)
| | - Fabio Rigato
- Microba Life Sciences Limited, Brisbane, QLD, Australia
| | | | - Lutz Krause
- Microba Life Sciences Limited, Brisbane, QLD, Australia
| | - Philip Hugenholtz
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, St. Lucia, QLD, Australia
| | - Gene W. Tyson
- Microba Life Sciences Limited, Brisbane, QLD, Australia
- Centre for Microbiome Research, School of Biomedical Sciences, Translational Research Institute, Queensland University of Technology, Woolloongabba, QLD, Australia
| | | |
Collapse
|
25
|
Cosic A, Leitner E, Petternel C, Galler H, Reinthaler FF, Herzog-Obereder KA, Tatscher E, Raffl S, Feierl G, Högenauer C, Zechner EL, Kienesberger S. Variation in Accessory Genes Within the Klebsiella oxytoca Species Complex Delineates Monophyletic Members and Simplifies Coherent Genotyping. Front Microbiol 2021; 12:692453. [PMID: 34276625 PMCID: PMC8283571 DOI: 10.3389/fmicb.2021.692453] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 05/28/2021] [Indexed: 02/03/2023] Open
Abstract
Members of the Klebsiella oxytoca species complex (KoSC) are emerging human pathogens causing infections of increasing significance especially in healthcare settings. KoSC strains are affiliated with distinct phylogroups based on genetic variation at the beta-lactamase gene (bla OXY) and it has been proposed that each major phylogroup represents a unique species. However, since the typing methods applied in clinical settings cannot differentiate every species within the complex, existing clinical, epidemiological and DNA sequence data is frequently misclassified. Here we systematically examined the phylogenetic relationship of KoSC strains to evaluate robustness of existing typing methods and to provide a simple typing strategy for KoSC members that cannot be differentiated biochemically. Initial analysis of a collection of K. oxytoca, K. michiganensis, K. pasteurii, and K. grimontii strains of environmental origin showed robust correlation of core phylogeny and blaOXY grouping. Moreover, we identified species-specific accessory gene loci for these strains. Extension of species correlation using database entries initially failed. However, assessment of average nucleotide identities (ANI) and phylogenetic validations showed that nearly one third of isolates in public databases have been misidentified. Reclassification resulted in a robust reference strain set for reliable species identification of new isolates or for retyping of strains previously analyzed by multi-locus sequence typing (MLST). Finally, we show convergence of ANI, core gene phylogeny, and accessory gene content for available KoSC genomes. We conclude that also the monophyletic members K. oxytoca, K. michiganensis, K. pasteurii and K. grimontii can be simply differentiated by a PCR strategy targeting bla OXY and accessory genes defined here.
Collapse
Affiliation(s)
- Amar Cosic
- Institute of Molecular Biosciences, University of Graz, Graz, Austria
- BioTechMed-Graz, Graz, Austria
| | - Eva Leitner
- Diagnostic and Research Institute of Hygiene, Microbiology and Environmental Medicine, Medical University of Graz, Graz, Austria
| | - Christian Petternel
- Diagnostic and Research Institute of Hygiene, Microbiology and Environmental Medicine, Medical University of Graz, Graz, Austria
| | - Herbert Galler
- Diagnostic and Research Institute of Hygiene, Microbiology and Environmental Medicine, Medical University of Graz, Graz, Austria
| | - Franz F. Reinthaler
- Diagnostic and Research Institute of Hygiene, Microbiology and Environmental Medicine, Medical University of Graz, Graz, Austria
| | - Kathrin A. Herzog-Obereder
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Elisabeth Tatscher
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Sandra Raffl
- Institute of Molecular Biosciences, University of Graz, Graz, Austria
| | - Gebhard Feierl
- Diagnostic and Research Institute of Hygiene, Microbiology and Environmental Medicine, Medical University of Graz, Graz, Austria
| | - Christoph Högenauer
- BioTechMed-Graz, Graz, Austria
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Ellen L. Zechner
- Institute of Molecular Biosciences, University of Graz, Graz, Austria
- BioTechMed-Graz, Graz, Austria
- Field of Excellence BioHealth, University of Graz, Graz, Austria
| | - Sabine Kienesberger
- Institute of Molecular Biosciences, University of Graz, Graz, Austria
- BioTechMed-Graz, Graz, Austria
- Field of Excellence BioHealth, University of Graz, Graz, Austria
- *Correspondence: Sabine Kienesberger,
| |
Collapse
|
26
|
Setubal JC, Stoye J, Dutilh BE. Editorial: Computational Methods for Microbiome Analysis. Front Genet 2020; 11:623897. [PMID: 33362871 PMCID: PMC7759558 DOI: 10.3389/fgene.2020.623897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 11/24/2020] [Indexed: 11/13/2022] Open
Affiliation(s)
- João C Setubal
- Department of Biochemistry, Institute of Chemistry, University of São Paulo, São Paulo, Brazil
| | - Jens Stoye
- Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
| | - Bas E Dutilh
- Theoretical Biology and Bioinformatics, Department of Biology, Science for Life, Utrecht University, Utrecht, Netherlands
| |
Collapse
|
27
|
Poncheewin W, Hermes GDA, van Dam JCJ, Koehorst JJ, Smidt H, Schaap PJ. NG-Tax 2.0: A Semantic Framework for High-Throughput Amplicon Analysis. Front Genet 2020; 10:1366. [PMID: 32117417 PMCID: PMC6989550 DOI: 10.3389/fgene.2019.01366] [Citation(s) in RCA: 66] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Accepted: 12/12/2019] [Indexed: 12/20/2022] Open
Abstract
NG-Tax 2.0 is a semantic framework for FAIR high-throughput analysis and classification of marker gene amplicon sequences including bacterial and archaeal 16S ribosomal RNA (rRNA), eukaryotic 18S rRNA and ribosomal intergenic transcribed spacer sequences. It can directly use single or merged reads, paired-end reads and unmerged paired-end reads from long range fragments as input to generate de novo amplicon sequence variants (ASV). Using the RDF data model, ASV's can be automatically stored in a graph database as objects that link ASV sequences with the full data-wise and element-wise provenance, thereby achieving the level of interoperability required to utilize such data to its full potential. The graph database can be directly queried, allowing for comparative analyses of over thousands of samples and is connected with an interactive Rshiny toolbox for analysis and visualization of (meta) data. Additionally, NG-Tax 2.0 exports an extended BIOM 1.0 (JSON) file as starting point for further analyses by other means. The extended BIOM file contains new attribute types to include information about the command arguments used, the sequences of the ASVs formed, classification confidence scores and is backwards compatible. The performance of NG-Tax 2.0 was compared with DADA2, using the plugin in the QIIME 2 analysis pipeline. Fourteen 16S rRNA gene amplicon mock community samples were obtained from the literature and evaluated. Precision of NG-Tax 2.0 was significantly higher with an average of 0.95 vs 0.58 for QIIME2-DADA2 while recall was comparable with an average of 0.85 and 0.77, respectively. NG-Tax 2.0 is written in Java. The code, the ontology, a Galaxy platform implementation, the analysis toolbox, tutorials and example SPARQL queries are freely available at http://wurssb.gitlab.io/ngtax under the MIT License.
Collapse
Affiliation(s)
- Wasin Poncheewin
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Wageningen, Netherlands
| | - Gerben D. A. Hermes
- Laboratory of Microbiology, Wageningen University & Research, Wageningen, Netherlands
| | - Jesse C. J. van Dam
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Wageningen, Netherlands
| | - Jasper J. Koehorst
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Wageningen, Netherlands
| | - Hauke Smidt
- Laboratory of Microbiology, Wageningen University & Research, Wageningen, Netherlands
| | - Peter J. Schaap
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Wageningen, Netherlands
| |
Collapse
|
28
|
Mogodiniyai Kasmaei K, Sundh J. Identification of Novel Putative Bacterial Feruloyl Esterases From Anaerobic Ecosystems by Use of Whole-Genome Shotgun Metagenomics and Genome Binning. Front Microbiol 2019; 10:2673. [PMID: 31824458 PMCID: PMC6879456 DOI: 10.3389/fmicb.2019.02673] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Accepted: 11/04/2019] [Indexed: 12/20/2022] Open
Abstract
Feruloyl esterases (FAEs) can reduce the recalcitrance of lignocellulosic biomass to enzymatic hydrolysis, thereby enhancing biorefinery potentials or animal feeding values of the biomass. In addition, ferulic acid, a product of FAE activity, has applications in pharmaceutical and food/beverage industries. It is therefore of great interest to identify new FAEs to enhance understanding about this enzyme family. For this purpose, we used whole-genome shotgun metagenomics and genome binning to explore rumens of dairy cows, large intestines of horses, sediments of freshwater and forest topsoils to identify novel prokaryotic FAEs and trace the responsible microorganisms. A number of prokaryotic genomes were recovered of which, genomes of Clostridiales order and Candidatus Rhabdochlamydia genus showed FAE coding capacities. In total, five sequences were deemed as putative FAE. The BLASTP search against non-redundant protein database of NCBI indicated that these putative FAEs represented novel sequences within this enzyme family. The phylogenetic analysis showed that at least three putative sequences shared evolutionary lineage with FAEs of type A and thus could possess specific activities similar to this type of FAEs, something that is not previously found outside fungal kingdom. We nominate Candidatus Rhabdochlamydia genus as a novel FAE producing taxonomic unit.
Collapse
Affiliation(s)
- Kamyar Mogodiniyai Kasmaei
- Department of Animal Nutrition and Management, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - John Sundh
- Science for Life Laboratory, Department of Biochemistry and Biophysics, National Bioinformatics Infrastructure Sweden, Stockholm University, Solna, Sweden
| |
Collapse
|
29
|
Zhang C, Li Q, Meng Q, Wang W, Cheng Y, Wu X. Sequence and phylogenetic analysis of the complete mitochondrial genome for Hepu mitten crab ( Eriocheir hepuensis) from Nanjiujiang River basin. Mitochondrial DNA B Resour 2019; 4:3890-3891. [PMID: 33366237 PMCID: PMC7707772 DOI: 10.1080/23802359.2019.1688117] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Taxonomic classification of Eriocheir hepuensis was ambiguous, and it has long been controversial. In this study, the whole mitochondrial genome of E. hepuensis was determined to be 16,397 bp, including 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 control region. A total of 20 intergenic gaps were detected, and the AT content of whole mitochondrial genome was 71.78%. Phylogenetic analysis confirmed that the evolutionary relationship of E. hepuensis, E. sinensis, and E. japonica are most likely to be three species with the same taxonomic status. The whole mitogenome of this species will be useful for the future animal evolutionary, phylogenetic relationship, phylogeny and genomic studies in the genus Eriocheir.
Collapse
Affiliation(s)
- Cheng Zhang
- Key Laboratory of Freshwater Aquatic Genetic Resources, Ministry of Agriculture, Shanghai Ocean University, Shanghai, China.,Shanghai Collaborative Innovation Center for Aquatic Animal Genetics and Breeding, Shanghai Ocean University, Shanghai, China
| | - Qingqing Li
- Key Laboratory of Freshwater Aquatic Genetic Resources, Ministry of Agriculture, Shanghai Ocean University, Shanghai, China.,Shanghai Collaborative Innovation Center for Aquatic Animal Genetics and Breeding, Shanghai Ocean University, Shanghai, China
| | - Qingguo Meng
- Jiangsu Key Laboratory for Biodiversity & Biotechnology and Jiangsu Key Laboratory for Aquatic Crustacean Diseases, College of Life Sciences, Nanjing Normal University, Nanjing, PR China
| | - Wen Wang
- Jiangsu Key Laboratory for Biodiversity & Biotechnology and Jiangsu Key Laboratory for Aquatic Crustacean Diseases, College of Life Sciences, Nanjing Normal University, Nanjing, PR China
| | - Yongxu Cheng
- Key Laboratory of Freshwater Aquatic Genetic Resources, Ministry of Agriculture, Shanghai Ocean University, Shanghai, China.,Shanghai Collaborative Innovation Center for Aquatic Animal Genetics and Breeding, Shanghai Ocean University, Shanghai, China.,National Demonstration Centre for Experimental Fisheries Science Education, Shanghai Ocean University, Shanghai, China
| | - XuGan Wu
- Key Laboratory of Freshwater Aquatic Genetic Resources, Ministry of Agriculture, Shanghai Ocean University, Shanghai, China.,Shanghai Collaborative Innovation Center for Aquatic Animal Genetics and Breeding, Shanghai Ocean University, Shanghai, China.,National Demonstration Centre for Experimental Fisheries Science Education, Shanghai Ocean University, Shanghai, China
| |
Collapse
|
30
|
Mandonnet E, Sarubbo S, Petit L. Response: Commentary: The Nomenclature of Human White Matter Association Pathways: Proposal for a Systematic Taxonomic Anatomical Classification. Front Neuroanat 2019; 13:91. [PMID: 31680882 PMCID: PMC6811495 DOI: 10.3389/fnana.2019.00091] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Accepted: 10/03/2019] [Indexed: 11/30/2022] Open
Affiliation(s)
| | - Silvio Sarubbo
- Division of Neurosurgery, Structural and Functional Connectivity Lab, Azienda Provinciale per i Servizi Sanitari, Trento, Italy
| | - Laurent Petit
- Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives-UMR 5293, CNRS, CEA University of Bordeaux, Bordeaux, France
| |
Collapse
|
31
|
Abstract
Here, we describe MetaErg, a standalone and fully automated metagenome and metaproteome annotation pipeline. Annotation of metagenomes is challenging. First, metagenomes contain sequence data of many organisms from all domains of life. Second, many of these are from understudied lineages, encoding genes with low similarity to experimentally validated reference genes. Third, assembly and binning are not perfect, sometimes resulting in artifactual hybrid contigs or genomes. To address these challenges, MetaErg provides graphical summaries of annotation outcomes, both for the complete metagenome and for individual metagenome-assembled genomes (MAGs). It performs a comprehensive annotation of each gene, including taxonomic classification, enabling functional inferences despite low similarity to reference genes, as well as detection of potential assembly or binning artifacts. When provided with metaproteome information, it visualizes gene and pathway activity using sequencing coverage and proteomic spectral counts, respectively. For visualization, MetaErg provides an HTML interface, bringing all annotation results together, and producing sortable and searchable tables, collapsible trees, and other graphic representations enabling intuitive navigation of complex data. MetaErg, implemented in Perl, HTML, and JavaScript, is a fully open source application, distributed under Academic Free License at https://github.com/xiaoli-dong/metaerg. MetaErg is also available as a docker image at https://hub.docker.com/r/xiaolidong/docker-metaerg.
Collapse
Affiliation(s)
- Xiaoli Dong
- Department of Geoscience, University of Calgary, Calgary, AB, Canada
| | - Marc Strous
- Department of Geoscience, University of Calgary, Calgary, AB, Canada
| |
Collapse
|
32
|
Fang Y, Wang Y, Liu Z, Dai H, Cai H, Li Z, Du Z, Wang X, Jing H, Wei Q, Kan B, Wang D. Multilocus Sequence Analysis, a Rapid and Accurate Tool for Taxonomic Classification, Evolutionary Relationship Determination, and Population Biology Studies of the Genus Shewanella. Appl Environ Microbiol 2019; 85:e03126-18. [PMID: 30902862 DOI: 10.1128/AEM.03126-18] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Accepted: 03/19/2019] [Indexed: 02/02/2023] Open
Abstract
The genus Shewanella comprises a group of marine-dwelling species with worldwide distribution. Several species are regarded as causative agents of food spoilage and opportunistic pathogens of human diseases. In this study, a standard multilocus sequence analysis (MLSA) based on six protein-coding genes (gyrA, gyrB, infB, recN, rpoA, and topA) was established as a rapid and accurate identification tool in 59 Shewanella type strains. This method yielded sufficient resolving power in regard to enough informative sites, adequate sequence divergences, and distinct interspecies branches. The stability of phylogenetic topology was supported by high bootstrap values and concordance with different methods. The reliability of the MLSA scheme was further validated by identical phylogenies and high correlations of genomes. The MLSA approach provided a robust system to exhibit evolutionary relationships in the Shewanella genus. The split network tree proposed twelve distinct monophyletic clades with identical G+C contents and high genetic similarities. A total of 86 tested strains were investigated to explore the population biology of the Shewanella genus in China. The most prevalent Shewanella species was Shewanella algae, followed by Shewanella xiamenensis, Shewanella chilikensis, Shewanella indica, Shewanella seohaensis, and Shewanella carassii The strains frequently isolated from clinical and food samples highlighted the importance of increasing the surveillance of Shewanella species. Based on the combined genetic, genomic, and phenotypic analyses, Shewanella upenei should be considered a synonym of S. algae, and Shewanella pacifica should be reclassified as a synonym of Shewanella japonica IMPORTANCE The MLSA scheme based on six housekeeping genes (HKGs) (gyrA, gyrB, infB, recN, rpoA, and topA) is well established as a reliable tool for taxonomic, evolutionary, and population diversity analyses of the genus Shewanella in this study. The standard MLSA method allows researchers to make rapid, economical, and precise identification of Shewanella strains. The robust phylogenetic network of MLSA provides profound insight into the evolutionary structure of the genus Shewanella The population genetics of Shewanella species determined by the MLSA approach plays a pivotal role in clinical diagnosis and routine monitoring. Further studies on remaining species and genomic analysis will enhance a more comprehensive understanding of the microbial systematics, phylogenetic relationships, and ecological status of the genus Shewanella.
Collapse
|
33
|
Tapinos A, Constantinides B, Phan MVT, Kouchaki S, Cotten M, Robertson DL. The Utility of Data Transformation for Alignment, De Novo Assembly and Classification of Short Read Virus Sequences. Viruses 2019; 11:E394. [PMID: 31035503 DOI: 10.3390/v11050394] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2019] [Revised: 04/19/2019] [Accepted: 04/22/2019] [Indexed: 01/07/2023] Open
Abstract
Advances in DNA sequencing technology are facilitating genomic analyses of unprecedented scope and scale, widening the gap between our abilities to generate and fully exploit biological sequence data. Comparable analytical challenges are encountered in other data-intensive fields involving sequential data, such as signal processing, in which dimensionality reduction (i.e., compression) methods are routinely used to lessen the computational burden of analyses. In this work, we explored the application of dimensionality reduction methods to numerically represent high-throughput sequence data for three important biological applications of virus sequence data: reference-based mapping, short sequence classification and de novo assembly. Leveraging highly compressed sequence transformations to accelerate sequence comparison, our approach yielded comparable accuracy to existing approaches, further demonstrating its suitability for sequences originating from diverse virus populations. We assessed the application of our methodology using both synthetic and real viral pathogen sequences. Our results show that the use of highly compressed sequence approximations can provide accurate results, with analytical performance retained and even enhanced through appropriate dimensionality reduction of sequence data.
Collapse
|
34
|
Khawaldeh S, Pervaiz U, Elsharnoby M, Alchalabi AE, Al-Zubi N. Taxonomic Classification for Living Organisms Using Convolutional Neural Networks. Genes (Basel) 2017; 8:genes8110326. [PMID: 29149087 PMCID: PMC5704239 DOI: 10.3390/genes8110326] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Revised: 11/05/2017] [Accepted: 11/14/2017] [Indexed: 12/11/2022] Open
Abstract
Taxonomic classification has a wide-range of applications such as finding out more about evolutionary history. Compared to the estimated number of organisms that nature harbors, humanity does not have a thorough comprehension of to which specific classes they belong. The classification of living organisms can be done in many machine learning techniques. However, in this study, this is performed using convolutional neural networks. Moreover, a DNA encoding technique is incorporated in the algorithm to increase performance and avoid misclassifications. The algorithm proposed outperformed the state of the art algorithms in terms of accuracy and sensitivity, which illustrates a high potential for using it in many other applications in genome analysis.
Collapse
Affiliation(s)
- Saed Khawaldeh
- Erasmus+ Joint Master Program in Medical Imaging and Applications, University of Burgundy, 21000 Dijon, France.
- Erasmus+ Joint Master Program in Medical Imaging and Applications, UNICLAM, 03043 Cassino FR, Italy.
- Erasmus+ Joint Master Program in Medical Imaging and Applications, University of Girona, 17004 Girona, Spain.
- Graduate School of Natural and Applied Sciences, Istanbul Sehir University, 34865 Kartal/İstanbul, Turkey.
- Department of Electrical Engineering and Automation, Aalto University, 02150 Espoo, Finland.
| | - Usama Pervaiz
- Erasmus+ Joint Master Program in Medical Imaging and Applications, University of Burgundy, 21000 Dijon, France.
- Erasmus+ Joint Master Program in Medical Imaging and Applications, UNICLAM, 03043 Cassino FR, Italy.
- Erasmus+ Joint Master Program in Medical Imaging and Applications, University of Girona, 17004 Girona, Spain.
| | - Mohammed Elsharnoby
- Graduate School of Natural and Applied Sciences, Istanbul Sehir University, 34865 Kartal/İstanbul, Turkey.
| | - Alaa Eddin Alchalabi
- Graduate School of Natural and Applied Sciences, Istanbul Sehir University, 34865 Kartal/İstanbul, Turkey.
| | - Nayel Al-Zubi
- Department of Computer Engineering, Al-Balqa' Applied University, 19117 Al-Salt, Jordan.
| |
Collapse
|
35
|
Abstract
OBJECTIVE I explore the origins, theoretical underpinnings, applications, and importance of vigilance in a world ever more dominated by semiautomated, automated, and autonomous machines. BACKGROUND The empirical genesis of vigilance is taken as a case study in the etiology of the application of the behavioral sciences to the human culture of technology. The subsequent taxonomic ordering and theoretical clarification of its causal antecedents are set in the overall context of contemporary human-machine systems research. METHOD The methods exercised in this work are historical analysis and informational synthesis in combination with projected theoretical implications and impact. RESULTS The profile of evolution of the concept of vigilance is clarified and cast in the light of critical events, such as the promulgation of the vigilance taxonomy, its linkage to attentional resource theory, and the recognition that the attendant performance decrement is as indicative of iatrogenic sources as it is a shortfall or limitation of the observer's processing capacity. CONCLUSION Vigilance is alive and growing in importance. Understanding sustained attention will become ever more critical in the humanization of automation-dominated systems. APPLICATION The application of vigilance is widespread and potentially ubiquitous for semiautomated, automated, and autonomous system interaction.
Collapse
|
36
|
Jovel J, Patterson J, Wang W, Hotte N, O'Keefe S, Mitchel T, Perry T, Kao D, Mason AL, Madsen KL, Wong GKS. Characterization of the Gut Microbiome Using 16S or Shotgun Metagenomics. Front Microbiol 2016; 7:459. [PMID: 27148170 PMCID: PMC4837688 DOI: 10.3389/fmicb.2016.00459] [Citation(s) in RCA: 496] [Impact Index Per Article: 62.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Accepted: 03/21/2016] [Indexed: 02/06/2023] Open
Abstract
The advent of next generation sequencing (NGS) has enabled investigations of the gut microbiome with unprecedented resolution and throughput. This has stimulated the development of sophisticated bioinformatics tools to analyze the massive amounts of data generated. Researchers therefore need a clear understanding of the key concepts required for the design, execution and interpretation of NGS experiments on microbiomes. We conducted a literature review and used our own data to determine which approaches work best. The two main approaches for analyzing the microbiome, 16S ribosomal RNA (rRNA) gene amplicons and shotgun metagenomics, are illustrated with analyses of libraries designed to highlight their strengths and weaknesses. Several methods for taxonomic classification of bacterial sequences are discussed. We present simulations to assess the number of sequences that are required to perform reliable appraisals of bacterial community structure. To the extent that fluctuations in the diversity of gut bacterial populations correlate with health and disease, we emphasize various techniques for the analysis of bacterial communities within samples (α-diversity) and between samples (β-diversity). Finally, we demonstrate techniques to infer the metabolic capabilities of a bacteria community from these 16S and shotgun data.
Collapse
Affiliation(s)
- Juan Jovel
- Department of Medicine, University of AlbertaEdmonton, AB, Canada
| | - Jordan Patterson
- Department of Medicine, University of AlbertaEdmonton, AB, Canada
| | - Weiwei Wang
- Department of Medicine, University of AlbertaEdmonton, AB, Canada
| | - Naomi Hotte
- Department of Medicine, University of AlbertaEdmonton, AB, Canada
| | - Sandra O'Keefe
- Department of Medicine, University of AlbertaEdmonton, AB, Canada
| | - Troy Mitchel
- Department of Medicine, University of AlbertaEdmonton, AB, Canada
| | - Troy Perry
- Department of Medicine, University of AlbertaEdmonton, AB, Canada
| | - Dina Kao
- Department of Medicine, University of AlbertaEdmonton, AB, Canada
| | - Andrew L. Mason
- Department of Medicine, University of AlbertaEdmonton, AB, Canada
| | - Karen L. Madsen
- Department of Medicine, University of AlbertaEdmonton, AB, Canada
| | - Gane K.-S. Wong
- Department of Medicine, University of AlbertaEdmonton, AB, Canada
- Department of Biological Sciences, University of AlbertaEdmonton, AB, Canada
- BGI-ShenzhenShenzhen, China
| |
Collapse
|
37
|
Kahl SM, Ulrich A, Kirichenko AA, Müller MEH. Phenotypic and phylogenetic segregation of Alternaria infectoria from small-spored Alternaria species isolated from wheat in Germany and Russia. J Appl Microbiol 2015; 119:1637-50. [PMID: 26381081 DOI: 10.1111/jam.12951] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Revised: 08/31/2015] [Accepted: 09/05/2015] [Indexed: 12/19/2022]
Abstract
AIMS To identify the taxonomic differences between phytopathogenic small-spored Alternaria strains isolated from wheat kernels in Germany and Russia by a polyphasic approach. METHODS AND RESULTS Ninety-five Alternaria (A.) strains were characterized by their colony colour, their three-dimensional sporulation patterns, mycotoxin production and phylogenetic relationships based on sequence variation in translation elongation factor 1-α (TEF1-α). The examination of toxin profiles and the phylogenetic features via TEF1-α resulted in two distinct clusters, in each case containing Alternaria infectoria isolates (92 and 96% respectively) in the first and the Alternaria alternata, Alternaria arborescens and Alternaria tenuissima isolates (77 and 79% respectively) in the other combined cluster. The production of Alternariol, Altertoxin and Altenuene has not been reported previously in the A. infectoria species group. The isolates from Germany and Russia differ slightly in species composition and mycotoxin production capacity. CONCLUSIONS We identified that the A. infectoria species group can be differentiated from the A. alternata, A. arborescens and A. tenuissima species group by colour, low mycotoxin production and by the sequence variation in TEF1-α gene. SIGNIFICANCE AND IMPACT OF THE STUDY These results allow a reliable toxic risk assessment when detecting different Alternaria fungi on cereals.
Collapse
Affiliation(s)
- S M Kahl
- Leibniz-Centre for Agricultural Landscape Research (ZALF), Institute of Landscape Biogeochemistry, Müncheberg, Germany.,Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
| | - A Ulrich
- Leibniz-Centre for Agricultural Landscape Research (ZALF), Institute of Landscape Biogeochemistry, Müncheberg, Germany
| | - A A Kirichenko
- Novosibirsk State Agricultural University (NSAU), Novosibirsk, Russia
| | - M E H Müller
- Leibniz-Centre for Agricultural Landscape Research (ZALF), Institute of Landscape Biogeochemistry, Müncheberg, Germany.,Berlin-Brandenburg Institute of Advanced Biodiversity Research (BBIB), Berlin, Germany
| |
Collapse
|
38
|
Catalano SR, Whittington ID, Donnellan SC, Bertozzi T, Gillanders BM. First comparative insight into the architecture of COI mitochondrial minicircle molecules of dicyemids reveals marked inter-species variation. Parasitology 2015; 142:1066-79. [PMID: 25877339 DOI: 10.1017/S0031182015000384] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Dicyemids, poorly known parasites of benthic cephalopods, are one of the few phyla in which mitochondrial (mt) genome architecture departs from the typical ~16 kb circular metazoan genome. In addition to a putative circular genome, a series of mt minicircles that each comprises the mt encoded units (I-III) of the cytochrome c oxidase complex have been reported. Whether the structure of the mt minicircles is a consistent feature among dicyemid species is unknown. Here we analyse the complete cytochrome c oxidase subunit I (COI) minicircle molecule, containing the COI gene and an associated non-coding region (NCR), for ten dicyemid species, allowing for first time comparisons between species of minicircle architecture, NCR function and inferences of minicircle replication. Divergence in COI nucleotide sequences between dicyemid species was high (average net divergence = 31.6%) while within species diversity was lower (average net divergence = 0.2%). The NCR and putative 5' section of the COI gene were highly divergent between dicyemid species (average net nucleotide divergence of putative 5' COI section = 61.1%). No tRNA genes were found in the NCR, although palindrome sequences with the potential to form stem-loop structures were identified in some species, which may play a role in transcription or other biological processes.
Collapse
|
39
|
Scheuch M, Höper D, Beer M. RIEMS: a software pipeline for sensitive and comprehensive taxonomic classification of reads from metagenomics datasets. BMC Bioinformatics 2015; 16:69. [PMID: 25886935 PMCID: PMC4351923 DOI: 10.1186/s12859-015-0503-6] [Citation(s) in RCA: 68] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2014] [Accepted: 02/20/2015] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND Fuelled by the advent and subsequent development of next generation sequencing technologies, metagenomics became a powerful tool for the analysis of microbial communities both scientifically and diagnostically. The biggest challenge is the extraction of relevant information from the huge sequence datasets generated for metagenomics studies. Although a plethora of tools are available, data analysis is still a bottleneck. RESULTS To overcome the bottleneck of data analysis, we developed an automated computational workflow called RIEMS - Reliable Information Extraction from Metagenomic Sequence datasets. RIEMS assigns every individual read sequence within a dataset taxonomically by cascading different sequence analyses with decreasing stringency of the assignments using various software applications. After completion of the analyses, the results are summarised in a clearly structured result protocol organised taxonomically. The high accuracy and performance of RIEMS analyses were proven in comparison with other tools for metagenomics data analysis using simulated sequencing read datasets. CONCLUSIONS RIEMS has the potential to fill the gap that still exists with regard to data analysis for metagenomics studies. The usefulness and power of RIEMS for the analysis of genuine sequencing datasets was demonstrated with an early version of RIEMS in 2011 when it was used to detect the orthobunyavirus sequences leading to the discovery of Schmallenberg virus.
Collapse
Affiliation(s)
- Matthias Scheuch
- Institute of Diagnostic Virology, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Südufer 10, 17493, Greifswald - Insel Riems, Germany.
| | - Dirk Höper
- Institute of Diagnostic Virology, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Südufer 10, 17493, Greifswald - Insel Riems, Germany.
| | - Martin Beer
- Institute of Diagnostic Virology, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Südufer 10, 17493, Greifswald - Insel Riems, Germany.
| |
Collapse
|
40
|
Hou T, Liu F, Liu Y, Zou QY, Zhang X, Wang K. Classification of metagenomics data at lower taxonomic level using a robust supervised classifier. Evol Bioinform Online 2015; 11:3-10. [PMID: 25673967 PMCID: PMC4309676 DOI: 10.4137/ebo.s20523] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2014] [Revised: 11/25/2014] [Accepted: 12/14/2014] [Indexed: 11/11/2022] Open
Abstract
As more and more completely sequenced genomes become available, the taxonomic classification of metagenomic data will benefit greatly from supervised classifiers that can be updated instantaneously in response to new genomes. Currently, some supervised classifiers have been developed to assess the organism of metagenomic sequences. We have found that the existing supervised classifiers usually cannot discriminate the training data from different classes accurately when the data contain some outliers. However, the training genomic data (bacterial and archaeal genomes) usually contain a portion of outliers, which come from sequencing errors, phage invasions, and some highly expressed genes, etc. The outliers, treated as noises, prohibit the development of classifiers with better prediction accuracy. To solve the problem, we present a robust supervised classifier, weighted support vector domain description (WSVDD), which can eliminate the interference from some outliers for training genomic data and then generate more accurate data domain descriptions for each taxonomic class. The experimental results demonstrate WSVDD is more robust than other classifiers for simulated Sanger and 454 reads with different outlier rates. In addition, in experiments performed on simulated metagenomes and real gut metagenomes, WSVDD also achieved better prediction accuracy than other classifiers.
Collapse
Affiliation(s)
- Tao Hou
- College of Communications Engineering, Jilin University, Changchun, China
| | - Fu Liu
- College of Communications Engineering, Jilin University, Changchun, China
| | - Yun Liu
- College of Communications Engineering, Jilin University, Changchun, China
| | - Qing Yu Zou
- College of Communications Engineering, Jilin University, Changchun, China
| | - Xiao Zhang
- College of Communications Engineering, Jilin University, Changchun, China
| | - Ke Wang
- College of Communications Engineering, Jilin University, Changchun, China
| |
Collapse
|
41
|
Melcher U, Verma R, Schneider WL. Metagenomic search strategies for interactions among plants and multiple microbes. Front Plant Sci 2014; 5:268. [PMID: 24966863 PMCID: PMC4052219 DOI: 10.3389/fpls.2014.00268] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2014] [Accepted: 05/24/2014] [Indexed: 05/22/2023]
Abstract
Plants harbor multiple microbes. Metagenomics can facilitate understanding of the significance, for the plant, of the microbes, and of the interactions among them. However, current approaches to metagenomic analysis of plants are computationally time consuming. Efforts to speed the discovery process include improvement of computational speed, condensing the sequencing reads into smaller datasets before BLAST searches, simplifying the target database of BLAST searches, and flipping the roles of metagenomic and reference datasets. The latter is exemplified by the e-probe diagnostic nucleic acid analysis approach originally devised for improving analysis during plant quarantine.
Collapse
Affiliation(s)
- Ulrich Melcher
- Department of Biochemistry and Molecular Biology, Oklahoma State UniversityStillwater, OK, USA
| | - Ruchi Verma
- Department of Biochemistry and Molecular Biology, Oklahoma State UniversityStillwater, OK, USA
| | - William L. Schneider
- Foreign Disease-Weed Science Research Unit, United States Department of Agriculture – Agricultural Research ServiceFort Detrick, MD, USA
| |
Collapse
|
42
|
Tuzhikov A, Panchin A, Shestopalov VI. TUIT, a BLAST-based tool for taxonomic classification of nucleotide sequences. Biotechniques 2014; 56:78-84. [PMID: 24502797 DOI: 10.2144/000114135] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2013] [Accepted: 12/20/2013] [Indexed: 11/23/2022] Open
Abstract
Pyrosequencing of 16S ribosomal RNA (rRNA) genes has become the gold standard in human microbiome studies. The routine task of taxonomic classification using 16S rRNA reads is commonly performed by the Ribosomal Database Project (RDP) II Classifier, a robust tool that relies on a set of well-characterized reference sequences. However, the RDP II Classifier may be unable to classify a significant part of the data set due to the absence of proper reference sequences. The taxonomic classification for some unclassified sequences might still be performed using BLAST searches against large and frequently updated nucleotide databases. Here we introduce TUIT (Taxonomic Unit Identification Tool)-an efficient open source and platform-independent application that can perform taxonomic classification on its own or can be used in combination with the RDP II Classifier to maximize the taxonomic identification rate. Using a set of simulated DNA sequences, we demonstrate that the algorithm performs taxonomic classification with high specificity for sequences as short as 125 base pairs. TUIT is applicable for 16S rRNA gene sequence classification; however, it is not restricted to 16S rRNA sequences. In addition, TUIT may be used as a complementary tool for effective taxonomic classification of nucleotide sequences generated by many current platforms, such as Roche 454 and Illumina. Stand-alone TUIT is available online at http://sourceforge.net/projects/tuit/.
Collapse
|
43
|
Vázquez-Castellanos JF, García-López R, Pérez-Brocal V, Pignatelli M, Moya A. Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut. BMC Genomics 2014; 15:37. [PMID: 24438450 PMCID: PMC3901335 DOI: 10.1186/1471-2164-15-37] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2013] [Accepted: 01/16/2014] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The main limitations in the analysis of viral metagenomes are perhaps the high genetic variability and the lack of information in extant databases. To address these issues, several bioinformatic tools have been specifically designed or adapted for metagenomics by improving read assembly and creating more sensitive methods for homology detection. This study compares the performance of different available assemblers and taxonomic annotation software using simulated viral-metagenomic data. RESULTS We simulated two 454 viral metagenomes using genomes from NCBI's RefSeq database based on the list of actual viruses found in previously published metagenomes. Three different assembly strategies, spanning six assemblers, were tested for performance: overlap-layout-consensus algorithms Newbler, Celera and Minimo; de Bruijn graphs algorithms Velvet and MetaVelvet; and read probabilistic model Genovo. The performance of the assemblies was measured by the length of resulting contigs (using N50), the percentage of reads assembled and the overall accuracy when comparing against corresponding reference genomes. Additionally, the number of chimeras per contig and the lowest common ancestor were estimated in order to assess the effect of assembling on taxonomic and functional annotation. The functional classification of the reads was evaluated by counting the reads that correctly matched the functional data previously reported for the original genomes and calculating the number of over-represented functional categories in chimeric contigs. The sensitivity and specificity of tBLASTx, PhymmBL and the k-mer frequencies were measured by accurate predictions when comparing simulated reads against the NCBI Virus genomes RefSeq database. CONCLUSIONS Assembling improves functional annotation by increasing accurate assignations and decreasing ambiguous hits between viruses and bacteria. However, the success is limited by the chimeric contigs occurring at all taxonomic levels. The assembler and its parameters should be selected based on the focus of each study. Minimo's non-chimeric contigs and Genovo's long contigs excelled in taxonomy assignation and functional annotation, respectively.tBLASTx stood out as the best approach for taxonomic annotation for virus identification. PhymmBL proved useful in datasets in which no related sequences are present as it uses genomic features that may help identify distant taxa. The k-frequencies underperformed in all viral datasets.
Collapse
Affiliation(s)
- Jorge F Vázquez-Castellanos
- Fundación para el Fomento de la Investigación Sanitaria y Biomédica de la Comunidad Valencia (FISABIO)-Salud Pública, Avenida de Cataluña 21, 46020 Valencia, Spain
- Institut Cavanilles de Biodiversitat i Biologia Evolutiva, (ICBiBE) Universitat de València, Apartado Postal 22085, 46071 Valencia, Spain
| | - Rodrigo García-López
- Fundación para el Fomento de la Investigación Sanitaria y Biomédica de la Comunidad Valencia (FISABIO)-Salud Pública, Avenida de Cataluña 21, 46020 Valencia, Spain
- Institut Cavanilles de Biodiversitat i Biologia Evolutiva, (ICBiBE) Universitat de València, Apartado Postal 22085, 46071 Valencia, Spain
| | - Vicente Pérez-Brocal
- Fundación para el Fomento de la Investigación Sanitaria y Biomédica de la Comunidad Valencia (FISABIO)-Salud Pública, Avenida de Cataluña 21, 46020 Valencia, Spain
- Institut Cavanilles de Biodiversitat i Biologia Evolutiva, (ICBiBE) Universitat de València, Apartado Postal 22085, 46071 Valencia, Spain
- CIBER en Epidemiología y Salud Pública (CIBERESP), Madrid, Spain
| | - Miguel Pignatelli
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK
| | - Andrés Moya
- Fundación para el Fomento de la Investigación Sanitaria y Biomédica de la Comunidad Valencia (FISABIO)-Salud Pública, Avenida de Cataluña 21, 46020 Valencia, Spain
- Institut Cavanilles de Biodiversitat i Biologia Evolutiva, (ICBiBE) Universitat de València, Apartado Postal 22085, 46071 Valencia, Spain
- CIBER en Epidemiología y Salud Pública (CIBERESP), Madrid, Spain
| |
Collapse
|
44
|
Abstract
As part of a long-term investigation on the evolution of Passiflora L., we investigated the divergence ages of the genus and diversification of its subgenera, relating them with biogeographical and/or historical events, and other characteristics of this taxon. The main aim of the present work was to evaluate the biogeographic distribution of this genus to better understand its evolutionary history. This is the first time that representatives from South American and Old World Passifloraceae genera have been studied as a group comprising a total of 106 widely distributed species, with representative samples of the four suggested subgenera. Seven DNA regions were studied, comprising 7,431 nucleotides from plastidial, mitochondrial and nuclear genomes. Divergence time estimates were obtained by using a Bayesian Markov Chain Monte Carlo method and a random local clock model for each partition. Three major subgenera have been shown to be monophyletic and here we are proposing to include another subgenus in the Passiflora infrageneric classification. In general, divergence among the four subgenera in Passiflora is very ancient, ranging from ∼32 to ∼38 Mya, and Passifloraceae seems to follow a biogeographic scenario proposed for several plant groups, originating in Africa, crossing to Europe/Asia and arriving in the New World by way of land bridges. Our results indicated that Passiflora ancestors arrived in Central America and diversified quickly from there, with many long distance dispersion events.
Collapse
Affiliation(s)
- Valéria C Muschner
- Laboratório de Evolução Molecular, Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil. ; Bepartamento de Botânica, Universidade Federal do Paraná, Curitiba, PR, Brazil
| | | | | | | |
Collapse
|