Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Chen W, Cheng Y, Zhang C, Zhang S, Zhao H. MSClust: A Multi-Seeds based Clustering algorithm for microbiome profiling using 16S rRNA sequence. J Microbiol Methods 2013;94:347-55. [PMID: 23899776 DOI: 10.1016/j.mimet.2013.07.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2013] [Revised: 07/06/2013] [Accepted: 07/07/2013] [Indexed: 11/21/2022]

For:	Chen W, Cheng Y, Zhang C, Zhang S, Zhao H. MSClust: A Multi-Seeds based Clustering algorithm for microbiome profiling using 16S rRNA sequence. J Microbiol Methods 2013;94:347-55. [PMID: 23899776 DOI: 10.1016/j.mimet.2013.07.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2013] [Revised: 07/06/2013] [Accepted: 07/07/2013] [Indexed: 11/21/2022]

Number

Cited by Other Article(s)

Cao M, Peng Q, Wei ZG, Liu F, Hou YF. EdClust: A heuristic sequence clustering method with higher sensitivity. J Bioinform Comput Biol 2021;20:2150036. [PMID: 34939905 DOI: 10.1142/s0219720021500360] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Wei ZG, Zhang XD, Cao M, Liu F, Qian Y, Zhang SW. Comparison of Methods for Picking the Operational Taxonomic Units From Amplicon Sequences. Front Microbiol 2021;12:644012. [PMID: 33841367 PMCID: PMC8024490 DOI: 10.3389/fmicb.2021.644012] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 02/17/2021] [Indexed: 12/31/2022] Open

Xia Y. Correlation and association analyses in microbiome study integrating multiomics in health and disease. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020;171:309-491. [PMID: 32475527 DOI: 10.1016/bs.pmbts.2020.04.003] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Abstract

Correlation and association analyses are one of the most widely used statistical methods in research fields, including microbiome and integrative multiomics studies. Correlation and association have two implications: dependence and co-occurrence. Microbiome data are structured as phylogenetic tree and have several unique characteristics, including high dimensionality, compositionality, sparsity with excess zeros, and heterogeneity. These unique characteristics cause several statistical issues when analyzing microbiome data and integrating multiomics data, such as large p and small n, dependency, overdispersion, and zero-inflation. In microbiome research, on the one hand, classic correlation and association methods are still applied in real studies and used for the development of new methods; on the other hand, new methods have been developed to target statistical issues arising from unique characteristics of microbiome data. Here, we first provide a comprehensive view of classic and newly developed univariate correlation and association-based methods. We discuss the appropriateness and limitations of using classic methods and demonstrate how the newly developed methods mitigate the issues of microbiome data. Second, we emphasize that concepts of correlation and association analyses have been shifted by introducing network analysis, microbe-metabolite interactions, functional analysis, etc. Third, we introduce multivariate correlation and association-based methods, which are organized by the categories of exploratory, interpretive, and discriminatory analyses and classification methods. Fourth, we focus on the hypothesis testing of univariate and multivariate regression-based association methods, including alpha and beta diversities-based, count-based, and relative abundance (or compositional)-based association analyses. We demonstrate the characteristics and limitations of each approaches. Fifth, we introduce two specific microbiome-based methods: phylogenetic tree-based association analysis and testing for survival outcomes. Sixth, we provide an overall view of longitudinal methods in analysis of microbiome and omics data, which cover standard, static, regression-based time series methods, principal trend analysis, and newly developed univariate overdispersed and zero-inflated as well as multivariate distance/kernel-based longitudinal models. Finally, we comment on current association analysis and future direction of association analysis in microbiome and multiomics studies.

Collapse

Wei ZG, Zhang SW. DMSC: A Dynamic Multi-Seeds Method for Clustering 16S rRNA Sequences Into OTUs. Front Microbiol 2019;10:428. [PMID: 30915052 PMCID: PMC6422886 DOI: 10.3389/fmicb.2019.00428] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2019] [Accepted: 02/19/2019] [Indexed: 12/30/2022] Open

Abstract

Next-generation sequencing (NGS)-based 16S rRNA sequencing by jointly using the PCR amplification and NGS technology is a cost-effective technique, which has been successfully used to study the phylogeny and taxonomy of samples from complex microbiomes or environments. Clustering 16S rRNA sequences into operational taxonomic units (OTUs) is often the first step for many downstream analyses. Heuristic clustering is one of the most widely employed approaches for generating OTUs. However, most heuristic OTUs clustering methods just select one single seed sequence to represent each cluster, resulting in their outcomes suffer from either overestimation of OTUs number or sensitivity to sequencing errors. In this paper, we present a novel dynamic multi-seeds clustering method (namely DMSC) to pick OTUs. DMSC first heuristically generates clusters according to the distance threshold. When the size of a cluster reaches the pre-defined minimum size, then DMSC selects the multi-core sequences (MCS) as the seeds that are defined as the n-core sequences (n ≥ 3), in which the distance between any two sequences is less than the distance threshold. A new sequence is assigned to the corresponding cluster depending on the average distance to MCS and the distance standard deviation within the MCS. If a new sequence is added to the cluster, dynamically update the MCS until no sequence is merged into the cluster. The new method DMSC was tested on several simulated and real-life sequence datasets and also compared with the traditional heuristic methods such as CD-HIT, UCLUST, and DBH. Experimental results in terms of the inferred OTUs number, normalized mutual information (NMI) and Matthew correlation coefficient (MCC) metrics demonstrate that DMSC can produce higher quality clusters with low memory usage and reduce OTU overestimation. Additionally, DMSC is also robust to the sequencing errors. The DMSC software can be freely downloaded from https://github.com/NWPU-903PR/DMSC.

Collapse

Zheng W, Mao Q, Genco RJ, Wactawski-Wende J, Buck M, Cai Y, Sun Y. A parallel computational framework for ultra-large-scale sequence clustering analysis. Bioinformatics 2019;35:380-388. [PMID: 30010718 PMCID: PMC6931356 DOI: 10.1093/bioinformatics/bty617] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Revised: 06/14/2018] [Accepted: 07/11/2018] [Indexed: 12/30/2022] Open

Wei ZG, Zhang SW, Zhang YZ. DMclust, a Density-based Modularity Method for Accurate OTU Picking of 16S rRNA Sequences. Mol Inform 2017;36. [PMID: 28586119 DOI: 10.1002/minf.201600059] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2017] [Accepted: 04/25/2017] [Indexed: 11/08/2022]

DBH: A de Bruijn graph-based heuristic method for clustering large-scale 16S rRNA sequences into OTUs. J Theor Biol 2017;425:80-87. [PMID: 28454900 DOI: 10.1016/j.jtbi.2017.04.019] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Revised: 03/28/2017] [Accepted: 04/20/2017] [Indexed: 12/22/2022]

Cai Y, Zheng W, Yao J, Yang Y, Mai V, Mao Q, Sun Y. ESPRIT-Forest: Parallel clustering of massive amplicon sequence data in subquadratic time. PLoS Comput Biol 2017;13:e1005518. [PMID: 28437450 PMCID: PMC5421816 DOI: 10.1371/journal.pcbi.1005518] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2016] [Revised: 05/08/2017] [Accepted: 04/13/2017] [Indexed: 12/30/2022] Open

Exploring the interaction patterns among taxa and environments from marine metagenomic data. QUANTITATIVE BIOLOGY 2016. [DOI: 10.1007/s40484-016-0071-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Wei ZG, Zhang SW. MtHc: a motif-based hierarchical method for clustering massive 16S rRNA sequences into OTUs. MOLECULAR BIOSYSTEMS 2016;11:1907-13. [PMID: 25912934 DOI: 10.1039/c5mb00089k] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Franzén O, Hu J, Bao X, Itzkowitz SH, Peter I, Bashir A. Improved OTU-picking using long-read 16S rRNA gene amplicon sequencing and generic hierarchical clustering. MICROBIOME 2015;3:43. [PMID: 26434730 PMCID: PMC4593230 DOI: 10.1186/s40168-015-0105-6] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2015] [Accepted: 08/31/2015] [Indexed: 05/05/2023]

Flynn JM, Brown EA, Chain FJJ, MacIsaac HJ, Cristescu ME. Toward accurate molecular identification of species in complex environmental samples: testing the performance of sequence filtering and clustering methods. Ecol Evol 2015;5:2252-66. [PMID: 26078860 PMCID: PMC4461425 DOI: 10.1002/ece3.1497] [Citation(s) in RCA: 89] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2015] [Revised: 03/05/2015] [Accepted: 03/10/2015] [Indexed: 11/05/2022] Open

Schmidt TSB, Matias Rodrigues JF, von Mering C. Limits to robustness and reproducibility in the demarcation of operational taxonomic units. Environ Microbiol 2014;17:1689-706. [PMID: 25156547 DOI: 10.1111/1462-2920.12610] [Citation(s) in RCA: 81] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2014] [Accepted: 08/21/2014] [Indexed: 11/27/2022]