Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Meng J, Wang B, Wei Y, Feng S, Balaji P. SWAP-Assembler: scalable and efficient genome assembly towards thousands of cores. BMC Bioinformatics 2014;15 Suppl 9:S2. [PMID: 25253533 PMCID: PMC4168705 DOI: 10.1186/1471-2105-15-s9-s2] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

For:	Meng J, Wang B, Wei Y, Feng S, Balaji P. SWAP-Assembler: scalable and efficient genome assembly towards thousands of cores. BMC Bioinformatics 2014;15 Suppl 9:S2. [PMID: 25253533 PMCID: PMC4168705 DOI: 10.1186/1471-2105-15-s9-s2] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Number

Cited by Other Article(s)

Freire B, Ladra S, Parama JR. Memory-Efficient Assembly Using Flye. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:3564-3577. [PMID: 34469305 DOI: 10.1109/tcbb.2021.3108843] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Koppad S, B A, Gkoutos GV, Acharjee A. Cloud Computing Enabled Big Multi-Omics Data Analytics. Bioinform Biol Insights 2021;15:11779322211035921. [PMID: 34376975 PMCID: PMC8323418 DOI: 10.1177/11779322211035921] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Accepted: 07/12/2021] [Indexed: 12/27/2022] Open

Guo G, Chen H, Yan D, Cheng J, Chen JY, Chong Z. Scalable De Novo Genome Assembly Using a Pregel-Like Graph-Parallel System. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:731-744. [PMID: 31180898 DOI: 10.1109/tcbb.2019.2920912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Pan T, Nihalani R, Aluru S. Fast de Bruijn Graph Compaction in Distributed Memory Environments. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020;17:136-148. [PMID: 30072337 DOI: 10.1109/tcbb.2018.2858797] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Ge J, Meng J, Guo N, Wei Y, Balaji P, Feng S. Counting Kmers for Biological Sequences at Large Scale. Interdiscip Sci 2019;12:99-108. [PMID: 31734873 DOI: 10.1007/s12539-019-00348-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 08/19/2019] [Accepted: 10/25/2019] [Indexed: 11/25/2022]

Ghosh P, Kalyanaraman A. FastEtch: A Fast Sketch-Based Assembler for Genomes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019;16:1091-1106. [PMID: 28910776 DOI: 10.1109/tcbb.2017.2737999] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Abstract

De novo genome assembly describes the process of reconstructing an unknown genome from a large collection of short (or long) reads sequenced from the genome. A single run of a Next-Generation Sequencing (NGS) technology can produce billions of short reads, making genome assembly computationally demanding (both in terms of memory and time). One of the major computational steps in modern day short read assemblers involves the construction and use of a string data structure called the de Bruijn graph. In fact, a majority of short read assemblers build the complete de Bruijn graph for the set of input reads, and subsequently traverse and prune low-quality edges, in order to generate genomic "contigs"-the output of assembly. These steps of graph construction and traversal, contribute to well over 90 percent of the runtime and memory. In this paper, we present a fast algorithm, FastEtch, that uses sketching to build an approximate version of the de Bruijn graph for the purpose of generating an assembly. The algorithm uses Count-Min sketch, which is a probabilistic data structure for streaming data sets. The result is an approximate de Bruijn graph that stores information pertaining only to a selected subset of nodes that are most likely to contribute to the contig generation step. In addition, edges are not stored; instead that fraction which contribute to our contig generation are detected on-the-fly. This approximate approach is intended to significantly improve performance (both execution time and memory footprint) whilst possibly compromising on the output assembly quality. We present two main versions of the assembler-one that generates an assembly, where each contig represents a contiguous genomic region from one strand of the DNA, and another that generates an assembly, where the contigs can straddle either of the two strands of the DNA. For further scalability, we have implemented a multi-threaded parallel code. Experimental results using our algorithm conducted on E. coli, Yeast, C. elegans, and Human (Chr2 and Chr2+3) genomes show that our method yields one of the best time-memory-quality trade-offs, when compared against many state-of-the-art genome assemblers.

Collapse

Guo R, Zhao Y, Zou Q, Fang X, Peng S. Bioinformatics applications on Apache Spark. Gigascience 2018;7:5067872. [PMID: 30101283 PMCID: PMC6113509 DOI: 10.1093/gigascience/giy098] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Accepted: 07/28/2018] [Indexed: 11/13/2022] Open

Chikhi R, Limasset A, Medvedev P. Compacting de Bruijn graphs from sequencing data quickly and in low memory. Bioinformatics 2017;32:i201-i208. [PMID: 27307618 PMCID: PMC4908363 DOI: 10.1093/bioinformatics/btw279] [Citation(s) in RCA: 102] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open

Le VV, Tran LV, Tran HV. A novel semi-supervised algorithm for the taxonomic assignment of metagenomic reads. BMC Bioinformatics 2016;17:22. [PMID: 26740458 PMCID: PMC4702387 DOI: 10.1186/s12859-015-0872-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Accepted: 12/22/2015] [Indexed: 11/10/2022] Open