Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhao K, Chu X. G-BLASTN: accelerating nucleotide alignment by graphics processors. ACTA ACUST UNITED AC 2014;30:1384-91. [PMID: 24463183 DOI: 10.1093/bioinformatics/btu047] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

For:	Zhao K, Chu X. G-BLASTN: accelerating nucleotide alignment by graphics processors. ACTA ACUST UNITED AC 2014;30:1384-91. [PMID: 24463183 DOI: 10.1093/bioinformatics/btu047] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Number

Cited by Other Article(s)

Schmidt B, Hildebrandt A. From GPUs to AI and quantum: three waves of acceleration in bioinformatics. Drug Discov Today 2024;29:103990. [PMID: 38663581 DOI: 10.1016/j.drudis.2024.103990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 04/05/2024] [Accepted: 04/17/2024] [Indexed: 05/01/2024]

Zhao X, Sun C, Jin M, Chen J, Xing L, Yan J, Wang H, Liu Z, Chen WH. Enrichment Culture but Not Metagenomic Sequencing Identified a Highly Prevalent Phage Infecting Lactiplantibacillus plantarum in Human Feces. Microbiol Spectr 2023;11:e0434022. [PMID: 36995238 PMCID: PMC10269749 DOI: 10.1128/spectrum.04340-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 03/07/2023] [Indexed: 03/31/2023] Open

Abstract

Lactiplantibacillus plantarum (previously known as Lactobacillus plantarum) is increasingly used as a probiotic to treat human diseases, but its phages in the human gut remain unexplored. Here, we report its first gut phage, Gut-P1, which we systematically screened using metagenomic sequencing, virus-like particle (VLP) sequencing, and enrichment culture from 35 fecal samples. Gut-P1 is virulent, belongs to the Douglaswolinvirus genus, and is highly prevalent in the gut (~11% prevalence); it has a genome of 79,928 bp consisting of 125 protein coding genes and displaying low sequence similarities to public L. plantarum phages. Physiochemical characterization shows that it has a short latent period and adapts to broad ranges of temperatures and pHs. Furthermore, Gut-P1 strongly inhibits the growth of L. plantarum strains at a multiplicity of infection (MOI) of 1e-6. Together, these results indicate that Gut-P1 can greatly impede the application of L. plantarum in humans. Strikingly, Gut-P1 was identified only in the enrichment culture, not in our metagenomic or VLP sequencing data nor in any public human phage databases, indicating the inefficiency of bulk sequencing in recovering low-abundance but highly prevalent phages and pointing to the unexplored hidden diversity of the human gut virome despite recent large-scale sequencing and bioinformatics efforts. IMPORTANCE As Lactiplantibacillus plantarum (previously known as Lactobacillus plantarum) is increasingly used as a probiotic to treat human gut-related diseases, its bacteriophages may pose a certain threat to their further application and should be identified and characterized more often from the human intestine. Here, we isolated and identified the first gut L. plantarum phage that is prevalent in a Chinese population. This phage, Gut-P1, is virulent and can strongly inhibit the growth of multiple L. plantarum strains at low MOIs. Our results also show that bulk sequencing is inefficient at recovering low-abundance but highly prevalent phages such as Gut-P1, suggesting that the hidden diversity of human enteroviruses has not yet been explored. Our results call for innovative approaches to isolate and identify intestinal phages from the human gut and to rethink our current understanding of the enterovirus, particularly its underestimated diversity and overestimated individual specificity.

Collapse

Affiliation(s)

Xueyang Zhao College of Life Science, Henan Normal University, Xinxiang, Henan, China Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
Chuqing Sun Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
Menglu Jin College of Life Science, Henan Normal University, Xinxiang, Henan, China Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
Jingchao Chen Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
Lulu Xing College of Life Science, Henan Normal University, Xinxiang, Henan, China Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
Jin Yan Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
Hailei Wang College of Life Science, Henan Normal University, Xinxiang, Henan, China
Zhi Liu Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
Wei-Hua Chen Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China Institution of Medical Artificial Intelligence, Binzhou Medical University, Yantai, China

Collapse

Cheng T, Chin PJ, Cha K, Petrick N, Mikailov M. Profiling the BLAST bioinformatics application for load balancing on high-performance computing clusters. BMC Bioinformatics 2022;23:544. [PMID: 36526957 PMCID: PMC9758941 DOI: 10.1186/s12859-022-05029-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 10/31/2022] [Indexed: 12/23/2022] Open

Abstract

BACKGROUND

The Basic Local Alignment Search Tool (BLAST) is a suite of commonly used algorithms for identifying matches between biological sequences. The user supplies a database file and query file of sequences for BLAST to find identical sequences between the two. The typical millions of database and query sequences make BLAST computationally challenging but also well suited for parallelization on high-performance computing clusters. The efficacy of parallelization depends on the data partitioning, where the optimal data partitioning relies on an accurate performance model. In previous studies, a BLAST job was sped up by 27 times by partitioning the database and query among thousands of processor nodes. However, the optimality of the partitioning method was not studied. Unlike BLAST performance models proposed in the literature that usually have problem size and hardware configuration as the only variables, the execution time of a BLAST job is a function of database size, query size, and hardware capability. In this work, the nucleotide BLAST application BLASTN was profiled using three methods: shell-level profiling with the Unix "time" command, code-level profiling with the built-in "profiler" module, and system-level profiling with the Unix "gprof" program. The runtimes were measured for six node types, using six different database files and 15 query files, on a heterogeneous HPC cluster with 500+ nodes. The empirical measurement data were fitted with quadratic functions to develop performance models that were used to guide the data parallelization for BLASTN jobs.

RESULTS

Profiling results showed that BLASTN contains more than 34,500 different functions, but a single function, RunMTBySplitDB, takes 99.12% of the total runtime. Among its 53 child functions, five core functions were identified to make up 92.12% of the overall BLASTN runtime. Based on the performance models, static load balancing algorithms can be applied to the BLASTN input data to minimize the runtime of the longest job on an HPC cluster. Four test cases being run on homogeneous and heterogeneous clusters were tested. Experiment results showed that the runtime can be reduced by 81% on a homogeneous cluster and by 20% on a heterogeneous cluster by re-distributing the workload.

DISCUSSION

Optimal data partitioning can improve BLASTN's overall runtime 5.4-fold in comparison with dividing the database and query into the same number of fragments. The proposed methodology can be used in the other applications in the BLAST+ suite or any other application as long as source code is available.

Collapse

Xiao C, Sun T, Yang Z, Zou L, Deng J, Yang X. Whole transcriptome RNA Sequencing Reveals the Global Molecular Responses and circRNA/lncRNA-miRNA-mRNA ceRNA Regulatory Network in Chicken Fat Deposition. Poult Sci 2022;101:102121. [PMID: 36116349 PMCID: PMC9485216 DOI: 10.1016/j.psj.2022.102121] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 03/21/2022] [Accepted: 08/03/2022] [Indexed: 11/29/2022] Open

GPrimer: a fast GPU-based pipeline for primer design for qPCR experiments. BMC Bioinformatics 2021;22:220. [PMID: 33926379 PMCID: PMC8082839 DOI: 10.1186/s12859-021-04133-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Accepted: 04/14/2021] [Indexed: 11/10/2022] Open

Wang Y, Zhao B, Lu Z, Shi Y, Li J. The complete chloroplast genome provides insight into the polymorphism and adaptive evolution of Garcinia paucinervis. BIOTECHNOL BIOTEC EQ 2021. [DOI: 10.1080/13102818.2021.1879676] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open

Carroll TC, Ojiaku JT, Wong PWH. Semiglobal Sequence Alignment with Gaps Using GPU. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020;17:2086-2097. [PMID: 31056513 DOI: 10.1109/tcbb.2019.2914105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Sha SP, Suryavanshi MV, Jani K, Sharma A, Shouche Y, Tamang JP. Diversity of Yeasts and Molds by Culture-Dependent and Culture-Independent Methods for Mycobiome Surveillance of Traditionally Prepared Dried Starters for the Production of Indian Alcoholic Beverages. Front Microbiol 2018;9:2237. [PMID: 30319566 PMCID: PMC6169615 DOI: 10.3389/fmicb.2018.02237] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Accepted: 09/03/2018] [Indexed: 12/25/2022] Open

Wang X, Cheng F, Rohlsen D, Bi C, Wang C, Xu Y, Wei S, Ye Q, Yin T, Ye N. Organellar genome assembly methods and comparative analysis of horticultural plants. HORTICULTURE RESEARCH 2018;5:3. [PMID: 29423233 PMCID: PMC5798811 DOI: 10.1038/s41438-017-0002-1] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Revised: 11/20/2017] [Accepted: 11/26/2017] [Indexed: 05/31/2023]

Abstract

Although organellar genomes (including chloroplast and mitochondrial genomes) are smaller than nuclear genomes in size and gene number, organellar genomes are very important for the investigation of plant evolution and molecular ecology mechanisms. Few studies have focused on the organellar genomes of horticultural plants. Approximately 1193 chloroplast genomes and 199 mitochondrial genomes of land plants are available in the National Center for Biotechnology Information (NCBI), of which only 39 are from horticultural plants. In this paper, we report an innovative and efficient method for high-quality horticultural organellar genome assembly from next-generation sequencing (NGS) data. Sequencing reads were first assembled by Newbler, Amos, and Minimus software with default parameters. The remaining gaps were then filled through BLASTN search and PCR. The complete DNA sequence was corrected based on Illumina sequencing data using BWA (Burrows-Wheeler Alignment tool) software. The advantage of this approach is that there is no need to isolate organellar DNA from total DNA during sample preparation. Using this procedure, the complete mitochondrial and chloroplast genomes of an ornamental plant, Salix suchowensis, and a fruit tree, Ziziphus jujuba, were identified. This study shows that horticultural plants have similar mitochondrial and chloroplast sequence organization to other seed plants. Most horticultural plants demonstrate a slight bias toward A+T rich features in the mitochondrial genome. In addition, a phylogenetic analysis of 39 horticultural plants based on 15 protein-coding genes showed that some mitochondrial genes are horizontally transferred from chloroplast DNA. Our study will provide an important reference for organellar genome assembly in other horticultural plants. Furthermore, phylogenetic analysis of the organellar genomes of horticultural plants could accurately clarify the unanticipated relationships among these plants.

Collapse

Nobile MS, Cazzaniga P, Tangherloni A, Besozzi D. Graphics processing units in bioinformatics, computational biology and systems biology. Brief Bioinform 2017;18:870-885. [PMID: 27402792 PMCID: PMC5862309 DOI: 10.1093/bib/bbw058] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Indexed: 01/18/2023] Open

Accelerating Wright-Fisher Forward Simulations on the Graphics Processing Unit. G3-GENES GENOMES GENETICS 2017;7:3229-3236. [PMID: 28768689 PMCID: PMC5592947 DOI: 10.1534/g3.117.300103] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Abstract

Forward Wright–Fisher simulations are powerful in their ability to model complex demography and selection scenarios, but suffer from slow execution on the Central Processor Unit (CPU), thus limiting their usefulness. However, the single-locus Wright–Fisher forward algorithm is exceedingly parallelizable, with many steps that are so-called “embarrassingly parallel,” consisting of a vast number of individual computations that are all independent of each other and thus capable of being performed concurrently. The rise of modern Graphics Processing Units (GPUs) and programming languages designed to leverage the inherent parallel nature of these processors have allowed researchers to dramatically speed up many programs that have such high arithmetic intensity and intrinsic concurrency. The presented GPU Optimized Wright–Fisher simulation, or “GO Fish” for short, can be used to simulate arbitrary selection and demographic scenarios while running over 250-fold faster than its serial counterpart on the CPU. Even modest GPU hardware can achieve an impressive speedup of over two orders of magnitude. With simulations so accelerated, one can not only do quick parametric bootstrapping of previously estimated parameters, but also use simulated results to calculate the likelihoods and summary statistics of demographic and selection models against real polymorphism data, all without restricting the demographic and selection scenarios that can be modeled or requiring approximations to the single-locus forward algorithm for efficiency. Further, as many of the parallel programming techniques used in this simulation can be applied to other computationally intensive algorithms important in population genetics, GO Fish serves as an exciting template for future research into accelerating computation in evolution. GO Fish is part of the Parallel PopGen Package available at: http://dl42.github.io/ParallelPopGen/.

Collapse

Zhang J, Wang H, Feng WC. cuBLASTP: Fine-Grained Parallelization of Protein Sequence Search on CPU+GPU. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017;14:830-843. [PMID: 26469393 DOI: 10.1109/tcbb.2015.2489662] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Ye W, Chen Y, Zhang Y, Xu Y. H-BLAST: a fast protein sequence alignment toolkit on heterogeneous computers with GPUs. Bioinformatics 2017;33:1130-1138. [PMID: 28087515 DOI: 10.1093/bioinformatics/btw769] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2016] [Accepted: 12/12/2016] [Indexed: 11/15/2022] Open

Manconi A, Moscatelli M, Armano G, Gnocchi M, Orro A, Milanesi L. Removing duplicate reads using graphics processing units. BMC Bioinformatics 2016;17:346. [PMID: 28185553 PMCID: PMC5123249 DOI: 10.1186/s12859-016-1192-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Abstract

Background

During library construction polymerase chain reaction is used to enrich the DNA before sequencing. Typically, this process generates duplicate read sequences. Removal of these artifacts is mandatory, as they can affect the correct interpretation of data in several analyses. Ideally, duplicate reads should be characterized by identical nucleotide sequences. However, due to sequencing errors, duplicates may also be nearly-identical. Removing nearly-identical duplicates can result in a notable computational effort. To deal with this challenge, we recently proposed a GPU method aimed at removing identical and nearly-identical duplicates generated with an Illumina platform.

The method implements an approach based on prefix-suffix comparison. Read sequences with identical prefix are considered potential duplicates. Then, their suffixes are compared to identify and remove those that are actually duplicated.

Although the method can be efficiently used to remove duplicates, there are some limitations that need to be overcome. In particular, it cannot to detect potential duplicates in the event that prefixes are longer than 27 bases, and it does not provide support for paired-end read libraries. Moreover, large clusters of potential duplicates are split into smaller with the aim to guarantees a reasonable computing time. This heuristic may affect the accuracy of the analysis.

Results

In this work we propose GPU-DupRemoval, a new implementation of our method able to (i) cluster reads without constraints on the maximum length of the prefixes, (ii) support both single- and paired-end read libraries, and (iii) analyze large clusters of potential duplicates.

Conclusions

Due to the massive parallelization obtained by exploiting graphics cards, GPU-DupRemoval removes duplicate reads faster than other cutting-edge solutions, while outperforming most of them in terms of amount of duplicates reads.

Collapse

Zhang J, Misra S, Wang H, Feng WC. muBLASTP: database-indexed protein sequence search on multicore CPUs. BMC Bioinformatics 2016;17:443. [PMID: 27809763 PMCID: PMC5096327 DOI: 10.1186/s12859-016-1302-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2016] [Accepted: 10/21/2016] [Indexed: 11/16/2022] Open

Suzuki S, Kakuta M, Ishida T, Akiyama Y. GPU-Acceleration of Sequence Homology Searches with Database Subsequence Clustering. PLoS One 2016;11:e0157338. [PMID: 27482905 PMCID: PMC4970815 DOI: 10.1371/journal.pone.0157338] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2015] [Accepted: 05/27/2016] [Indexed: 11/30/2022] Open

Bonnici V, Busato F, Micale G, Bombieri N, Pulvirenti A, Giugno R. APPAGATO: an APproximate PArallel and stochastic GrAph querying TOol for biological networks. Bioinformatics 2016;32:2159-66. [DOI: 10.1093/bioinformatics/btw223] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2015] [Accepted: 04/10/2016] [Indexed: 02/02/2023] Open

Langenkämper D, Jakobi T, Feld D, Jelonek L, Goesmann A, Nattkemper TW. Comparison of Acceleration Techniques for Selected Low-Level Bioinformatics Operations. Front Genet 2016;7:5. [PMID: 26904094 PMCID: PMC4748744 DOI: 10.3389/fgene.2016.00005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2015] [Accepted: 01/17/2016] [Indexed: 12/27/2022] Open

Abstract

Within the recent years clock rates of modern processors stagnated while the demand for computing power continued to grow. This applied particularly for the fields of life sciences and bioinformatics, where new technologies keep on creating rapidly growing piles of raw data with increasing speed. The number of cores per processor increased in an attempt to compensate for slight increments of clock rates. This technological shift demands changes in software development, especially in the field of high performance computing where parallelization techniques are gaining in importance due to the pressing issue of large sized datasets generated by e.g., modern genomics. This paper presents an overview of state-of-the-art manual and automatic acceleration techniques and lists some applications employing these in different areas of sequence informatics. Furthermore, we provide examples for automatic acceleration of two use cases to show typical problems and gains of transforming a serial application to a parallel one. The paper should aid the reader in deciding for a certain techniques for the problem at hand. We compare four different state-of-the-art automatic acceleration approaches (OpenMP, PluTo-SICA, PPCG, and OpenACC). Their performance as well as their applicability for selected use cases is discussed. While optimizations targeting the CPU worked better in the complex k-mer use case, optimizers for Graphics Processing Units (GPUs) performed better in the matrix multiplication example. But performance is only superior at a certain problem size due to data migration overhead. We show that automatic code parallelization is feasible with current compiler software and yields significant increases in execution speed. Automatic optimizers for CPU are mature and usually no additional manual adjustment is required. In contrast, some automatic parallelizers targeting GPUs still lack maturity and are limited to simple statements and structures.

Collapse

Zhang K. Gctf: Real-time CTF determination and correction. J Struct Biol 2015;193:1-12. [PMID: 26592709 PMCID: PMC4711343 DOI: 10.1016/j.jsb.2015.11.003] [Citation(s) in RCA: 2541] [Impact Index Per Article: 282.3] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Revised: 11/08/2015] [Accepted: 11/11/2015] [Indexed: 11/26/2022]

Chen Y, Ye W, Zhang Y, Xu Y. High speed BLASTN: an accelerated MegaBLAST search tool. Nucleic Acids Res 2015;43:7762-8. [PMID: 26250111 PMCID: PMC4652774 DOI: 10.1093/nar/gkv784] [Citation(s) in RCA: 272] [Impact Index Per Article: 30.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2015] [Accepted: 07/22/2015] [Indexed: 11/14/2022] Open

Metabolic and metagenomic outcomes from early-life pulsed antibiotic treatment. Nat Commun 2015;6:7486. [PMID: 26123276 PMCID: PMC4491183 DOI: 10.1038/ncomms8486] [Citation(s) in RCA: 257] [Impact Index Per Article: 28.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2015] [Accepted: 05/13/2015] [Indexed: 12/12/2022] Open

Manconi A, Manca E, Moscatelli M, Gnocchi M, Orro A, Armano G, Milanesi L. G-CNV: A GPU-Based Tool for Preparing Data to Detect CNVs with Read-Depth Methods. Front Bioeng Biotechnol 2015;3:28. [PMID: 25806367 PMCID: PMC4354384 DOI: 10.3389/fbioe.2015.00028] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2014] [Accepted: 02/19/2015] [Indexed: 11/23/2022] Open

Abstract

Copy number variations (CNVs) are the most prevalent types of structural variations (SVs) in the human genome and are involved in a wide range of common human diseases. Different computational methods have been devised to detect this type of SVs and to study how they are implicated in human diseases. Recently, computational methods based on high-throughput sequencing (HTS) are increasingly used. The majority of these methods focus on mapping short-read sequences generated from a donor against a reference genome to detect signatures distinctive of CNVs. In particular, read-depth based methods detect CNVs by analyzing genomic regions with significantly different read-depth from the other ones. The pipeline analysis of these methods consists of four main stages: (i) data preparation, (ii) data normalization, (iii) CNV regions identification, and (iv) copy number estimation. However, available tools do not support most of the operations required at the first two stages of this pipeline. Typically, they start the analysis by building the read-depth signal from pre-processed alignments. Therefore, third-party tools must be used to perform most of the preliminary operations required to build the read-depth signal. These data-intensive operations can be efficiently parallelized on graphics processing units (GPUs). In this article, we present G-CNV, a GPU-based tool devised to perform the common operations required at the first two stages of the analysis pipeline. G-CNV is able to filter low-quality read sequences, to mask low-quality nucleotides, to remove adapter sequences, to remove duplicated read sequences, to map the short-reads, to resolve multiple mapping ambiguities, to build the read-depth signal, and to normalize it. G-CNV can be efficiently used as a third-party tool able to prepare data for the subsequent read-depth signal generation and analysis. Moreover, it can also be integrated in CNV detection tools to generate read-depth signals.

Collapse

CLAST: CUDA implemented large-scale alignment search tool. BMC Bioinformatics 2014;15:406. [PMID: 25495907 PMCID: PMC4271471 DOI: 10.1186/s12859-014-0406-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2014] [Accepted: 12/02/2014] [Indexed: 01/13/2023] Open

Abstract

BACKGROUND

Metagenomics is a powerful methodology to study microbial communities, but it is highly dependent on nucleotide sequence similarity searching against sequence databases. Metagenomic analyses with next-generation sequencing technologies produce enormous numbers of reads from microbial communities, and many reads are derived from microbes whose genomes have not yet been sequenced, limiting the usefulness of existing sequence similarity search tools. Therefore, there is a clear need for a sequence similarity search tool that can rapidly detect weak similarity in large datasets.

RESULTS

We developed a tool, which we named CLAST (CUDA implemented large-scale alignment search tool), that enables analyses of millions of reads and thousands of reference genome sequences, and runs on NVIDIA Fermi architecture graphics processing units. CLAST has four main advantages over existing alignment tools. First, CLAST was capable of identifying sequence similarities ~80.8 times faster than BLAST and 9.6 times faster than BLAT. Second, CLAST executes global alignment as the default (local alignment is also an option), enabling CLAST to assign reads to taxonomic and functional groups based on evolutionarily distant nucleotide sequences with high accuracy. Third, CLAST does not need a preprocessed sequence database like Burrows-Wheeler Transform-based tools, and this enables CLAST to incorporate large, frequently updated sequence databases. Fourth, CLAST requires <2 GB of main memory, making it possible to run CLAST on a standard desktop computer or server node.

CONCLUSIONS

CLAST achieved very high speed (similar to the Burrows-Wheeler Transform-based Bowtie 2 for long reads) and sensitivity (equal to BLAST, BLAT, and FR-HIT) without the need for extensive database preprocessing or a specialized computing platform. Our results demonstrate that CLAST has the potential to be one of the most powerful and realistic approaches to analyze the massive amount of sequence data from next-generation sequencing technologies.

Collapse

Fürstberger A, Maucher M, Kestler HA. Extended pairwise local alignment of wild card DNA/RNA sequences using dynamic programming. J STAT COMPUT SIM 2014. [DOI: 10.1080/00949655.2014.928294] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Li Y, Chi H, Xia L, Chu X. Accelerating the scoring module of mass spectrometry-based peptide identification using GPUs. BMC Bioinformatics 2014;15:121. [PMID: 24773593 PMCID: PMC4049470 DOI: 10.1186/1471-2105-15-121] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2012] [Accepted: 04/23/2014] [Indexed: 11/10/2022] Open