Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Stivala AD, Stuckey PJ, Wirth AI. Fast and accurate protein substructure searching with simulated annealing and GPUs. BMC Bioinformatics 2010;11:446. [PMID: 20813068 PMCID: PMC2944279 DOI: 10.1186/1471-2105-11-446] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2010] [Accepted: 09/03/2010] [Indexed: 11/10/2022] Open

For:	Stivala AD, Stuckey PJ, Wirth AI. Fast and accurate protein substructure searching with simulated annealing and GPUs. BMC Bioinformatics 2010;11:446. [PMID: 20813068 PMCID: PMC2944279 DOI: 10.1186/1471-2105-11-446] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2010] [Accepted: 09/03/2010] [Indexed: 11/10/2022] Open

Number

Cited by Other Article(s)

Yang B, Bao W, Chen B. PGRNIG: novel parallel gene regulatory network identification algorithm based on GPU. Brief Funct Genomics 2022;21:441-454. [PMID: 36064791 DOI: 10.1093/bfgp/elac028] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 07/30/2022] [Accepted: 08/03/2022] [Indexed: 12/14/2022] Open

Mining folded proteomes in the era of accurate structure prediction. PLoS Comput Biol 2022;18:e1009930. [PMID: 35333855 PMCID: PMC8986115 DOI: 10.1371/journal.pcbi.1009930] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 04/06/2022] [Accepted: 02/16/2022] [Indexed: 01/02/2023] Open

Abstract

Protein structure fundamentally underpins the function and processes of numerous biological systems. Fold recognition algorithms offer a sensitive and robust tool to detect structural, and thereby functional, similarities between distantly related homologs. In the era of accurate structure prediction owing to advances in machine learning techniques and a wealth of experimentally determined structures, previously curated sequence databases have become a rich source of biological information. Here, we use bioinformatic fold recognition algorithms to scan the entire AlphaFold structure database to identify novel protein family members, infer function and group predicted protein structures. As an example of the utility of this approach, we identify novel, previously unknown members of various pore-forming protein families, including MACPFs, GSDMs and aerolysin-like proteins.

Virtually every cellular process in all organisms on Earth is driven by molecular nano-machines known as proteins. The diverse functions of proteins are the result of the unique three-dimensional shape adopted by a given protein molecule. It is therefore important to determine the shape of a given protein, which unlike DNA and our genes, cannot be known from its sequence alone. Since two proteins with similar shapes typically have a similar function, knowing a protein shape provides crucial clues about its function. By virtue of decades of experimental work and advances in artificial intelligence, this complex shape can now be computationally predicted for any protein whose composition is known. Scientists have used these and other methods to produce enormous libraries of protein shapes consisting of nearly a million unique entries. However, these libraries are too large and too complex for researchers to ‘read’. We use shape-comparison algorithms to carefully check these shape-libraries to gain insight into the potential function and biological role of previously unknown proteins. Furthermore, we identified new members of protein families using this technique. We show that shape-matching algorithms and computationally generated shape-libraries can be used effectively together to yield new insights and expedite scientific endeavours.

Collapse

Dulcey CE, López de Los Santos Y, Létourneau M, Déziel E, Doucet N. Semi-rational evolution of the 3-(3-hydroxyalkanoyloxy)alkanoate (HAA) synthase RhlA to improve rhamnolipid production in Pseudomonas aeruginosa and Burkholderia glumae. FEBS J 2019;286:4036-4059. [PMID: 31177633 DOI: 10.1111/febs.14954] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Revised: 04/12/2019] [Accepted: 06/06/2019] [Indexed: 12/15/2022]

High-throughput and scalable protein function identification with Hadoop and Map-only pattern of the MapReduce processing model. Knowl Inf Syst 2018. [DOI: 10.1007/s10115-018-1245-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]

Warris S, Timal NRN, Kempenaar M, Poortinga AM, van de Geest H, Varbanescu AL, Nap JP. pyPaSWAS: Python-based multi-core CPU and GPU sequence alignment. PLoS One 2018;13:e0190279. [PMID: 29293576 PMCID: PMC5749749 DOI: 10.1371/journal.pone.0190279] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2017] [Accepted: 12/11/2017] [Indexed: 11/18/2022] Open

Nobile MS, Cazzaniga P, Tangherloni A, Besozzi D. Graphics processing units in bioinformatics, computational biology and systems biology. Brief Bioinform 2017;18:870-885. [PMID: 27402792 PMCID: PMC5862309 DOI: 10.1093/bib/bbw058] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Indexed: 01/18/2023] Open

Yan X, Li J, Gu Q, Xu J. gWEGA: GPU-accelerated WEGA for molecular superposition and shape comparison. J Comput Chem 2014;35:1122-30. [PMID: 24729358 DOI: 10.1002/jcc.23603] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2014] [Revised: 03/06/2014] [Accepted: 03/14/2014] [Indexed: 01/13/2023]

Mrozek D, Brożek M, Małysiak-Mrozek B. Parallel implementation of 3D protein structure similarity searches using a GPU and the CUDA. J Mol Model 2014;20:2067. [PMID: 24481593 PMCID: PMC3936136 DOI: 10.1007/s00894-014-2067-1] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2013] [Accepted: 10/11/2013] [Indexed: 01/16/2023]

Going over the three dimensional protein structure similarity problem. Artif Intell Rev 2013. [DOI: 10.1007/s10462-013-9416-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Kirshner DA, Nilmeier JP, Lightstone FC. Catalytic site identification--a web server to identify catalytic site structural matches throughout PDB. Nucleic Acids Res 2013;41:W256-65. [PMID: 23680785 PMCID: PMC3692059 DOI: 10.1093/nar/gkt403] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open

Park S, Shin SY, Hwang KB. CFMDS: CUDA-based fast multidimensional scaling for genome-scale data. BMC Bioinformatics 2013;13 Suppl 17:S23. [PMID: 23282007 PMCID: PMC3521231 DOI: 10.1186/1471-2105-13-s17-s23] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Wang JJY, Bensmail H, Gao X. Multiple graph regularized protein domain ranking. BMC Bioinformatics 2012;13:307. [PMID: 23157331 PMCID: PMC3583823 DOI: 10.1186/1471-2105-13-307] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2012] [Accepted: 10/29/2012] [Indexed: 11/10/2022] Open

GSA: a GPU-accelerated structure similarity algorithm and its application in progressive virtual screening. Mol Divers 2012;16:759-69. [DOI: 10.1007/s11030-012-9403-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2012] [Accepted: 10/08/2012] [Indexed: 12/21/2022]

Ho HK, Gange G, Kuiper MJ, Ramamohanarao K. BetaSearch: a new method for querying β-residue motifs. BMC Res Notes 2012;5:391. [PMID: 22839199 PMCID: PMC3532365 DOI: 10.1186/1756-0500-5-391] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2012] [Accepted: 06/15/2012] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Searching for structural motifs across known protein structures can be useful for identifying unrelated proteins with similar function and characterising secondary structures such as β-sheets. This is infeasible using conventional sequence alignment because linear protein sequences do not contain spatial information. β-residue motifs are β-sheet substructures that can be represented as graphs and queried using existing graph indexing methods, however, these approaches are designed for general graphs that do not incorporate the inherent structural constraints of β-sheets and require computationally-expensive filtering and verification procedures. 3D substructure search methods, on the other hand, allow β-residue motifs to be queried in a three-dimensional context but at significant computational costs.

FINDINGS

We developed a new method for querying β-residue motifs, called BetaSearch, which leverages the natural planar constraints of β-sheets by indexing them as 2D matrices, thus avoiding much of the computational complexities involved with structural and graph querying. BetaSearch exhibits faster filtering, verification, and overall query time than existing graph indexing approaches whilst producing comparable index sizes. Compared to 3D substructure search methods, BetaSearch achieves 33 and 240 times speedups over index-based and pairwise alignment-based approaches, respectively. Furthermore, we have presented case-studies to demonstrate its capability of motif matching in sequentially dissimilar proteins and described a method for using BetaSearch to predict β-strand pairing.

CONCLUSIONS

We have demonstrated that BetaSearch is a fast method for querying substructure motifs. The improvements in speed over existing approaches make it useful for efficiently performing high-volume exploratory querying of possible protein substructural motifs or conformations. BetaSearch was used to identify a nearly identical β-residue motif between an entirely synthetic (Top7) and a naturally-occurring protein (Charcot-Leyden crystal protein), as well as identifying structural similarities between biotin-binding domains of avidin, streptavidin and the lipocalin gamma subunit of human C8.

Collapse

Anand P, Yeturu K, Chandra N. PocketAnnotate: towards site-based function annotation. Nucleic Acids Res 2012;40:W400-8. [PMID: 22618878 PMCID: PMC3394344 DOI: 10.1093/nar/gks421] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Hirschfeld JA, Lustfeld H. Finding stable minima using a nudged-elastic-band-based optimization scheme. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012;85:056709. [PMID: 23004905 DOI: 10.1103/physreve.85.056709] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2012] [Revised: 05/08/2012] [Indexed: 06/01/2023]

Pang B, Zhao N, Becchi M, Korkin D, Shyu CR. Accelerating large-scale protein structure alignments with graphics processing units. BMC Res Notes 2012;5:116. [PMID: 22357132 PMCID: PMC3309952 DOI: 10.1186/1756-0500-5-116] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2011] [Accepted: 02/22/2012] [Indexed: 11/24/2022] Open

Liu P, Agrafiotis DK, Rassokhin DN, Yang E. Accelerating Chemical Database Searching Using Graphics Processing Units. J Chem Inf Model 2011;51:1807-16. [DOI: 10.1021/ci200164g] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Farber RM. Topical perspective on massive threading and parallelism. J Mol Graph Model 2011;30:82-9. [PMID: 21764615 DOI: 10.1016/j.jmgm.2011.06.007] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2011] [Revised: 06/15/2011] [Accepted: 06/17/2011] [Indexed: 10/18/2022]

Abstract

Unquestionably computer architectures have undergone a recent and noteworthy paradigm shift that now delivers multi- and many-core systems with tens to many thousands of concurrent hardware processing elements per workstation or supercomputer node. GPGPU (General Purpose Graphics Processor Unit) technology in particular has attracted significant attention as new software development capabilities, namely CUDA (Compute Unified Device Architecture) and OpenCL™, have made it possible for students as well as small and large research organizations to achieve excellent speedup for many applications over more conventional computing architectures. The current scientific literature reflects this shift with numerous examples of GPGPU applications that have achieved one, two, and in some special cases, three-orders of magnitude increased computational performance through the use of massive threading to exploit parallelism. Multi-core architectures are also evolving quickly to exploit both massive-threading and massive-parallelism such as the 1.3 million threads Blue Waters supercomputer. The challenge confronting scientists in planning future experimental and theoretical research efforts--be they individual efforts with one computer or collaborative efforts proposing to use the largest supercomputers in the world is how to capitalize on these new massively threaded computational architectures--especially as not all computational problems will scale to massive parallelism. In particular, the costs associated with restructuring software (and potentially redesigning algorithms) to exploit the parallelism of these multi- and many-threaded machines must be considered along with application scalability and lifespan. This perspective is an overview of the current state of threading and parallelize with some insight into the future.

Collapse