Chen W, Yao C, Guo Y, Wang Y, Xue Z. pmTM-align: scalable pairwise and multiple structure alignment with Apache Spark and OpenMP.
BMC Bioinformatics 2020;
21:426. [PMID:
32993484 PMCID:
PMC7526426 DOI:
10.1186/s12859-020-03757-2]
[Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2019] [Accepted: 09/16/2020] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND
Structure comparison can provide useful information to identify functional and evolutionary relationship between proteins. With the dramatic increase of protein structure data in the Protein Data Bank, computation time quickly becomes the bottleneck for large scale structure comparisons. To more efficiently deal with informative multiple structure alignment tasks, we propose pmTM-align, a parallel protein structure alignment approach based on mTM-align/TM-align. pmTM-align contains two stages to handle pairwise structure alignments with Spark and the phylogenetic tree-based multiple structure alignment task on a single computer with OpenMP.
RESULTS
Experiments with the SABmark dataset showed that parallelization along with data structure optimization provided considerable speedup for mTM-align. The Spark-based structure alignments achieved near ideal scalability with large datasets, and the OpenMP-based construction of the phylogenetic tree accelerated the incremental alignment of multiple structures and metrics computation by a factor of about 2-5.
CONCLUSIONS
pmTM-align enables scalable pairwise and multiple structure alignment computing and offers more timely responses for medium to large-sized input data than existing alignment tools such as mTM-align.
Collapse