Han GS, Li Q, Li Y. Nucleosome positioning based on DNA sequence embedding and deep learning.
BMC Genomics 2022;
23:301. [PMID:
35418074 PMCID:
PMC9006412 DOI:
10.1186/s12864-022-08508-6]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 03/28/2022] [Indexed: 11/25/2022] Open
Abstract
Background
Nucleosome positioning is the precise determination of the location of nucleosomes on DNA sequence. With the continuous advancement of biotechnology and computer technology, biological data is showing explosive growth. It is of practical significance to develop an efficient nucleosome positioning algorithm. Indeed, convolutional neural networks (CNN) can capture local features in DNA sequences, but ignore the order of bases. While the bidirectional recurrent neural network can make up for CNN's shortcomings in this regard and extract the long-term dependent features of DNA sequence.
Results
In this work, we use word vectors to represent DNA sequences and propose three new deep learning models for nucleosome positioning, and the integrative model NP_CBiR reaches a better prediction performance. The overall accuracies of NP_CBiR on H. sapiens, C. elegans, and D. melanogaster datasets are 86.18%, 89.39%, and 85.55% respectively.
Conclusions
Benefited by different network structures, NP_CBiR can effectively extract local features and bases order features of DNA sequences, thus can be considered as a complementary tool for nucleosome positioning.
Collapse