Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Cheng X, Wang J, Li Q, Liu T. BiLSTM-5mC: A Bidirectional Long Short-Term Memory-Based Approach for Predicting 5-Methylcytosine Sites in Genome-Wide DNA Promoters. Molecules 2021;26:molecules26247414. [PMID: 34946497 PMCID: PMC8704614 DOI: 10.3390/molecules26247414] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 12/04/2021] [Indexed: 12/04/2022] Open

For:	Cheng X, Wang J, Li Q, Liu T. BiLSTM-5mC: A Bidirectional Long Short-Term Memory-Based Approach for Predicting 5-Methylcytosine Sites in Genome-Wide DNA Promoters. Molecules 2021;26:molecules26247414. [PMID: 34946497 PMCID: PMC8704614 DOI: 10.3390/molecules26247414] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 12/04/2021] [Indexed: 12/04/2022] Open

Number

Cited by Other Article(s)

Hou A, Luo H, Liu H, Luo L, Ding P. Multi-scale DNA language model improves 6 mA binding sites prediction. Comput Biol Chem 2024;112:108129. [PMID: 39067351 DOI: 10.1016/j.compbiolchem.2024.108129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 06/05/2024] [Accepted: 06/10/2024] [Indexed: 07/30/2024]

Yi M, Zhou F, Deng Y. STM-ac4C: a hybrid model for identification of N4-acetylcytidine (ac4C) in human mRNA based on selective kernel convolution, temporal convolutional network, and multi-head self-attention. Front Genet 2024;15:1408688. [PMID: 38873109 PMCID: PMC11169723 DOI: 10.3389/fgene.2024.1408688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Accepted: 05/14/2024] [Indexed: 06/15/2024] Open

Abstract

N4-acetylcysteine (ac4C) is a chemical modification in mRNAs that alters the structure and function of mRNA by adding an acetyl group to the N4 position of cytosine. Researchers have shown that ac4C is closely associated with the occurrence and development of various cancers. Therefore, accurate prediction of ac4C modification sites on human mRNA is crucial for revealing its role in diseases and developing new diagnostic and therapeutic strategies. However, existing deep learning models still have limitations in prediction accuracy and generalization ability, which restrict their effectiveness in handling complex biological sequence data. This paper introduces a deep learning-based model, STM-ac4C, for predicting ac4C modification sites on human mRNA. The model combines the advantages of selective kernel convolution, temporal convolutional networks, and multi-head self-attention mechanisms to effectively extract and integrate multi-level features of RNA sequences, thereby achieving high-precision prediction of ac4C sites. On the independent test dataset, STM-ac4C showed improvements of 1.81%, 3.5%, and 0.37% in accuracy, Matthews correlation coefficient, and area under the curve, respectively, compared to the existing state-of-the-art technologies. Moreover, its performance on additional balanced and imbalanced datasets also confirmed the model's robustness and generalization ability. Various experimental results indicate that STM-ac4C outperforms existing methods in predictive performance. In summary, STM-ac4C excels in predicting ac4C modification sites on human mRNA, providing a powerful new tool for a deeper understanding of the biological significance of mRNA modifications and cancer treatment. Additionally, the model reveals key sequence features that influence the prediction of ac4C sites through sequence region impact analysis, offering new perspectives for future research. The source code and experimental data are available at https://github.com/ymy12341/STM-ac4C.

Collapse

Yu X, Ren J, Long H, Zeng R, Zhang G, Bilal A, Cui Y. iDNA-OpenPrompt: OpenPrompt learning model for identifying DNA methylation. Front Genet 2024;15:1377285. [PMID: 38689652 PMCID: PMC11058834 DOI: 10.3389/fgene.2024.1377285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Accepted: 03/07/2024] [Indexed: 05/02/2024] Open

Jia J, Lei R, Qin L, Wei X. i5mC-DCGA: an improved hybrid network framework based on the CBAM attention mechanism for identifying promoter 5mC sites. BMC Genomics 2024;25:242. [PMID: 38443802 PMCID: PMC10913688 DOI: 10.1186/s12864-024-10154-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 02/22/2024] [Indexed: 03/07/2024] Open

Jia J, Deng Y, Yi M, Zhu Y. 4mCPred-GSIMP: Predicting DNA N4-methylcytosine sites in the mouse genome with multi-Scale adaptive features extraction and fusion. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024;21:253-271. [PMID: 38303422 DOI: 10.3934/mbe.2024012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]

Wang S, Liu Y, Liu Y, Zhang Y, Zhu X. BERT-5mC: an interpretable model for predicting 5-methylcytosine sites of DNA based on BERT. PeerJ 2023;11:e16600. [PMID: 38089911 PMCID: PMC10712318 DOI: 10.7717/peerj.16600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 11/15/2023] [Indexed: 12/18/2023] Open

Zhuo L, Wang R, Fu X, Yao X. StableDNAm: towards a stable and efficient model for predicting DNA methylation based on adaptive feature correction learning. BMC Genomics 2023;24:742. [PMID: 38053026 DOI: 10.1186/s12864-023-09802-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 11/11/2023] [Indexed: 12/07/2023] Open

Jia J, Cao X, Wei Z. DLC-ac4C: A Prediction Model for N4-acetylcytidine Sites in Human mRNA Based on DenseNet and Bidirectional LSTM Methods. Curr Genomics 2023;24:171-186. [PMID: 38178985 PMCID: PMC10761336 DOI: 10.2174/0113892029270191231013111911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 09/13/2023] [Accepted: 09/21/2023] [Indexed: 01/06/2024] Open

Abstract

Introduction

N4 acetylcytidine (ac4C) is a highly conserved nucleoside modification that is essential for the regulation of immune functions in organisms. Currently, the identification of ac4C is primarily achieved using biological methods, which can be time-consuming and labor-intensive. In contrast, accurate identification of ac4C by computational methods has become a more effective method for classification and prediction.

Aim

To the best of our knowledge, although there are several computational methods for ac4C locus prediction, the performance of the models they constructed is poor, and the network structure they used is relatively simple and suffers from the disadvantage of network degradation. This study aims to improve these limitations by proposing a predictive model based on integrated deep learning to better help identify ac4C sites.

Methods

In this study, we propose a new integrated deep learning prediction framework, DLC-ac4C. First, we encode RNA sequences based on three feature encoding schemes, namely C2 encoding, nucleotide chemical property (NCP) encoding, and nucleotide density (ND) encoding. Second, one-dimensional convolutional layers and densely connected convolutional networks (DenseNet) are used to learn local features, and bi-directional long short-term memory networks (Bi-LSTM) are used to learn global features. Third, a channel attention mechanism is introduced to determine the importance of sequence characteristics. Finally, a homomorphic integration strategy is used to limit the generalization error of the model, which further improves the performance of the model.

Results

The DLC-ac4C model performed well in terms of sensitivity (Sn), specificity (Sp), accuracy (Acc), Mathews correlation coefficient (MCC), and area under the curve (AUC) for the independent test data with 86.23%, 79.71%, 82.97%, 66.08%, and 90.42%, respectively, which was significantly better than the prediction accuracy of the existing methods.

Conclusion

Our model not only combines DenseNet and Bi-LSTM, but also uses the channel attention mechanism to better capture hidden information features from a sequence perspective, and can identify ac4C sites more effectively.

Collapse

Wang S, Wang L, Li F, Bai F. DeepSA: a deep-learning driven predictor of compound synthesis accessibility. J Cheminform 2023;15:103. [PMID: 37919805 PMCID: PMC10621138 DOI: 10.1186/s13321-023-00771-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 10/20/2023] [Indexed: 11/04/2023] Open

Wang Z, Xiang S, Zhou C, Xu Q. DeepMethylation: a deep learning based framework with GloVe and Transformer encoder for DNA methylation prediction. PeerJ 2023;11:e16125. [PMID: 37780374 PMCID: PMC10538282 DOI: 10.7717/peerj.16125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 08/27/2023] [Indexed: 10/03/2023] Open

Yassi M, Chatterjee A, Parry M. Application of deep learning in cancer epigenetics through DNA methylation analysis. Brief Bioinform 2023;24:bbad411. [PMID: 37985455 PMCID: PMC10661960 DOI: 10.1093/bib/bbad411] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 10/08/2023] [Accepted: 10/25/2023] [Indexed: 11/22/2023] Open

Hu W, Guan L, Li M. Prediction of DNA Methylation based on Multi-dimensional feature encoding and double convolutional fully connected convolutional neural network. PLoS Comput Biol 2023;19:e1011370. [PMID: 37639434 PMCID: PMC10461834 DOI: 10.1371/journal.pcbi.1011370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 07/18/2023] [Indexed: 08/31/2023] Open

Alakuş TB. A Novel Repetition Frequency-Based DNA Encoding Scheme to Predict Human and Mouse DNA Enhancers with Deep Learning. Biomimetics (Basel) 2023;8:218. [PMID: 37366813 DOI: 10.3390/biomimetics8020218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 05/18/2023] [Accepted: 05/22/2023] [Indexed: 06/28/2023] Open

Abstract

Recent studies have shown that DNA enhancers have an important role in the regulation of gene expression. They are responsible for different important biological elements and processes such as development, homeostasis, and embryogenesis. However, experimental prediction of these DNA enhancers is time-consuming and costly as it requires laboratory work. Therefore, researchers started to look for alternative ways and started to apply computation-based deep learning algorithms to this field. Yet, the inconsistency and unsuccessful prediction performance of computational-based approaches among various cell lines led to the investigation of these approaches as well. Therefore, in this study, a novel DNA encoding scheme was proposed, and solutions were sought to the problems mentioned and DNA enhancers were predicted with BiLSTM. The study consisted of four different stages for two scenarios. In the first stage, DNA enhancer data were obtained. In the second stage, DNA sequences were converted to numerical representations by both the proposed encoding scheme and various DNA encoding schemes including EIIP, integer number, and atomic number. In the third stage, the BiLSTM model was designed, and the data were classified. In the final stage, the performance of DNA encoding schemes was determined by accuracy, precision, recall, F1-score, CSI, MCC, G-mean, Kappa coefficient, and AUC scores. In the first scenario, it was determined whether the DNA enhancers belonged to humans or mice. As a result of the prediction process, the highest performance was achieved with the proposed DNA encoding scheme, and an accuracy of 92.16% and an AUC score of 0.85 were calculated, respectively. The closest accuracy score to the proposed scheme was obtained with the EIIP DNA encoding scheme and the result was observed as 89.14%. The AUC score of this scheme was measured as 0.87. Among the remaining DNA encoding schemes, the atomic number showed an accuracy score of 86.61%, while this rate decreased to 76.96% with the integer scheme. The AUC values of these schemes were 0.84 and 0.82, respectively. In the second scenario, it was determined whether there was a DNA enhancer and, if so, it was decided to which species this enhancer belonged. In this scenario, the highest accuracy score was obtained with the proposed DNA encoding scheme and the result was 84.59%. Moreover, the AUC score of the proposed scheme was determined as 0.92. EIIP and integer DNA encoding schemes showed accuracy scores of 77.80% and 73.68%, respectively, while their AUC scores were close to 0.90. The most ineffective prediction was performed with the atomic number and the accuracy score of this scheme was calculated as 68.27%. Finally, the AUC score of this scheme was 0.81. At the end of the study, it was observed that the proposed DNA encoding scheme was successful and effective in predicting DNA enhancers.

Collapse

Jia J, Qin L, Lei R. DGA-5mC: A 5-methylcytosine site prediction model based on an improved DenseNet and bidirectional GRU method. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023;20:9759-9780. [PMID: 37322910 DOI: 10.3934/mbe.2023428] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]

Abstract

The 5-methylcytosine (5mC) in the promoter region plays a significant role in biological processes and diseases. A few high-throughput sequencing technologies and traditional machine learning algorithms are often used by researchers to detect 5mC modification sites. However, high-throughput identification is laborious, time-consuming and expensive; moreover, the machine learning algorithms are not so advanced. Therefore, there is an urgent need to develop a more efficient computational approach to replace those traditional methods. Since deep learning algorithms are more popular and have powerful computational advantages, we constructed a novel prediction model, called DGA-5mC, to identify 5mC modification sites in promoter regions by using a deep learning algorithm based on an improved densely connected convolutional network (DenseNet) and the bidirectional GRU approach. Furthermore, we added a self-attention module to evaluate the importance of various 5mC features. The deep learning-based DGA-5mC model algorithm automatically handles large proportions of unbalanced data for both positive and negative samples, highlighting the model's reliability and superiority. So far as the authors are aware, this is the first time that the combination of an improved DenseNet and bidirectional GRU methods has been used to predict the 5mC modification sites in promoter regions. It can be seen that the DGA-5mC model, after using a combination of one-hot coding, nucleotide chemical property coding and nucleotide density coding, performed well in terms of sensitivity, specificity, accuracy, the Matthews correlation coefficient (MCC), area under the curve and Gmean in the independent test dataset: 90.19%, 92.74%, 92.54%, 64.64%, 96.43% and 91.46%, respectively. In addition, all datasets and source codes for the DGA-5mC model are freely accessible at https://github.com/lulukoss/DGA-5mC.

Collapse

Zeng W, Gautam A, Huson DH. MuLan-Methyl-multiple transformer-based language models for accurate DNA methylation prediction. Gigascience 2022;12:giad054. [PMID: 37489753 PMCID: PMC10367125 DOI: 10.1093/gigascience/giad054] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Revised: 05/09/2023] [Accepted: 07/18/2023] [Indexed: 07/26/2023] Open

Zhou J, Wang X, Wei Z, Meng J, Huang D. 4acCPred: Weakly supervised prediction of N4-acetyldeoxycytosine DNA modification from sequences. MOLECULAR THERAPY - NUCLEIC ACIDS 2022;30:337-345. [DOI: 10.1016/j.omtn.2022.10.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 10/12/2022] [Indexed: 11/06/2022]

Jin J, Yu Y, Wang R, Zeng X, Pang C, Jiang Y, Li Z, Dai Y, Su R, Zou Q, Nakai K, Wei L. iDNA-ABF: multi-scale deep biological language learning model for the interpretable prediction of DNA methylations. Genome Biol 2022;23:219. [PMID: 36253864 PMCID: PMC9575223 DOI: 10.1186/s13059-022-02780-1] [Citation(s) in RCA: 44] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Accepted: 10/03/2022] [Indexed: 11/29/2022] Open

Affiliation(s)

Junru Jin School of Software, Shandong University, Jinan, 250101, China Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
Yingying Yu School of Software, Shandong University, Jinan, 250101, China Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
Ruheng Wang School of Software, Shandong University, Jinan, 250101, China Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
Xin Zeng Human Genome Center, The Institute of Medical Science, The University of Tokyo, Tokyo, 108-8639, Japan Department of Computational Biology and Medical Sciences, The University of Tokyo, Kashiwa, 277-8563, Japan
Chao Pang School of Software, Shandong University, Jinan, 250101, China Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
Yi Jiang School of Software, Shandong University, Jinan, 250101, China Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
Zhongshen Li School of Software, Shandong University, Jinan, 250101, China Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
Yutong Dai Human Genome Center, The Institute of Medical Science, The University of Tokyo, Tokyo, 108-8639, Japan Department of Computational Biology and Medical Sciences, The University of Tokyo, Kashiwa, 277-8563, Japan
Ran Su College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
Quan Zou Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, China
Kenta Nakai Human Genome Center, The Institute of Medical Science, The University of Tokyo, Tokyo, 108-8639, Japan. Department of Computational Biology and Medical Sciences, The University of Tokyo, Kashiwa, 277-8563, Japan.
Leyi Wei School of Software, Shandong University, Jinan, 250101, China. Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China.

Collapse

Zhou Y, Jia E, Shi H, Liu Z, Sheng Y, Pan M, Tu J, Ge Q, Lu Z. Prediction of Time-Series Transcriptomic Gene Expression Based on Long Short-Term Memory with Empirical Mode Decomposition. Int J Mol Sci 2022;23:ijms23147532. [PMID: 35886880 PMCID: PMC9322773 DOI: 10.3390/ijms23147532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 07/03/2022] [Accepted: 07/04/2022] [Indexed: 02/01/2023] Open