Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

González-Iglesias A, Arcas A, Domingo-Muelas A, Mancini E, Galcerán J, Valcárcel J, Fariñas I, Nieto MA. Intron detention tightly regulates the stemness/differentiation switch in the adult neurogenic niche. Nat Commun 2024;15:2837. [PMID: 38565566 PMCID: PMC10987655 DOI: 10.1038/s41467-024-47092-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 03/13/2024] [Indexed: 04/04/2024] Open

Affiliation(s)

Ainara González-Iglesias Instituto de Neurociencias (CSIC-UMH), Sant Joan d'Alacant, 03550, Spain
Aida Arcas Instituto de Neurociencias (CSIC-UMH), Sant Joan d'Alacant, 03550, Spain Department of Gene Therapy and Regulation of Gene Expression, Center for Applied Medical Research, University of Navarra, Pamplona, 31008, Spain
Ana Domingo-Muelas Departamento de Biología Celular, Biología Funcional y Antropología Física and Instituto de Biotecnología y Biomedicina, Universidad de Valencia, Burjassot, 46100, Spain Centro de Investigación Biomédica en Red sobre Enfermedades Neurodegenerativas (CIBERNED), 28029, Madrid, Spain Carlos Simon Foundation, 46980, Paterna, Valencia, Spain Department of Cell and Developmental Biology, Institute for Regenerative Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA Igenomix Foundation, 46980, Paterna, Valencia, Spain
Estefania Mancini Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, 08003, Spain
Joan Galcerán Instituto de Neurociencias (CSIC-UMH), Sant Joan d'Alacant, 03550, Spain Centro de Investigación Biomédica en Red sobre Enfermedades Raras (CIBERER), 28029, Madrid, Spain
Juan Valcárcel Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, 08003, Spain Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain Institució Catalana de Recerca i Estudis Avançats (ICREA), 08010, Barcelona, Spain
Isabel Fariñas Departamento de Biología Celular, Biología Funcional y Antropología Física and Instituto de Biotecnología y Biomedicina, Universidad de Valencia, Burjassot, 46100, Spain Centro de Investigación Biomédica en Red sobre Enfermedades Neurodegenerativas (CIBERNED), 28029, Madrid, Spain
M Angela Nieto Instituto de Neurociencias (CSIC-UMH), Sant Joan d'Alacant, 03550, Spain. Centro de Investigación Biomédica en Red sobre Enfermedades Raras (CIBERER), 28029, Madrid, Spain.

Collapse

Wang M, Ali H, Xu Y, Xie J, Xu S. BiPSTP: Sequence feature encoding method for identifying different RNA modifications with bidirectional position-specific trinucleotides propensities. J Biol Chem 2024;300:107140. [PMID: 38447795 PMCID: PMC10997841 DOI: 10.1016/j.jbc.2024.107140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 02/17/2024] [Accepted: 02/25/2024] [Indexed: 03/08/2024] Open

Abstract

RNA modification, a posttranscriptional regulatory mechanism, significantly influences RNA biogenesis and function. The accurate identification of modification sites is paramount for investigating their biological implications. Methods for encoding RNA sequence into numerical data play a crucial role in developing robust models for predicting modification sites. However, existing techniques suffer from limitations, including inadequate information representation, challenges in effectively integrating positional and sequential information, and the generation of irrelevant or redundant features when combining multiple approaches. These deficiencies hinder the effectiveness of machine learning models in addressing the performance challenges associated with predicting RNA modification sites. Here, we introduce a novel RNA sequence feature representation method, named BiPSTP, which utilizes bidirectional trinucleotide position-specific propensities. We employ the parameter ξ to denote the interval between the current nucleotide and its adjacent forward or backward dinucleotide, enabling the extraction of positional and sequential information from RNA sequences. Leveraging the BiPSTP method, we have developed the prediction model mRNAPred using support vector machine classifier to identify multiple types of RNA modification sites. We evaluate the performance of our BiPSTP method and mRNAPred model across 12 distinct RNA modification types. Our experimental results demonstrate the superiority of the mRNAPred model compared to state-of-art models in the domain of RNA modification sites identification. Importantly, our BiPSTP method enhances the robustness and generalization performance of prediction models. Notably, it can be applied to feature extraction from DNA sequences to predict other biological modification sites.

Collapse

Meng Q, Schatten H, Zhou Q, Chen J. Crosstalk between m6A and coding/non-coding RNA in cancer and detection methods of m6A modification residues. Aging (Albany NY) 2023;15:6577-6619. [PMID: 37437245 PMCID: PMC10373953 DOI: 10.18632/aging.204836] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 06/15/2023] [Indexed: 07/14/2023]

Cheng J, Li G, Wang W, Stovall DB, Sui G, Li D. Circular RNAs with protein-coding ability in oncogenesis. Biochim Biophys Acta Rev Cancer 2023;1878:188909. [PMID: 37172651 DOI: 10.1016/j.bbcan.2023.188909] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 05/08/2023] [Accepted: 05/08/2023] [Indexed: 05/15/2023]

Acera Mateos P, Zhou Y, Zarnack K, Eyras E. Concepts and methods for transcriptome-wide prediction of chemical messenger RNA modifications with machine learning. Brief Bioinform 2023;24:7150742. [PMID: 37139545 DOI: 10.1093/bib/bbad163] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 03/03/2023] [Indexed: 05/05/2023] Open

Wang R, Chung CR, Huang HD, Lee TY. Identification of species-specific RNA N6-methyladinosine modification sites from RNA sequences. Brief Bioinform 2023;24:7008797. [PMID: 36715277 DOI: 10.1093/bib/bbac573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 11/11/2022] [Accepted: 11/24/2022] [Indexed: 01/31/2023] Open

M6A-BERT-Stacking: A Tissue-Specific Predictor for Identifying RNA N6-Methyladenosine Sites Based on BERT and Stacking Strategy. Symmetry (Basel) 2023. [DOI: 10.3390/sym15030731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023] Open

Taguchi YH. Bioinformatic tools for epitranscriptomics. Am J Physiol Cell Physiol 2023;324:C447-C457. [PMID: 36468841 DOI: 10.1152/ajpcell.00437.2022] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Zhang S, Wang J, Li X, Liang Y. M6A-GSMS: Computational identification of N⁶-methyladenosine sites with GBDT and stacking learning in multiple species. J Biomol Struct Dyn 2022;40:12380-12391. [PMID: 34459713 DOI: 10.1080/07391102.2021.1970628] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Zou J, Liu H, Tan W, Chen YQ, Dong J, Bai SY, Wu ZX, Zeng Y. Dynamic regulation and key roles of ribonucleic acid methylation. Front Cell Neurosci 2022;16:1058083. [PMID: 36601431 PMCID: PMC9806184 DOI: 10.3389/fncel.2022.1058083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 11/28/2022] [Indexed: 12/23/2022] Open

Affiliation(s)

Jia Zou Community Health Service Center, Geriatric Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, China,Brain Science and Advanced Technology Institute, School of Medicine, Wuhan University of Science and Technology, Wuhan, China
Hui Liu Community Health Service Center, Geriatric Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, China,Brain Science and Advanced Technology Institute, School of Medicine, Wuhan University of Science and Technology, Wuhan, China
Wei Tan Community Health Service Center, Geriatric Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, China
Yi-qi Chen Community Health Service Center, Geriatric Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, China,Brain Science and Advanced Technology Institute, School of Medicine, Wuhan University of Science and Technology, Wuhan, China
Jing Dong Community Health Service Center, Geriatric Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, China,Brain Science and Advanced Technology Institute, School of Medicine, Wuhan University of Science and Technology, Wuhan, China
Shu-yuan Bai Community Health Service Center, Geriatric Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, China,Brain Science and Advanced Technology Institute, School of Medicine, Wuhan University of Science and Technology, Wuhan, China
Zhao-xia Wu Community Health Service Center, Wuchang Hospital, Wuhan, China
Yan Zeng Community Health Service Center, Geriatric Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, China,Brain Science and Advanced Technology Institute, School of Medicine, Wuhan University of Science and Technology, Wuhan, China,School of Public Health, Wuhan University of Science and Technology, Wuhan, China,*Correspondence: Yan Zeng,

Collapse

Luo Z, Lou L, Qiu W, Xu Z, Xiao X. Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning. Int J Mol Sci 2022;23:ijms232415490. [PMID: 36555143 PMCID: PMC9778682 DOI: 10.3390/ijms232415490] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 12/03/2022] [Accepted: 12/05/2022] [Indexed: 12/13/2022] Open

Abstract

N6-methyladenosine (m⁶A) is the most abundant within eukaryotic messenger RNA modification, which plays an essential regulatory role in the control of cellular functions and gene expression. However, it remains an outstanding challenge to detect mRNA m⁶A transcriptome-wide at base resolution via experimental approaches, which are generally time-consuming and expensive. Developing computational methods is a good strategy for accurate in silico detection of m⁶A modification sites from the large amount of RNA sequence data. Unfortunately, the existing computational models are usually only for m⁶A site prediction in a single species, without considering the tissue level of species, while most of them are constructed based on low-confidence level data generated by an m⁶A antibody immunoprecipitation (IP)-based sequencing method, thereby restricting reliability and generalizability of proposed models. Here, we review recent advances in computational prediction of m⁶A sites and construct a new computational approach named im6APred using ensemble deep learning to accurately identify m⁶A sites based on high-confidence level data in multiple tissues of mammals. Our model im6APred builds upon a comprehensive evaluation of multiple classification methods, including four traditional classification algorithms and three deep learning methods and their ensembles. The optimal base-classifier combinations are then chosen by five-fold cross-validation test to achieve an effective stacked model. Our model im6APred can produce the area under the receiver operating characteristic curve (AUROC) in the range of 0.82-0.91 on independent tests, indicating that our model has the ability to learn general methylation rules on RNA bases and generalize to m⁶A transcriptome-wide identification. Moreover, AUROCs in the range of 0.77-0.96 were achieved using cross-species/tissues validation on the benchmark dataset, demonstrating differences in predictive performance at the tissue level and the need for constructing tissue-specific models for m⁶A site prediction.

Collapse

Liao J, Wang Q, Wu F, Huang Z. In Silico Methods for Identification of Potential Active Sites of Therapeutic Targets. Molecules 2022;27:7103. [PMID: 36296697 PMCID: PMC9609013 DOI: 10.3390/molecules27207103] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/12/2022] [Accepted: 08/25/2022] [Indexed: 07/30/2023] Open

Wang H, Zhao S, Cheng Y, Bi S, Zhu X. MTDeepM6A-2S: A two-stage multi-task deep learning method for predicting RNA N6-methyladenosine sites of Saccharomyces cerevisiae. Front Microbiol 2022;13:999506. [PMID: 36274691 PMCID: PMC9579691 DOI: 10.3389/fmicb.2022.999506] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 09/16/2022] [Indexed: 11/13/2022] Open

PSP-PJMI: An innovative feature representation algorithm for identifying DNA N4-methylcytosine sites. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.05.060] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Ma L, He LN, Kang S, Gu B, Gao S, Zuo Z. Advances in detecting N6-methyladenosine modification in circRNAs. Methods 2022;205:234-246. [PMID: 35878749 DOI: 10.1016/j.ymeth.2022.07.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 07/15/2022] [Accepted: 07/18/2022] [Indexed: 12/14/2022] Open

CNNLSTMac4CPred: A Hybrid Model for N4-Acetylcytidine Prediction. Interdiscip Sci 2022;14:439-451. [PMID: 35106702 DOI: 10.1007/s12539-021-00500-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Revised: 12/04/2021] [Accepted: 12/13/2021] [Indexed: 12/23/2022]

Yu B, Zhang Y, Wang X, Gao H, Sun J, Gao X. Identification of DNA modification sites based on elastic net and bidirectional gated recurrent unit with convolutional neural network. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103566] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Identification of D Modification Sites Using a Random Forest Model Based on Nucleotide Chemical Properties. Int J Mol Sci 2022;23:ijms23063044. [PMID: 35328461 PMCID: PMC8950657 DOI: 10.3390/ijms23063044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 02/25/2022] [Accepted: 03/09/2022] [Indexed: 12/03/2022] Open

Wang H, Wang S, Zhang Y, Bi S, Zhu X. A brief review of machine learning methods for RNA methylation sites prediction. Methods 2022;203:399-421. [DOI: 10.1016/j.ymeth.2022.03.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Revised: 02/15/2022] [Accepted: 03/01/2022] [Indexed: 02/07/2023] Open

Cui C, Wu X, Zhou Y. GlyinsRNA: a webserver for predicting glycosylation sites on small RNAs. RNA Biol 2021;18:600-603. [PMID: 34559595 DOI: 10.1080/15476286.2021.1982574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open

Wang X, Lin X, Wang R, Han N, Fan K, Han L, Ding Z. A Feature Fusion Predictor for RNA Pseudouridine Sites with Particle Swarm Optimizer Based Feature Selection and Ensemble Learning Approach. Curr Issues Mol Biol 2021;43:1844-1858. [PMID: 34889887 PMCID: PMC8929013 DOI: 10.3390/cimb43030129] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 10/17/2021] [Accepted: 10/19/2021] [Indexed: 01/28/2023] Open

Zhou Y, Yang J, Tian Z, Zeng J, Shen W. Research progress concerning m⁶A methylation and cancer. Oncol Lett 2021;22:775. [PMID: 34589154 PMCID: PMC8442141 DOI: 10.3892/ol.2021.13036] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 08/20/2021] [Indexed: 12/12/2022] Open

BERT-m7G: A Transformer Architecture Based on BERT and Stacking Ensemble to Identify RNA N7-Methylguanosine Sites from Sequence Information. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021;2021:7764764. [PMID: 34484416 PMCID: PMC8413034 DOI: 10.1155/2021/7764764] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 08/13/2021] [Indexed: 01/19/2023]

Wang M, Xie J, Xu S. M6A-BiNP: predicting N⁶-methyladenosine sites based on bidirectional position-specific propensities of polynucleotides and pointwise joint mutual information. RNA Biol 2021;18:2498-2512. [PMID: 34161188 PMCID: PMC8632114 DOI: 10.1080/15476286.2021.1930729] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

Wang Y, Guo R, Huang L, Yang S, Hu X, He K. m6AGE: A Predictor for N6-Methyladenosine Sites Identification Utilizing Sequence Characteristics and Graph Embedding-Based Geometrical Information. Front Genet 2021;12:670852. [PMID: 34122525 PMCID: PMC8191635 DOI: 10.3389/fgene.2021.670852] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 04/29/2021] [Indexed: 11/30/2022] Open

Epigenetics: Roles and therapeutic implications of non-coding RNA modifications in human cancers. MOLECULAR THERAPY. NUCLEIC ACIDS 2021;25:67-82. [PMID: 34188972 PMCID: PMC8217334 DOI: 10.1016/j.omtn.2021.04.021] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Zhang L, Qin X, Liu M, Xu Z, Liu G. DNN-m6A: A Cross-Species Method for Identifying RNA N6-Methyladenosine Sites Based on Deep Neural Network with Multi-Information Fusion. Genes (Basel) 2021;12:354. [PMID: 33670877 PMCID: PMC7997228 DOI: 10.3390/genes12030354] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 02/22/2021] [Accepted: 02/25/2021] [Indexed: 12/16/2022] Open

Zhuang J, Liu D, Lin M, Qiu W, Liu J, Chen S. PseUdeep: RNA Pseudouridine Site Identification with Deep Learning Algorithm. Front Genet 2021;12:773882. [PMID: 34868261 PMCID: PMC8637112 DOI: 10.3389/fgene.2021.773882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 10/04/2021] [Indexed: 11/16/2022] Open

Ao C, Yu L, Zou Q. Prediction of bio-sequence modifications and the associations with diseases. Brief Funct Genomics 2020;20:1-18. [PMID: 33313647 DOI: 10.1093/bfgp/elaa023] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Revised: 11/09/2020] [Accepted: 11/10/2020] [Indexed: 12/22/2022] Open

Chen X, Xiong Y, Liu Y, Chen Y, Bi S, Zhu X. m5CPred-SVM: a novel method for predicting m5C sites of RNA. BMC Bioinformatics 2020;21:489. [PMID: 33126851 PMCID: PMC7602301 DOI: 10.1186/s12859-020-03828-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Accepted: 10/21/2020] [Indexed: 02/08/2023] Open

Abstract

BACKGROUND

As one of the most common post-transcriptional modifications (PTCM) in RNA, 5-cytosine-methylation plays important roles in many biological functions such as RNA metabolism and cell fate decision. Through accurate identification of 5-methylcytosine (m5C) sites on RNA, researchers can better understand the exact role of 5-cytosine-methylation in these biological functions. In recent years, computational methods of predicting m5C sites have attracted lots of interests because of its efficiency and low-cost. However, both the accuracy and efficiency of these methods are not satisfactory yet and need further improvement.

RESULTS

In this work, we have developed a new computational method, m5CPred-SVM, to identify m5C sites in three species, H. sapiens, M. musculus and A. thaliana. To build this model, we first collected benchmark datasets following three recently published methods. Then, six types of sequence-based features were generated based on RNA segments and the sequential forward feature selection strategy was used to obtain the optimal feature subset. After that, the performance of models based on different learning algorithms were compared, and the model based on the support vector machine provided the highest prediction accuracy. Finally, our proposed method, m5CPred-SVM was compared with several existing methods, and the result showed that m5CPred-SVM offered substantially higher prediction accuracy than previously published methods. It is expected that our method, m5CPred-SVM, can become a useful tool for accurate identification of m5C sites.

CONCLUSION

In this study, by introducing position-specific propensity related features, we built a new model, m5CPred-SVM, to predict RNA m5C sites of three different species. The result shows that our model outperformed the existing state-of-art models. Our model is available for users through a web server at https://zhulab.ahu.edu.cn/m5CPred-SVM .

Collapse

Khan F, Khan M, Iqbal N, Khan S, Muhammad Khan D, Khan A, Wei DQ. Prediction of Recombination Spots Using Novel Hybrid Feature Extraction Method via Deep Learning Approach. Front Genet 2020;11:539227. [PMID: 33093842 PMCID: PMC7527634 DOI: 10.3389/fgene.2020.539227] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Accepted: 08/13/2020] [Indexed: 01/20/2023] Open

Ahmed S, Kabir M, Arif M, Khan ZU, Yu DJ. DeepPPSite: A deep learning-based model for analysis and prediction of phosphorylation sites using efficient sequence information. Anal Biochem 2020;612:113955. [PMID: 32949607 DOI: 10.1016/j.ab.2020.113955] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 08/30/2020] [Accepted: 09/11/2020] [Indexed: 12/29/2022]

Abstract

Phosphorylation is a ubiquitous type of post-translational modification (PTM) that occurs in both eukaryotic and prokaryotic cells where in a phosphate group binds with amino acid residues. These specific residues, i.e., serine (S), threonine (T), and tyrosine (Y), exhibit diverse functions at the molecular level. Recent studies have determined that some diseases such as cancer, diabetes, and neurodegenerative diseases are caused by abnormal phosphorylation. Based on its potential applications in biological research and drug development, the large-scale identification of phosphorylation sites has attracted interest. Existing wet-lab technologies for targeting phosphorylation sites are overpriced and time consuming. Thus, computational algorithms that can efficiently accelerate the annotation of phosphorylation sites from massive protein sequences are needed. Numerous machine learning-based methods have been implemented for phosphorylation sites prediction. However, despite extensive efforts, existing computational approaches continue to have inadequate performance, particularly in terms of overall ACC, MCC, and AUC. In this paper, we report a novel deep learning-based predictor to overcome these performance hurdles, DeepPPSite, which was constructed using a stacked long short-term memory recurrent network for predicting phosphorylation sites. The proposed technique expediently learns the protein representations from conjoint protein descriptors. The experimental results indicated that our model achieved superior performance on the training dataset for S, T and Y, with MCC values of 0.608, 0.602, and 0.558, respectively, using a 10-fold cross-validation test. We further determined the generalization efficacy of the proposed predictor DeepPPSite by conducting a rigorous independent test. The predictive MCC values were 0.358, 0.356, and 0.350 for the S, T, and Y phosphorylation sites, respectively. Rigorous cross-validation and independent validation tests for the three types of phosphorylation sites demonstrated that the designed DeepPPSite tool significantly outperforms state-of-the-art methods.

Collapse

Karthiya R, Khandelia P. m6A RNA Methylation: Ramifications for Gene Expression and Human Health. Mol Biotechnol 2020;62:467-484. [PMID: 32840728 DOI: 10.1007/s12033-020-00269-5] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/14/2020] [Indexed: 12/12/2022]

Liu L, Song B, Ma J, Song Y, Zhang SY, Tang Y, Wu X, Wei Z, Chen K, Su J, Rong R, Lu Z, de Magalhães JP, Rigden DJ, Zhang L, Zhang SW, Huang Y, Lei X, Liu H, Meng J. Bioinformatics approaches for deciphering the epitranscriptome: Recent progress and emerging topics. Comput Struct Biotechnol J 2020;18:1587-1604. [PMID: 32670500 PMCID: PMC7334300 DOI: 10.1016/j.csbj.2020.06.010] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2020] [Revised: 06/02/2020] [Accepted: 06/07/2020] [Indexed: 12/13/2022] Open

Affiliation(s)

Lian Liu School of Computer Sciences, Shannxi Normal University, Xi’an, Shaanxi 710119, China
Bowen Song Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China Institute of Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
Jiani Ma School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
Yi Song Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China Institute of Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
Song-Yao Zhang Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi’an, Shaanxi 710072, China
Yujiao Tang Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China Institute of Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
Xiangyu Wu Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China Institute of Ageing & Chronic Disease, University of Liverpool, L7 8TX, Liverpool, United Kingdom
Zhen Wei Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China Institute of Ageing & Chronic Disease, University of Liverpool, L7 8TX, Liverpool, United Kingdom
Kunqi Chen Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China Institute of Ageing & Chronic Disease, University of Liverpool, L7 8TX, Liverpool, United Kingdom
Jionglong Su Department of Mathematical Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China
Rong Rong Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China Institute of Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
Zhiliang Lu Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China Institute of Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
João Pedro de Magalhães Institute of Ageing & Chronic Disease, University of Liverpool, L7 8TX, Liverpool, United Kingdom
Daniel J. Rigden Institute of Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
Lin Zhang School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
Shao-Wu Zhang School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
Yufei Huang Department of Electrical and Computer Engineering, University of Texas at San Antonio, San Antonio, TX, 78249, USA Department of Epidemiology and Biostatistics, University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA
Xiujuan Lei School of Computer Sciences, Shannxi Normal University, Xi’an, Shaanxi 710119, China
Hui Liu School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
Jia Meng Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China AI University Research Centre, Xi’an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China Institute of Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom

Collapse

Dou L, Li X, Ding H, Xu L, Xiang H. Prediction of m5C Modifications in RNA Sequences by Combining Multiple Sequence Features. MOLECULAR THERAPY. NUCLEIC ACIDS 2020;21:332-342. [PMID: 32645685 PMCID: PMC7340967 DOI: 10.1016/j.omtn.2020.06.004] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 06/03/2020] [Accepted: 06/04/2020] [Indexed: 12/14/2022]

Liu L, Lei X, Fang Z, Tang Y, Meng J, Wei Z. LITHOPHONE: Improving lncRNA Methylation Site Prediction Using an Ensemble Predictor. Front Genet 2020;11:545. [PMID: 32582286 PMCID: PMC7297269 DOI: 10.3389/fgene.2020.00545] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Accepted: 05/06/2020] [Indexed: 12/31/2022] Open

Abstract

N 6-methyladenosine (m6A) is one of the most widely studied epigenetic modifications, which plays an important role in many biological processes, such as splicing, RNA localization, and degradation. Studies have shown that m6A on lncRNA has important functions, including regulating the expression and functions of lncRNA, regulating the synthesis of pre-mRNA, promoting the proliferation of cancer cells, and affecting cell differentiation and many others. Although a number of methods have been proposed to predict m6A RNA methylation sites, most of these methods aimed at general m6A sites prediction without noticing the uniqueness of the lncRNA methylation prediction problem. Since many lncRNAs do not have a polyA tail and cannot be captured in the polyA selection step of the most widely adopted RNA-seq library preparation protocol, lncRNA methylation sites cannot be effectively captured and are thus likely to be significantly underrepresented in existing experimental data affecting the accuracy of existing predictors. In this paper, we propose a new computational framework, LITHOPHONE, which stands for long noncoding RNA methylation sites prediction from sequence characteristics and genomic information with an ensemble predictor. We show that the methylation sites of lncRNA and mRNA have different patterns exhibited in the extracted features and should be differently handled when making predictions. Due to the used experiment protocols, the number of known lncRNA m6A sites is limited, and insufficient to train a reliable predictor; thus, the performance can be improved by combining both lncRNA and mRNA data using an ensemble predictor. We show that the newly developed LITHOPHONE approach achieved a reasonably good performance when tested on independent datasets (AUC: 0.966 and 0.835 under full transcript and mature mRNA modes, respectively), marking a substantial improvement compared with existing methods. Additionally, LITHOPHONE was applied to scan the entire human lncRNAome for all possible lncRNA m6A sites, and the results are freely accessible at: http://180.208.58.19/lith/.

Collapse

Zhu X, He J, Zhao S, Tao W, Xiong Y, Bi S. A comprehensive comparison and analysis of computational predictors for RNA N6-methyladenosine sites of Saccharomyces cerevisiae. Brief Funct Genomics 2020;18:367-376. [PMID: 31609411 DOI: 10.1093/bfgp/elz018] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Revised: 07/07/2019] [Accepted: 07/15/2019] [Indexed: 12/16/2022] Open

Zhu ZM, Huo FC, Pei DS. Function and evolution of RNA N6-methyladenosine modification. Int J Biol Sci 2020;16:1929-1940. [PMID: 32398960 PMCID: PMC7211178 DOI: 10.7150/ijbs.45231] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2020] [Accepted: 04/05/2020] [Indexed: 02/06/2023] Open

Govindaraj RG, Subramaniyam S, Manavalan B. Extremely-randomized-tree-based Prediction of N⁶-Methyladenosine Sites in Saccharomyces cerevisiae. Curr Genomics 2020;21:26-33. [PMID: 32655295 PMCID: PMC7324895 DOI: 10.2174/1389202921666200219125625] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Revised: 12/28/2019] [Accepted: 01/24/2020] [Indexed: 02/07/2023] Open

Li Y, Wang J, Huang C, Shen M, Zhan H, Xu K. RNA N6-methyladenosine: a promising molecular target in metabolic diseases. Cell Biosci 2020;10:19. [PMID: 32110378 PMCID: PMC7035649 DOI: 10.1186/s13578-020-00385-4] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Accepted: 02/11/2020] [Indexed: 12/12/2022] Open

Zhao J, Cao Y, Zhang L. Exploring the computational methods for protein-ligand binding site prediction. Comput Struct Biotechnol J 2020;18:417-426. [PMID: 32140203 PMCID: PMC7049599 DOI: 10.1016/j.csbj.2020.02.008] [Citation(s) in RCA: 82] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 01/23/2020] [Accepted: 02/11/2020] [Indexed: 12/21/2022] Open

Wu P, Mo Y, Peng M, Tang T, Zhong Y, Deng X, Xiong F, Guo C, Wu X, Li Y, Li X, Li G, Zeng Z, Xiong W. Emerging role of tumor-related functional peptides encoded by lncRNA and circRNA. Mol Cancer 2020;19:22. [PMID: 32019587 PMCID: PMC6998289 DOI: 10.1186/s12943-020-1147-3] [Citation(s) in RCA: 320] [Impact Index Per Article: 80.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Accepted: 01/28/2020] [Indexed: 02/08/2023] Open

Affiliation(s)

Pan Wu NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China.,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, the Third Xiangya Hospital, Central South University, Changsha, Hunan, China
Yongzhen Mo Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
Miao Peng Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
Ting Tang Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
Yu Zhong Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
Xiangying Deng Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
Fang Xiong Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
Can Guo Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
Xu Wu NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
Yong Li Department of Medicine, Dan L Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, USA
Xiaoling Li Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
Guiyuan Li NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China.,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, the Third Xiangya Hospital, Central South University, Changsha, Hunan, China
Zhaoyang Zeng NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China.,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, the Third Xiangya Hospital, Central South University, Changsha, Hunan, China
Wei Xiong NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China. .,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China. .,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, the Third Xiangya Hospital, Central South University, Changsha, Hunan, China.

Collapse

Liu L, Lei X, Meng J, Wei Z. WITMSG: Large-scale Prediction of Human Intronic m⁶A RNA Methylation Sites from Sequence and Genomic Features. Curr Genomics 2020;21:67-76. [PMID: 32655300 PMCID: PMC7324894 DOI: 10.2174/1389202921666200211104140] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 01/14/2020] [Accepted: 01/27/2020] [Indexed: 02/07/2023] Open

Zhou Y, Cui Q, Zhou Y. NmSEER V2.0: a prediction tool for 2'-O-methylation sites based on random forest and multi-encoding combination. BMC Bioinformatics 2019;20:690. [PMID: 31874624 PMCID: PMC6929462 DOI: 10.1186/s12859-019-3265-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open

Liu Z, Dong W, Luo W, Jiang W, Li Q, He Z. HLMethy: a machine learning-based model to identify the hidden labels of m⁶A candidates. PLANT MOLECULAR BIOLOGY 2019;101:575-584. [PMID: 31722090 DOI: 10.1007/s11103-019-00930-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Accepted: 11/01/2019] [Indexed: 06/10/2023]

Zhao W, Qi X, Liu L, Ma S, Liu J, Wu J. Epigenetic Regulation of m⁶A Modifications in Human Cancer. MOLECULAR THERAPY-NUCLEIC ACIDS 2019;19:405-412. [PMID: 31887551 PMCID: PMC6938965 DOI: 10.1016/j.omtn.2019.11.022] [Citation(s) in RCA: 141] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 11/03/2019] [Accepted: 11/22/2019] [Indexed: 01/22/2023]

Chen Z, Zhao P, Li F, Wang Y, Smith AI, Webb GI, Akutsu T, Baggag A, Bensmail H, Song J. Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences. Brief Bioinform 2019;21:1676-1696. [DOI: 10.1093/bib/bbz112] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 07/31/2019] [Accepted: 08/07/2019] [Indexed: 12/14/2022] Open

Abstract Abstract RNA post-transcriptional modifications play a crucial role in a myriad of biological processes and cellular functions. To date, more than 160 RNA modifications have been discovered; therefore, accurate identification of RNA-modification sites is fundamental for a better understanding of RNA-mediated biological functions and mechanisms. However, due to limitations in experimental methods, systematic identification of different types of RNA-modification sites remains a major challenge. Recently, more than 20 computational methods have been developed to identify RNA-modification sites in tandem with high-throughput experimental methods, with most of these capable of predicting only single types of RNA-modification sites. These methods show high diversity in their dataset size, data quality, core algorithms, features extracted and feature selection techniques and evaluation strategies. Therefore, there is an urgent need to revisit these methods and summarize their methodologies, in order to improve and further develop computational techniques to identify and characterize RNA-modification sites from the large amounts of sequence data. With this goal in mind, first, we provide a comprehensive survey on a large collection of 27 state-of-the-art approaches for predicting N1-methyladenosine and N6-methyladenosine sites. We cover a variety of important aspects that are crucial for the development of successful predictors, including the dataset quality, operating algorithms, sequence and genomic features, feature selection, model performance evaluation and software utility. In addition, we also provide our thoughts on potential strategies to improve the model performance. Second, we propose a computational approach called DeepPromise based on deep learning techniques for simultaneous prediction of N1-methyladenosine and N6-methyladenosine. To extract the sequence context surrounding the modification sites, three feature encodings, including enhanced nucleic acid composition, one-hot encoding, and RNA embedding, were used as the input to seven consecutive layers of convolutional neural networks (CNNs), respectively. Moreover, DeepPromise further combined the prediction score of the CNN-based models and achieved around 43% higher area under receiver-operating curve (AUROC) for m1A site prediction and 2–6% higher AUROC for m6A site prediction, respectively, when compared with several existing state-of-the-art approaches on the independent test. In-depth analyses of characteristic sequence motifs identified from the convolution-layer filters indicated that nucleotide presentation at proximal positions surrounding the modification sites contributed most to the classification, whereas those at distal positions also affected classification but to different extents. To maximize user convenience, a web server was developed as an implementation of DeepPromise and made publicly available at http://DeepPromise.erc.monash.edu/, with the server accepting both RNA sequences and genomic sequences to allow prediction of two types of putative RNA-modification sites. Collapse

Chen K, Wei Z, Zhang Q, Wu X, Rong R, Lu Z, Su J, de Magalhães JP, Rigden DJ, Meng J. WHISTLE: a high-accuracy map of the human N6-methyladenosine (m6A) epitranscriptome predicted using a machine learning approach. Nucleic Acids Res 2019;47:e41. [PMID: 30993345 PMCID: PMC6468314 DOI: 10.1093/nar/gkz074] [Citation(s) in RCA: 137] [Impact Index Per Article: 27.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Revised: 01/27/2019] [Accepted: 02/01/2019] [Indexed: 12/24/2022] Open

Affiliation(s)

Kunqi Chen Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.,Institute of Ageing & Chronic Disease, University of Liverpool, L7 8TX Liverpool, UK
Zhen Wei Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.,Institute of Ageing & Chronic Disease, University of Liverpool, L7 8TX Liverpool, UK
Qing Zhang Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China
Xiangyu Wu Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.,Institute of Ageing & Chronic Disease, University of Liverpool, L7 8TX Liverpool, UK
Rong Rong Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.,Research Center for Precision Medicine, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.,Institute of Integrative Biology, University of Liverpool, L7 8TX Liverpool, UK
Zhiliang Lu Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.,Research Center for Precision Medicine, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.,Institute of Integrative Biology, University of Liverpool, L7 8TX Liverpool, UK
Jionglong Su Research Center for Precision Medicine, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.,Department of Mathematical Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China
João Pedro de Magalhães Institute of Ageing & Chronic Disease, University of Liverpool, L7 8TX Liverpool, UK
Daniel J Rigden Institute of Integrative Biology, University of Liverpool, L7 8TX Liverpool, UK
Jia Meng Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.,Research Center for Precision Medicine, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.,Institute of Integrative Biology, University of Liverpool, L7 8TX Liverpool, UK

Collapse

Fang T, Zhang Z, Sun R, Zhu L, He J, Huang B, Xiong Y, Zhu X. RNAm5CPred: Prediction of RNA 5-Methylcytosine Sites Based on Three Different Kinds of Nucleotide Composition. MOLECULAR THERAPY. NUCLEIC ACIDS 2019;18:739-747. [PMID: 31726390 PMCID: PMC6859278 DOI: 10.1016/j.omtn.2019.10.008] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Revised: 10/11/2019] [Accepted: 10/11/2019] [Indexed: 12/11/2022]

Abstract

5-methylcytosine (m5C) is one of the most common and abundant post-transcriptional modifications (PTCMs) in RNA. Recent studies showed that m5C plays important roles in many biological functions such as RNA metabolism and cell fate decision. Because most experimental methods that determine m5C sites across the transcriptome are time-consuming and expensive, it is urgent to develop accurate computational methods to identify m5C sites effectively. A benchmark dataset is important for developing and evaluating computational methods. In this work, we constructed four different datasets according to the data redundancy and imbalance. Based on these datasets, we generated three different kinds of features, i.e., KNFs (K-nucleotide frequencies), KSNPFs (K-spaced nucleotide pair frequencies), and pseDNC (pseudo-dinucleotide composition), and then used a support vector machine (SVM) to build our models. Based on the imbalanced and nonredundant dataset, Met935, we extensively studied the three kinds of features and determined an optimal combination of the features. Based on the feature combination, we built models on the three different datasets and compared them with state-of-the-art models. According to the predictive results of the stringent jackknife test, the models based on the three features, 4NF, 1SNPF, and pseDNC, are superior or comparable to other methods. To determine the best model between the models based on the imbalanced dataset Met935 and the balanced dataset Met240, we further evaluated the two models on an independent test set Test1157. Our results demonstrate that the model based on the balanced dataset Met240 achieved the highest recall (68.79%) and the highest Matthews correlation coefficient (MCC) (0.154). In addition, the model is also superior to other state-of-the-art methods according to the integrated parameter MCC on the independent test set. Thus, we selected the model based on Met240 as our final model, which was named RNAm5CPred. In addition, a web server for RNAm5CPred (http://zhulab.ahu.edu.cn/RNAm5CPred/) has been provided to facilitate experimental research.

Collapse

Liu K, Chen W, Lin H. XG-PseU: an eXtreme Gradient Boosting based method for identifying pseudouridine sites. Mol Genet Genomics 2019;295:13-21. [DOI: 10.1007/s00438-019-01600-9] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Accepted: 07/29/2019] [Indexed: 01/08/2023]