1
|
Huang J, Wang X, Xia R, Yang D, Liu J, Lv Q, Yu X, Meng J, Chen K, Song B, Wang Y. Domain-knowledge enabled ensemble learning of 5-formylcytosine (f5C) modification sites. Comput Struct Biotechnol J 2024; 23:3175-3185. [PMID: 39253057 PMCID: PMC11381828 DOI: 10.1016/j.csbj.2024.08.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Revised: 08/07/2024] [Accepted: 08/07/2024] [Indexed: 09/11/2024] Open
Abstract
5-formylcytidine (f5C) is a unique post-transcriptional RNA modification found in mRNA and tRNA at the wobble site, playing a crucial role in mitochondrial protein synthesis and potentially contributing to the regulation of translation. Recent studies have unveiled that the f5C modifications may drive mitochondrial mRNA translation to power cancer metastasis. Accurate identification of f5C sites is essential for further unraveling their molecular functions and regulatory mechanisms, but there are currently no computational methods available for predicting their locations. In this study, we introduce an innovative ensemble approach, successfully enabling the computational recognition of Saccharomyces cerevisiae f5C. We conducted a comprehensive model selection process that involved multiple basic machine learning and deep learning algorithms such as recurrent neural networks, convolutional neural networks and Transformer-based models. Initially trained only on sequence information, these individual models achieved an AUROC ranging from 0.7104 to 0.7492. Through the integration of 32 novel domain-derived genomic features, the performance of individual models has significantly improved to an AUROC between 0.7309 and 0.8076. To further enhance accuracy and robustness, we then constructed the ensembles of these individual models with different combinations. The best performance attained by our ensemble models reached an AUROC of 0.8391. Shapley additive explanations were conducted to explain the significant contributions of genomic features, providing insights into the putative distribution of f5C across various topological regions and potentially paving the way for revealing their functional relevance within distinct genomic contexts. A freely accessible web server that allows real-time analysis of user-uploaded sites can be accessed at: www.rnamd.org/Resf5C-Pred.
Collapse
Affiliation(s)
- Jiaming Huang
- Jiangsu Key Laboratory for Functional Substance of Chinese Medicine, School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China
- Department of Biological Sciences, School of Science, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Xuan Wang
- Department of Biological Sciences, School of Science, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Rong Xia
- Department of Biological Sciences, School of Science, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
- School of AI and Advanced Computing, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Dongqing Yang
- Department of Public Health, School of Medicine, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Jian Liu
- Jiangsu Key Laboratory for Functional Substance of Chinese Medicine, School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Qi Lv
- Jiangsu Key Laboratory for Functional Substance of Chinese Medicine, School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Xiaoxuan Yu
- Department of Pharmacology, School of Medicine, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Jia Meng
- Department of Biological Sciences, School of Science, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
- AI University Research Centre, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L7 8TX, United Kingdom
| | - Kunqi Chen
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350004, China
| | - Bowen Song
- Department of Public Health, School of Medicine, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Yue Wang
- Jiangsu Key Laboratory for Functional Substance of Chinese Medicine, School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China
| |
Collapse
|
2
|
Luo Z, Yu L, Xu Z, Liu K, Gu L. Comprehensive Review and Assessment of Computational Methods for Prediction of N6-Methyladenosine Sites. BIOLOGY 2024; 13:777. [PMID: 39452086 PMCID: PMC11504118 DOI: 10.3390/biology13100777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 09/19/2024] [Accepted: 09/23/2024] [Indexed: 10/26/2024]
Abstract
N6-methyladenosine (m6A) plays a crucial regulatory role in the control of cellular functions and gene expression. Recent advances in sequencing techniques for transcriptome-wide m6A mapping have accelerated the accumulation of m6A site information at a single-nucleotide level, providing more high-confidence training data to develop computational approaches for m6A site prediction. However, it is still a major challenge to precisely predict m6A sites using in silico approaches. To advance the computational support for m6A site identification, here, we curated 13 up-to-date benchmark datasets from nine different species (i.e., H. sapiens, M. musculus, Rat, S. cerevisiae, Zebrafish, A. thaliana, Pig, Rhesus, and Chimpanzee). This will assist the research community in conducting an unbiased evaluation of alternative approaches and support future research on m6A modification. We revisited 52 computational approaches published since 2015 for m6A site identification, including 30 traditional machine learning-based, 14 deep learning-based, and 8 ensemble learning-based methods. We comprehensively reviewed these computational approaches in terms of their training datasets, calculated features, computational methodologies, performance evaluation strategy, and webserver/software usability. Using these benchmark datasets, we benchmarked nine predictors with available online websites or stand-alone software and assessed their prediction performance. We found that deep learning and traditional machine learning approaches generally outperformed scoring function-based approaches. In summary, the curated benchmark dataset repository and the systematic assessment in this study serve to inform the design and implementation of state-of-the-art computational approaches for m6A identification and facilitate more rigorous comparisons of new methods in the future.
Collapse
Affiliation(s)
- Zhengtao Luo
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei 230036, China;
- Anhui Provincial Key Laboratory of Smart Agriculture Technology and Equipment, Anhui Agricultural University, Hefei 230036, China
| | - Liyi Yu
- Computer Department, Jingdezhen Ceramic University, Jingdezhen 333403, China; (L.Y.); (Z.X.)
| | - Zhaochun Xu
- Computer Department, Jingdezhen Ceramic University, Jingdezhen 333403, China; (L.Y.); (Z.X.)
- School for Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin 150076, China
| | - Kening Liu
- Computer Department, Jingdezhen Ceramic University, Jingdezhen 333403, China; (L.Y.); (Z.X.)
| | - Lichuan Gu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei 230036, China;
- Anhui Provincial Key Laboratory of Smart Agriculture Technology and Equipment, Anhui Agricultural University, Hefei 230036, China
| |
Collapse
|
3
|
Elsebaie HA, Nafie MS, Tawfik HO, Belal A, Ghoneim MM, Obaidullah AJ, Shaaban S, Ayed AA, El-Naggar M, Mehany ABM, Shaldam MA. Discovery of new 1,3-diphenylurea appended aryl pyridine derivatives as apoptosis inducers through c-MET and VEGFR-2 inhibition: design, synthesis, in vivo and in silico studies. RSC Med Chem 2024; 15:2553-2569. [PMID: 39026631 PMCID: PMC11253870 DOI: 10.1039/d4md00280f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Accepted: 06/11/2024] [Indexed: 07/20/2024] Open
Abstract
Interest has been generated in VEGFR-2 and c-MET as potential receptors for the treatment of different malignancies. Using aryl pyridine derivatives with 1,3-diphenylurea attached, a number of promising dual VEGFR-2 and c-MET inhibitors were developed and synthesized. Regarding the molecular target, compounds 2d, 2f, 2j, 2k, and 2n had potent IC50 values of 65, 24, 150, 170, and 18 nM against c-MET, respectively. Additionally, they had potent IC50 values of 310, 35, 290, 320, and 24 nM against VEGFR-2, respectively. Regarding cytotoxicity, compounds 2d, 2f, 2j, 2k and 2n exhibited potent cytotoxicity against MCF-7 with IC50 values in the range 0.76-21.5 μM, and they showed promising cytotoxic activity against PC-3 with IC50 values in the range 1.85-3.42 μM compared to cabozantinib (IC50 = 1.06 μM against MCF-7 and 2.01 μM against PC-3). Regarding cell death, compound 2n caused cell death in MCF-7 cells by 87.34-fold; it induced total apoptosis by 33.19% (8.04% for late apoptosis, 25.15% for early apoptosis), stopping their growth in the G2/M phase, affecting the expression of apoptosis-related genes P53, Bax, caspases 3 and 9 and the anti-apoptotic gene, Bcl-2. In vivo study illustrated the anticancer activity of compound 2n by reduction of tumor mass and volume, and the tumor inhibition ratio reached 56.1% with an improvement of hematological parameters. Accordingly, compound 2n can be further developed as a selective target-oriented chemotherapeutic against breast cancer.
Collapse
Affiliation(s)
- Heba A Elsebaie
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Tanta University Tanta 31527 Egypt
| | - Mohamed S Nafie
- Department of Chemistry, College of Sciences, University of Sharjah Sharjah 27272 United Arab Emirates
- Chemistry Department, Faculty of Science, Suez Canal University Ismailia 41522 Egypt
| | - Haytham O Tawfik
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Tanta University Tanta 31527 Egypt
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, AlSalam University in Egypt Kafr Al Zaiyat 6615062 Egypt
| | - Amany Belal
- Department of Pharmaceutical Chemistry, College of Pharmacy, Taif University P.O. Box 11099 Taif 21944 Saudi Arabia
| | - Mohammed M Ghoneim
- Department of Pharmacy Practice, College of Pharmacy, AlMaarefa University Ad Diriyah Riyadh 13713 Saudi Arabia
| | - Ahmad J Obaidullah
- Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University P.O. Box 2457 Riyadh 11451 Saudi Arabia
| | - Salwa Shaaban
- Department of Microbiology & Immunology, Faculty of pharmacySuef University Beni-Suef Egypt
- Department of Clinical Laboratory Sciences, Faculty of Applied medical Sciences, King Khalid University Abha Saudi Arabia
| | - Abdelmoneim A Ayed
- Department of Chemistry, Faculty of Science, Cairo University Giza Cairo 12613 Egypt
| | - Mohamed El-Naggar
- Chemistry department, Faculty of Sciences, Pure and Applied Chemistry Group, University of Sharjah P. O. Box 27272 Sharjah United Arab Emirates
| | - Ahmed B M Mehany
- Zoology Department, Faculty of Science (Boys), Al-Azhar University Cairo 11884 Egypt
| | - Moataz A Shaldam
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, AlSalam University in Egypt Kafr Al Zaiyat 6615062 Egypt
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Kafrelsheikh University P.O. Box 33516 Kafrelsheikh Egypt
| |
Collapse
|
4
|
Abdel-Maksoud MA, Askar MA, Abdel-rahman IY, Gharib M, Aufy M. Integrating Network Pharmacology and Molecular Docking Approach to Elucidate the Mechanism of Commiphora wightii for the Treatment of Rheumatoid Arthritis. Bioinform Biol Insights 2024; 18:11779322241247634. [PMID: 38765022 PMCID: PMC11102677 DOI: 10.1177/11779322241247634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 03/28/2024] [Indexed: 05/21/2024] Open
Abstract
Background Rheumatoid arthritis (RA) is considered a notable prolonged inflammatory condition with no proper cure. Synovial inflammation and synovial pannus are crucial in the onset of RA. The "tumor-like" invading proliferation of new arteries is a keynote of RA. Commiphora wightii (C wightii) is a perennial, deciduous, and trifoliate plant used in several areas of southeast Asia to cure numerous ailments, including arthritis, diabetes, obesity, and asthma. Several in vitro investigations have indicated C wightii's therapeutic efficacy in the treatment of arthritis. However, the precise molecular action is yet unknown. Material and methods In this study, a network pharmacology approach was applied to uncover potential targets, active therapeutic ingredients and signaling pathways in C wightii for the treatment of arthritis. In the groundwork of this research, we examined the active constituent-compound-target-pathway network and evaluated that (Guggulsterol-V, Myrrhahnone B, and Campesterol) decisively donated to the development of arthritis by affecting tumor necrosis factor (TNF), PIK3CA, and MAPK3 genes. Later on, docking was employed to confirm the active components' efficiency against the potential targets. Results According to molecular-docking research, several potential targets of RA bind tightly with the corresponding key active ingredient of C wightii. With the aid of network pharmacology techniques, we conclude that the signaling pathways and biological processes involved in C wightii had an impact on the prevention of arthritis. The outcomes of molecular docking also serve as strong recommendations for future research. In the context of this study, network pharmacology combined with molecular docking analysis showed that C wightii acted on arthritis-related signaling pathways to exhibit a promising preventive impact on arthritis. Conclusion These results serve as the basis for grasping the mechanism of the antiarthritis activity of C wightii. However, further in vivo/in vitro study is needed to verify the reliability of these targets for the treatment of arthritis.
Collapse
Affiliation(s)
- Mostafa A Abdel-Maksoud
- Department of Botany and Microbiology, College of Science, King Saud University, Riyadh, Saudi Arabia
| | - Mostafa A Askar
- Radiation Biology Department, National Centre for Radiation Research and Technology (NCRRT), Egyptian Atomic Energy Authority (EAEA), Cairo, Egypt
| | - Ibrahim Y Abdel-rahman
- Radiation Biology Department, National Centre for Radiation Research and Technology (NCRRT), Egyptian Atomic Energy Authority (EAEA), Cairo, Egypt
| | - Mustafa Gharib
- Radiation Biology Department, National Centre for Radiation Research and Technology (NCRRT), Egyptian Atomic Energy Authority (EAEA), Cairo, Egypt
| | - Mohammed Aufy
- Division of Pharmacology and Toxicology, Department of Pharmaceutical Sciences, University of Vienna, Vienna, Austria
| |
Collapse
|
5
|
Baddam SR, Avula MK, Akula R, Battula VR, Kalagara S, Buchikonda R, Ganta S, Venkatesan S, Allaka TR. Design, synthesis and in silico molecular docking evaluation of novel 1,2,3-triazole derivatives as potent antimicrobial agents. Heliyon 2024; 10:e27773. [PMID: 38590856 PMCID: PMC10999864 DOI: 10.1016/j.heliyon.2024.e27773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 03/06/2024] [Accepted: 03/06/2024] [Indexed: 04/10/2024] Open
Abstract
Chalcone and triazole scaffolds have demonstrated a crucial role in the advancement of science and technology. Due to their significance, research has proceeded on the design and development of novel benzooxepine connected to 1,2,3-triazolyl chalcone structures. The new chalcone derivatives produced by benzooxepine triazole methyl ketone 2 and different aromatic carbonyl compounds 3 are discussed in this paper. All prepared compounds have well-established structures to a variety of spectral approaches, including mass analysis, 1H NMR, 13C NMR, and IR. Among the tested compounds, hybrids 4c, 4d, 4i, and 4k exhibited exceptional antibacterial susceptibilities with MIC range of 3.59-10.30 μM against the tested S. aureus strain. Compounds 4c, 4d displayed superior antifungal activity against F. oxysporum with MIC 3.25, 4.89 μM, when compared to fluconazole (MIC = 3.83 μM) respectively. On the other hand, analogues 4d, 4f, and 4k demonstrated equivalent antitubercular action against H37Rv strain with MIC range of 2.16-4.90 μM. The capacity of ligand 4f to form a stable compound on the active site of CYP51 from M. tuberculosis (1EA1) was confirmed by docking studies using amino acids Leu321(A), Pro77(A), Phe83(A), Lys74(A), Tyr76(A), Ala73(A), Arg96(A), Thr80(A), Met79(A), His259(A), and Gln72(A). Additionally, the chalcone‒1,2,3‒triazole hybrids ADME (absorption, distribution, metabolism, and excretion), characteristics of molecules, estimations of toxicity, and bioactivity parameters were assessed.
Collapse
Affiliation(s)
- Sudhakar Reddy Baddam
- University of Massachusetts Chan Medical School, RNA Therapeutic Institute, Worcester, MA, 01655, United States
| | - Mahesh Kumar Avula
- Technology Development Center, Custom Pharmaceutical Services, Dr. Reddy's Laboratories Pvt. Ltd., Hyderabad, Telangana, 500049, India
- Department of Organic Chemistry and FDW, Andhra University, Visakhapatnam, Andhra Pradesh, 530003, India
| | - Raghunadh Akula
- Technology Development Center, Custom Pharmaceutical Services, Dr. Reddy's Laboratories Pvt. Ltd., Hyderabad, Telangana, 500049, India
| | - Venkateswara Rao Battula
- Department of Chemistry, AU College of Engineering (A), Andhra University, Visakhapatnam, Andhra Pradesh, 530003, India
| | - Sudhakar Kalagara
- Department of Chemistry and Biochemistry, University of the Texas at El Paso, El Paso, TX, 79968, United States
| | - Ravinder Buchikonda
- Technology Development Center, Custom Pharmaceutical Services, Dr. Reddy's Laboratories Pvt. Ltd., Hyderabad, Telangana, 500049, India
| | - Srinivas Ganta
- ScieGen Pharmaceutical Inc., Hauppauge, NY, 11788, United States
| | - Srinivasadesikan Venkatesan
- Department of Chemistry, School of Applied Science and Humanities, VIGNAN's Foundation for Science, Technology and Research, Vadlamudi, Andhra Pradesh, 522213, India
| | - Tejeswara Rao Allaka
- Centre for Chemical Sciences and Technology, Institute of Science and Technology, Jawaharlal Nehru Technological University Hyderabad, Kukatpally, Hyderabad, Telangana, 500085, India
| |
Collapse
|
6
|
Li G, Zhao B, Su X, Yang Y, Hu P, Zhou X, Hu L. Discovering Consensus Regions for Interpretable Identification of RNA N6-Methyladenosine Modification Sites via Graph Contrastive Clustering. IEEE J Biomed Health Inform 2024; 28:2362-2372. [PMID: 38265898 DOI: 10.1109/jbhi.2024.3357979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
As a pivotal post-transcriptional modification of RNA, N6-methyladenosine (m6A) has a substantial influence on gene expression modulation and cellular fate determination. Although a variety of computational models have been developed to accurately identify potential m6A modification sites, few of them are capable of interpreting the identification process with insights gained from consensus knowledge. To overcome this problem, we propose a deep learning model, namely M6A-DCR, by discovering consensus regions for interpretable identification of m6A modification sites. In particular, M6A-DCR first constructs an instance graph for each RNA sequence by integrating specific positions and types of nucleotides. The discovery of consensus regions is then formulated as a graph clustering problem in light of aggregating all instance graphs. After that, M6A-DCR adopts a motif-aware graph reconstruction optimization process to learn high-quality embeddings of input RNA sequences, thus achieving the identification of m6A modification sites in an end-to-end manner. Experimental results demonstrate the superior performance of M6A-DCR by comparing it with several state-of-the-art identification models. The consideration of consensus regions empowers our model to make interpretable predictions at the motif level. The analysis of cross validation through different species and tissues further verifies the consistency between the identification results of M6A-DCR and the evolutionary relationships among species.
Collapse
|
7
|
Tu G, Wang X, Xia R, Song B. m6A-TCPred: a web server to predict tissue-conserved human m 6A sites using machine learning approach. BMC Bioinformatics 2024; 25:127. [PMID: 38528499 PMCID: PMC10962094 DOI: 10.1186/s12859-024-05738-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 03/11/2024] [Indexed: 03/27/2024] Open
Abstract
BACKGROUND N6-methyladenosine (m6A) is the most prevalent post-transcriptional modification in eukaryotic cells that plays a crucial role in regulating various biological processes, and dysregulation of m6A status is involved in multiple human diseases including cancer contexts. A number of prediction frameworks have been proposed for high-accuracy identification of putative m6A sites, however, none have targeted for direct prediction of tissue-conserved m6A modified residues from non-conserved ones at base-resolution level. RESULTS We report here m6A-TCPred, a computational tool for predicting tissue-conserved m6A residues using m6A profiling data from 23 human tissues. By taking advantage of the traditional sequence-based characteristics and additional genome-derived information, m6A-TCPred successfully captured distinct patterns between potentially tissue-conserved m6A modifications and non-conserved ones, with an average AUROC of 0.871 and 0.879 tested on cross-validation and independent datasets, respectively. CONCLUSION Our results have been integrated into an online platform: a database holding 268,115 high confidence m6A sites with their conserved information across 23 human tissues; and a web server to predict the conserved status of user-provided m6A collections. The web interface of m6A-TCPred is freely accessible at: www.rnamd.org/m6ATCPred .
Collapse
Affiliation(s)
- Gang Tu
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, 215123, China
| | - Xuan Wang
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, 215123, China.
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, L7 8TX, UK.
| | - Rong Xia
- Department of Financial and Actuarial Mathematics, Xi'an Jiaotong-Liverpool University, Suzhou, 215123, China
| | - Bowen Song
- Department of Public Health, School of Medicine, Nanjing University of Chinese Medicine, Nanjing, 210023, China
| |
Collapse
|
8
|
Zhang Y, Wang Z, Zhang Y, Li S, Guo Y, Song J, Yu DJ. Interpretable prediction models for widespread m6A RNA modification across cell lines and tissues. Bioinformatics 2023; 39:btad709. [PMID: 37995291 PMCID: PMC10697738 DOI: 10.1093/bioinformatics/btad709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 11/01/2023] [Accepted: 11/22/2023] [Indexed: 11/25/2023] Open
Abstract
MOTIVATION RNA N6-methyladenosine (m6A) in Homo sapiens plays vital roles in a variety of biological functions. Precise identification of m6A modifications is thus essential to elucidation of their biological functions and underlying molecular-level mechanisms. Currently available high-throughput single-nucleotide-resolution m6A modification data considerably accelerated the identification of RNA modification sites through the development of data-driven computational methods. Nevertheless, existing methods have limitations in terms of the coverage of single-nucleotide-resolution cell lines and have poor capability in model interpretations, thereby having limited applicability. RESULTS In this study, we present CLSM6A, comprising a set of deep learning-based models designed for predicting single-nucleotide-resolution m6A RNA modification sites across eight different cell lines and three tissues. Extensive benchmarking experiments are conducted on well-curated datasets and accordingly, CLSM6A achieves superior performance than current state-of-the-art methods. Furthermore, CLSM6A is capable of interpreting the prediction decision-making process by excavating critical motifs activated by filters and pinpointing highly concerned positions in both forward and backward propagations. CLSM6A exhibits better portability on similar cross-cell line/tissue datasets, reveals a strong association between highly activated motifs and high-impact motifs, and demonstrates complementary attributes of different interpretation strategies. AVAILABILITY AND IMPLEMENTATION The webserver is available at http://csbio.njust.edu.cn/bioinf/clsm6a. The datasets and code are available at https://github.com/zhangying-njust/CLSM6A/.
Collapse
Affiliation(s)
- Ying Zhang
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Zhikang Wang
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
| | - Yiwen Zhang
- School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC 3004, Australia
| | - Shanshan Li
- School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC 3004, Australia
| | - Yuming Guo
- School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC 3004, Australia
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
- Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| |
Collapse
|
9
|
Yang Y, Liu Z, Lu J, Sun Y, Fu Y, Pan M, Xie X, Ge Q. Analysis approaches for the identification and prediction of N6-methyladenosine sites. Epigenetics 2023; 18:2158284. [PMID: 36562485 PMCID: PMC9980620 DOI: 10.1080/15592294.2022.2158284] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
The global dynamics in a variety of biological processes can be revealed by mapping transcriptional m6A sites, in particular full-transcriptome m6A. And individual m6A sites have contributed to biological function, which can be evaluated by stoichiometric information obtained from the single nucleotide resolution. Currently, the identification of m6A sites is mainly carried out by experiment and prediction methods, based on high-throughput sequencing and machine learning model respectively. This review summarizes the recent topics and progress made in bioinformatics methods of deciphering the m6A methylation, including the experimental detection of m6A methylation sites, techniques of data analysis, the way of predicting m6A methylation sites, m6A methylation databases, and detection of m6A modification in circRNA. At the end, the essay makes a brief discussion for the development perspective in this area.
Collapse
Affiliation(s)
- Yuwei Yang
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, People's Republic of China
| | - Zhiyu Liu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, People's Republic of China
| | - Junru Lu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, People's Republic of China
| | - Yuqing Sun
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, People's Republic of China
| | - Yue Fu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, People's Republic of China
| | - Min Pan
- Department of Pathology and Pathophysiology School of Medicine, Southeast University, Nanjing, China
| | - Xueying Xie
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, People's Republic of China
| | - Qinyu Ge
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, People's Republic of China
| |
Collapse
|
10
|
Jia J, Cao X, Wei Z. DLC-ac4C: A Prediction Model for N4-acetylcytidine Sites in Human mRNA Based on DenseNet and Bidirectional LSTM Methods. Curr Genomics 2023; 24:171-186. [PMID: 38178985 PMCID: PMC10761336 DOI: 10.2174/0113892029270191231013111911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 09/13/2023] [Accepted: 09/21/2023] [Indexed: 01/06/2024] Open
Abstract
Introduction N4 acetylcytidine (ac4C) is a highly conserved nucleoside modification that is essential for the regulation of immune functions in organisms. Currently, the identification of ac4C is primarily achieved using biological methods, which can be time-consuming and labor-intensive. In contrast, accurate identification of ac4C by computational methods has become a more effective method for classification and prediction. Aim To the best of our knowledge, although there are several computational methods for ac4C locus prediction, the performance of the models they constructed is poor, and the network structure they used is relatively simple and suffers from the disadvantage of network degradation. This study aims to improve these limitations by proposing a predictive model based on integrated deep learning to better help identify ac4C sites. Methods In this study, we propose a new integrated deep learning prediction framework, DLC-ac4C. First, we encode RNA sequences based on three feature encoding schemes, namely C2 encoding, nucleotide chemical property (NCP) encoding, and nucleotide density (ND) encoding. Second, one-dimensional convolutional layers and densely connected convolutional networks (DenseNet) are used to learn local features, and bi-directional long short-term memory networks (Bi-LSTM) are used to learn global features. Third, a channel attention mechanism is introduced to determine the importance of sequence characteristics. Finally, a homomorphic integration strategy is used to limit the generalization error of the model, which further improves the performance of the model. Results The DLC-ac4C model performed well in terms of sensitivity (Sn), specificity (Sp), accuracy (Acc), Mathews correlation coefficient (MCC), and area under the curve (AUC) for the independent test data with 86.23%, 79.71%, 82.97%, 66.08%, and 90.42%, respectively, which was significantly better than the prediction accuracy of the existing methods. Conclusion Our model not only combines DenseNet and Bi-LSTM, but also uses the channel attention mechanism to better capture hidden information features from a sequence perspective, and can identify ac4C sites more effectively.
Collapse
Affiliation(s)
- Jianhua Jia
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, 333403, China
| | - Xiaojing Cao
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, 333403, China
| | - Zhangying Wei
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, 333403, China
| |
Collapse
|
11
|
Xu Z, Wang X, Meng J, Zhang L, Song B. m5U-GEPred: prediction of RNA 5-methyluridine sites based on sequence-derived and graph embedding features. Front Microbiol 2023; 14:1277099. [PMID: 37937221 PMCID: PMC10627201 DOI: 10.3389/fmicb.2023.1277099] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/02/2023] [Indexed: 11/09/2023] Open
Abstract
5-Methyluridine (m5U) is one of the most common post-transcriptional RNA modifications, which is involved in a variety of important biological processes and disease development. The precise identification of the m5U sites allows for a better understanding of the biological processes of RNA and contributes to the discovery of new RNA functional and therapeutic targets. Here, we present m5U-GEPred, a prediction framework, to combine sequence characteristics and graph embedding-based information for m5U identification. The graph embedding approach was introduced to extract the global information of training data that complemented the local information represented by conventional sequence features, thereby enhancing the prediction performance of m5U identification. m5U-GEPred outperformed the state-of-the-art m5U predictors built on two independent species, with an average AUROC of 0.984 and 0.985 tested on human and yeast transcriptomes, respectively. To further validate the performance of our newly proposed framework, the experimentally validated m5U sites identified from Oxford Nanopore Technology (ONT) were collected as independent testing data, and in this project, m5U-GEPred achieved reasonable prediction performance with ACC of 91.84%. We hope that m5U-GEPred should make a useful computational alternative for m5U identification.
Collapse
Affiliation(s)
- Zhongxing Xu
- Department of Public Health, School of Medicine and Holistic Integrative Medicine, Nanjing University of Chinese Medicine, Nanjing, China
- School of AI and Advanced Computing, Xi'an Jiaotong-Liverpool University, Suzhou, China
| | - Xuan Wang
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, China
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, United Kingdom
| | - Jia Meng
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, China
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, United Kingdom
- AI University Research Centre, Xi'an Jiaotong-Liverpool University, Suzhou, China
| | - Lin Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Bowen Song
- Department of Public Health, School of Medicine and Holistic Integrative Medicine, Nanjing University of Chinese Medicine, Nanjing, China
| |
Collapse
|
12
|
Song B, Huang D, Zhang Y, Wei Z, Su J, Pedro de Magalhães J, Rigden DJ, Meng J, Chen K. m6A-TSHub: Unveiling the Context-specific m 6A Methylation and m 6A-affecting Mutations in 23 Human Tissues. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:678-694. [PMID: 36096444 PMCID: PMC10787194 DOI: 10.1016/j.gpb.2022.09.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 08/19/2022] [Accepted: 09/02/2022] [Indexed: 06/15/2023]
Abstract
As the most pervasive epigenetic marker present on mRNAs and long non-coding RNAs (lncRNAs), N6-methyladenosine (m6A) RNA methylation has been shown to participate in essential biological processes. Recent studies have revealed the distinct patterns of m6A methylome across human tissues, and a major challenge remains in elucidating the tissue-specific presence and circuitry of m6A methylation. We present here a comprehensive online platform, m6A-TSHub, for unveiling the context-specific m6A methylation and genetic mutations that potentially regulate m6A epigenetic mark. m6A-TSHub consists of four core components, including (1) m6A-TSDB, a comprehensive database of 184,554 functionally annotated m6A sites derived from 23 human tissues and 499,369 m6A sites from 25 tumor conditions, respectively; (2) m6A-TSFinder, a web server for high-accuracy prediction of m6A methylation sites within a specific tissue from RNA sequences, which was constructed using multi-instance deep neural networks with gated attention; (3) m6A-TSVar, a web server for assessing the impact of genetic variants on tissue-specific m6A RNA modifications; and (4) m6A-CAVar, a database of 587,983 The Cancer Genome Atlas (TCGA) cancer mutations (derived from 27 cancer types) that were predicted to affect m6A modifications in the primary tissue of cancers. The database should make a useful resource for studying the m6A methylome and the genetic factors of epitranscriptome disturbance in a specific tissue (or cancer type). m6A-TSHub is accessible at www.xjtlu.edu.cn/biologicalsciences/m6ats.
Collapse
Affiliation(s)
- Bowen Song
- Key Laboratory of Gastrointestinal Cancer (Fujian Medical University), Ministry of Education, School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350004, China; Department of Mathematical Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China; Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Daiyun Huang
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China; Department of Computer Science, University of Liverpool, Liverpool L69 7ZB, United Kingdom.
| | - Yuxin Zhang
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Zhen Wei
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China; Institute of Ageing & Chronic Disease, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Jionglong Su
- School of AI and Advanced Computing, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
| | - João Pedro de Magalhães
- Institute of Ageing & Chronic Disease, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Daniel J Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Jia Meng
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom; Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China; AI University Research Centre, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Kunqi Chen
- Key Laboratory of Gastrointestinal Cancer (Fujian Medical University), Ministry of Education, School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350004, China.
| |
Collapse
|
13
|
Hu W, Guan L, Li M. Prediction of DNA Methylation based on Multi-dimensional feature encoding and double convolutional fully connected convolutional neural network. PLoS Comput Biol 2023; 19:e1011370. [PMID: 37639434 PMCID: PMC10461834 DOI: 10.1371/journal.pcbi.1011370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 07/18/2023] [Indexed: 08/31/2023] Open
Abstract
DNA methylation takes on critical significance to the regulation of gene expression by affecting the stability of DNA and changing the structure of chromosomes. DNA methylation modification sites should be identified, which lays a solid basis for gaining more insights into their biological functions. Existing machine learning-based methods of predicting DNA methylation have not fully exploited the hidden multidimensional information in DNA gene sequences, such that the prediction accuracy of models is significantly limited. Besides, most models have been built in terms of a single methylation type. To address the above-mentioned issues, a deep learning-based method was proposed in this study for DNA methylation site prediction, termed the MEDCNN model. The MEDCNN model is capable of extracting feature information from gene sequences in three dimensions (i.e., positional information, biological information, and chemical information). Moreover, the proposed method employs a convolutional neural network model with double convolutional layers and double fully connected layers while iteratively updating the gradient descent algorithm using the cross-entropy loss function to increase the prediction accuracy of the model. Besides, the MEDCNN model can predict different types of DNA methylation sites. As indicated by the experimental results,the deep learning method based on coding from multiple dimensions outperformed single coding methods, and the MEDCNN model was highly applicable and outperformed existing models in predicting DNA methylation between different species. As revealed by the above-described findings, the MEDCNN model can be effective in predicting DNA methylation sites.
Collapse
Affiliation(s)
- Wenxing Hu
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, China
| | - Lixin Guan
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, China
| | - Mengshan Li
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, China
| |
Collapse
|
14
|
Meng Q, Schatten H, Zhou Q, Chen J. Crosstalk between m6A and coding/non-coding RNA in cancer and detection methods of m6A modification residues. Aging (Albany NY) 2023; 15:6577-6619. [PMID: 37437245 PMCID: PMC10373953 DOI: 10.18632/aging.204836] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 06/15/2023] [Indexed: 07/14/2023]
Abstract
N6-methyladenosine (m6A) is one of the most common and well-known internal RNA modifications that occur on mRNAs or ncRNAs. It affects various aspects of RNA metabolism, including splicing, stability, translocation, and translation. An abundance of evidence demonstrates that m6A plays a crucial role in various pathological and biological processes, especially in tumorigenesis and tumor progression. In this article, we introduce the potential functions of m6A regulators, including "writers" that install m6A marks, "erasers" that demethylate m6A, and "readers" that determine the fate of m6A-modified targets. We have conducted a review on the molecular functions of m6A, focusing on both coding and noncoding RNAs. Additionally, we have compiled an overview of the effects noncoding RNAs have on m6A regulators and explored the dual roles of m6A in the development and advancement of cancer. Our review also includes a detailed summary of the most advanced databases for m6A, state-of-the-art experimental and sequencing detection methods, and machine learning-based computational predictors for identifying m6A sites.
Collapse
Affiliation(s)
- Qingren Meng
- National Clinical Research Center for Infectious Diseases, Shenzhen Third People’s Hospital, The Second Hospital Affiliated with the Southern University of Science and Technology, Shenzhen, Guangdong Province, China
| | - Heide Schatten
- Department of Veterinary Pathobiology, University of Missouri, Columbia, MO 65211, USA
| | - Qian Zhou
- International Cancer Center, Shenzhen University Medical School, Shenzhen, Guangdong Province, China
| | - Jun Chen
- National Clinical Research Center for Infectious Diseases, Shenzhen Third People’s Hospital, The Second Hospital Affiliated with the Southern University of Science and Technology, Shenzhen, Guangdong Province, China
| |
Collapse
|
15
|
Luo Z, Lou L, Qiu W, Xu Z, Xiao X. Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning. Int J Mol Sci 2022; 23:15490. [PMID: 36555143 PMCID: PMC9778682 DOI: 10.3390/ijms232415490] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 12/03/2022] [Accepted: 12/05/2022] [Indexed: 12/13/2022] Open
Abstract
N6-methyladenosine (m6A) is the most abundant within eukaryotic messenger RNA modification, which plays an essential regulatory role in the control of cellular functions and gene expression. However, it remains an outstanding challenge to detect mRNA m6A transcriptome-wide at base resolution via experimental approaches, which are generally time-consuming and expensive. Developing computational methods is a good strategy for accurate in silico detection of m6A modification sites from the large amount of RNA sequence data. Unfortunately, the existing computational models are usually only for m6A site prediction in a single species, without considering the tissue level of species, while most of them are constructed based on low-confidence level data generated by an m6A antibody immunoprecipitation (IP)-based sequencing method, thereby restricting reliability and generalizability of proposed models. Here, we review recent advances in computational prediction of m6A sites and construct a new computational approach named im6APred using ensemble deep learning to accurately identify m6A sites based on high-confidence level data in multiple tissues of mammals. Our model im6APred builds upon a comprehensive evaluation of multiple classification methods, including four traditional classification algorithms and three deep learning methods and their ensembles. The optimal base-classifier combinations are then chosen by five-fold cross-validation test to achieve an effective stacked model. Our model im6APred can produce the area under the receiver operating characteristic curve (AUROC) in the range of 0.82-0.91 on independent tests, indicating that our model has the ability to learn general methylation rules on RNA bases and generalize to m6A transcriptome-wide identification. Moreover, AUROCs in the range of 0.77-0.96 were achieved using cross-species/tissues validation on the benchmark dataset, demonstrating differences in predictive performance at the tissue level and the need for constructing tissue-specific models for m6A site prediction.
Collapse
Affiliation(s)
| | | | | | - Zhaochun Xu
- Computer Department, Jingdezhen Ceramic University, Jingdezhen 333403, China
| | - Xuan Xiao
- Computer Department, Jingdezhen Ceramic University, Jingdezhen 333403, China
| |
Collapse
|
16
|
Zhang T, Tang Q, Nie F, Zhao Q, Chen W. DeepLncPro: an interpretable convolutional neural network model for identifying long non-coding RNA promoters. Brief Bioinform 2022; 23:6754194. [PMID: 36209437 DOI: 10.1093/bib/bbac447] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 09/14/2022] [Accepted: 09/17/2022] [Indexed: 12/14/2022] Open
Abstract
Long non-coding RNA (lncRNA) plays important roles in a series of biological processes. The transcription of lncRNA is regulated by its promoter. Hence, accurate identification of lncRNA promoter will be helpful to understand its regulatory mechanisms. Since experimental techniques remain time consuming for gnome-wide promoter identification, developing computational tools to identify promoters are necessary. However, only few computational methods have been proposed for lncRNA promoter prediction and their performances still have room to be improved. In the present work, a convolutional neural network based model, called DeepLncPro, was proposed to identify lncRNA promoters in human and mouse. Comparative results demonstrated that DeepLncPro was superior to both state-of-the-art machine learning methods and existing models for identifying lncRNA promoters. Furthermore, DeepLncPro has the ability to extract and analyze transcription factor binding motifs from lncRNAs, which made it become an interpretable model. These results indicate that the DeepLncPro can server as a powerful tool for identifying lncRNA promoters. An open-source tool for DeepLncPro was provided at https://github.com/zhangtian-yang/DeepLncPro.
Collapse
Affiliation(s)
- Tianyang Zhang
- School of Life Sciences, North China University of Science and Technology
| | - Qiang Tang
- School of Basic Medical Sciences, Chengdu University of Traditional Chinese Medicine
| | - Fulei Nie
- School of Life Sciences, North China University of Science and Technology
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning
| | - Wei Chen
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine
| |
Collapse
|
17
|
RNADSN: Transfer-Learning 5-Methyluridine (m5U) Modification on mRNAs from Common Features of tRNA. Int J Mol Sci 2022; 23:ijms232113493. [PMID: 36362279 PMCID: PMC9655583 DOI: 10.3390/ijms232113493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 09/24/2022] [Accepted: 09/29/2022] [Indexed: 11/06/2022] Open
Abstract
One of the most abundant non-canonical bases widely occurring on various RNA molecules is 5-methyluridine (m5U). Recent studies have revealed its influences on the development of breast cancer, systemic lupus erythematosus, and the regulation of stress responses. The accurate identification of m5U sites is crucial for understanding their biological functions. We propose RNADSN, the first transfer learning deep neural network that learns common features between tRNA m5U and mRNA m5U to enhance the prediction of mRNA m5U. Without seeing the experimentally detected mRNA m5U sites, RNADSN has already outperformed the state-of-the-art method, m5UPred. Using mRNA m5U classification as an additional layer of supervision, our model achieved another distinct improvement and presented an average area under the receiver operating characteristic curve (AUC) of 0.9422 and an average precision (AP) of 0.7855. The robust performance of RNADSN was also verified by cross-technical and cross-cellular validation. The interpretation of RNADSN also revealed the sequence motif of common features. Therefore, RNADSN should be a useful tool for studying m5U modification.
Collapse
|
18
|
RNA modifications in aging-associated cardiovascular diseases. Aging (Albany NY) 2022; 14:8110-8136. [PMID: 36178367 PMCID: PMC9596201 DOI: 10.18632/aging.204311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2022] [Accepted: 09/17/2022] [Indexed: 11/25/2022]
Abstract
Cardiovascular disease (CVD) is a leading cause of morbidity and mortality worldwide that bears an enormous healthcare burden and aging is a major contributing factor to CVDs. Functional gene expression network during aging is regulated by mRNAs transcriptionally and by non-coding RNAs epi-transcriptionally. RNA modifications alter the stability and function of both mRNAs and non-coding RNAs and are involved in differentiation, development, and diseases. Here we review major chemical RNA modifications on mRNAs and non-coding RNAs, including N6-adenosine methylation, N1-adenosine methylation, 5-methylcytidine, pseudouridylation, 2′ -O-ribose-methylation, and N7-methylguanosine, in the aging process with an emphasis on cardiovascular aging. We also summarize the currently available methods to detect RNA modifications and the bioinformatic tools to study RNA modifications. More importantly, we discussed the specific implication of the RNA modifications on mRNAs and non-coding RNAs in the pathogenesis of aging-associated CVDs, including atherosclerosis, hypertension, coronary heart diseases, congestive heart failure, atrial fibrillation, peripheral artery disease, venous insufficiency, and stroke.
Collapse
|
19
|
Huang D, Chen K, Song B, Wei Z, Su J, Coenen F, de Magalhães JP, Rigden DJ, Meng J. Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation. Nucleic Acids Res 2022; 50:10290-10310. [PMID: 36155798 PMCID: PMC9561283 DOI: 10.1093/nar/gkac830] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 08/26/2022] [Accepted: 09/15/2022] [Indexed: 12/25/2022] Open
Abstract
As the most pervasive epigenetic mark present on mRNA and lncRNA, N6-methyladenosine (m6A) RNA methylation regulates all stages of RNA life in various biological processes and disease mechanisms. Computational methods for deciphering RNA modification have achieved great success in recent years; nevertheless, their potential remains underexploited. One reason for this is that existing models usually consider only the sequence of transcripts, ignoring the various regions (or geography) of transcripts such as 3′UTR and intron, where the epigenetic mark forms and functions. Here, we developed three simple yet powerful encoding schemes for transcripts to capture the submolecular geographic information of RNA, which is largely independent from sequences. We show that m6A prediction models based on geographic information alone can achieve comparable performances to classic sequence-based methods. Importantly, geographic information substantially enhances the accuracy of sequence-based models, enables isoform- and tissue-specific prediction of m6A sites, and improves m6A signal detection from direct RNA sequencing data. The geographic encoding schemes we developed have exhibited strong interpretability, and are applicable to not only m6A but also N1-methyladenosine (m1A), and can serve as a general and effective complement to the widely used sequence encoding schemes in deep learning applications concerning RNA transcripts.
Collapse
Affiliation(s)
- Daiyun Huang
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, PR China.,Department of Computer Sciences, University of Liverpool, Liverpool L69 7ZB, UK
| | - Kunqi Chen
- Key Laboratory of Gastrointestinal Cancer (Fujian Medical University), Ministry of Education, School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350004, PR China
| | - Bowen Song
- Department of Mathematical Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, PR China.,Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, UK
| | - Zhen Wei
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, PR China.,Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool L69 7ZB, UK
| | - Jionglong Su
- Department of Mathematical Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, PR China.,School of AI and Advanced Computing, Xi'an Jiaotong-Liverpool University, Suzhou 215123, PR China
| | - Frans Coenen
- Department of Computer Sciences, University of Liverpool, Liverpool L69 7ZB, UK
| | - João Pedro de Magalhães
- Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool L69 7ZB, UK
| | - Daniel J Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, UK
| | - Jia Meng
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, PR China.,Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, UK.,AI University Research Centre, Xi'an Jiaotong-Liverpool University, Suzhou 215123, PR China
| |
Collapse
|
20
|
DLm6Am: A Deep-Learning-Based Tool for Identifying N6,2′-O-Dimethyladenosine Sites in RNA Sequences. Int J Mol Sci 2022; 23:ijms231911026. [PMID: 36232325 PMCID: PMC9570463 DOI: 10.3390/ijms231911026] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 09/10/2022] [Accepted: 09/15/2022] [Indexed: 11/25/2022] Open
Abstract
N6,2′-O-dimethyladenosine (m6Am) is a post-transcriptional modification that may be associated with regulatory roles in the control of cellular functions. Therefore, it is crucial to accurately identify transcriptome-wide m6Am sites to understand underlying m6Am-dependent mRNA regulation mechanisms and biological functions. Here, we used three sequence-based feature-encoding schemes, including one-hot, nucleotide chemical property (NCP), and nucleotide density (ND), to represent RNA sequence samples. Additionally, we proposed an ensemble deep learning framework, named DLm6Am, to identify m6Am sites. DLm6Am consists of three similar base classifiers, each of which contains a multi-head attention module, an embedding module with two parallel deep learning sub-modules, a convolutional neural network (CNN) and a Bi-directional long short-term memory (BiLSTM), and a prediction module. To demonstrate the superior performance of our model’s architecture, we compared multiple model frameworks with our method by analyzing the training data and independent testing data. Additionally, we compared our model with the existing state-of-the-art computational methods, m6AmPred and MultiRM. The accuracy (ACC) for the DLm6Am model was improved by 6.45% and 8.42% compared to that of m6AmPred and MultiRM on independent testing data, respectively, while the area under receiver operating characteristic curve (AUROC) for the DLm6Am model was increased by 4.28% and 5.75%, respectively. All the results indicate that DLm6Am achieved the best prediction performance in terms of ACC, Matthews correlation coefficient (MCC), AUROC, and the area under precision and recall curves (AUPR). To further assess the generalization performance of our proposed model, we implemented chromosome-level leave-out cross-validation, and found that the obtained AUROC values were greater than 0.83, indicating that our proposed method is robust and can accurately predict m6Am sites.
Collapse
|
21
|
Liu C, Guo Z, Yang Y, Hu B, Zhu L, Li M, Gu Z, Xin Y, Sun H, Guan Y, Zhang L. Identification of dipeptidyl peptidase-IV inhibitory peptides from yak bone collagen by in silico and in vitro analysis. Eur Food Res Technol 2022. [DOI: 10.1007/s00217-022-04111-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
22
|
Charoute H, Elkarhat Z, Elkhattabi L, El Fahime E, Oukkache N, Rouba H, Barakat A. Computational screening of potential drugs against COVID-19 disease: the Neuropilin-1 receptor as molecular target. Virusdisease 2022; 33:23-31. [PMID: 35079600 PMCID: PMC8776366 DOI: 10.1007/s13337-021-00751-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2021] [Accepted: 11/01/2021] [Indexed: 12/28/2022] Open
Abstract
The transmembrane receptor Neuropilin-1 (NRP-1) was reported to serve as a host cell entry factor for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causal agent of COVID-19 disease. Therefore, molecular compounds interfering with SARS-CoV-2 binding to NRP-1 seem to be potential candidates as new antiviral drugs. In this study, NRP-1 receptor was targeted using a library of 1167 compounds previously analyzed in COVID-19 related studies. The results show the effectiveness of Nafamostat, Y96, Selinexor, Ebastine and UGS, in binding to NRP-1 receptor, with docking scores lower than - 8.2 kcal/mol. These molecules interact with NRP-1 receptor key residues, which makes them promising drugs to pursue further biological assays to explore their potential use in the treatment of COVID-19. Supplementary Information The online version contains supplementary material available at 10.1007/s13337-021-00751-x.
Collapse
Affiliation(s)
- Hicham Charoute
- Research Unit of Epidemiology, Biostatistics and Bioinformatics, 1, Place Louis Pasteur, Institut Pasteur du Maroc, 20360 Casablanca, Morocco
- Laboratory of Genomics and Human Genetics, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Zouhair Elkarhat
- Laboratory of Genomics and Human Genetics, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Lamiae Elkhattabi
- Laboratory of Genomics and Human Genetics, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Elmostafa El Fahime
- Molecular Biology and Functional Genomics Platform, National Center for Scientific and Technical Research, Rabat, Morocco
| | - Naoual Oukkache
- Laboratory of Venoms and Toxins, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Hassan Rouba
- Laboratory of Genomics and Human Genetics, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Abdelhamid Barakat
- Laboratory of Genomics and Human Genetics, Institut Pasteur du Maroc, Casablanca, Morocco
| |
Collapse
|
23
|
Ao C, Zou Q, Yu L. NmRF: identification of multispecies RNA 2'-O-methylation modification sites from RNA sequences. Brief Bioinform 2021; 23:6446272. [PMID: 34850821 DOI: 10.1093/bib/bbab480] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 10/05/2021] [Accepted: 10/18/2021] [Indexed: 12/12/2022] Open
Abstract
2'-O-methylation (Nm) is a post-transcriptional modification of RNA that is catalyzed by 2'-O-methyltransferase and involves replacing the H on the 2'-hydroxyl group with a methyl group. The 2'-O-methylation modification site is detected in a variety of RNA types (miRNA, tRNA, mRNA, etc.), plays an important role in biological processes and is associated with different diseases. There are few functional mechanisms developed at present, and traditional high-throughput experiments are time-consuming and expensive to explore functional mechanisms. For a deeper understanding of relevant biological mechanisms, it is necessary to develop efficient and accurate recognition tools based on machine learning. Based on this, we constructed a predictor called NmRF based on optimal mixed features and random forest classifier to identify 2'-O-methylation modification sites. The predictor can identify modification sites of multiple species at the same time. To obtain a better prediction model, a two-step strategy is adopted; that is, the optimal hybrid feature set is obtained by combining the light gradient boosting algorithm and incremental feature selection strategy. In 10-fold cross-validation, the accuracies of Homo sapiens and Saccharomyces cerevisiae were 89.069 and 93.885%, and the AUC were 0.9498 and 0.9832, respectively. The rigorous 10-fold cross-validation and independent tests confirm that the proposed method is significantly better than existing tools. A user-friendly web server is accessible at http://lab.malab.cn/∼acy/NmRF.
Collapse
Affiliation(s)
- Chunyan Ao
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| |
Collapse
|
24
|
Dung DTM, Park EJ, Anh DT, Hai PT, Huy LD, Jun HW, Kwon JH, Young Ji A, Kang JS, Tung TT, Dung PTP, Han SB, Nam NH. Design, synthesis, and evaluation of novel (E)-N'-(3-allyl-2-hydroxy)benzylidene-2-(4-oxoquinazolin-3(4H)-yl)acetohydrazides as antitumor agents. Arch Pharm (Weinheim) 2021; 355:e2100216. [PMID: 34674294 DOI: 10.1002/ardp.202100216] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 09/29/2021] [Accepted: 10/01/2021] [Indexed: 12/12/2022]
Abstract
In our continuing search for novel small-molecule anticancer agents, we designed and synthesized a series of novel (E)-N'-(3-allyl-2-hydroxy)benzylidene-2-(4-oxoquinazolin-3(4H)-yl)acetohydrazides (5), focusing on the modification of substitution in the quinazolin-4(3H)-one moiety. The biological evaluation showed that all 13 designed and synthesized compounds displayed significant cytotoxicity against three human cancer cell lines (SW620, colon cancer; PC-3, prostate cancer; NCI-H23, lung cancer). The most potent compound 5l displayed cytotoxicity up to 213-fold more potent than 5-fluorouracil and 87-fold more potent than PAC-1, the first procaspase-activating compound. Structure-activity relationship analysis revealed that substitution of either electron-withdrawing or electron-releasing groups at positions 6 or 7 on the quinazolin-4(3H)-4-one moiety increased the cytotoxicity of the compounds, but substitution at position 6 seemed to be more favorable. In the caspase activation assay, compound 5l was found to activate the caspase activity by 291% in comparison to PAC-1, which was used as a control. Further docking simulation also revealed that this compound may be a potent allosteric inhibitor of procaspase-3 through chelation of the inhibitory zinc ion. Physicochemical and ADMET calculations for 5l provided useful information of its suitable absorption profile and some toxicological effects that need further optimization to be developed as a promising anticancer agent.
Collapse
Affiliation(s)
- Do T M Dung
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, Hanoi, Vietnam
| | - Eun J Park
- College of Pharmacy, Chungbuk National University, Cheongju, Chungbuk, Republic of Korea
| | - Duong T Anh
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, Hanoi, Vietnam
| | - Pham-The Hai
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, Hanoi, Vietnam
| | - Le D Huy
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, Hanoi, Vietnam
| | - Hye W Jun
- College of Pharmacy, Chungbuk National University, Cheongju, Chungbuk, Republic of Korea
| | - Joo-Hee Kwon
- Korea Research Institute of Bioscience and Biotechnology, Cheongju, Chungbuk, Republic of Korea
| | - A Young Ji
- College of Pharmacy, Chungbuk National University, Cheongju, Chungbuk, Republic of Korea
| | - Jong S Kang
- Korea Research Institute of Bioscience and Biotechnology, Cheongju, Chungbuk, Republic of Korea
| | - Truong T Tung
- Faculty of Pharmacy, PHENIKAA University, Hanoi, Vietnam
- PHENIKAA Institute for Advanced Study (PIAS), PHENIKAA University, Hanoi, Vietnam
| | - Phan T P Dung
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, Hanoi, Vietnam
| | - Sang-Bae Han
- College of Pharmacy, Chungbuk National University, Cheongju, Chungbuk, Republic of Korea
| | - Nguyen-Hai Nam
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, Hanoi, Vietnam
| |
Collapse
|