1
|
Huang J, Wang X, Xia R, Yang D, Liu J, Lv Q, Yu X, Meng J, Chen K, Song B, Wang Y. Domain-knowledge enabled ensemble learning of 5-formylcytosine (f5C) modification sites. Comput Struct Biotechnol J 2024; 23:3175-3185. [PMID: 39253057 PMCID: PMC11381828 DOI: 10.1016/j.csbj.2024.08.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Revised: 08/07/2024] [Accepted: 08/07/2024] [Indexed: 09/11/2024] [Imported: 10/05/2024] Open
Abstract
5-formylcytidine (f5C) is a unique post-transcriptional RNA modification found in mRNA and tRNA at the wobble site, playing a crucial role in mitochondrial protein synthesis and potentially contributing to the regulation of translation. Recent studies have unveiled that the f5C modifications may drive mitochondrial mRNA translation to power cancer metastasis. Accurate identification of f5C sites is essential for further unraveling their molecular functions and regulatory mechanisms, but there are currently no computational methods available for predicting their locations. In this study, we introduce an innovative ensemble approach, successfully enabling the computational recognition of Saccharomyces cerevisiae f5C. We conducted a comprehensive model selection process that involved multiple basic machine learning and deep learning algorithms such as recurrent neural networks, convolutional neural networks and Transformer-based models. Initially trained only on sequence information, these individual models achieved an AUROC ranging from 0.7104 to 0.7492. Through the integration of 32 novel domain-derived genomic features, the performance of individual models has significantly improved to an AUROC between 0.7309 and 0.8076. To further enhance accuracy and robustness, we then constructed the ensembles of these individual models with different combinations. The best performance attained by our ensemble models reached an AUROC of 0.8391. Shapley additive explanations were conducted to explain the significant contributions of genomic features, providing insights into the putative distribution of f5C across various topological regions and potentially paving the way for revealing their functional relevance within distinct genomic contexts. A freely accessible web server that allows real-time analysis of user-uploaded sites can be accessed at: www.rnamd.org/Resf5C-Pred.
Collapse
|
2
|
Xu X, Wu S, Zhang Y, Fan W, Lin X, Chen K, Lin X. m6A modification of VEGFA mRNA by RBM15/YTHDF2/IGF2BP3 contributes to angiogenesis of hepatocellular carcinoma. Mol Carcinog 2024; 63:2174-2189. [PMID: 39092767 DOI: 10.1002/mc.23802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 07/19/2024] [Accepted: 07/24/2024] [Indexed: 08/04/2024] [Imported: 10/05/2024]
Abstract
Vascular endothelial growth factor A (VEGFA) plays a critical role as a potent angiogenesis factor and is highly expressed in hepatocellular carcinoma (HCC). Although the expression of VEGFA has been strongly linked to the aggressive nature of HCC, the specific posttranscriptional modifications that might contribute to VEGFA expression and HCC angiogenesis are not yet well understood. In this study, we aimed to investigate the epitranscriptome regulation of VEGFA in HCC. A comprehensive analysis integrating MeRIP-seq, RNA-seq, and crosslinking-immunprecipitation-seq data revealed that VEGFA was hypermethylated in HCC and identified the potential m6A regulators of VEGFA including a m6A methyltransferase complex component RBM15 and the two readers, YTHDF2 and IGF2BP3. Through rigorous cell and molecular biology experiments, RBM15 was validated as a key component of methyltransferase complex responsible for m6A methylation of VEGFA, which was subsequently recognized and stabilized by IGF2BP3 and YTHDF2, leading to enhanced VEGFA expression and VEGFA-related functions such as human umbilical vascular endothelial cells (HUVEC) migration and tube formation. In the HCC xenograft model, knockdown of RBM15, IGF2BP3, or YTHDF2 resulted in reduced expression of VEGFA, accompanied by significant inhibition of tumor growth closely associated with VEGFA expression and angiogenesis. Furthermore, our analysis of HCC clinical samples identified positive correlations between the expression levels of VEGFA and the regulators RBM15, IGF2BP3, and YTHDF2. Collectively, these findings offer novel insights into the posttranscriptional modulation of VEGFA and provide potential avenues for alternative approaches to antiangiogenesis therapy targeting VEGFA.
Collapse
MESH Headings
- Humans
- Carcinoma, Hepatocellular/genetics
- Carcinoma, Hepatocellular/pathology
- Carcinoma, Hepatocellular/metabolism
- RNA-Binding Proteins/genetics
- RNA-Binding Proteins/metabolism
- Liver Neoplasms/genetics
- Liver Neoplasms/pathology
- Liver Neoplasms/metabolism
- Vascular Endothelial Growth Factor A/genetics
- Vascular Endothelial Growth Factor A/metabolism
- Animals
- Neovascularization, Pathologic/genetics
- Neovascularization, Pathologic/metabolism
- Neovascularization, Pathologic/pathology
- Mice
- RNA, Messenger/genetics
- Gene Expression Regulation, Neoplastic
- Human Umbilical Vein Endothelial Cells
- Mice, Nude
- Cell Line, Tumor
- Adenosine/metabolism
- Adenosine/genetics
- Adenosine/analogs & derivatives
- Cell Proliferation/genetics
- Mice, Inbred BALB C
- Xenograft Model Antitumor Assays
- Male
- Cell Movement/genetics
- Angiogenesis
Collapse
|
3
|
Lin X, Zhang J, Wu Z, Shi Y, Chen M, Li M, Hu H, Tian K, Lv X, Li C, Liu Y, Gao X, Yang Q, Chen K, Zhu A. Involvement of autophagy in mesaconitine-induced neurotoxicity in HT22 cells revealed through integrated transcriptomic, proteomic, and m6A epitranscriptomic profiling. Front Pharmacol 2024; 15:1393717. [PMID: 38939838 PMCID: PMC11208636 DOI: 10.3389/fphar.2024.1393717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 05/20/2024] [Indexed: 06/29/2024] [Imported: 10/05/2024] Open
Abstract
Background: Mesaconitine (MA), a diester-diterpenoid alkaloid extracted from the medicinal herb Aconitum carmichaelii, is commonly used to treat various diseases. Previous studies have indicated the potent toxicity of aconitum despite its pharmacological activities, with limited understanding of its effects on the nervous system and the underlying mechanisms. Methods: HT22 cells and zebrafish were used to investigate the neurotoxic effects of MA both in vitro and in vivo, employing multi-omics techniques to explore the potential mechanisms of toxicity. Results: Our results demonstrated that treatment with MA induces neurotoxicity in zebrafish and HT22 cells. Subsequent analysis revealed that MA induced oxidative stress, as well as structural and functional damage to mitochondria in HT22 cells, accompanied by an upregulation of mRNA and protein expression related to autophagic and lysosomal pathways. Furthermore, methylated RNA immunoprecipitation sequencing (MeRIP-seq) showed a correlation between the expression of autophagy-related genes and N6-methyladenosine (m6A) modification following MA treatment. In addition, we identified METTL14 as a potential regulator of m6A methylation in HT22 cells after exposure to MA. Conclusion: Our study has contributed to a thorough mechanistic elucidation of the neurotoxic effects caused by MA, and has provided valuable insights for optimizing the rational utilization of traditional Chinese medicine formulations containing aconitum in clinical practice.
Collapse
|
4
|
Wu Z, Zhang J, Wu Y, Chen M, Hu H, Gao X, Li C, Li M, Zhang Y, Lin X, Yang Q, Chen L, Chen K, Zheng L, Zhu A. Gelsenicine disrupted the intestinal barrier of Caenorhabditis elegans. Chem Biol Interact 2024; 395:111036. [PMID: 38705443 DOI: 10.1016/j.cbi.2024.111036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 04/23/2024] [Accepted: 05/03/2024] [Indexed: 05/07/2024] [Imported: 10/05/2024]
Abstract
Gelsemium elegans Benth. (G. elegans) is a traditional medicinal herb that has anti-inflammatory, analgesic, sedative, and detumescence effects. However, it can also cause intestinal side effects such as abdominal pain and diarrhea. The toxicological mechanisms of gelsenicine are still unclear. The objective of this study was to assess enterotoxicity induced by gelsenicine in the nematodes Caenorhabditis elegans (C. elegans). The nematodes were treated with gelsenicine, and subsequently their growth, development, and locomotion behavior were evaluated. The targets of gelsenicine were predicted using PharmMapper. mRNA-seq was performed to verify the predicted targets. Intestinal permeability, ROS generation, and lipofuscin accumulation were measured. Additionally, the fluorescence intensities of GFP-labeled proteins involved in oxidative stress and unfolded protein response in endoplasmic reticulum (UPRER) were quantified. As a result, the treatment of gelsenicine resulted in the inhibition of nematode lifespan, as well as reductions in body length, width, and locomotion behavior. A total of 221 targets were predicted by PharmMapper, and 731 differentially expressed genes were screened out by mRNA-seq. GO and KEGG enrichment analysis revealed involvement in redox process and transmembrane transport. The permeability assay showed leakage of blue dye from the intestinal lumen into the body cavity. Abnormal mRNAs expression of gem-4, hmp-1, fil-2, and pho-1, which regulated intestinal development, absorption and catabolism, transmembrane transport, and apical junctions, was observed. Intestinal lipofuscin and ROS were increased, while sod-2 and isp-1 expressions were decreased. Multiple proteins in SKN-1/DAF-16 pathway were found to bind stably with gelsenicine in a predictive model. There was an up-regulation in the expression of SKN-1:GFP, while the nuclear translocation of DAF-16:GFP exhibited abnormality. The UPRER biomarker HSP-4:GFP was down-regulated. In conclusion, the treatment of gelsenicine resulted in the increase of nematode intestinal permeability. The toxicological mechanisms underlying this effect involved the disruption of intestinal barrier integrity, an imbalance between oxidative and antioxidant processes mediated by the SKN-1/DAF-16 pathway, and abnormal unfolded protein reaction.
Collapse
|
5
|
Shen X, Chen M, Zhang J, Lin Y, Gao X, Tu J, Chen K, Zhu A, Xu S. Unveiling the Impact of ApoF Deficiency on Liver and Lipid Metabolism: Insights from Transcriptome-Wide m6A Methylome Analysis in Mice. Genes (Basel) 2024; 15:347. [PMID: 38540406 PMCID: PMC10970566 DOI: 10.3390/genes15030347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 02/23/2024] [Accepted: 03/07/2024] [Indexed: 04/02/2024] [Imported: 10/05/2024] Open
Abstract
Lipid metabolism participates in various physiological processes and has been shown to be connected to the development and progression of multiple diseases, especially metabolic hepatopathy. Apolipoproteins (Apos) act as vectors that combine with lipids, such as cholesterol and triglycerides (TGs). Despite being involved in lipid transportation and metabolism, the critical role of Apos in the maintenance of lipid metabolism has still not been fully revealed. This study sought to clarify variations related to m6A methylome in ApoF gene knockout mice with disordered lipid metabolism based on the bioinformatics method of transcriptome-wide m6A methylome epitranscriptomics. High-throughput methylated RNA immunoprecipitation sequencing (MeRIP-seq) was conducted in both wild-type (WT) and ApoF knockout (KO) mice. As a result, the liver histopathology presented vacuolization and steatosis, and the serum biochemical assays reported abnormal lipid content in KO mice. The m6A-modified mRNAs were conformed consensus sequenced in eukaryotes, and the distribution was enriched within the coding sequences and 3' non-coding regions. In KO mice, the functional annotation terms of the differentially expressed genes (DEGs) included cholesterol, steroid and lipid metabolism, and lipid storage. In the differentially m6A-methylated mRNAs, the functional annotation terms included cholesterol, TG, and long-chain fatty acid metabolic processes; lipid transport; and liver development. The overlapping DEGs and differential m6A-modified mRNAs were also enriched in terms of lipid metabolism disorder. In conclusion, transcriptome-wide MeRIP sequencing in ApoF KO mice demonstrated the role of this crucial apolipoprotein in liver health and lipid metabolism.
Collapse
|
6
|
Gao Y, Ren J, Chen K, Guan G. Construction and validation of a prognostic signature for mucinous colonic adenocarcinoma based on N7-methylguanosine-related long non-coding RNAs. J Gastrointest Oncol 2024; 15:203-219. [PMID: 38482248 PMCID: PMC10932661 DOI: 10.21037/jgo-23-980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 02/21/2024] [Indexed: 10/05/2024] [Imported: 10/05/2024] Open
Abstract
BACKGROUND Mucinous colonic adenocarcinoma remains a challenging disease due to its high propensity for metastasis and recurrence. N7-methylguanosine (m7G) and long non-coding RNA (lncRNA) are closely associated with the occurrence and progression of tumors. However, research on m7G-related lncRNA in mucinous colonic adenocarcinoma is lacking. Therefore, we sought to explore the prognostic impact of m7G-related lncRNAs in mucinous adenocarcinoma (MC) patients. METHODS In this study, Pearson analysis was used to identify m7G-related lncRNAs from transcriptome data in The Cancer Genome Atlas (TCGA). Univariate Cox regression analysis and least absolute shrinkage and selection operator (LASSO) regression were used to further screen m7G-related lncRNAs and incorporate them into a prognostic signature. Based on the risk model, patients were divided into low- and high-risk groups and randomly assigned to the training set and test sets in a 6:4 ratio. Kaplan-Meier, receiver operating characteristic (ROC) curve, multivariate regression, and nomogram analyses were used to confirm the accuracy of the signature. The CIBERSORT algorithm was used to calculate the degree of immune cell infiltration (ICI). Finally, the correlation of the prognostic signature with tumor mutational burden (TMB) and immunophenotype score (IPS) was evaluated. RESULTS A total of 432 m7G-related lncRNAs were identified by Pearson analysis. Univariate Cox regression, LASSO regression and survival analysis were performed to further select six m7G-related lncRNAs (P<0.05): AC254629.1, LINC01133, LINC01134, MHENCR, SMIM2-AS1, and XACT. Based on the risk model, heat maps, Kaplan-Meier curves, and ROC curves were constructed, and the results showed that there were significant differences in expression levels and survival status between the two risk groups. The area under the ROC curve (AUC) values for 3-, 5-, and 10-year survival in the training set were 0.944, 0.957, and 1.000, respectively. And in the test set were 0.964, 1.000, and 1.000, respectively. Subsequently, univariate and multivariate regression analyses of clinical characteristics and risk score were performed. The results of risk score were [hazard ratio (HR): 6.458, 95% confidence interval (CI): 2.708-15.403, P<0.001; HR: 7.280, 95% CI: 2.500-21.203, P<0.001], respectively. Using the risk score as an independent prognostic factor, the AUC of it over 3, 5, and 10 years was 0.911, 0.955, and 0.961, respectively. Calibration plots for the nomogram show that the model calibration line is very close to the ideal calibration line, indicating good calibration. The level of ICI was significantly different in the different risk groups. Survival analysis showed that, regardless of TMB risk, patients with MC and a high-risk score consistently had a poor overall survival (OS). CONCLUSIONS The m7G-related lncRNA prognostic signature has potential value for the prognosis of mucinous colonic adenocarcinoma.
Collapse
|
7
|
Liang Z, Ye H, Ma J, Wei Z, Wang Y, Zhang Y, Huang D, Song B, Meng J, Rigden DJ, Chen K. m6A-Atlas v2.0: updated resources for unraveling the N6-methyladenosine (m6A) epitranscriptome among multiple species. Nucleic Acids Res 2024; 52:D194-D202. [PMID: 37587690 PMCID: PMC10768109 DOI: 10.1093/nar/gkad691] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 08/02/2023] [Accepted: 08/10/2023] [Indexed: 08/18/2023] [Imported: 10/05/2024] Open
Abstract
N 6-Methyladenosine (m6A) is one of the most abundant internal chemical modifications on eukaryote mRNA and is involved in numerous essential molecular functions and biological processes. To facilitate the study of this important post-transcriptional modification, we present here m6A-Atlas v2.0, an updated version of m6A-Atlas. It was expanded to include a total of 797 091 reliable m6A sites from 13 high-resolution technologies and two single-cell m6A profiles. Additionally, three methods (exomePeaks2, MACS2 and TRESS) were used to identify >16 million m6A enrichment peaks from 2712 MeRIP-seq experiments covering 651 conditions in 42 species. Quality control results of MeRIP-seq samples were also provided to help users to select reliable peaks. We also estimated the condition-specific quantitative m6A profiles (i.e. differential methylation) under 172 experimental conditions for 19 species. Further, to provide insights into potential functional circuitry, the m6A epitranscriptomics were annotated with various genomic features, interactions with RNA-binding proteins and microRNA, potentially linked splicing events and single nucleotide polymorphisms. The collected m6A sites and their functional annotations can be freely queried and downloaded via a user-friendly graphical interface at: http://rnamd.org/m6a.
Collapse
|
8
|
Wang X, Zhang Y, Chen K, Liang Z, Ma J, Xia R, de Magalhães JP, Rigden DJ, Meng J, Song B. m7GHub V2.0: an updated database for decoding the N7-methylguanosine (m7G) epitranscriptome. Nucleic Acids Res 2024; 52:D203-D212. [PMID: 37811871 PMCID: PMC10767970 DOI: 10.1093/nar/gkad789] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 08/18/2023] [Accepted: 09/18/2023] [Indexed: 10/10/2023] [Imported: 10/05/2024] Open
Abstract
With recent progress in mapping N7-methylguanosine (m7G) RNA methylation sites, tens of thousands of experimentally validated m7G sites have been discovered in various species, shedding light on the significant role of m7G modification in regulating numerous biological processes including disease pathogenesis. An integrated resource that enables the sharing, annotation and customized analysis of m7G data will greatly facilitate m7G studies under various physiological contexts. We previously developed the m7GHub database to host mRNA m7G sites identified in the human transcriptome. Here, we present m7GHub v.2.0, an updated resource for a comprehensive collection of m7G modifications in various types of RNA across multiple species: an m7GDB database containing 430 898 putative m7G sites identified in 23 species, collected from both widely applied next-generation sequencing (NGS) and the emerging Oxford Nanopore direct RNA sequencing (ONT) techniques; an m7GDiseaseDB hosting 156 206 m7G-associated variants (involving addition or removal of an m7G site), including 3238 disease-relevant m7G-SNPs that may function through epitranscriptome disturbance; and two enhanced analysis modules to perform interactive analyses on the collections of m7G sites (m7GFinder) and functional variants (m7GSNPer). We expect that m7Ghub v.2.0 should serve as a valuable centralized resource for studying m7G modification. It is freely accessible at: www.rnamd.org/m7GHub2.
Collapse
|
9
|
Chen X, Hu H, Lin X, Chen M, Bao W, Wu Y, Li C, Gao Y, Hou S, Yang Q, Chen L, Zhang J, Chen K, Wang Q, Zhu A. Euphorbia factor L1 inhibited transport channel and energy metabolism in human colon adenocarcinoma cell line Caco-2. Biomed Pharmacother 2023; 169:115919. [PMID: 37992574 DOI: 10.1016/j.biopha.2023.115919] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 11/10/2023] [Accepted: 11/20/2023] [Indexed: 11/24/2023] [Imported: 10/05/2024] Open
Abstract
Euphorbia factor L1 (EFL1) is a kind of lathyrane-type diterpenoid and is isolated from the medical herb Euphorbia lathyris L. (Euphorbiaceae); it has been reported with the toxicity that causes intestinal irritation, but the underlying mechanisms are still obscure. The objective of this study was to assess the EFL1-induced intestinal cytotoxicity in human colon adenocarcinoma Caco-2 cells. The Caco-2 cells were treated with EFL1, and the intracellular calcium ion concentration, mitochondrial membrane potential (MMP), mitochondrial permeability transition pore (mPTP), adenosine 5'-triphosphate (ATP) content, ATPase activities, TGF-β1 concentration, and transepithelial electrical resistance (TEER) were detected. The interaction between EFL1 and the tight junction proteins Occludin, Claudin-4, Tricellulin, ZO-1, JAM-1, and E-cadherin was simulated by molecular docking. The expression of proteins involved in the energy metabolism, the ion transporters and aquaporins, the tight junction, and the F-actin cytoskeleton were detected by Western blotting and cell immunofluorescence. As a result, EFL1 decreased the intracellular Ca2+, MMP, mPTP, ATP content, and ATPase activities in the Caco-2 cells. The AMPK/SIRT1/PGC-1α signaling pathway, which regulates the energy metabolism, was inhibited. The ion transporters NEH and CFTR, as well as the aquaporins in the Caco-2 cells, were decreased. The tight junction proteins were down-regulated, and the integrity of the intestinal barrier was injured; TGF-β1 was compensatively increased; so, the intestinal permeability was increased and was characterized by decreased TEER. The morphology of the F-actin cytoskeleton was destroyed. These findings indicated that EFL1 caused cytotoxicity in the human intestinal Caco-2 cells through mitochondrial damage, inhibition of the energy metabolism, and suppression of the ion and water molecule transporters, as well as the down-regulation tight junction and cytoskeleton protiens.
Collapse
|
10
|
Song B, Huang D, Zhang Y, Wei Z, Su J, Pedro de Magalhães J, Rigden DJ, Meng J, Chen K. m6A-TSHub: Unveiling the Context-specific m 6A Methylation and m 6A-affecting Mutations in 23 Human Tissues. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:678-694. [PMID: 36096444 PMCID: PMC10787194 DOI: 10.1016/j.gpb.2022.09.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 08/19/2022] [Accepted: 09/02/2022] [Indexed: 06/15/2023] [Imported: 10/05/2024]
Abstract
As the most pervasive epigenetic marker present on mRNAs and long non-coding RNAs (lncRNAs), N6-methyladenosine (m6A) RNA methylation has been shown to participate in essential biological processes. Recent studies have revealed the distinct patterns of m6A methylome across human tissues, and a major challenge remains in elucidating the tissue-specific presence and circuitry of m6A methylation. We present here a comprehensive online platform, m6A-TSHub, for unveiling the context-specific m6A methylation and genetic mutations that potentially regulate m6A epigenetic mark. m6A-TSHub consists of four core components, including (1) m6A-TSDB, a comprehensive database of 184,554 functionally annotated m6A sites derived from 23 human tissues and 499,369 m6A sites from 25 tumor conditions, respectively; (2) m6A-TSFinder, a web server for high-accuracy prediction of m6A methylation sites within a specific tissue from RNA sequences, which was constructed using multi-instance deep neural networks with gated attention; (3) m6A-TSVar, a web server for assessing the impact of genetic variants on tissue-specific m6A RNA modifications; and (4) m6A-CAVar, a database of 587,983 The Cancer Genome Atlas (TCGA) cancer mutations (derived from 27 cancer types) that were predicted to affect m6A modifications in the primary tissue of cancers. The database should make a useful resource for studying the m6A methylome and the genetic factors of epitranscriptome disturbance in a specific tissue (or cancer type). m6A-TSHub is accessible at www.xjtlu.edu.cn/biologicalsciences/m6ats.
Collapse
|
11
|
Chen K, Picardi E, Han X, Nigita G. Editorial: RNA modifications and epitranscriptomics, Volume II. Front Genet 2023; 14:1229046. [PMID: 37351345 PMCID: PMC10282930 DOI: 10.3389/fgene.2023.1229046] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 05/30/2023] [Indexed: 06/24/2023] [Imported: 10/05/2024] Open
|
12
|
Song B, Wang X, Liang Z, Ma J, Huang D, Wang Y, de Magalhães JP, Rigden DJ, Meng J, Liu G, Chen K, Wei Z. RMDisease V2.0: an updated database of genetic variants that affect RNA modifications with disease and trait implication. Nucleic Acids Res 2023; 51:D1388-D1396. [PMID: 36062570 PMCID: PMC9825452 DOI: 10.1093/nar/gkac750] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 08/02/2022] [Accepted: 08/24/2022] [Indexed: 01/30/2023] [Imported: 10/05/2024] Open
Abstract
Recent advances in epitranscriptomics have unveiled functional associations between RNA modifications (RMs) and multiple human diseases, but distinguishing the functional or disease-related single nucleotide variants (SNVs) from the majority of 'silent' variants remains a major challenge. We previously developed the RMDisease database for unveiling the association between genetic variants and RMs concerning human disease pathogenesis. In this work, we present RMDisease v2.0, an updated database with expanded coverage. Using deep learning models and from 873 819 experimentally validated RM sites, we identified a total of 1 366 252 RM-associated variants that may affect (add or remove an RM site) 16 different types of RNA modifications (m6A, m5C, m1A, m5U, Ψ, m6Am, m7G, A-to-I, ac4C, Am, Cm, Um, Gm, hm5C, D and f5C) in 20 organisms (human, mouse, rat, zebrafish, maize, fruit fly, yeast, fission yeast, Arabidopsis, rice, chicken, goat, sheep, pig, cow, rhesus monkey, tomato, chimpanzee, green monkey and SARS-CoV-2). Among them, 14 749 disease- and 2441 trait-associated genetic variants may function via the perturbation of epitranscriptomic markers. RMDisease v2.0 should serve as a useful resource for studying the genetic drivers of phenotypes that lie within the epitranscriptome layer circuitry, and is freely accessible at: www.rnamd.org/rmdisease2.
Collapse
|
13
|
Hu J, Pan D, Li G, Chen K, Hu X. Regulation of programmed cell death by Brd4. Cell Death Dis 2022; 13:1059. [PMID: 36539410 PMCID: PMC9767942 DOI: 10.1038/s41419-022-05505-1] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 12/04/2022] [Accepted: 12/07/2022] [Indexed: 12/24/2022] [Imported: 10/05/2024]
Abstract
Epigenetic factor Brd4 has emerged as a key regulator of cancer cell proliferation. Targeted inhibition of Brd4 suppresses growth and induces apoptosis of various cancer cells. In addition to apoptosis, Brd4 has also been shown to regulate several other forms of programmed cell death (PCD), including autophagy, necroptosis, pyroptosis, and ferroptosis, with different biological outcomes. PCD plays key roles in development and tissue homeostasis by eliminating unnecessary or detrimental cells. Dysregulation of PCD is associated with various human diseases, including cancer, neurodegenerative and infectious diseases. In this review, we discussed some recent findings on how Brd4 actively regulates different forms of PCD and the therapeutic potentials of targeting Brd4 in PCD-related human diseases. A better understanding of PCD regulation would provide not only new insights into pathophysiological functions of PCD but also provide new avenues for therapy by targeting Brd4-regulated PCD.
Collapse
|
14
|
Zhang Y, Jiang J, Ma J, Wei Z, Wang Y, Song B, Meng J, Jia G, de Magalhães JP, Rigden D, Hang D, Chen K. DirectRMDB: a database of post-transcriptional RNA modifications unveiled from direct RNA sequencing technology. Nucleic Acids Res 2022; 51:D106-D116. [PMID: 36382409 PMCID: PMC9825532 DOI: 10.1093/nar/gkac1061] [Citation(s) in RCA: 42] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Revised: 10/20/2022] [Accepted: 10/25/2022] [Indexed: 11/17/2022] [Imported: 10/05/2024] Open
Abstract
With advanced technologies to map RNA modifications, our understanding of them has been revolutionized, and they are seen to be far more widespread and important than previously thought. Current next-generation sequencing (NGS)-based modification profiling methods are blind to RNA modifications and thus require selective chemical treatment or antibody immunoprecipitation methods for particular modification types. They also face the problem of short read length, isoform ambiguities, biases and artifacts. Direct RNA sequencing (DRS) technologies, commercialized by Oxford Nanopore Technologies (ONT), enable the direct interrogation of any given modification present in individual transcripts and promise to address the limitations of previous NGS-based methods. Here, we present the first ONT-based database of quantitative RNA modification profiles, DirectRMDB, which includes 16 types of modification and a total of 904,712 modification sites in 25 species identified from 39 independent studies. In addition to standard functions adopted by existing databases, such as gene annotations and post-transcriptional association analysis, we provide a fresh view of RNA modifications, which enables exploration of the epitranscriptome in an isoform-specific manner. The DirectRMDB database is freely available at: http://www.rnamd.org/directRMDB/.
Collapse
|
15
|
Huang D, Chen K, Song B, Wei Z, Su J, Coenen F, de Magalhães JP, Rigden DJ, Meng J. Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation. Nucleic Acids Res 2022; 50:10290-10310. [PMID: 36155798 PMCID: PMC9561283 DOI: 10.1093/nar/gkac830] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 08/26/2022] [Accepted: 09/15/2022] [Indexed: 12/25/2022] [Imported: 10/05/2024] Open
Abstract
As the most pervasive epigenetic mark present on mRNA and lncRNA, N6-methyladenosine (m6A) RNA methylation regulates all stages of RNA life in various biological processes and disease mechanisms. Computational methods for deciphering RNA modification have achieved great success in recent years; nevertheless, their potential remains underexploited. One reason for this is that existing models usually consider only the sequence of transcripts, ignoring the various regions (or geography) of transcripts such as 3′UTR and intron, where the epigenetic mark forms and functions. Here, we developed three simple yet powerful encoding schemes for transcripts to capture the submolecular geographic information of RNA, which is largely independent from sequences. We show that m6A prediction models based on geographic information alone can achieve comparable performances to classic sequence-based methods. Importantly, geographic information substantially enhances the accuracy of sequence-based models, enables isoform- and tissue-specific prediction of m6A sites, and improves m6A signal detection from direct RNA sequencing data. The geographic encoding schemes we developed have exhibited strong interpretability, and are applicable to not only m6A but also N1-methyladenosine (m1A), and can serve as a general and effective complement to the widely used sequence encoding schemes in deep learning applications concerning RNA transcripts.
Collapse
|
16
|
Liu L, Song B, Chen K, Zhang Y, de Magalhães JP, Rigden DJ, Lei X, Wei Z. WHISTLE server: A high-accuracy genomic coordinate-based machine learning platform for RNA modification prediction. Methods 2022; 203:378-382. [PMID: 34245870 DOI: 10.1016/j.ymeth.2021.07.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 06/28/2021] [Accepted: 07/05/2021] [Indexed: 01/12/2023] [Imported: 10/05/2024] Open
Abstract
The primary sequences of DNA, RNA and protein have been used as the dominant information source of existing machine learning tools, especially for contexts not fully explored by wet-experimental approaches. Since molecular markers are profoundly orchestrated in the living organisms, those markers that cannot be unambiguously recovered from the primary sequence often help to predict other biological events. To the best of our knowledge, there is no current tool to build and deploy machine learning models that consider genomic evidence. We therefore developed the WHISTLE server, the first machine learning platform based on genomic coordinates. It features convenient covariate extraction and model web deployment with 46 distinct genomic features integrated along with the conventional sequence features. We showed that, when predicting m6A sites from SRAMP project, the model integrating genomic features substantially outperformed those based on only sequence features. The WHISTLE server should be a useful tool for studying biological attributes specifically associated with genomic coordinates, and is freely accessible at: www.xjtlu.edu.cn/biologicalsciences/whi2.
Collapse
|
17
|
Zhang Y, Huang D, Wei Z, Chen K. Primary sequence-assisted prediction of m6A RNA methylation sites from Oxford nanopore direct RNA sequencing data. Methods 2022; 203:62-69. [PMID: 35429629 DOI: 10.1016/j.ymeth.2022.04.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Revised: 03/27/2022] [Accepted: 04/11/2022] [Indexed: 11/28/2022] [Imported: 10/05/2024] Open
|
18
|
Ma J, Song B, Wei Z, Huang D, Zhang Y, Su J, de Magalhães JP, Rigden DJ, Meng J, Chen K. m5C-Atlas: a comprehensive database for decoding and annotating the 5-methylcytosine (m5C) epitranscriptome. Nucleic Acids Res 2022; 50:D196-D203. [PMID: 34986603 PMCID: PMC8728298 DOI: 10.1093/nar/gkab1075] [Citation(s) in RCA: 65] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 10/11/2021] [Accepted: 10/22/2021] [Indexed: 01/19/2023] [Imported: 10/05/2024] Open
Abstract
5-Methylcytosine (m5C) is one of the most prevalent covalent modifications on RNA. It is known to regulate a broad variety of RNA functions, including nuclear export, RNA stability and translation. Here, we present m5C-Atlas, a database for comprehensive collection and annotation of RNA 5-methylcytosine. The database contains 166 540 m5C sites in 13 species identified from 5 base-resolution epitranscriptome profiling technologies. Moreover, condition-specific methylation levels are quantified from 351 RNA bisulfite sequencing samples gathered from 22 different studies via an integrative pipeline. The database also presents several novel features, such as the evolutionary conservation of a m5C locus, its association with SNPs, and any relevance to RNA secondary structure. All m5C-atlas data are accessible through a user-friendly interface, in which the m5C epitranscriptomes can be freely explored, shared, and annotated with putative post-transcriptional mechanisms (e.g. RBP intermolecular interaction with RNA, microRNA interaction and splicing sites). Together, these resources offer unprecedented opportunities for exploring m5C epitranscriptomes. The m5C-Atlas database is freely accessible at https://www.xjtlu.edu.cn/biologicalsciences/m5c-atlas.
Collapse
|
19
|
Wang Y, Chen K, Wei Z, Coenen F, Su J, Meng J. MetaTX: deciphering the distribution of mRNA-related features in the presence of isoform ambiguity, with applications in epitranscriptome analysis. Bioinformatics 2021; 37:1285-1291. [PMID: 33135046 DOI: 10.1093/bioinformatics/btaa938] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Revised: 09/23/2020] [Accepted: 10/24/2020] [Indexed: 01/09/2023] [Imported: 10/05/2024] Open
Abstract
MOTIVATION The distribution of biological features strongly indicates their functional relevance. Compared to DNA-related features, deciphering the distribution of mRNA-related features is non-trivial due to the existence of isoform ambiguity and compositional diversity of mRNAs. RESULTS We propose here a rigorous statistical framework, MetaTX, for deciphering the distribution of mRNA-related features. Through a standardized mRNA model, MetaTX firstly unifies various mRNA transcripts of diverse compositions, and then corrects the isoform ambiguity by incorporating the overall distribution pattern of the features through an EM algorithm. MetaTX was tested on both simulated and real data. Results suggested that MetaTX substantially outperformed existing direct methods on simulated datasets, and that a more informative distribution pattern was produced for all the three datasets tested, which contain N6-Methyladenosine sites generated by different technologies. MetaTX should make a useful tool for studying the distribution and functions of mRNA-related biological features, especially for mRNA modifications such as N6-Methyladenosine. AVAILABILITY AND IMPLEMENTATION The MetaTX R package is freely available at GitHub: https://github.com/yue-wang-biomath/MetaTX.1.0. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
20
|
Song Z, Huang D, Song B, Chen K, Song Y, Liu G, Su J, Magalhães JPD, Rigden DJ, Meng J. Attention-based multi-label neural networks for integrated prediction and interpretation of twelve widely occurring RNA modifications. Nat Commun 2021; 12:4011. [PMID: 34188054 PMCID: PMC8242015 DOI: 10.1038/s41467-021-24313-3] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2020] [Accepted: 06/07/2021] [Indexed: 02/08/2023] [Imported: 10/05/2024] Open
Abstract
Recent studies suggest that epi-transcriptome regulation via post-transcriptional RNA modifications is vital for all RNA types. Precise identification of RNA modification sites is essential for understanding the functions and regulatory mechanisms of RNAs. Here, we present MultiRM, a method for the integrated prediction and interpretation of post-transcriptional RNA modifications from RNA sequences. Built upon an attention-based multi-label deep learning framework, MultiRM not only simultaneously predicts the putative sites of twelve widely occurring transcriptome modifications (m6A, m1A, m5C, m5U, m6Am, m7G, Ψ, I, Am, Cm, Gm, and Um), but also returns the key sequence contents that contribute most to the positive predictions. Importantly, our model revealed a strong association among different types of RNA modifications from the perspective of their associated sequence contexts. Our work provides a solution for detecting multiple RNA modifications, enabling an integrated analysis of these RNA modifications, and gaining a better understanding of sequence-based RNA modification mechanisms. RNA modifications appear to play a role in determining RNA structure and function. Here, the authors develop a deep learning model that predicts the location of 12 RNA modifications using primary sequence, and show that several modifications are associated, which suggests dependencies between them.
Collapse
|
21
|
Song B, Chen K, Tang Y, Wei Z, Su J, de Magalhães JP, Rigden DJ, Meng J. ConsRM: collection and large-scale prediction of the evolutionarily conserved RNA methylation sites, with implications for the functional epitranscriptome. Brief Bioinform 2021; 22:6276017. [PMID: 33993206 DOI: 10.1093/bib/bbab088] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Revised: 02/04/2021] [Accepted: 02/24/2021] [Indexed: 12/15/2022] [Imported: 10/05/2024] Open
Abstract
Motivation N6-methyladenosine (m6A) is the most prevalent RNA modification on mRNAs and lncRNAs. Evidence increasingly demonstrates its crucial importance in essential molecular mechanisms and various diseases. With recent advances in sequencing techniques, tens of thousands of m6A sites are identified in a typical high-throughput experiment, posing a key challenge to distinguish the functional m6A sites from the remaining 'passenger' (or 'silent') sites. Results: We performed a comparative conservation analysis of the human and mouse m6A epitranscriptomes at single site resolution. A novel scoring framework, ConsRM, was devised to quantitatively measure the degree of conservation of individual m6A sites. ConsRM integrates multiple information sources and a positive-unlabeled learning framework, which integrated genomic and sequence features to trace subtle hints of epitranscriptome layer conservation. With a series validation experiments in mouse, fly and zebrafish, we showed that ConsRM outperformed well-adopted conservation scores (phastCons and phyloP) in distinguishing the conserved and unconserved m6A sites. Additionally, the m6A sites with a higher ConsRM score are more likely to be functionally important. An online database was developed containing the conservation metrics of 177 998 distinct human m6A sites to support conservation analysis and functional prioritization of individual m6A sites. And it is freely accessible at: https://www.xjtlu.edu.cn/biologicalsciences/con.
Collapse
|
22
|
Jiang J, Song B, Chen K, Lu Z, Rong R, Zhong Y, Meng J. m6AmPred: Identifying RNA N6, 2'-O-dimethyladenosine (m 6A m) sites based on sequence-derived information. Methods 2021; 203:328-334. [PMID: 33540081 DOI: 10.1016/j.ymeth.2021.01.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 01/14/2021] [Accepted: 01/20/2021] [Indexed: 12/11/2022] [Imported: 10/05/2024] Open
Abstract
N6,2'-O-dimethyladenosine (m6Am) is a reversible modification widely occurred on varied RNA molecules. The biological function of m6Am is yet to be known though recent studies have revealed its influences in cellular mRNA fate. Precise identification of m6Am sites on RNA is vital for the understanding of its biological functions. We present here m6AmPred, the first web server for in silico identification of m6Am sites from the primary sequences of RNA. Built upon the eXtreme Gradient Boosting with Dart algorithm (XgbDart) and EIIP-PseEIIP encoding scheme, m6AmPred achieved promising prediction performance with the AUCs greater than 0.954 when tested by 10-fold cross-validation and independent testing datasets. To critically test and validate the performance of m6AmPred, the experimentally verified m6Am sites from two data sources were cross-validated. The m6AmPred web server is freely accessible at: https://www.xjtlu.edu.cn/biologicalsciences/m6am, and it should make a useful tool for the researchers who are interested in N6,2'-O-dimethyladenosine RNA modification.
Collapse
|
23
|
Chen K, Song B, Tang Y, Wei Z, Xu Q, Su J, de Magalhães JP, Rigden DJ, Meng J. RMDisease: a database of genetic variants that affect RNA modifications, with implications for epitranscriptome pathogenesis. Nucleic Acids Res 2021; 49:D1396-D1404. [PMID: 33010174 PMCID: PMC7778951 DOI: 10.1093/nar/gkaa790] [Citation(s) in RCA: 66] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 09/08/2020] [Accepted: 09/11/2020] [Indexed: 12/11/2022] [Imported: 10/05/2024] Open
Abstract
Deciphering the biological impacts of millions of single nucleotide variants remains a major challenge. Recent studies suggest that RNA modifications play versatile roles in essential biological mechanisms, and are closely related to the progression of various diseases including multiple cancers. To comprehensively unveil the association between disease-associated variants and their epitranscriptome disturbance, we built RMDisease, a database of genetic variants that can affect RNA modifications. By integrating the prediction results of 18 different RNA modification prediction tools and also 303,426 experimentally-validated RNA modification sites, RMDisease identified a total of 202,307 human SNPs that may affect (add or remove) sites of eight types of RNA modifications (m6A, m5C, m1A, m5U, Ψ, m6Am, m7G and Nm). These include 4,289 disease-associated variants that may imply disease pathogenesis functioning at the epitranscriptome layer. These SNPs were further annotated with essential information such as post-transcriptional regulations (sites for miRNA binding, interaction with RNA-binding proteins and alternative splicing) revealing putative regulatory circuits. A convenient graphical user interface was constructed to support the query, exploration and download of the relevant information. RMDisease should make a useful resource for studying the epitranscriptome impact of genetic variants via multiple RNA modifications with emphasis on their potential disease relevance. RMDisease is freely accessible at: www.xjtlu.edu.cn/biologicalsciences/rmd.
Collapse
|
24
|
Tang Y, Chen K, Song B, Ma J, Wu X, Xu Q, Wei Z, Su J, Liu G, Rong R, Lu Z, de Magalhães J, Rigden DJ, Meng J. m6A-Atlas: a comprehensive knowledgebase for unraveling the N6-methyladenosine (m6A) epitranscriptome. Nucleic Acids Res 2021; 49:D134-D143. [PMID: 32821938 PMCID: PMC7779050 DOI: 10.1093/nar/gkaa692] [Citation(s) in RCA: 189] [Impact Index Per Article: 63.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Revised: 08/05/2020] [Accepted: 08/09/2020] [Indexed: 12/25/2022] [Imported: 10/05/2024] Open
Abstract
N 6-Methyladenosine (m6A) is the most prevalent RNA modification on mRNAs and lncRNAs. It plays a pivotal role during various biological processes and disease pathogenesis. We present here a comprehensive knowledgebase, m6A-Atlas, for unraveling the m6A epitranscriptome. Compared to existing databases, m6A-Atlas features a high-confidence collection of 442 162 reliable m6A sites identified from seven base-resolution technologies and the quantitative (rather than binary) epitranscriptome profiles estimated from 1363 high-throughput sequencing samples. It also offers novel features, such as; the conservation of m6A sites among seven vertebrate species (including human, mouse and chimp), the m6A epitranscriptomes of 10 virus species (including HIV, KSHV and DENV), the putative biological functions of individual m6A sites predicted from epitranscriptome data, and the potential pathogenesis of m6A sites inferred from disease-associated genetic mutations that can directly destroy m6A directing sequence motifs. A user-friendly graphical user interface was constructed to support the query, visualization and sharing of the m6A epitranscriptomes annotated with sites specifying their interaction with post-transcriptional machinery (RBP-binding, microRNA interaction and splicing sites) and interactively display the landscape of multiple RNA modifications. These resources provide fresh opportunities for unraveling the m6A epitranscriptomes. m6A-Atlas is freely accessible at: www.xjtlu.edu.cn/biologicalsciences/atlas.
Collapse
|
25
|
Jiang J, Song B, Tang Y, Chen K, Wei Z, Meng J. m5UPred: A Web Server for the Prediction of RNA 5-Methyluridine Sites from Sequences. MOLECULAR THERAPY-NUCLEIC ACIDS 2020; 22:742-747. [PMID: 33230471 PMCID: PMC7595847 DOI: 10.1016/j.omtn.2020.09.031] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Accepted: 09/25/2020] [Indexed: 11/16/2022] [Imported: 10/05/2024]
Abstract
As one of the widely occurring RNA modifications, 5-methyluridine (m5U) has recently been shown to play critical roles in various biological functions and disease pathogenesis, such as under stress response and during breast cancer development. Precise identification of m5U sites on RNA is vital for the understanding of the regulatory mechanisms of RNA life. We present here m5UPred, the first web server for in silico identification of m5U sites from the primary sequences of RNA. Built upon the support vector machine (SVM) algorithm and the biochemical encoding scheme, m5UPred achieved reasonable prediction performance with the area under the receiver operating characteristic curve (AUC) greater than 0.954 by 5-fold cross-validation and independent testing datasets. To critically test and validate the performance of our newly proposed predictor, the experimentally validated m5U sites were further separated by high-throughput sequencing techniques (miCLIP-Seq and FICC-Seq) and cell types (HEK293 and HAP1). When tested on cross-technique and cross-cell-type validation using independent datasets, m5UPred achieved an average AUC of 0.922 and 0.926 under mature mRNA mode, respectively, showing reasonable accuracy and reliability. The m5UPred web server is freely accessible now and it should make a useful tool for the researchers who are interested in m5U RNA modification.
Collapse
|