1
|
Qiao Y, Yang R, Liu Y, Chen J, Zhao L, Huo P, Wang Z, Bu D, Wu Y, Zhao Y. DeepFusion: A deep bimodal information fusion network for unraveling protein-RNA interactions using in vivo RNA structures. Comput Struct Biotechnol J 2024; 23:617-625. [PMID: 38274994 PMCID: PMC10808905 DOI: 10.1016/j.csbj.2023.12.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 12/04/2023] [Accepted: 12/26/2023] [Indexed: 01/27/2024] Open
Abstract
RNA-binding proteins (RBPs) are key post-transcriptional regulators, and the malfunctions of RBP-RNA binding lead to diverse human diseases. However, prediction of RBP binding sites is largely based on RNA sequence features, whereas in vivo RNA structural features based on high-throughput sequencing are rarely incorporated. Here, we designed a deep bimodal information fusion network called DeepFusion for unraveling protein-RNA interactions by incorporating structural features derived from DMS-seq data. DeepFusion integrates two sub-models to extract local motif-like information and long-term context information. We show that DeepFusion performs best compared with other cutting-edge methods with only sequence inputs on two datasets. DeepFusion's performance is further improved with bimodal input after adding in vivo DMS-seq structural features. Furthermore, DeepFusion can be used for analyzing RNA degradation, demonstrating significantly different RBP-binding scores in genes with slow degradation rates versus those with rapid degradation rates. DeepFusion thus provides enhanced abilities for further analysis of functional RNAs. DeepFusion's code and data are available at http://bioinfo.org/deepfusion/.
Collapse
Affiliation(s)
- Yixuan Qiao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Rui Yang
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yang Liu
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jiaxin Chen
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Lianhe Zhao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Peipei Huo
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Zhihao Wang
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Dechao Bu
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Yang Wu
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Yi Zhao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
2
|
Saha R, Vázquez-Salazar A, Nandy A, Chen IA. Fitness Landscapes and Evolution of Catalytic RNA. Annu Rev Biophys 2024; 53:109-125. [PMID: 39013026 DOI: 10.1146/annurev-biophys-030822-025038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2024]
Abstract
The relationship between genotype and phenotype, or the fitness landscape, is the foundation of genetic engineering and evolution. However, mapping fitness landscapes poses a major technical challenge due to the amount of quantifiable data that is required. Catalytic RNA is a special topic in the study of fitness landscapes due to its relatively small sequence space combined with its importance in synthetic biology. The combination of in vitro selection and high-throughput sequencing has recently provided empirical maps of both complete and local RNA fitness landscapes, but the astronomical size of sequence space limits purely experimental investigations. Next steps are likely to involve data-driven interpolation and extrapolation over sequence space using various machine learning techniques. We discuss recent progress in understanding RNA fitness landscapes, particularly with respect to protocells and machine representations of RNA. The confluence of technical advances may significantly impact synthetic biology in the near future.
Collapse
Affiliation(s)
- Ranajay Saha
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, California, USA; ,
| | - Alberto Vázquez-Salazar
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, California, USA; ,
| | - Aditya Nandy
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, California, USA; ,
- Department of Chemistry, The University of Chicago, Chicago, Illinois, USA
- The James Franck Institute, The University of Chicago, Chicago, Illinois, USA
| | - Irene A Chen
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, California, USA; ,
- Department of Chemistry and Biochemistry, University of California, Los Angeles, California, USA
| |
Collapse
|
3
|
Park JH, Prasad V, Newsom S, Najar F, Rajan R. IdMotif: An Interactive Motif Identification in Protein Sequences. IEEE COMPUTER GRAPHICS AND APPLICATIONS 2024; 44:114-125. [PMID: 38127603 DOI: 10.1109/mcg.2023.3345742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
This article presents a visual analytics framework, idMotif, to support domain experts in identifying motifs in protein sequences. A motif is a short sequence of amino acids usually associated with distinct functions of a protein, and identifying similar motifs in protein sequences helps us to predict certain types of disease or infection. idMotif can be used to explore, analyze, and visualize such motifs in protein sequences. We introduce a deep-learning-based method for grouping protein sequences and allow users to discover motif candidates of protein groups based on local explanations of the decision of a deep-learning model. idMotif provides several interactive linked views for between and within protein cluster/group and sequence analysis. Through a case study and experts' feedback, we demonstrate how the framework helps domain experts analyze protein sequences and motif identification.
Collapse
|
4
|
Li X, Qu W, Yan J, Tan J. RPI-EDLCN: An Ensemble Deep Learning Framework Based on Capsule Network for ncRNA-Protein Interaction Prediction. J Chem Inf Model 2024; 64:2221-2235. [PMID: 37158609 DOI: 10.1021/acs.jcim.3c00377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Noncoding RNAs (ncRNAs) play crucial roles in many cellular life activities by interacting with proteins. Identification of ncRNA-protein interactions (ncRPIs) is key to understanding the function of ncRNAs. Although a number of computational methods for predicting ncRPIs have been developed, the problem of predicting ncRPIs remains challenging. It has always been the focus of ncRPIs research to select suitable feature extraction methods and develop a deep learning architecture with better recognition performance. In this work, we proposed an ensemble deep learning framework, RPI-EDLCN, based on a capsule network (CapsuleNet) to predict ncRPIs. In terms of feature input, we extracted the sequence features, secondary structure sequence features, motif information, and physicochemical properties of ncRNA/protein. The sequence and secondary structure sequence features of ncRNA/protein are encoded by the conjoint k-mer method and then input into an ensemble deep learning model based on CapsuleNet by combining the motif information and physicochemical properties. In this model, the encoding features are processed by convolution neural network (CNN), deep neural network (DNN), and stacked autoencoder (SAE). Then the advanced features obtained from the processing are input into the CapsuleNet for further feature learning. Compared with other state-of-the-art methods under 5-fold cross-validation, the performance of RPI-EDLCN is the best, and the accuracy of RPI-EDLCN on RPI1807, RPI2241, and NPInter v2.0 data sets was 93.8%, 88.2%, and 91.9%, respectively. The results of the independent test indicated that RPI-EDLCN can effectively predict potential ncRPIs in different organisms. In addition, RPI-EDLCN successfully predicted hub ncRNAs and proteins in Mus musculus ncRNA-protein networks. Overall, our model can be used as an effective tool to predict ncRPIs and provides some useful guidance for future biological studies.
Collapse
Affiliation(s)
- Xiaoyi Li
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing 100124, China
| | - Wenyan Qu
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing 100124, China
| | - Jing Yan
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing 100124, China
| | - Jianjun Tan
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing 100124, China
| |
Collapse
|
5
|
Zhu Y, Zhao L, Wen N, Wang J, Wang C. DataDTA: a multi-feature and dual-interaction aggregation framework for drug-target binding affinity prediction. Bioinformatics 2023; 39:btad560. [PMID: 37688568 PMCID: PMC10516524 DOI: 10.1093/bioinformatics/btad560] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Revised: 05/09/2023] [Accepted: 09/07/2023] [Indexed: 09/11/2023] Open
Abstract
MOTIVATION Accurate prediction of drug-target binding affinity (DTA) is crucial for drug discovery. The increase in the publication of large-scale DTA datasets enables the development of various computational methods for DTA prediction. Numerous deep learning-based methods have been proposed to predict affinities, some of which only utilize original sequence information or complex structures, but the effective combination of various information and protein-binding pockets have not been fully mined. Therefore, a new method that integrates available key information is urgently needed to predict DTA and accelerate the drug discovery process. RESULTS In this study, we propose a novel deep learning-based predictor termed DataDTA to estimate the affinities of drug-target pairs. DataDTA utilizes descriptors of predicted pockets and sequences of proteins, as well as low-dimensional molecular features and SMILES strings of compounds as inputs. Specifically, the pockets were predicted from the three-dimensional structure of proteins and their descriptors were extracted as the partial input features for DTA prediction. The molecular representation of compounds based on algebraic graph features was collected to supplement the input information of targets. Furthermore, to ensure effective learning of multiscale interaction features, a dual-interaction aggregation neural network strategy was developed. DataDTA was compared with state-of-the-art methods on different datasets, and the results showed that DataDTA is a reliable prediction tool for affinities estimation. Specifically, the concordance index (CI) of DataDTA is 0.806 and the Pearson correlation coefficient (R) value is 0.814 on the test dataset, which is higher than other methods. AVAILABILITY AND IMPLEMENTATION The codes and datasets of DataDTA are available at https://github.com/YanZhu06/DataDTA.
Collapse
Affiliation(s)
- Yan Zhu
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| | - Lingling Zhao
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| | - Naifeng Wen
- School of Mechanical and Electrical Engineering, Dalian Minzu University, Dalian 116600, China
| | - Junjie Wang
- Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| |
Collapse
|
6
|
Sato K, Hamada M. Recent trends in RNA informatics: a review of machine learning and deep learning for RNA secondary structure prediction and RNA drug discovery. Brief Bioinform 2023; 24:bbad186. [PMID: 37232359 PMCID: PMC10359090 DOI: 10.1093/bib/bbad186] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 04/24/2023] [Accepted: 04/25/2023] [Indexed: 05/27/2023] Open
Abstract
Computational analysis of RNA sequences constitutes a crucial step in the field of RNA biology. As in other domains of the life sciences, the incorporation of artificial intelligence and machine learning techniques into RNA sequence analysis has gained significant traction in recent years. Historically, thermodynamics-based methods were widely employed for the prediction of RNA secondary structures; however, machine learning-based approaches have demonstrated remarkable advancements in recent years, enabling more accurate predictions. Consequently, the precision of sequence analysis pertaining to RNA secondary structures, such as RNA-protein interactions, has also been enhanced, making a substantial contribution to the field of RNA biology. Additionally, artificial intelligence and machine learning are also introducing technical innovations in the analysis of RNA-small molecule interactions for RNA-targeted drug discovery and in the design of RNA aptamers, where RNA serves as its own ligand. This review will highlight recent trends in the prediction of RNA secondary structure, RNA aptamers and RNA drug discovery using machine learning, deep learning and related technologies, and will also discuss potential future avenues in the field of RNA informatics.
Collapse
Affiliation(s)
- Kengo Sato
- School of System Design and Technology, Tokyo Denki University, 5 Senju Asahi-cho, Adachi-ku, Tokyo 120-8551, Japan
| | - Michiaki Hamada
- Department of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University, 55N-06-10, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL) , National Institute of Advanced Industrial Science and Technology (AIST), 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Graduate School of Medicine, Nippon Medical School, 1-1-5, Sendagi, Bunkyo-ku, Tokyo 113-8602, Japan
| |
Collapse
|
7
|
Wang X, Zhang M, Long C, Yao L, Zhu M. Self-Attention Based Neural Network for Predicting RNA-Protein Binding Sites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1469-1479. [PMID: 36067103 DOI: 10.1109/tcbb.2022.3204661] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Proteins binding to Ribonucleic Acid (RNA) inside cells are called RNA-binding proteins (RBP), which play a crucial role in gene regulation. The identification of RNA-protein binding sites helps to understand the function of RBP better. Although many computational methods have been developed to predict RNA-protein binding sites, their prediction accuracy on small sample datasets needs improvement. To overcome this limitation, we propose a novel model called SA-Net, which utilizes k-mer embedding to encode RNA sequences and a self-attention-based neural network to extract sequence features. K-mer embedding assists the model to discover significant subsequence fragments associated with binding sites. The self-attention mechanism captures contextual information from the entire input sequence globally, performing well in small sample sequence learning. Experimental results demonstrate that SA-Net attains state-of-the-art results on the RBP-24 dataset. We find that 4-mer embedding aids the model to achieve optimal performance. We also show that the self-attention network outperforms the commonly used CNN and CNN-BLSTM models in sequence feature extraction.
Collapse
|
8
|
Xia Y, Xia C, Pan X, Shen H. BindWeb: A web server for ligand binding residue and pocket prediction from protein structures. Protein Sci 2022; 31:e4462. [PMID: 36190332 PMCID: PMC9667820 DOI: 10.1002/pro.4462] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 09/27/2022] [Accepted: 09/28/2022] [Indexed: 12/13/2022]
Abstract
Knowledge of protein-ligand interactions is beneficial for biological process analysis and drug design. Given the complexity of the interactions and the inadequacy of experimental data, accurate ligand binding residue and pocket prediction remains challenging. In this study, we introduce an easy-to-use web server BindWeb for ligand-specific and ligand-general binding residue and pocket prediction from protein structures. BindWeb integrates a graph neural network GraphBind with a hybrid convolutional neural network and bidirectional long short-term memory network DELIA to identify binding residues. Furthermore, BindWeb clusters the predicted binding residues to binding pockets with mean shift clustering. The experimental results and case study demonstrate that BindWeb benefits from the complementarity of two base methods. BindWeb is freely available for academic use at http://www.csbio.sjtu.edu.cn/bioinf/BindWeb/.
Collapse
Affiliation(s)
- Ying Xia
- Institute of Image Processing and Pattern RecognitionShanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of ChinaShanghaiChina
| | - Chunqiu Xia
- Institute of Image Processing and Pattern RecognitionShanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of ChinaShanghaiChina
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern RecognitionShanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of ChinaShanghaiChina
| | - Hong‐Bin Shen
- Institute of Image Processing and Pattern RecognitionShanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of ChinaShanghaiChina
| |
Collapse
|
9
|
Zhai W, Duan Y, Zhang X, Xu G, Li H, Shi J, Xu Z, Zhang X. Sequence and thermodynamic characteristics of terminators revealed by FlowSeq and the discrimination of terminators strength. Synth Syst Biotechnol 2022; 7:1046-1055. [PMID: 35845313 PMCID: PMC9257418 DOI: 10.1016/j.synbio.2022.06.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 06/11/2022] [Accepted: 06/11/2022] [Indexed: 11/24/2022] Open
Abstract
The intrinsic terminator in prokaryotic forms secondary RNA structure and terminates the transcription. However, leaking transcription is common due to varied terminator strength. Besides of the representative hairpin and U-tract structure, detailed sequence and thermodynamic features of terminators were not completely clear, and the effect of terminator on the upstream gene expression was unclearly. Thus, it is still challenging to use terminator to control expression with higher precision. Here, in E. Coli, we firstly determined the effect of the 3′-end sequences including spacer sequences and terminator sequences on the expression of upstream and downstream genes. Secondly, terminator mutation library was constructed, and the thermodynamic and sequence features differing in the termination efficiency were analyzed using the FlowSeq technique. The result showed that under the regulation of terminators, a negative correlation was presented between the expression of upstream and downstream genes (r=−0.60), and the terminators with lower free energy corelated with higher upstream gene expression. Meanwhile, the terminator with longer stem length, more compact loop and perfect U-tract structure was benefit to the transcription termination. Finally, a terminator strength classification model was established, and the verification experiment based on 20 synthetic terminators indicated that the model can distinguish strong and weak terminators to certain extent. The results help to elucidate the role of terminators in gene expression, and the key factors identified are crucial for rational design of terminators, and the model provided a method for terminator strength prediction.
Collapse
Affiliation(s)
- Weiji Zhai
- Biotechnology of Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi, China
| | - Yanting Duan
- Biotechnology of Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi, China
| | - Xiaomei Zhang
- School of Life Science and Health Engineering, Jiangnan University, Wuxi, 214122, China
- Jiangsu Provincial Engineering Research Center for Bioactive Product Processing, Jiangnan University, 1800 Lihu Avenue, Wuxi, 214122, China
| | - Guoqiang Xu
- Biotechnology of Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi, China
| | - Hui Li
- School of Life Science and Health Engineering, Jiangnan University, Wuxi, 214122, China
- Jiangsu Provincial Engineering Research Center for Bioactive Product Processing, Jiangnan University, 1800 Lihu Avenue, Wuxi, 214122, China
| | - Jinsong Shi
- School of Life Science and Health Engineering, Jiangnan University, Wuxi, 214122, China
- Jiangsu Provincial Engineering Research Center for Bioactive Product Processing, Jiangnan University, 1800 Lihu Avenue, Wuxi, 214122, China
| | - Zhenghong Xu
- Biotechnology of Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi, China
| | - Xiaojuan Zhang
- Biotechnology of Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi, China
- Corresponding author. Biotechnology of Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China.
| |
Collapse
|
10
|
Arora V, Sanguinetti G. De novo prediction of RNA-protein interactions with graph neural networks. RNA (NEW YORK, N.Y.) 2022; 28:1469-1480. [PMID: 36008134 PMCID: PMC9745830 DOI: 10.1261/rna.079365.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 08/17/2022] [Indexed: 06/15/2023]
Abstract
RNA-binding proteins (RBPs) are key co- and post-transcriptional regulators of gene expression, playing a crucial role in many biological processes. Experimental methods like CLIP-seq have enabled the identification of transcriptome-wide RNA-protein interactions for select proteins; however, the time- and resource-intensive nature of these technologies call for the development of computational methods to complement their predictions. Here, we leverage recent, large-scale CLIP-seq experiments to construct a de novo predictor of RNA-protein interactions based on graph neural networks (GNN). We show that the GNN method allows us not only to predict missing links in an RNA-protein network, but to predict the entire complement of targets of previously unassayed proteins, and even to reconstruct the entire network of RNA-protein interactions in different conditions based on minimal information. Our results demonstrate the potential of modern machine learning methods to extract useful information on post-transcriptional regulation from large data sets.
Collapse
Affiliation(s)
- Viplove Arora
- Data Science, Department of Physics, SISSA, Trieste 34136, Italy
| | | |
Collapse
|
11
|
Sadée C, Hagler LD, Becker WR, Jarmoskaite I, Vaidyanathan PP, Denny SK, Greenleaf WJ, Herschlag D. A comprehensive thermodynamic model for RNA binding by the Saccharomyces cerevisiae Pumilio protein PUF4. Nat Commun 2022; 13:4522. [PMID: 35927243 PMCID: PMC9352680 DOI: 10.1038/s41467-022-31968-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 07/07/2022] [Indexed: 11/12/2022] Open
Abstract
Genomic methods have been valuable for identifying RNA-binding proteins (RBPs) and the genes, pathways, and processes they regulate. Nevertheless, standard motif descriptions cannot be used to predict all RNA targets or test quantitative models for cellular interactions and regulation. We present a complete thermodynamic model for RNA binding to the S. cerevisiae Pumilio protein PUF4 derived from direct binding data for 6180 RNAs measured using the RNA on a massively parallel array (RNA-MaP) platform. The PUF4 model is highly similar to that of the related RBPs, human PUM2 and PUM1, with one marked exception: a single favorable site of base flipping for PUF4, such that PUF4 preferentially binds to a non-contiguous series of residues. These results are foundational for developing and testing cellular models of RNA-RBP interactions and function, for engineering RBPs, for understanding the biophysical nature of RBP binding and the evolutionary landscape of RNAs and RBPs.
Collapse
Affiliation(s)
- Christoph Sadée
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
| | - Lauren D Hagler
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
| | - Winston R Becker
- Biophysics Program, Stanford University School of Medicine, Stanford, CA, USA
| | - Inga Jarmoskaite
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Pavanapuresan P Vaidyanathan
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Protillion Biosciences, Burlingame, CA, USA
| | - Sarah K Denny
- Biophysics Program, Stanford University School of Medicine, Stanford, CA, USA
- Scribe Therapeutics, Alameda, CA, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Applied Physics, Stanford University, Stanford, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Daniel Herschlag
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA.
- Department of Chemical Engineering, Stanford University, Stanford, CA, USA.
- ChEM-H Institute, Stanford University, Stanford, CA, USA.
| |
Collapse
|
12
|
Duan Y, Zhang X, Zhai W, Zhang J, Zhang X, Xu G, Li H, Deng Z, Shi J, Xu Z. Deciphering the Rules of Ribosome Binding Site Differentiation in Context Dependence. ACS Synth Biol 2022; 11:2726-2740. [PMID: 35877551 DOI: 10.1021/acssynbio.2c00139] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The ribosome binding site (RBS) is a crucial element regulating translation. However, the activity of RBS is poorly predictable, because it is strongly affected by the local possible secondary structure, that is, context dependence. By the Flowseq technique, over 20 000 RBS variants were sorted and sequenced, and the translation of multiple genes under the same RBS was quantitatively characterized to evaluate the context dependence of each RBS variant in E. coli. Two regions, (-7 to -2) and (-17 to -12), of RBS were predicted with a higher possibility to pair with each other to slow down the translation initiation. Associations between phenotypes and the intrinsic factors suspected to affect translation efficiency and context dependence of the RBS, including nucleotide bias at each position, free energy, and conservation, were disentangled. The results showed that translation efficiency was influenced more significantly by conservation of the SD region (-16 to -8), while an AC-rich spacer region (-7 to -1) was associated with low context dependence. We confirmed these characteristics using a series of synthesized RBSs. The average correlation between multiple reporters was significantly higher for RBSs with an AC-rich spacer (0.714) compared with a GU-rich spacer (0.286). Overall, we proposed general design criteria to improve programmability and minimize context dependence of RBS. The characteristics unraveled here can be adapted to other bacteria for fine-tuning target-gene expression.
Collapse
Affiliation(s)
- Yanting Duan
- Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China.,National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi 214122, China
| | - Xiaojuan Zhang
- Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China.,National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi 214122, China
| | - Weiji Zhai
- Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China.,National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi 214122, China
| | - Jinpeng Zhang
- Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China.,National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi 214122, China
| | - Xiaomei Zhang
- School of Life Science and Health Engineering, Jiangnan University, Wuxi 214122, China.,Jiangsu Engineering Research Center for Bioactive Products Processing Technology, Jiangnan University, 1800 Lihu Avenue, Wuxi 214122, China
| | - Guoqiang Xu
- Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China.,National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi 214122, China
| | - Hui Li
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
| | - Zhaohong Deng
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
| | - Jinsong Shi
- School of Life Science and Health Engineering, Jiangnan University, Wuxi 214122, China.,Jiangsu Engineering Research Center for Bioactive Products Processing Technology, Jiangnan University, 1800 Lihu Avenue, Wuxi 214122, China
| | - Zhenghong Xu
- Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China.,National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi 214122, China
| |
Collapse
|
13
|
Jiang HY, He J. Functional annotation of creeping bentgrass protein sequences based on convolutional neural network. BMC PLANT BIOLOGY 2022; 22:227. [PMID: 35501681 PMCID: PMC9063134 DOI: 10.1186/s12870-022-03607-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Accepted: 04/19/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND Creeping bentgrass (Agrostis soionifera) is a perennial grass of Gramineae, belonging to cold season turfgrass, but has poor disease resistance. Up to now, little is known about the induced systemic resistance (ISR) mechanism, especially the relevant functional proteins, which is important to disease resistance of turfgrass. Achieving more information of proteins of infected creeping bentgrass is helpful to understand the ISR mechanism. RESULTS With BDO treatment, creeping bentgrass seedlings were grown, and the ISR response was induced by infecting Rhizoctonia solani. High-quality protein sequences of creeping bentgrass seedlings were obtained. Some of protein sequences were functionally annotated according to the database alignment while a large part of the obtained protein sequences was left non-annotated. To treat the non-annotated sequences, a prediction model based on convolutional neural network was established with the dataset from Uniport database in three domains to acquire good performance, especially the higher false positive control rate. With established model, the non-annotated protein sequences of creeping bentgrass were analyzed to annotate proteins relevant to disease-resistance response and signal transduction. CONCLUSIONS The prediction model based on convolutional neural network was successfully applied to select good candidates of the proteins with functions relevant to the ISR mechanism from the protein sequences which cannot be annotated by database alignment. The waste of sequence data can be avoided, and research time and labor will be saved in further research of protein of creeping bentgrass by molecular biology technology. It also provides reference for other sequence analysis of turfgrass disease-resistance research.
Collapse
Affiliation(s)
- Han-Yu Jiang
- School of Physics and Technology, Nanjing Normal University, Nanjing, 210097, Jiangsu, China
- Sino-U.S. Center for Grazingland Ecosystem Sustainability/Pratacultural Engineering Laboratory of Gansu Province/ Key Laboratory of Grassland Ecosystem, Ministry of Education/College of Pratacultural Science, Gansu Agricultural University, Lanzhou, Gansu, 730070, China
| | - Jun He
- School of Physics and Technology, Nanjing Normal University, Nanjing, 210097, Jiangsu, China.
| |
Collapse
|
14
|
Yamada K, Hamada M. Prediction of RNA-protein interactions using a nucleotide language model. BIOINFORMATICS ADVANCES 2022; 2:vbac023. [PMID: 36699410 PMCID: PMC9710633 DOI: 10.1093/bioadv/vbac023] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 02/28/2022] [Accepted: 04/05/2022] [Indexed: 01/28/2023]
Abstract
Motivation The accumulation of sequencing data has enabled researchers to predict the interactions between RNA sequences and RNA-binding proteins (RBPs) using novel machine learning techniques. However, existing models are often difficult to interpret and require additional information to sequences. Bidirectional encoder representations from transformer (BERT) is a language-based deep learning model that is highly interpretable. Therefore, a model based on BERT architecture can potentially overcome such limitations. Results Here, we propose BERT-RBP as a model to predict RNA-RBP interactions by adapting the BERT architecture pretrained on a human reference genome. Our model outperformed state-of-the-art prediction models using the eCLIP-seq data of 154 RBPs. The detailed analysis further revealed that BERT-RBP could recognize both the transcript region type and RNA secondary structure only based on sequence information. Overall, the results provide insights into the fine-tuning mechanism of BERT in biological contexts and provide evidence of the applicability of the model to other RNA-related problems. Availability and implementation Python source codes are freely available at https://github.com/kkyamada/bert-rbp. The datasets underlying this article were derived from sources in the public domain: [RBPsuite (http://www.csbio.sjtu.edu.cn/bioinf/RBPsuite/), Ensembl Biomart (http://asia.ensembl.org/biomart/martview/)]. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Keisuke Yamada
- Department of Electrical Engineering and Bioscience, School of Advanced Science and Engineering, Waseda University, Tokyo 169-8555, Japan
| | - Michiaki Hamada
- Department of Electrical Engineering and Bioscience, School of Advanced Science and Engineering, Waseda University, Tokyo 169-8555, Japan,Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Okubo, Shinjuku, Tokyo 169-8555, Japan,To whom correspondence should be addressed.
| |
Collapse
|
15
|
Yu B, Wang X, Zhang Y, Gao H, Wang Y, Liu Y, Gao X. RPI-MDLStack: Predicting RNA-protein interactions through deep learning with stacking strategy and LASSO. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.108676] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
16
|
PRIP: A Protein-RNA Interface Predictor Based on Semantics of Sequences. Life (Basel) 2022; 12:life12020307. [PMID: 35207594 PMCID: PMC8879494 DOI: 10.3390/life12020307] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 01/28/2022] [Accepted: 02/04/2022] [Indexed: 01/08/2023] Open
Abstract
RNA–protein interactions play an indispensable role in many biological processes. Growing evidence has indicated that aberration of the RNA–protein interaction is associated with many serious human diseases. The precise and quick detection of RNA–protein interactions is crucial to finding new functions and to uncovering the mechanism of interactions. Although many methods have been presented to recognize RNA-binding sites, there is much room left for the improvement of predictive accuracy. We present a sequence semantics-based method (called PRIP) for predicting RNA-binding interfaces. The PRIP extracted semantic embedding by pre-training the Word2vec with the corpus. Extreme gradient boosting was employed to train a classifier. The PRIP obtained a SN of 0.73 over the five-fold cross validation and a SN of 0.67 over the independent test, outperforming the state-of-the-art methods. Compared with other methods, this PRIP learned the hidden relations between words in the context. The analysis of the semantics relationship implied that the semantics of some words were specific to RNA-binding interfaces. This method is helpful to explore the mechanism of RNA–protein interactions from a semantics point of view.
Collapse
|
17
|
Xia C, Li Q, Cheng X, Wu T, Gao P, Gu Y. Insulin-like growth factor 2 mRNA-binding protein 2-stabilized long non-coding RNA Taurine up-regulated gene 1 (TUG1) promotes cisplatin-resistance of colorectal cancer via modulating autophagy. Bioengineered 2022; 13:2450-2469. [PMID: 35014946 PMCID: PMC8973703 DOI: 10.1080/21655979.2021.2012918] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 11/26/2021] [Accepted: 11/27/2021] [Indexed: 12/13/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) have been demonstrated to influence the chemoresistance of colorectal cancer (CRC). Therefore, the study is designed to investigate the regulatory function and mechanism of Taurine up-regulated gene 1 (TUG1) in the cisplatin resistance of CRC. qRT-PCR checked the expressions of TUG1, Insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2), and miR-195-5p in CRC tissues and cells. The TUG1 or miR-195-5p overexpression model was engineered in CRC cells, followed by treatment with DDP or the autophagy inhibitor (Chloroquine, CQ). CCK8 (Cell Counting Kit-8) and the colony formation experiment monitored cell proliferation. Flow cytometry examined apoptosis, Transwell tracked migration and invasion, and Western blot ascertained the protein profiles of autophagy proteins (LC3I/LC3II and Beclin1) and the HDGF/DDX5/β-catenin pathway. Dual-luciferase gene reporter assay and RNA immunoprecipitation confirmed the binding correlation between TUG1 and miR-195-5p and between miR-195-5p and HDGF. Furthermore, in-vivo experiments in nude mice probed the function and mechanism of IGF2BP2 in CRC cell growth. The profiles of TUG1 and IGF2BP2 were elevated in CRC tissues, and IGF2BP2 enhanced TUG1's expression in CRC cells. TUG1 activated autophagy to facilitate CRC cells' resistance to DDP. TUG1 targets miR-195-5p, and miR-195-5p targets HDGF. Overexpression of miR-195-5p abated the cancer-promoting function of TUG1 and curbed the profile of the HDGF/DDX5/β-catenin axis. TUG1 stabilized by IGF2BP2 boosted CRC cell proliferation, migration, migration, and autophagy via the miR-195-5p/HDGF/DDX5/β-catenin axis, hence enhancing CRC cell's resistance to DDP.
Collapse
Affiliation(s)
- Cuifeng Xia
- Department of Colorectal Surgery, The Third Affiliated Hospital of Kunming Medical University (Yunnan Cancer Hospital), Kunming, Yunnan, China
| | - Qiang Li
- Department of Colorectal Surgery, The Third Affiliated Hospital of Kunming Medical University (Yunnan Cancer Hospital), Kunming, Yunnan, China
| | - Xianshuo Cheng
- Department of Colorectal Surgery, The Third Affiliated Hospital of Kunming Medical University (Yunnan Cancer Hospital), Kunming, Yunnan, China
| | - Tao Wu
- Department of Colorectal Surgery, The Third Affiliated Hospital of Kunming Medical University (Yunnan Cancer Hospital), Kunming, Yunnan, China
| | - Pin Gao
- Department of Colorectal Surgery, The Third Affiliated Hospital of Kunming Medical University (Yunnan Cancer Hospital), Kunming, Yunnan, China
| | - Yongfang Gu
- Department of Hepatobiliary Surgery, The Second People’s Hospital of Qujing, Qujing, Yunnan, China
| |
Collapse
|
18
|
Arora V, Sanguinetti G. Challenges for machine learning in RNA-protein interaction prediction. Stat Appl Genet Mol Biol 2022; 21:sagmb-2021-0087. [PMID: 35073469 DOI: 10.1515/sagmb-2021-0087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 01/02/2022] [Indexed: 11/15/2022]
Abstract
RNA-protein interactions have long being recognised as crucial regulators of gene expression. Recently, the development of scalable experimental techniques to measure these interactions has revolutionised the field, leading to the production of large-scale datasets which offer both opportunities and challenges for machine learning techniques. In this brief note, we will discuss some of the major stumbling blocks towards the use of machine learning in computational RNA biology, focusing specifically on the problem of predicting RNA-protein interactions from next-generation sequencing data.
Collapse
Affiliation(s)
- Viplove Arora
- Data Science, Department of Physics, International School for Advanced Studies (SISSA), Trieste 34136, Italy
| | - Guido Sanguinetti
- Data Science, Department of Physics, International School for Advanced Studies (SISSA), Trieste 34136, Italy
| |
Collapse
|
19
|
Wei J, Chen S, Zong L, Gao X, Li Y. Protein-RNA interaction prediction with deep learning: structure matters. Brief Bioinform 2022; 23:bbab540. [PMID: 34929730 PMCID: PMC8790951 DOI: 10.1093/bib/bbab540] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 11/14/2021] [Accepted: 11/22/2021] [Indexed: 12/11/2022] Open
Abstract
Protein-RNA interactions are of vital importance to a variety of cellular activities. Both experimental and computational techniques have been developed to study the interactions. Because of the limitation of the previous database, especially the lack of protein structure data, most of the existing computational methods rely heavily on the sequence data, with only a small portion of the methods utilizing the structural information. Recently, AlphaFold has revolutionized the entire protein and biology field. Foreseeably, the protein-RNA interaction prediction will also be promoted significantly in the upcoming years. In this work, we give a thorough review of this field, surveying both the binding site and binding preference prediction problems and covering the commonly used datasets, features and models. We also point out the potential challenges and opportunities in this field. This survey summarizes the development of the RNA-binding protein-RNA interaction field in the past and foresees its future development in the post-AlphaFold era.
Collapse
Affiliation(s)
- Junkang Wei
- Department of Computer Science and Engineering (CSE), The Chinese
University of Hong Kong (CUHK), 999077, Hong Kong SAR, China
| | - Siyuan Chen
- Computational Bioscience Research Center (CBRC),
King Abdullah University of Science and Technology (KAUST),
23955-6900, Thuwal, Saudi Arabia
| | - Licheng Zong
- Department of Computer Science and Engineering (CSE), The Chinese
University of Hong Kong (CUHK), 999077, Hong Kong SAR, China
| | - Xin Gao
- Computational Bioscience Research Center (CBRC),
King Abdullah University of Science and Technology (KAUST),
23955-6900, Thuwal, Saudi Arabia
| | - Yu Li
- Department of Computer Science and Engineering (CSE), The Chinese
University of Hong Kong (CUHK), 999077, Hong Kong SAR, China
- The CUHK Shenzhen Research Institute, Hi-Tech Park, 518057,
Shenzhen, China
| |
Collapse
|
20
|
Zhu W, Chen X, Guo X, Liu H, Ma R, Wang Y, Liang Y, Sun Y, Wang M, Zhao R, Gao P. Low glucose-induced overexpression of HOXC-AS3 promotes metabolic reprogramming of breast cancer. Cancer Res 2022; 82:805-818. [PMID: 35031573 DOI: 10.1158/0008-5472.can-21-1179] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 08/08/2021] [Accepted: 01/03/2022] [Indexed: 11/16/2022]
Affiliation(s)
- Wenjie Zhu
- Key Laboratory for Experimental Teratology of Ministry of Education, Department of Pathology, School of Basic Medical Sciences, Shandong University, Jinan, Shandong, China
- Department of Pathology, Qilu Hospital, Shandong University, Jinan, Shandong, China
| | - Xu Chen
- Key Laboratory for Experimental Teratology of Ministry of Education, Department of Pathology, School of Basic Medical Sciences, Shandong University, Jinan, Shandong, China
- Department of Pathology, Qilu Hospital, Shandong University, Jinan, Shandong, China
| | - Xiangyu Guo
- Key Laboratory for Experimental Teratology of Ministry of Education, Department of Pathology, School of Basic Medical Sciences, Shandong University, Jinan, Shandong, China
- Department of Pathology, Qilu Hospital, Shandong University, Jinan, Shandong, China
| | - Haiting Liu
- Key Laboratory for Experimental Teratology of Ministry of Education, Department of Pathology, School of Basic Medical Sciences, Shandong University, Jinan, Shandong, China
- Department of Pathology, Qilu Hospital, Shandong University, Jinan, Shandong, China
| | - Ranran Ma
- Key Laboratory for Experimental Teratology of Ministry of Education, Department of Pathology, School of Basic Medical Sciences, Shandong University, Jinan, Shandong, China
- Department of Pathology, Qilu Hospital, Shandong University, Jinan, Shandong, China
| | - Yawen Wang
- Key Laboratory for Experimental Teratology of Ministry of Education, Department of Pathology, School of Basic Medical Sciences, Shandong University, Jinan, Shandong, China
| | - Yahang Liang
- Key Laboratory for Experimental Teratology of Ministry of Education, Department of Pathology, School of Basic Medical Sciences, Shandong University, Jinan, Shandong, China
| | - Ying Sun
- Key Laboratory for Experimental Teratology of Ministry of Education, Department of Pathology, School of Basic Medical Sciences, Shandong University, Jinan, Shandong, China
| | - Mengqi Wang
- Key Laboratory for Experimental Teratology of Ministry of Education, Department of Pathology, School of Basic Medical Sciences, Shandong University, Jinan, Shandong, China
- Department of Pathology, Qilu Hospital, Shandong University, Jinan, Shandong, China
| | - Ruinan Zhao
- Key Laboratory for Experimental Teratology of Ministry of Education, Department of Pathology, School of Basic Medical Sciences, Shandong University, Jinan, Shandong, China
- Department of Pathology, Qilu Hospital, Shandong University, Jinan, Shandong, China
| | - Peng Gao
- Key Laboratory for Experimental Teratology of Ministry of Education, Department of Pathology, School of Basic Medical Sciences, Shandong University, Jinan, Shandong, China
- Department of Pathology, Qilu Hospital, Shandong University, Jinan, Shandong, China
| |
Collapse
|
21
|
Niu M, Zou Q, Lin C. CRBPDL: Identification of circRNA-RBP interaction sites using an ensemble neural network approach. PLoS Comput Biol 2022; 18:e1009798. [PMID: 35051187 PMCID: PMC8806072 DOI: 10.1371/journal.pcbi.1009798] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 02/01/2022] [Accepted: 01/02/2022] [Indexed: 02/06/2023] Open
Abstract
Circular RNAs (circRNAs) are non-coding RNAs with a special circular structure produced formed by the reverse splicing mechanism. Increasing evidence shows that circular RNAs can directly bind to RNA-binding proteins (RBP) and play an important role in a variety of biological activities. The interactions between circRNAs and RBPs are key to comprehending the mechanism of posttranscriptional regulation. Accurately identifying binding sites is very useful for analyzing interactions. In past research, some predictors on the basis of machine learning (ML) have been presented, but prediction accuracy still needs to be ameliorated. Therefore, we present a novel calculation model, CRBPDL, which uses an Adaboost integrated deep hierarchical network to identify the binding sites of circular RNA-RBP. CRBPDL combines five different feature encoding schemes to encode the original RNA sequence, uses deep multiscale residual networks (MSRN) and bidirectional gating recurrent units (BiGRUs) to effectively learn high-level feature representations, it is sufficient to extract local and global context information at the same time. Additionally, a self-attention mechanism is employed to train the robustness of the CRBPDL. Ultimately, the Adaboost algorithm is applied to integrate deep learning (DL) model to improve prediction performance and reliability of the model. To verify the usefulness of CRBPDL, we compared the efficiency with state-of-the-art methods on 37 circular RNA data sets and 31 linear RNA data sets. Moreover, results display that CRBPDL is capable of performing universal, reliable, and robust. The code and data sets are obtainable at https://github.com/nmt315320/CRBPDL.git. More and more evidences show that circular RNA can directly bind to proteins and participate in countless different biological processes. The calculation method can quickly and accurately predict the binding site of circular RNA and RBP. In order to identify the interaction of circRNA with 37 different types of circRNA binding proteins, we developed an integrated deep learning network based on hierarchical network, called CRBPDL. It can effectively learn high-level feature representations. The performance of the model was verified through comparative experiments of different feature extraction algorithms, different deep learning models and classifier models. Moreover, the CRBPDL model was applied to 31 linear RNAs, and the effectiveness of our method was proved by comparison with the results of current excellent algorithms. It is expected that the CRBPDL model can effectively predict the binding site of circular RNA-RBP and provide reliable candidates for further biological experiments.
Collapse
Affiliation(s)
- Mengting Niu
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
| | - Chen Lin
- School of Informatics, Xiamen University, Xiamen, China
- * E-mail:
| |
Collapse
|
22
|
Marques-Pereira C, Pires M, Moreira IS. Discovery of Virus-Host interactions using bioinformatic tools. Methods Cell Biol 2022; 169:169-198. [DOI: 10.1016/bs.mcb.2022.02.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
23
|
LPI-HyADBS: a hybrid framework for lncRNA-protein interaction prediction integrating feature selection and classification. BMC Bioinformatics 2021; 22:568. [PMID: 34836494 PMCID: PMC8620196 DOI: 10.1186/s12859-021-04485-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 11/09/2021] [Indexed: 12/03/2022] Open
Abstract
Background Long noncoding RNAs (lncRNAs) have dense linkages with a plethora of important cellular activities. lncRNAs exert functions by linking with corresponding RNA-binding proteins. Since experimental techniques to detect lncRNA-protein interactions (LPIs) are laborious and time-consuming, a few computational methods have been reported for LPI prediction. However, computation-based LPI identification methods have the following limitations: (1) Most methods were evaluated on a single dataset, and researchers may thus fail to measure their generalization ability. (2) The majority of methods were validated under cross validation on lncRNA-protein pairs, did not investigate the performance under other cross validations, especially for cross validation on independent lncRNAs and independent proteins. (3) lncRNAs and proteins have abundant biological information, how to select informative features need to further investigate. Results Under a hybrid framework (LPI-HyADBS) integrating feature selection based on AdaBoost, and classification models including deep neural network (DNN), extreme gradient Boost (XGBoost), and SVM with a penalty Coefficient of misclassification (C-SVM), this work focuses on finding new LPIs. First, five datasets are arranged. Each dataset contains lncRNA sequences, protein sequences, and an LPI network. Second, biological features of lncRNAs and proteins are acquired based on Pyfeat. Third, the obtained features of lncRNAs and proteins are selected based on AdaBoost and concatenated to depict each LPI sample. Fourth, DNN, XGBoost, and C-SVM are used to classify lncRNA-protein pairs based on the concatenated features. Finally, a hybrid framework is developed to integrate the classification results from the above three classifiers. LPI-HyADBS is compared to six classical LPI prediction approaches (LPI-SKF, LPI-NRLMF, Capsule-LPI, LPI-CNNCP, LPLNP, and LPBNI) on five datasets under 5-fold cross validations on lncRNAs, proteins, lncRNA-protein pairs, and independent lncRNAs and independent proteins. The results show LPI-HyADBS has the best LPI prediction performance under four different cross validations. In particular, LPI-HyADBS obtains better classification ability than other six approaches under the constructed independent dataset. Case analyses suggest that there is relevance between ZNF667-AS1 and Q15717. Conclusions Integrating feature selection approach based on AdaBoost, three classification techniques including DNN, XGBoost, and C-SVM, this work develops a hybrid framework to identify new linkages between lncRNAs and proteins. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04485-x.
Collapse
|
24
|
Ma Y, Wang X, Luo W, Xiao J, Song X, Wang Y, Shuai H, Ren Z, Wang Y. Roles of Emerging RNA-Binding Activity of cGAS in Innate Antiviral Response. Front Immunol 2021; 12:741599. [PMID: 34899698 PMCID: PMC8660693 DOI: 10.3389/fimmu.2021.741599] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 10/25/2021] [Indexed: 12/12/2022] Open
Abstract
cGAS, a DNA sensor in mammalian cells, catalyzes the generation of 2'-3'-cyclic AMP-GMP (cGAMP) once activated by the binding of free DNA. cGAMP can bind to STING, activating downstream TBK1-IRF-3 signaling to initiate the expression of type I interferons. Although cGAS has been considered a traditional DNA-binding protein, several lines of evidence suggest that cGAS is a potential RNA-binding protein (RBP), which is mainly supported by its interactions with RNAs, RBP partners, RNA/cGAS-phase-separations as well as its structural similarity with the dsRNA recognition receptor 2'-5' oligoadenylate synthase. Moreover, two influential studies reported that the cGAS-like receptors (cGLRs) of fly Drosophila melanogaster sense RNA and control 3'-2'-cGAMP signaling. In this review, we summarize and discuss in depth recent studies that identified or implied cGAS as an RBP. We also comprehensively summarized current experimental methods and computational tools that can identify or predict RNAs that bind to cGAS. Based on these discussions, we appeal that the RNA-binding activity of cGAS cannot be ignored in the cGAS-mediated innate antiviral response. It will be important to identify RNAs that can bind and regulate the activity of cGAS in cells with or without virus infection. Our review provides novel insight into the regulation of cGAS by its RNA-binding activity and extends beyond its DNA-binding activity. Our review would be significant for understanding the precise modulation of cGAS activity, providing the foundation for the future development of drugs against cGAS-triggering autoimmune diseases such as Aicardi-Gourtières syndrome.
Collapse
Affiliation(s)
- Yuying Ma
- Guangzhou Jinan Biomedicine Research and Development Center, National Engineering Research Center of Genetic Medicine, Institute of Biomedicine, College of Life Science and Technology, Jinan University, Guangzhou, China
- Key Laboratory of Virology of Guangdong Province, Jinan University, Guangzhou, China
- Guangdong Province Key Laboratory of Bioengineering Medicine, Jinan University, Guangzhou, China
| | - Xiaohui Wang
- Guangzhou Jinan Biomedicine Research and Development Center, National Engineering Research Center of Genetic Medicine, Institute of Biomedicine, College of Life Science and Technology, Jinan University, Guangzhou, China
- Key Laboratory of Virology of Guangdong Province, Jinan University, Guangzhou, China
- Guangdong Province Key Laboratory of Bioengineering Medicine, Jinan University, Guangzhou, China
| | - Weisheng Luo
- Guangzhou Jinan Biomedicine Research and Development Center, National Engineering Research Center of Genetic Medicine, Institute of Biomedicine, College of Life Science and Technology, Jinan University, Guangzhou, China
- Key Laboratory of Virology of Guangdong Province, Jinan University, Guangzhou, China
- Guangdong Province Key Laboratory of Bioengineering Medicine, Jinan University, Guangzhou, China
| | - Ji Xiao
- Guangzhou Jinan Biomedicine Research and Development Center, National Engineering Research Center of Genetic Medicine, Institute of Biomedicine, College of Life Science and Technology, Jinan University, Guangzhou, China
- Key Laboratory of Virology of Guangdong Province, Jinan University, Guangzhou, China
- Guangdong Province Key Laboratory of Bioengineering Medicine, Jinan University, Guangzhou, China
| | - Xiaowei Song
- Guangzhou Jinan Biomedicine Research and Development Center, National Engineering Research Center of Genetic Medicine, Institute of Biomedicine, College of Life Science and Technology, Jinan University, Guangzhou, China
- Key Laboratory of Virology of Guangdong Province, Jinan University, Guangzhou, China
- Guangdong Province Key Laboratory of Bioengineering Medicine, Jinan University, Guangzhou, China
| | - Yifei Wang
- Guangzhou Jinan Biomedicine Research and Development Center, National Engineering Research Center of Genetic Medicine, Institute of Biomedicine, College of Life Science and Technology, Jinan University, Guangzhou, China
- Key Laboratory of Virology of Guangdong Province, Jinan University, Guangzhou, China
- Guangdong Province Key Laboratory of Bioengineering Medicine, Jinan University, Guangzhou, China
| | - Hanlin Shuai
- Department of Obstetrics and Gynecology, The Fifth Affiliated Hospital of Jinan University, Heyuan, China
| | - Zhe Ren
- Guangzhou Jinan Biomedicine Research and Development Center, National Engineering Research Center of Genetic Medicine, Institute of Biomedicine, College of Life Science and Technology, Jinan University, Guangzhou, China
- Key Laboratory of Virology of Guangdong Province, Jinan University, Guangzhou, China
- Guangdong Province Key Laboratory of Bioengineering Medicine, Jinan University, Guangzhou, China
| | - Yiliang Wang
- Guangzhou Jinan Biomedicine Research and Development Center, National Engineering Research Center of Genetic Medicine, Institute of Biomedicine, College of Life Science and Technology, Jinan University, Guangzhou, China
- Key Laboratory of Virology of Guangdong Province, Jinan University, Guangzhou, China
- Guangdong Province Key Laboratory of Bioengineering Medicine, Jinan University, Guangzhou, China
- State Key Laboratory of Respiratory Disease, National Clinical Research Center for Respiratory Disease, Guangzhou Institute of Respiratory Health, the First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
25
|
Uhl M, Tran VD, Heyl F, Backofen R. RNAProt: an efficient and feature-rich RNA binding protein binding site predictor. Gigascience 2021; 10:giab054. [PMID: 34406415 PMCID: PMC8372218 DOI: 10.1093/gigascience/giab054] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 05/18/2021] [Accepted: 07/27/2021] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Cross-linking and immunoprecipitation followed by next-generation sequencing (CLIP-seq) is the state-of-the-art technique used to experimentally determine transcriptome-wide binding sites of RNA-binding proteins (RBPs). However, it relies on gene expression, which can be highly variable between conditions and thus cannot provide a complete picture of the RBP binding landscape. This creates a demand for computational methods to predict missing binding sites. Although there exist various methods using traditional machine learning and lately also deep learning, we encountered several problems: many of these are not well documented or maintained, making them difficult to install and use, or are not even available. In addition, there can be efficiency issues, as well as little flexibility regarding options or supported features. RESULTS Here, we present RNAProt, an efficient and feature-rich computational RBP binding site prediction framework based on recurrent neural networks. We compare RNAProt with 1 traditional machine learning approach and 2 deep-learning methods, demonstrating its state-of-the-art predictive performance and better run time efficiency. We further show that its implemented visualizations capture known binding preferences and thus can help to understand what is learned. Since RNAProt supports various additional features (including user-defined features, which no other tool offers), we also present their influence on benchmark set performance. Finally, we show the benefits of incorporating additional features, specifically structure information, when learning the binding sites of an hairpin loop binding RBP. CONCLUSIONS RNAProt provides a complete framework for RBP binding site predictions, from data set generation over model training to the evaluation of binding preferences and prediction. It offers state-of-the-art predictive performance, as well as superior run time efficiency, while at the same time supporting more features and input types than any other tool available so far. RNAProt is easy to install and use, comes with comprehensive documentation, and is accompanied by informative statistics and visualizations. All this makes RNAProt a valuable tool to apply in future RBP binding site research.
Collapse
Affiliation(s)
- Michael Uhl
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Van Dinh Tran
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Florian Heyl
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Schaenzlestr. 18, 79104 Freiburg, Germany
| |
Collapse
|
26
|
Wu H, Pan X, Yang Y, Shen HB. Recognizing binding sites of poorly characterized RNA-binding proteins on circular RNAs using attention Siamese network. Brief Bioinform 2021; 22:6326526. [PMID: 34297803 DOI: 10.1093/bib/bbab279] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 06/04/2021] [Accepted: 07/01/2021] [Indexed: 12/24/2022] Open
Abstract
Circular RNAs (circRNAs) interact with RNA-binding proteins (RBPs) to play crucial roles in gene regulation and disease development. Computational approaches have attracted much attention to quickly predict highly potential RBP binding sites on circRNAs using the sequence or structure statistical binding knowledge. Deep learning is one of the popular learning models in this area but usually requires a lot of labeled training data. It would perform unsatisfactorily for the less characterized RBPs with a limited number of known target circRNAs. How to improve the prediction performance for such small-size labeled characterized RBPs is a challenging task for deep learning-based models. In this study, we propose an RBP-specific method iDeepC for predicting RBP binding sites on circRNAs from sequences. It adopts a Siamese neural network consisting of a lightweight attention module and a metric module. We have found that Siamese neural network effectively enhances the network capability of capturing mutual information between circRNAs with pairwise metric learning. To further deal with the small-sample size problem, we have performed the pretraining using available labeled data from other RBPs and also demonstrate the efficacy of this transfer-learning pipeline. We comprehensively evaluated iDeepC on the benchmark datasets of RBP-binding circRNAs, and the results suggest iDeepC achieving promising results on the poorly characterized RBPs. The source code is available at https://github.com/hehew321/iDeepC.
Collapse
Affiliation(s)
- Hehe Wu
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Yang Yang
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| |
Collapse
|
27
|
Zooming in on protein-RNA interactions: a multi-level workflow to identify interaction partners. Biochem Soc Trans 2021; 48:1529-1543. [PMID: 32820806 PMCID: PMC7458403 DOI: 10.1042/bst20191059] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Revised: 07/17/2020] [Accepted: 07/20/2020] [Indexed: 02/01/2023]
Abstract
Interactions between proteins and RNA are at the base of numerous cellular regulatory and functional phenomena. The investigation of the biological relevance of non-coding RNAs has led to the identification of numerous novel RNA-binding proteins (RBPs). However, defining the RNA sequences and structures that are selectively recognised by an RBP remains challenging, since these interactions can be transient and highly dynamic, and may be mediated by unstructured regions in the protein, as in the case of many non-canonical RBPs. Numerous experimental and computational methodologies have been developed to predict, identify and verify the binding between a given RBP and potential RNA partners, but navigating across the vast ocean of data can be frustrating and misleading. In this mini-review, we propose a workflow for the identification of the RNA binding partners of putative, newly identified RBPs. The large pool of potential binders selected by in-cell experiments can be enriched by in silico tools such as catRAPID, which is able to predict the RNA sequences more likely to interact with specific RBP regions with high accuracy. The RNA candidates with the highest potential can then be analysed in vitro to determine the binding strength and to precisely identify the binding sites. The results thus obtained can furthermore validate the computational predictions, offering an all-round solution to the issue of finding the most likely RNA binding partners for a newly identified potential RBP.
Collapse
|
28
|
Yang S, Liu X, Ng RT. ProbeRating: a recommender system to infer binding profiles for nucleic acid-binding proteins. Bioinformatics 2021; 36:4797-4804. [PMID: 32573679 PMCID: PMC7750938 DOI: 10.1093/bioinformatics/btaa580] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Revised: 05/18/2020] [Accepted: 06/18/2020] [Indexed: 12/15/2022] Open
Abstract
MOTIVATION The interaction between proteins and nucleic acids plays a crucial role in gene regulation and cell function. Determining the binding preferences of nucleic acid-binding proteins (NBPs), namely RNA-binding proteins (RBPs) and transcription factors (TFs), is the key to decipher the protein-nucleic acids interaction code. Today, available NBP binding data from in vivo or in vitro experiments are still limited, which leaves a large portion of NBPs uncovered. Unfortunately, existing computational methods that model the NBP binding preferences are mostly protein specific: they need the experimental data for a specific protein in interest, and thus only focus on experimentally characterized NBPs. The binding preferences of experimentally unexplored NBPs remain largely unknown. RESULTS Here, we introduce ProbeRating, a nucleic acid recommender system that utilizes techniques from deep learning and word embeddings of natural language processing. ProbeRating is developed to predict binding profiles for unexplored or poorly studied NBPs by exploiting their homologs NBPs which currently have available binding data. Requiring only sequence information as input, ProbeRating adapts FastText from Facebook AI Research to extract biological features. It then builds a neural network-based recommender system. We evaluate the performance of ProbeRating on two different tasks: one for RBP and one for TF. As a result, ProbeRating outperforms previous methods on both tasks. The results show that ProbeRating can be a useful tool to study the binding mechanism for the many NBPs that lack direct experimental evidence. and implementation. AVAILABILITY AND IMPLEMENTATION The source code is freely available at <https://github.com/syang11/ProbeRating>. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shu Yang
- Department of Computer Science, University of British Columbia, Vancouver, BC V6T1Z4, Canada
| | - Xiaoxi Liu
- RIKEN Center for Integrative Medical Sciences (IMS), Yokohama 230-0045, Japan
| | - Raymond T Ng
- Department of Computer Science, University of British Columbia, Vancouver, BC V6T1Z4, Canada
| |
Collapse
|
29
|
Lucero L, Ferrero L, Fonouni-Farde C, Ariel F. Functional classification of plant long noncoding RNAs: a transcript is known by the company it keeps. THE NEW PHYTOLOGIST 2021; 229:1251-1260. [PMID: 32880949 DOI: 10.1111/nph.16903] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Accepted: 08/05/2020] [Indexed: 05/27/2023]
Abstract
The extraordinary maturation in high-throughput sequencing technologies has revealed the existence of a complex network of transcripts in eukaryotic organisms, including thousands of long noncoding (lnc) RNAs with little or no protein-coding capacity. Subsequent discoveries have shown that lncRNAs participate in a wide range of molecular processes, controlling gene expression and protein activity though direct interactions with proteins, DNA or other RNA molecules. Although significant advances have been achieved in the understanding of lncRNA biology in the animal kingdom, the functional characterization of plant lncRNAs is still in its infancy and remains a major challenge. In this review, we report emerging functional and mechanistic paradigms of plant lncRNAs and partner molecules, and discuss how cutting-edge technologies may help to identify and classify yet uncharacterized transcripts into functional groups.
Collapse
Affiliation(s)
- Leandro Lucero
- Instituto de Agrobiotecnología del Litoral, CONICET, Universidad Nacional del Litoral, Colectora Ruta Nacional 168 km 0, Santa Fe, 3000, Argentina
| | - Lucía Ferrero
- Instituto de Agrobiotecnología del Litoral, CONICET, Universidad Nacional del Litoral, Colectora Ruta Nacional 168 km 0, Santa Fe, 3000, Argentina
| | - Camille Fonouni-Farde
- Instituto de Agrobiotecnología del Litoral, CONICET, Universidad Nacional del Litoral, Colectora Ruta Nacional 168 km 0, Santa Fe, 3000, Argentina
| | - Federico Ariel
- Instituto de Agrobiotecnología del Litoral, CONICET, Universidad Nacional del Litoral, Colectora Ruta Nacional 168 km 0, Santa Fe, 3000, Argentina
| |
Collapse
|
30
|
Yuan L, Yang Y. DeCban: Prediction of circRNA-RBP Interaction Sites by Using Double Embeddings and Cross-Branch Attention Networks. Front Genet 2021; 11:632861. [PMID: 33552144 PMCID: PMC7862712 DOI: 10.3389/fgene.2020.632861] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 12/23/2020] [Indexed: 12/17/2022] Open
Abstract
Circular RNAs (circRNAs), as a rising star in the RNA world, play important roles in various biological processes. Understanding the interactions between circRNAs and RNA binding proteins (RBPs) can help reveal the functions of circRNAs. For the past decade, the emergence of high-throughput experimental data, like CLIP-Seq, has made the computational identification of RNA-protein interactions (RPIs) possible based on machine learning methods. However, as the underlying mechanisms of RPIs have not been fully understood yet and the information sources of circRNAs are limited, the computational tools for predicting circRNA-RBP interactions have been very few. In this study, we propose a deep learning method to identify circRNA-RBP interactions, called DeCban, which is featured by hybrid double embeddings for representing RNA sequences and a cross-branch attention neural network for classification. To capture more information from RNA sequences, the double embeddings include pre-trained embedding vectors for both RNA segments and their converted amino acids. Meanwhile, the cross-branch attention network aims to address the learning of very long sequences by integrating features of different scales and focusing on important information. The experimental results on 37 benchmark datasets show that both double embeddings and the cross-branch attention model contribute to the improvement of performance. DeCban outperforms the mainstream deep learning-based methods on not only prediction accuracy but also computational efficiency. The data sets and source code of this study are freely available at: https://github.com/AaronYll/DECban.
Collapse
Affiliation(s)
- Liangliang Yuan
- Department of Computer Science and Engineering, Center for Brain-Like Computing and Machine Intelligence, Shanghai Jiao Tong University, Shanghai, China
| | - Yang Yang
- Department of Computer Science and Engineering, Center for Brain-Like Computing and Machine Intelligence, Shanghai Jiao Tong University, Shanghai, China.,Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai, China
| |
Collapse
|
31
|
Zhang J, Yu J, Lin D, Guo X, He H, Shi S. DeepCLA: A Hybrid Deep Learning Approach for the Identification of Clathrin. J Chem Inf Model 2020; 61:516-524. [PMID: 33347303 DOI: 10.1021/acs.jcim.0c00979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Clathrin is a highly evolutionarily conserved protein, which can affect membrane cleavage and membrane release of vesicles. The absence of clathrin in the cellular system affects a variety of human diseases. Effective recognition of clathrin plays an important role in the development of drugs to treat related diseases. In recent years, deep learning has been widely applied in the field of bioinformatics because of its high efficiency and accuracy. In this study, we propose a deep learning framework, DeepCLA, which combines two different network structures, including a convolutional neural network and a bidirectional long short-term memory network to identify clathrin. The investigation of different deep network architectures demonstrates that the prediction performance of a hybrid depth network model is better than that of a single depth network. On the independent test dataset, DeepCLA outperforms the state-of-the-art methods. It suggests that DeepCLA is an effective approach for clathrin prediction and can provide more instructive guidance for further experimental investigation of clathrin. Moreover, the source code and training data of DeepCLA are provided at https://github.com/ZhangZhang89/DeepCLA.
Collapse
Affiliation(s)
- Ju Zhang
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang 330031, China
| | - Jialin Yu
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang 330031, China
| | - Dan Lin
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang 330031, China
| | - Xinyun Guo
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang 330031, China
| | - Huan He
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang 330031, China
| | - Shaoping Shi
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang 330031, China
| |
Collapse
|
32
|
Pan X, Fang Y, Li X, Yang Y, Shen HB. RBPsuite: RNA-protein binding sites prediction suite based on deep learning. BMC Genomics 2020; 21:884. [PMID: 33297946 PMCID: PMC7724624 DOI: 10.1186/s12864-020-07291-6] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2020] [Accepted: 11/28/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND RNA-binding proteins (RBPs) play crucial roles in various biological processes. Deep learning-based methods have been demonstrated powerful on predicting RBP sites on RNAs. However, the training of deep learning models is very time-intensive and computationally intensive. RESULTS Here we present a deep learning-based RBPsuite, an easy-to-use webserver for predicting RBP binding sites on linear and circular RNAs. For linear RNAs, RBPsuite predicts the RBP binding scores with them using our updated iDeepS. For circular RNAs (circRNAs), RBPsuite predicts the RBP binding scores with them using our developed CRIP. RBPsuite first breaks the input RNA sequence into segments of 101 nucleotides and scores the interaction between the segments and the RBPs. RBPsuite further detects the verified motifs on the binding segments gives the binding scores distribution along the full-length sequence. CONCLUSIONS RBPsuite is an easy-to-use online webserver for predicting RBP binding sites and freely available at http://www.csbio.sjtu.edu.cn/bioinf/RBPsuite/ .
Collapse
Affiliation(s)
- Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China.
| | - Yi Fang
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Xianfeng Li
- Key laboratory of Carcinogenesis and Translational Research, Peking University Cancer Hospital, Beijing, 100142, China
| | - Yang Yang
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Center for Brain-Like Computing and Machine Intelligence, Shanghai, 200240, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China.
| |
Collapse
|
33
|
Wang X, Wu Z, Qin W, Sun T, Lu S, Li Y, Wang Y, Hu X, Xu D, Wu Y, Chen Q, Yao W, Liu M, Wei M, Wu H. Long non-coding RNA ZFAS1 promotes colorectal cancer tumorigenesis and development through DDX21-POLR1B regulatory axis. Aging (Albany NY) 2020; 12:22656-22687. [PMID: 33202381 PMCID: PMC7746388 DOI: 10.18632/aging.103875] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Accepted: 05/25/2020] [Indexed: 12/19/2022]
Abstract
Increasing evidence supports long non-coding RNA-ZFAS1 as master protein regulators involved in a variety of human cancers. However, the molecular mechanism is not fully understood in colorectal cancer (CRC) and remains to be elucidated. Here, we uncovered a previously unreported mechanism linking RNA helicase DDX21 regulated by lncRNA ZFAS1 in control of POLR1B expression in CRC initiation and progression. Specifically, ZFAS1 exerted its oncogenic functions and was significantly up-regulated accompanied by elevated DDX21, POLR1B expression in CRC cells and tissues, which further closely associated with poor clinical outcomes. Notably, ZFAS1 knockdown dramatically suppressed CRC cell proliferation, invasion, migration, and increased cell apoptosis, which were contrary to the effect caused by ZFAS1 up-regulation. We further revealed that the inhibitory effect caused by ZFAS1 knockdown could be reversed by DDX21 overexpression in vitro and in vivo. Mechanistically, our research found that ZFAS1 could directly recruit DDX21 protein by harboring the specific motif (AAGA or CAGA). Finally, POLR1B was identified as the downstream target of DDX21 regulated by ZFAS1, which was also up-regulated in CRC cells and tissues and closely related to poor prognosis. The unrecognized ZFAS1/DDX21/POLR1B signaling regulation axis may provide new biomarkers and targets for CRC treatment and prognostic evaluation.
Collapse
Affiliation(s)
- Xiufang Wang
- Department of Pharmacology, School of Pharmacy, China Medical University, Shenyang 110122, P. R. China.,Liaoning Key Laboratory of Molecular Targeted Anti-Tumor Drug Development and Evaluation, Liaoning Cancer Immune Peptide Drug Engineering Technology Research Center, Key Laboratory of Precision Diagnosis and Treatment of Gastrointestinal Tumors, Ministry of Education, China Medical University, Shenyang 110122, P. R. China
| | - Zhikun Wu
- Department of Pharmacology, School of Pharmacy, China Medical University, Shenyang 110122, P. R. China.,Liaoning Key Laboratory of Molecular Targeted Anti-Tumor Drug Development and Evaluation, Liaoning Cancer Immune Peptide Drug Engineering Technology Research Center, Key Laboratory of Precision Diagnosis and Treatment of Gastrointestinal Tumors, Ministry of Education, China Medical University, Shenyang 110122, P. R. China
| | - Wenyan Qin
- Department of Pharmacology, School of Pharmacy, China Medical University, Shenyang 110122, P. R. China.,Liaoning Key Laboratory of Molecular Targeted Anti-Tumor Drug Development and Evaluation, Liaoning Cancer Immune Peptide Drug Engineering Technology Research Center, Key Laboratory of Precision Diagnosis and Treatment of Gastrointestinal Tumors, Ministry of Education, China Medical University, Shenyang 110122, P. R. China
| | - Tong Sun
- Department of Pharmacology, School of Pharmacy, China Medical University, Shenyang 110122, P. R. China.,Liaoning Key Laboratory of Molecular Targeted Anti-Tumor Drug Development and Evaluation, Liaoning Cancer Immune Peptide Drug Engineering Technology Research Center, Key Laboratory of Precision Diagnosis and Treatment of Gastrointestinal Tumors, Ministry of Education, China Medical University, Shenyang 110122, P. R. China
| | - Senxu Lu
- Department of Pharmacology, School of Pharmacy, China Medical University, Shenyang 110122, P. R. China.,Liaoning Key Laboratory of Molecular Targeted Anti-Tumor Drug Development and Evaluation, Liaoning Cancer Immune Peptide Drug Engineering Technology Research Center, Key Laboratory of Precision Diagnosis and Treatment of Gastrointestinal Tumors, Ministry of Education, China Medical University, Shenyang 110122, P. R. China
| | - Yalun Li
- Department of Anorectal Surgery, First Hospital of China Medical University, Shenyang 110001, P. R. China
| | - Yuanhe Wang
- Department of Medical Oncology, Cancer Hospital of China Medical University, Department of Medical Oncology, Liaoning Cancer Hospital and Institute, Shenyang 110042, P. R. China
| | - Xiaoyun Hu
- Department of Pharmacology, School of Pharmacy, China Medical University, Shenyang 110122, P. R. China.,Liaoning Key Laboratory of Molecular Targeted Anti-Tumor Drug Development and Evaluation, Liaoning Cancer Immune Peptide Drug Engineering Technology Research Center, Key Laboratory of Precision Diagnosis and Treatment of Gastrointestinal Tumors, Ministry of Education, China Medical University, Shenyang 110122, P. R. China
| | - Dongping Xu
- Department of Pharmacology, School of Pharmacy, China Medical University, Shenyang 110122, P. R. China.,Liaoning Key Laboratory of Molecular Targeted Anti-Tumor Drug Development and Evaluation, Liaoning Cancer Immune Peptide Drug Engineering Technology Research Center, Key Laboratory of Precision Diagnosis and Treatment of Gastrointestinal Tumors, Ministry of Education, China Medical University, Shenyang 110122, P. R. China
| | - Yutong Wu
- Department of Pharmacology, School of Pharmacy, China Medical University, Shenyang 110122, P. R. China.,Liaoning Key Laboratory of Molecular Targeted Anti-Tumor Drug Development and Evaluation, Liaoning Cancer Immune Peptide Drug Engineering Technology Research Center, Key Laboratory of Precision Diagnosis and Treatment of Gastrointestinal Tumors, Ministry of Education, China Medical University, Shenyang 110122, P. R. China
| | - Qiuchen Chen
- Department of Pharmacology, School of Pharmacy, China Medical University, Shenyang 110122, P. R. China.,Liaoning Key Laboratory of Molecular Targeted Anti-Tumor Drug Development and Evaluation, Liaoning Cancer Immune Peptide Drug Engineering Technology Research Center, Key Laboratory of Precision Diagnosis and Treatment of Gastrointestinal Tumors, Ministry of Education, China Medical University, Shenyang 110122, P. R. China
| | - Weifan Yao
- Department of Pharmacology, School of Pharmacy, China Medical University, Shenyang 110122, P. R. China.,Liaoning Key Laboratory of Molecular Targeted Anti-Tumor Drug Development and Evaluation, Liaoning Cancer Immune Peptide Drug Engineering Technology Research Center, Key Laboratory of Precision Diagnosis and Treatment of Gastrointestinal Tumors, Ministry of Education, China Medical University, Shenyang 110122, P. R. China
| | - Mingyan Liu
- Department of Pharmacology, School of Pharmacy, China Medical University, Shenyang 110122, P. R. China.,Liaoning Key Laboratory of Molecular Targeted Anti-Tumor Drug Development and Evaluation, Liaoning Cancer Immune Peptide Drug Engineering Technology Research Center, Key Laboratory of Precision Diagnosis and Treatment of Gastrointestinal Tumors, Ministry of Education, China Medical University, Shenyang 110122, P. R. China
| | - Minjie Wei
- Department of Pharmacology, School of Pharmacy, China Medical University, Shenyang 110122, P. R. China.,Liaoning Key Laboratory of Molecular Targeted Anti-Tumor Drug Development and Evaluation, Liaoning Cancer Immune Peptide Drug Engineering Technology Research Center, Key Laboratory of Precision Diagnosis and Treatment of Gastrointestinal Tumors, Ministry of Education, China Medical University, Shenyang 110122, P. R. China
| | - Huizhe Wu
- Department of Pharmacology, School of Pharmacy, China Medical University, Shenyang 110122, P. R. China.,Liaoning Key Laboratory of Molecular Targeted Anti-Tumor Drug Development and Evaluation, Liaoning Cancer Immune Peptide Drug Engineering Technology Research Center, Key Laboratory of Precision Diagnosis and Treatment of Gastrointestinal Tumors, Ministry of Education, China Medical University, Shenyang 110122, P. R. China
| |
Collapse
|
34
|
Bauer M, Vaxevanis C, Heimer N, Al-Ali HK, Jaekel N, Bachmann M, Wickenhauser C, Seliger B. Expression, Regulation and Function of microRNA as Important Players in the Transition of MDS to Secondary AML and Their Cross Talk to RNA-Binding Proteins. Int J Mol Sci 2020; 21:ijms21197140. [PMID: 32992663 PMCID: PMC7582632 DOI: 10.3390/ijms21197140] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 09/14/2020] [Accepted: 09/22/2020] [Indexed: 12/12/2022] Open
Abstract
Myelodysplastic syndromes (MDS), heterogeneous diseases of hematopoietic stem cells, exhibit a significant risk of progression to secondary acute myeloid leukemia (sAML) that are typically accompanied by MDS-related changes and therefore significantly differ to de novo acute myeloid leukemia (AML). Within these disorders, the spectrum of cytogenetic alterations and oncogenic mutations, the extent of a predisposing defective osteohematopoietic niche, and the irregularity of the tumor microenvironment is highly diverse. However, the exact underlying pathophysiological mechanisms resulting in hematopoietic failure in patients with MDS and sAML remain elusive. There is recent evidence that the post-transcriptional control of gene expression mediated by microRNAs (miRNAs), long noncoding RNAs, and/or RNA-binding proteins (RBPs) are key components in the pathogenic events of both diseases. In addition, an interplay between RBPs and miRNAs has been postulated in MDS and sAML. Although a plethora of miRNAs is aberrantly expressed in MDS and sAML, their expression pattern significantly depends on the cell type and on the molecular make-up of the sample, including chromosomal alterations and single nucleotide polymorphisms, which also reflects their role in disease progression and prediction. Decreased expression levels of miRNAs or RBPs preventing the maturation or inhibiting translation of genes involved in pathogenesis of both diseases were found. Therefore, this review will summarize the current knowledge regarding the heterogeneity of expression, function, and clinical relevance of miRNAs, its link to molecular abnormalities in MDS and sAML with specific focus on the interplay with RBPs, and the current treatment options. This information might improve the use of miRNAs and/or RBPs as prognostic markers and therapeutic targets for both malignancies.
Collapse
Affiliation(s)
- Marcus Bauer
- Institute of Pathology, Martin Luther University Halle-Wittenberg, 06112 Halle, Germany; (M.B.); (C.W.)
| | - Christoforos Vaxevanis
- Institute of Medical Immunology, Martin Luther University Halle-Wittenberg, Halle 06112, Germany; (C.V.); (N.H.)
| | - Nadine Heimer
- Institute of Medical Immunology, Martin Luther University Halle-Wittenberg, Halle 06112, Germany; (C.V.); (N.H.)
| | - Haifa Kathrin Al-Ali
- Department of Hematology/Oncology, University Hospital Halle, 06112 Halle, Germany; (H.K.A.-A.); (N.J.)
| | - Nadja Jaekel
- Department of Hematology/Oncology, University Hospital Halle, 06112 Halle, Germany; (H.K.A.-A.); (N.J.)
| | - Michael Bachmann
- Helmholtz-Zentrum Dresden Rossendorf, Institute of Radiopharmaceutical Cancer Research, 01328 Dresden, Germany;
| | - Claudia Wickenhauser
- Institute of Pathology, Martin Luther University Halle-Wittenberg, 06112 Halle, Germany; (M.B.); (C.W.)
| | - Barbara Seliger
- Institute of Medical Immunology, Martin Luther University Halle-Wittenberg, Halle 06112, Germany; (C.V.); (N.H.)
- Fraunhofer Institute for Cell Therapy and Immunology, 04103 Leipzig, Germany
- Correspondence: ; Tel.: +49-345-557-4054
| |
Collapse
|
35
|
Yang H, Deng Z, Pan X, Shen HB, Choi KS, Wang L, Wang S, Wu J. RNA-binding protein recognition based on multi-view deep feature and multi-label learning. Brief Bioinform 2020; 22:5893431. [PMID: 32808039 DOI: 10.1093/bib/bbaa174] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 06/17/2020] [Accepted: 07/09/2020] [Indexed: 12/28/2022] Open
Abstract
RNA-binding protein (RBP) is a class of proteins that bind to and accompany RNAs in regulating biological processes. An RBP may have multiple target RNAs, and its aberrant expression can cause multiple diseases. Methods have been designed to predict whether a specific RBP can bind to an RNA and the position of the binding site using binary classification model. However, most of the existing methods do not take into account the binding similarity and correlation between different RBPs. While methods employing multiple labels and Long Short Term Memory Network (LSTM) are proposed to consider binding similarity between different RBPs, the accuracy remains low due to insufficient feature learning and multi-label learning on RNA sequences. In response to this challenge, the concept of RNA-RBP Binding Network (RRBN) is proposed in this paper to provide theoretical support for multi-label learning to identify RBPs that can bind to RNAs. It is experimentally shown that the RRBN information can significantly improve the prediction of unknown RNA-RBP interactions. To further improve the prediction accuracy, we present the novel computational method iDeepMV which integrates multi-view deep learning technology under the multi-label learning framework. iDeepMV first extracts data from the views of amino acid sequence and dipeptide component based on the RNA sequences as the original view. Deep neural network models are then designed for the respective views to perform deep feature learning. The extracted deep features are fed into multi-label classifiers which are trained with the RNA-RBP interaction information for the three views. Finally, a voting mechanism is designed to make comprehensive decision on the results of the multi-label classifiers. Our experimental results show that the prediction performance of iDeepMV, which combines multi-view deep feature learning models with RNA-RBP interaction information, is significantly better than that of the state-of-the-art methods. iDeepMV is freely available at http://www.csbio.sjtu.edu.cn/bioinf/iDeepMV for academic use. The code is freely available at http://github.com/uchihayht/iDeepMV.
Collapse
Affiliation(s)
| | - Zhaohong Deng
- School of Artificial Intelligence and Computer Science of Jiangnan University, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (LCNBI) and ZJLab
| | - Xiaoyong Pan
- Department of Automation of Shanghai Jiao Tong University
| | | | | | - Lei Wang
- School of Biotechnology and Key Laboratory of Industrial Biotechnology Ministry in Jiangnan University
| | - Shitong Wang
- School of Artificial Intelligence and Computer Science of Jiangnan University
| | - Jing Wu
- School of Biotechnology and Key Laboratory of Industrial Biotechnology Ministry in Jiangnan University
| |
Collapse
|
36
|
Abstract
Deep neural networks have been revolutionizing the field of machine learning for the past several years. They have been applied with great success in many domains of the biomedical data sciences and are outperforming extant methods by a large margin. The ability of deep neural networks to pick up local image features and model the interactions between them makes them highly applicable to regulatory genomics. Instead of an image, the networks analyze DNA and RNA sequences and additional epigenomic data. In this review, we survey the successes of deep learning in the field of regulatory genomics. We first describe the fundamental building blocks of deep neural networks, popular architectures used in regulatory genomics, and their training process on molecular sequence data. We then review several key methods in different gene regulation domains. We start with the pioneering method DeepBind and its successors, which were developed to predict protein–DNA binding. We then review methods developed to predict and model epigenetic information, such as histone marks and nucleosome occupancy. Following epigenomics, we review methods to predict protein–RNA binding with its unique challenge of incorporating RNA structure information. Finally, we provide our overall view of the strengths and weaknesses of deep neural networks and prospects for future developments.
Collapse
Affiliation(s)
- Mira Barshai
- School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 8410501, Israel
| | - Eitamar Tripto
- Department of Biomedical Engineering, Ben-Gurion University of the Negev, Beer-Sheva 8410501, Israel
| | - Yaron Orenstein
- School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 8410501, Israel
| |
Collapse
|
37
|
Palomino‐Hernandez O, Margreiter MA, Rossetti G. Challenges in RNA Regulation in Huntington's Disease: Insights from Computational Studies. Isr J Chem 2020. [DOI: 10.1002/ijch.202000021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Oscar Palomino‐Hernandez
- Computational Biomedicine, Institute of Neuroscience and Medicine (INM-9)/Instute for advanced simulations (IAS-5)Forschungszentrum Juelich 52425 Jülich Germany
- Faculty 1RWTH Aachen 52425 Aachen Germany
- Computation-based Science and Technology Research CenterThe Cyprus Institute Nicosia 2121 Cyprus
- Institute of Life ScienceThe Hebrew University of Jerusalem Jerusalem 91904 Israel
| | - Michael A. Margreiter
- Computational Biomedicine, Institute of Neuroscience and Medicine (INM-9)/Instute for advanced simulations (IAS-5)Forschungszentrum Juelich 52425 Jülich Germany
- Faculty 1RWTH Aachen 52425 Aachen Germany
| | - Giulia Rossetti
- Computational Biomedicine, Institute of Neuroscience and Medicine (INM-9)/Instute for advanced simulations (IAS-5)Forschungszentrum Juelich 52425 Jülich Germany
- Jülich Supercomputing Centre (JSC)Forschungszentrum Jülich 52425 Jülich Germany
- Department of Hematology, Oncology, Hemostaseology and Stem Cell Transplantation University Hospital AachenRWTH Aachen University Pauwelsstraße 30 52074 Aachen Germany
| |
Collapse
|
38
|
Liu S, Li B, Liang Q, Liu A, Qu L, Yang J. Classification and function of RNA-protein interactions. WILEY INTERDISCIPLINARY REVIEWS-RNA 2020; 11:e1601. [PMID: 32488992 DOI: 10.1002/wrna.1601] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Revised: 04/15/2020] [Accepted: 04/29/2020] [Indexed: 12/11/2022]
Abstract
Almost all RNAs need to interact with proteins to fully exert their functions, and proteins also bind to RNAs to act as regulators. It has now become clear that RNA-protein interactions play important roles in many biological processes among organisms. Despite the great progress that has been made in the field, there is still no precise classification system for RNA-protein interactions, which makes it challenging to further decipher the functions and mechanisms of these interactions. In this review, we propose four different categories of RNA-protein interactions according to their basic characteristics: RNA motif-dependent RNA-protein interactions, RNA structure-dependent RNA-protein interactions, RNA modification-dependent RNA-protein interactions, and RNA guide-based RNA-protein interactions. Moreover, the integration of different types of RNA-protein interactions and the regulatory factors implicated in these interactions are discussed. Furthermore, we emphasize the functional diversity of these four types of interactions in biological processes and disease development and assess emerging trends in this exciting research field. This article is categorized under: RNA Interactions with Proteins and Other Molecules > Protein-RNA Interactions: Functional Implications RNA Interactions with Proteins and Other Molecules > Protein-RNA Recognition RNA Processing > RNA Editing and Modification.
Collapse
Affiliation(s)
- Shurong Liu
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Bin Li
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Qiaoxia Liang
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Anrui Liu
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Lianghu Qu
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Jianhua Yang
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, China.,Department of Interventional Medicine, The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, China
| |
Collapse
|
39
|
Interpreting and integrating big data in non-coding RNA research. Emerg Top Life Sci 2019; 3:343-355. [PMID: 33523206 DOI: 10.1042/etls20190004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Revised: 07/10/2019] [Accepted: 07/15/2019] [Indexed: 12/17/2022]
Abstract
In the last two decades, we have witnessed an impressive crescendo of non-coding RNA studies, due to both the development of high-throughput RNA-sequencing strategies and an ever-increasing awareness of the involvement of newly discovered ncRNA classes in complex regulatory networks. Together with excitement for the possibility to explore previously unknown layers of gene regulation, these advancements led to the realization of the need for shared criteria of data collection and analysis and for novel integrative perspectives and tools aimed at making biological sense of very large bodies of molecular information. In the last few years, efforts to respond to this need have been devoted mainly to the regulatory interactions involving ncRNAs as direct or indirect regulators of protein-coding mRNAs. Such efforts resulted in the development of new computational tools, allowing the exploitation of the information spread in numerous different ncRNA data sets to interpret transcriptome changes under physiological and pathological cell responses. While experimental validation remains essential to identify key RNA regulatory interactions, the integration of ncRNA big data, in combination with systematic literature mining, is proving to be invaluable in identifying potential new players, biomarkers and therapeutic targets in cancer and other diseases.
Collapse
|