Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Si J, Cui J, Cheng J, Wu R. Computational Prediction of RNA-Binding Proteins and Binding Sites. Int J Mol Sci 2015;16:26303-17. [PMID: 26540053 DOI: 10.3390/ijms161125952] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Revised: 10/20/2015] [Accepted: 10/23/2015] [Indexed: 11/19/2022] Open

For:	Si J, Cui J, Cheng J, Wu R. Computational Prediction of RNA-Binding Proteins and Binding Sites. Int J Mol Sci 2015;16:26303-17. [PMID: 26540053 DOI: 10.3390/ijms161125952] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Revised: 10/20/2015] [Accepted: 10/23/2015] [Indexed: 11/19/2022] Open

Number

Cited by Other Article(s)

Manea I, Casian M, Hosu-Stancioiu O, de-Los-Santos-Álvarez N, Lobo-Castañón MJ, Cristea C. A review on magnetic beads-based SELEX technologies: Applications from small to large target molecules. Anal Chim Acta 2024;1297:342325. [PMID: 38438246 DOI: 10.1016/j.aca.2024.342325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/18/2024] [Accepted: 02/01/2024] [Indexed: 03/06/2024]

Basu S, Zhao B, Biró B, Faraggi E, Gsponer J, Hu G, Kloczkowski A, Malhis N, Mirdita M, Söding J, Steinegger M, Wang D, Wang K, Xu D, Zhang J, Kurgan L. DescribePROT in 2023: more, higher-quality and experimental annotations and improved data download options. Nucleic Acids Res 2024;52:D426-D433. [PMID: 37933852 PMCID: PMC10767971 DOI: 10.1093/nar/gkad985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/12/2023] [Accepted: 10/16/2023] [Indexed: 11/08/2023] Open

Affiliation(s)

Sushmita Basu Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
Bi Zhao Genomics Program, College of Public Health, University of South Florida, Tampa, FL, USA
Bálint Biró Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA Department of Animal Biotechnology, Hungarian University of Agriculture and Life Sciences, Gödöllő, Hungary
Eshel Faraggi Physics Department, Indiana University, Indianapolis, IN, USA
Jörg Gsponer Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
Gang Hu School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, P.R. China
Andrzej Kloczkowski The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, USA
Nawar Malhis Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
Milot Mirdita School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
Johannes Söding Quantitative and Computational Biology, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
Martin Steinegger School of Biological Sciences, Seoul National University, Seoul, Republic of Korea Institute of Molecular Biology & Genetics, Seoul National University, Seoul, Republic of Korea Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
Duolin Wang Department of Electrical Engineer and Computer Science, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, USA
Kui Wang School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, P.R. China
Dong Xu Department of Electrical Engineer and Computer Science, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, USA
Jian Zhang School of Computer and Information Technology, Xinyang Normal University, Xinyang, P.R. China
Lukasz Kurgan Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA

Collapse

Song J, Kurgan L. Availability of web servers significantly boosts citations rates of bioinformatics methods for protein function and disorder prediction. BIOINFORMATICS ADVANCES 2023;3:vbad184. [PMID: 38146538 PMCID: PMC10749743 DOI: 10.1093/bioadv/vbad184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 12/08/2023] [Accepted: 12/15/2023] [Indexed: 12/27/2023]

Chen J, Gu Z, Lai L, Pei J. In silico protein function prediction: the rise of machine learning-based approaches. MEDICAL REVIEW (2021) 2023;3:487-510. [PMID: 38282798 PMCID: PMC10808870 DOI: 10.1515/mr-2023-0038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/11/2023] [Indexed: 01/30/2024]

Zhu H, Yang Y, Wang Y, Wang F, Huang Y, Chang Y, Wong KC, Li X. Dynamic characterization and interpretation for protein-RNA interactions across diverse cellular conditions using HDRNet. Nat Commun 2023;14:6824. [PMID: 37884495 PMCID: PMC10603054 DOI: 10.1038/s41467-023-42547-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 10/13/2023] [Indexed: 10/28/2023] Open

Agarwal A, Kant S, Bahadur RP. Efficient mapping of RNA-binding residues in RNA-binding proteins using local sequence features of binding site residues in protein-RNA complexes. Proteins 2023;91:1361-1379. [PMID: 37254800 DOI: 10.1002/prot.26528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 04/13/2023] [Accepted: 05/02/2023] [Indexed: 06/01/2023]

Abstract

Protein-RNA interactions play vital roles in plethora of biological processes such as regulation of gene expression, protein synthesis, mRNA processing and biogenesis. Identification of RNA-binding residues (RBRs) in proteins is essential to understand RNA-mediated protein functioning, to perform site-directed mutagenesis and to develop novel targeted drug therapies. Moreover, the extensive gap between sequence and structural data restricts the identification of binding sites in unsolved structures. However, efficient use of computational methods demanding only sequence to identify binding residues can bridge this huge sequence-structure gap. In this study, we have extensively studied protein-RNA interface in known RNA-binding proteins (RBPs). We find that the interface is highly enriched in basic and polar residues with Gly being the most common interface neighbor. We investigated several amino acid features and developed a method to predict putative RBRs from amino acid sequence. We have implemented balanced random forest (BRF) classifier with local residue features of protein sequences for prediction. With 5-fold cross-validations, the sequence pattern derived dipeptide composition based BRF model (DCP-BRF) resulted in an accuracy of 87.9%, specificity of 88.8%, sensitivity of 82.2%, Mathew's correlation coefficient of 0.60 and AUC of 0.93, performing better than few existing methods. We further validated our prediction model on known human RBPs through RBR prediction and could map ~54% of them. Further, knowledge of binding site preferences obtained from computational predictions combined with experimental validations of potential RNA binding sites can enhance our understanding of protein-RNA interactions. This may serve to accelerate investigations on functional roles of many novel RBPs.

Collapse

Zhang F, Li M, Zhang J, Kurgan L. HybridRNAbind: prediction of RNA interacting residues across structure-annotated and disorder-annotated proteins. Nucleic Acids Res 2023;51:e25. [PMID: 36629262 PMCID: PMC10018345 DOI: 10.1093/nar/gkac1253] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 11/22/2022] [Accepted: 12/15/2022] [Indexed: 01/12/2023] Open

Agarwal A, Bahadur RP. Modular architecture and functional annotation of human RNA-binding proteins containing RNA recognition motif. Biochimie 2023;209:116-130. [PMID: 36716848 DOI: 10.1016/j.biochi.2023.01.017] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 01/09/2023] [Accepted: 01/23/2023] [Indexed: 01/28/2023]

Wu Z, Basu S, Wu X, Kurgan L. qNABpredict: Quick, accurate, and taxonomy-aware sequence-based prediction of content of nucleic acid binding amino acids. Protein Sci 2023;32:e4544. [PMID: 36519304 PMCID: PMC9798252 DOI: 10.1002/pro.4544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 12/07/2022] [Accepted: 12/08/2022] [Indexed: 12/23/2022]

Wang Z, Dai Q, Song J, Duan X, Yang H, Yang Z. Predicting RBP Binding Sites of RNA With High-Order Encoding Features and CNN-BLSTM Hybrid Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:2409-2419. [PMID: 34038367 DOI: 10.1109/tcbb.2021.3083930] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Peng X, Wang X, Guo Y, Ge Z, Li F, Gao X, Song J. RBP-TSTL is a two-stage transfer learning framework for genome-scale prediction of RNA-binding proteins. Brief Bioinform 2022;23:6596984. [PMID: 35649392 PMCID: PMC9294422 DOI: 10.1093/bib/bbac215] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 04/25/2022] [Accepted: 05/06/2022] [Indexed: 11/27/2022] Open

Ribonomics Approaches to Identify RBPome in Plants and Other Eukaryotes: Current Progress and Future Prospects. Int J Mol Sci 2022;23:ijms23115923. [PMID: 35682602 PMCID: PMC9180120 DOI: 10.3390/ijms23115923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 05/16/2022] [Accepted: 05/20/2022] [Indexed: 02/01/2023] Open

Biró B, Zhao B, Kurgan L. Complementarity of the residue-level protein function and structure predictions in human proteins. Comput Struct Biotechnol J 2022;20:2223-2234. [PMID: 35615015 PMCID: PMC9118482 DOI: 10.1016/j.csbj.2022.05.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 05/02/2022] [Accepted: 05/02/2022] [Indexed: 11/24/2022] Open

Liu J, Gu Q, Du W, Feng Z, Zhang Q, Tian Y, Luo K, Gong Q, Tian X. Nucleolar RNA in action: Ultrastructure revealed during protein translation through a terpyridyl manganese(II) complex. Biosens Bioelectron 2022;203:114058. [DOI: 10.1016/j.bios.2022.114058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 12/21/2021] [Accepted: 01/28/2022] [Indexed: 11/02/2022]

Chalupová E, Vaculík O, Poláček J, Jozefov F, Majtner T, Alexiou P. ENNGene: an Easy Neural Network model building tool for Genomics. BMC Genomics 2022;23:248. [PMID: 35361122 PMCID: PMC8973509 DOI: 10.1186/s12864-022-08414-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 02/23/2022] [Indexed: 11/17/2022] Open

Abstract

Background

The recent big data revolution in Genomics, coupled with the emergence of Deep Learning as a set of powerful machine learning methods, has shifted the standard practices of machine learning for Genomics. Even though Deep Learning methods such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) are becoming widespread in Genomics, developing and training such models is outside the ability of most researchers in the field.

Results

Here we present ENNGene—Easy Neural Network model building tool for Genomics. This tool simplifies training of custom CNN or hybrid CNN-RNN models on genomic data via an easy-to-use Graphical User Interface. ENNGene allows multiple input branches, including sequence, evolutionary conservation, and secondary structure, and performs all the necessary preprocessing steps, allowing simple input such as genomic coordinates. The network architecture is selected and fully customized by the user, from the number and types of the layers to each layer's precise set-up. ENNGene then deals with all steps of training and evaluation of the model, exporting valuable metrics such as multi-class ROC and precision-recall curve plots or TensorBoard log files. To facilitate interpretation of the predicted results, we deploy Integrated Gradients, providing the user with a graphical representation of an attribution level of each input position. To showcase the usage of ENNGene, we train multiple models on the RBP24 dataset, quickly reaching the state of the art while improving the performance on more than half of the proteins by including the evolutionary conservation score and tuning the network per protein.

Conclusions

As the role of DL in big data analysis in the near future is indisputable, it is important to make it available for a broader range of researchers. We believe that an easy-to-use tool such as ENNGene can allow Genomics researchers without a background in Computational Sciences to harness the power of DL to gain better insights into and extract important information from the large amounts of data available in the field.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12864-022-08414-x.

Collapse

Wei J, Chen S, Zong L, Gao X, Li Y. Protein-RNA interaction prediction with deep learning: structure matters. Brief Bioinform 2022;23:bbab540. [PMID: 34929730 PMCID: PMC8790951 DOI: 10.1093/bib/bbab540] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 11/14/2021] [Accepted: 11/22/2021] [Indexed: 12/11/2022] Open

Niu M, Wu J, Zou Q, Liu Z, Xu L. rBPDL:Predicting RNA-Binding Proteins Using Deep Learning. IEEE J Biomed Health Inform 2021;25:3668-3676. [PMID: 33780344 DOI: 10.1109/jbhi.2021.3069259] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Mishra A, Khanal R, Kabir WU, Hoque T. AIRBP: Accurate identification of RNA-binding proteins using machine learning techniques. Artif Intell Med 2021;113:102034. [PMID: 33685590 DOI: 10.1016/j.artmed.2021.102034] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Revised: 01/19/2021] [Accepted: 02/09/2021] [Indexed: 12/25/2022]

Abstract

Identification of RNA-binding proteins (RBPs) that bind to ribonucleic acid molecules is an important problem in Computational Biology and Bioinformatics. It becomes indispensable to identify RBPs as they play crucial roles in post-transcriptional control of RNAs and RNA metabolism as well as have diverse roles in various biological processes such as splicing, mRNA stabilization, mRNA localization, and translation, RNA synthesis, folding-unfolding, modification, processing, and degradation. The existing experimental techniques for identifying RBPs are time-consuming and expensive. Therefore, identifying RBPs directly from the sequence using computational methods can be useful to annotate RBPs and assist the experimental design efficiently. In this work, we present a method called AIRBP, which is designed using an advanced machine learning technique, called stacking, to effectively predict RBPs by utilizing features extracted from evolutionary information, physiochemical properties, and disordered properties. Moreover, our method, AIRBP, use the majority vote from RBPPred, DeepRBPPred, and the stacking model for the prediction for RBPs. The results show that AIRBP attains Accuracy (ACC), Balanced Accuracy (BACC), F1-score, and Mathews Correlation Coefficient (MCC) of 95.84 %, 94.71 %, 0.928, and 0.899, respectively, based on the training dataset, using 10-fold cross-validation (CV). Further evaluation of AIRBP on independent test set reveals that it achieves ACC, BACC, F1-score, and MCC of 94.36 %, 94.28 %, 0.897, and 0.860, for Human test set; 91.25 %, 93.00 %, 0.896, and 0.835 for S. cerevisiae test set; and 90.60 %, 90.41 %, 0.934, and 0.775 for A. thaliana test set, respectively. These results indicate that the AIRBP outperforms the existing Deep- and TriPepSVM methods. Therefore, the proposed better-performing AIRBP can be useful for accurate identification and annotation of RBPs directly from the sequence and help gain valuable insight to treat critical diseases. Availability: Code-data is available here: http://cs.uno.edu/∼tamjid/Software/AIRBP/code_data.zip.

Collapse

Zhao B, Katuwawala A, Oldfield CJ, Dunker AK, Faraggi E, Gsponer J, Kloczkowski A, Malhis N, Mirdita M, Obradovic Z, Söding J, Steinegger M, Zhou Y, Kurgan L. DescribePROT: database of amino acid-level protein structure and function predictions. Nucleic Acids Res 2021;49:D298-D308. [PMID: 33119734 PMCID: PMC7778963 DOI: 10.1093/nar/gkaa931] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 09/11/2020] [Accepted: 10/05/2020] [Indexed: 12/30/2022] Open

Li K, Guo ZW, Zhai XM, Yang XX, Wu YS, Liu TC. RBPTD: a database of cancer-related RNA-binding proteins in humans. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020;2020:5734253. [PMID: 32047888 PMCID: PMC7012770 DOI: 10.1093/database/baz156] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Revised: 12/05/2019] [Accepted: 12/23/2019] [Indexed: 12/12/2022]

Navien TN, Thevendran R, Hamdani HY, Tang TH, Citartan M. In silico molecular docking in DNA aptamer development. Biochimie 2020;180:54-67. [PMID: 33086095 DOI: 10.1016/j.biochi.2020.10.005] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 09/23/2020] [Accepted: 10/14/2020] [Indexed: 12/21/2022]

PRIME-3D2D is a 3D2D model to predict binding sites of protein-RNA interaction. Commun Biol 2020;3:384. [PMID: 32678300 PMCID: PMC7366699 DOI: 10.1038/s42003-020-1114-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 06/29/2020] [Indexed: 11/08/2022] Open

Song J, Tian S, Yu L, Xing Y, Yang Q, Duan X, Dai Q. AC-Caps: Attention Based Capsule Network for Predicting RBP Binding Sites of LncRNA. Interdiscip Sci 2020;12:414-423. [PMID: 32572768 DOI: 10.1007/s12539-020-00379-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2020] [Revised: 05/18/2020] [Accepted: 05/30/2020] [Indexed: 01/03/2023]

Qiu J, Bernhofer M, Heinzinger M, Kemper S, Norambuena T, Melo F, Rost B. ProNA2020 predicts protein-DNA, protein-RNA, and protein-protein binding proteins and residues from sequence. J Mol Biol 2020;432:2428-2443. [PMID: 32142788 DOI: 10.1016/j.jmb.2020.02.026] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Revised: 02/17/2020] [Accepted: 02/23/2020] [Indexed: 11/29/2022]

Affiliation(s)

Jiajun Qiu Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748, Garching, Munich, Germany; TUM Graduate School, Center of Doctoral Studies in Informatics and Its Applications (CeDoSIA), Garching, 85748, Germany.
Michael Bernhofer Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748, Garching, Munich, Germany; TUM Graduate School, Center of Doctoral Studies in Informatics and Its Applications (CeDoSIA), Garching, 85748, Germany
Michael Heinzinger Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748, Garching, Munich, Germany; TUM Graduate School, Center of Doctoral Studies in Informatics and Its Applications (CeDoSIA), Garching, 85748, Germany
Sofie Kemper Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748, Garching, Munich, Germany
Tomas Norambuena Molecular Bioinformatics Laboratory, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Santiago, Chile
Francisco Melo Molecular Bioinformatics Laboratory, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Santiago, Chile; Institute of Biological and Medical Engineering, Pontificia Universidad Católica de Chile, Santiago, Chile
Burkhard Rost Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748, Garching, Munich, Germany; Columbia University, Department of Biochemistry and Molecular Biophysics, 701 West, 168th Street, New York, NY, 10032, USA; Institute of Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748, Garching/Munich, Germany; Germany & Institute for Food and Plant Sciences (WZW) Weihenstephan, Alte Akademie 8, 85354 Freising, Germany

Collapse

Protein-assisted RNA fragment docking (RnaX) for modeling RNA-protein interactions using ModelX. Proc Natl Acad Sci U S A 2019;116:24568-24573. [PMID: 31732673 PMCID: PMC6900601 DOI: 10.1073/pnas.1910999116] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open

Abstract

Protein–RNA interactions, key in biological processes, remained refractory to prediction algorithms. Here we present a new extension of the ModelX tool suite designed for this purpose. RNA–protein complexes in the Protein Data Bank were decomposed into small peptide–oligonucleotide interacting fragment pairs and used as building blocks to assemble big scaffolds representing complex RNA–protein interactions. This method has already been successful for designing DNA–protein and protein–protein interfaces. Areas under the curve up to 0.86 were achieved on binding site prediction showing the accuracy and coverage of our approach over established and in-house benchmarking sets. Together with FoldX protein design tool suite we were able to engineer backbone- and side chain-compatible interfaces using naked protein structures as input.

RNA–protein interactions are crucial for such key biological processes as regulation of transcription, splicing, translation, and gene silencing, among many others. Knowing where an RNA molecule interacts with a target protein and/or engineering an RNA molecule to specifically bind to a protein could allow for rational interference with these cellular processes and the design of novel therapies. Here we present a robust RNA–protein fragment pair-based method, termed RnaX, to predict RNA-binding sites. This methodology, which is integrated into the ModelX tool suite (http://modelx.crg.es), takes advantage of the structural information present in all released RNA–protein complexes. This information is used to create an exhaustive database for docking and a statistical forcefield for fast discrimination of true backbone-compatible interactions. RnaX, together with the protein design forcefield FoldX, enables us to predict RNA–protein interfaces and, when sufficient crystallographic information is available, to reengineer the interface at the sequence-specificity level by mimicking those conformational changes that occur on protein and RNA mutagenesis. These results, obtained at just a fraction of the computational cost of methods that simulate conformational dynamics, open up perspectives for the engineering of RNA–protein interfaces.

Collapse

Sagar A, Xue B. Recent Advances in Machine Learning Based Prediction of RNA-protein Interactions. Protein Pept Lett 2019;26:601-619. [PMID: 31215361 DOI: 10.2174/0929866526666190619103853] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Revised: 04/04/2019] [Accepted: 06/01/2019] [Indexed: 12/18/2022]

Vanmeert M, Razzokov J, Mirza MU, Weeks SD, Schepers G, Bogaerts A, Rozenski J, Froeyen M, Herdewijn P, Pinheiro VB, Lescrinier E. Rational design of an XNA ligase through docking of unbound nucleic acids to toroidal proteins. Nucleic Acids Res 2019;47:7130-7142. [PMID: 31334814 PMCID: PMC6649754 DOI: 10.1093/nar/gkz551] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 05/24/2019] [Accepted: 06/12/2019] [Indexed: 02/06/2023] Open

Katuwawala A, Ghadermarzi S, Kurgan L. Computational prediction of functions of intrinsically disordered regions. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2019;166:341-369. [PMID: 31521235 DOI: 10.1016/bs.pmbts.2019.04.006] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Insights into Telomerase/hTERT Alternative Splicing Regulation Using Bioinformatics and Network Analysis in Cancer. Cancers (Basel) 2019;11:cancers11050666. [PMID: 31091669 PMCID: PMC6562651 DOI: 10.3390/cancers11050666] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2019] [Revised: 05/10/2019] [Accepted: 05/13/2019] [Indexed: 01/08/2023] Open

Poursheikhali Asghari M, Abdolmaleki P. Prediction of RNA- and DNA-Binding Proteins Using Various Machine Learning Classifiers. Avicenna J Med Biotechnol 2019;11:104-111. [PMID: 30800250 PMCID: PMC6359699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Song J, Liu G, Wang R, Sun L, Zhang P. A novel method for predicting RNA-interacting residues in proteins using a combination of feature-based and sequence template-based methods. BIOTECHNOL BIOTEC EQ 2019. [DOI: 10.1080/13102818.2019.1612275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open

Jung Y, El-Manzalawy Y, Dobbs D, Honavar VG. Partner-specific prediction of RNA-binding residues in proteins: A critical assessment. Proteins 2018;87:198-211. [PMID: 30536635 PMCID: PMC6389706 DOI: 10.1002/prot.25639] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Revised: 10/10/2018] [Accepted: 11/29/2018] [Indexed: 01/06/2023]

Abstract

RNA-protein interactions play essential roles in regulating gene expression. While some RNA-protein interactions are "specific", that is, the RNA-binding proteins preferentially bind to particular RNA sequence or structural motifs, others are "non-RNA specific." Deciphering the protein-RNA recognition code is essential for comprehending the functional implications of these interactions and for developing new therapies for many diseases. Because of the high cost of experimental determination of protein-RNA interfaces, there is a need for computational methods to identify RNA-binding residues in proteins. While most of the existing computational methods for predicting RNA-binding residues in RNA-binding proteins are oblivious to the characteristics of the partner RNA, there is growing interest in methods for partner-specific prediction of RNA binding sites in proteins. In this work, we assess the performance of two recently published partner-specific protein-RNA interface prediction tools, PS-PRIP, and PRIdictor, along with our own new tools. Specifically, we introduce a novel metric, RNA-specificity metric (RSM), for quantifying the RNA-specificity of the RNA binding residues predicted by such tools. Our results show that the RNA-binding residues predicted by previously published methods are oblivious to the characteristics of the putative RNA binding partner. Moreover, when evaluated using partner-agnostic metrics, RNA partner-specific methods are outperformed by the state-of-the-art partner-agnostic methods. We conjecture that either (a) the protein-RNA complexes in PDB are not representative of the protein-RNA interactions in nature, or (b) the current methods for partner-specific prediction of RNA-binding residues in proteins fail to account for the differences in RNA partner-specific versus partner-agnostic protein-RNA interactions, or both.

Collapse

Hu W, Qin L, Li M, Pu X, Guo Y. Individually double minimum-distance definition of protein-RNA binding residues and application to structure-based prediction. J Comput Aided Mol Des 2018;32:1363-1373. [PMID: 30478757 DOI: 10.1007/s10822-018-0177-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Accepted: 11/14/2018] [Indexed: 01/01/2023]

Abstract

Identifying protein-RNA binding residues is essential for understanding the mechanism of protein-RNA interactions. So far, rigid distance thresholds are commonly used to define protein-RNA binding residues. However, after investigating 182 non-redundant protein-RNA complexes, we find that it would be unsuitable for a certain amount of complexes since the distances between proteins and RNAs vary widely. In this work, a novel definition method was proposed based on a flexible distance cutoff. This method can fully consider the individual differences among complexes by setting a variable tolerance limit of protein-RNA interactions, i.e. the double minimum-distance by which different distance thresholds are achieved for different complexes. In order to validate our method, a comprehensive comparison between our flexible method and traditional rigid methods was implemented in terms of interface structure, amino acid composition, interface area and interaction force, etc. The results indicate that this method is more reasonable because it incorporates the specificity of different complexes by extracting the important residues lost by rigid distance methods and discarding some redundant residues. Finally, to further test our double minimum-distance definition strategy, we developed a classifier to predict those binding sites derived from our new method by using structural features and a random forest machine learning algorithm. The model achieved a satisfactory prediction performance and the accuracy on independent data sets reaches to 85.0%. To the best of our knowledge, it is the first prediction model to define positive and negative samples using a flexible cutoff. So the comparison analysis and modeling results have demonstrated that our method would be a very promising strategy for more precisely defining protein-RNA binding sites.

Collapse

Moore KS, 't Hoen PAC. Computational approaches for the analysis of RNA-protein interactions: A primer for biologists. J Biol Chem 2018;294:1-9. [PMID: 30455357 DOI: 10.1074/jbc.rev118.004842] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Abbasi WA, Asif A, Ben-Hur A, Minhas FUAA. Learning protein binding affinity using privileged information. BMC Bioinformatics 2018;19:425. [PMID: 30442086 PMCID: PMC6238365 DOI: 10.1186/s12859-018-2448-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 10/25/2018] [Indexed: 01/04/2023] Open

Abstract

BACKGROUND

Determining protein-protein interactions and their binding affinity are important in understanding cellular biological processes, discovery and design of novel therapeutics, protein engineering, and mutagenesis studies. Due to the time and effort required in wet lab experiments, computational prediction of binding affinity from sequence or structure is an important area of research. Structure-based methods, though more accurate than sequence-based techniques, are limited in their applicability due to limited availability of protein structure data.

RESULTS

In this study, we propose a novel machine learning method for predicting binding affinity that uses protein 3D structure as privileged information at training time while expecting only protein sequence information during testing. Using the method, which is based on the framework of learning using privileged information (LUPI), we have achieved improved performance over corresponding sequence-based binding affinity prediction methods that do not have access to privileged information during training. Our experiments show that with the proposed framework which uses structure only during training, it is possible to achieve classification performance comparable to that which is obtained using structure-based features. Evaluation on an independent test set shows improved performance over the PPA-Pred2 method as well.

CONCLUSIONS

The proposed method outperforms several baseline learners and a state-of-the-art binding affinity predictor not only in cross-validation, but also on an additional validation dataset, demonstrating the utility of the LUPI framework for problems that would benefit from classification using structure-based features. The implementation of LUPI developed for this work is expected to be useful in other areas of bioinformatics as well.

Collapse

Zagrovic B, Bartonek L, Polyansky AA. RNA-protein interactions in an unstructured context. FEBS Lett 2018;592:2901-2916. [PMID: 29851074 PMCID: PMC6175095 DOI: 10.1002/1873-3468.13116] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Revised: 05/12/2018] [Accepted: 05/13/2018] [Indexed: 02/02/2023]

Chen F, Sun H, Wang J, Zhu F, Liu H, Wang Z, Lei T, Li Y, Hou T. Assessing the performance of MM/PBSA and MM/GBSA methods. 8. Predicting binding free energies and poses of protein-RNA complexes. RNA (NEW YORK, N.Y.) 2018;24:1183-1194. [PMID: 29930024 PMCID: PMC6097651 DOI: 10.1261/rna.065896.118] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Accepted: 06/13/2018] [Indexed: 05/10/2023]

Krüger A, Zimbres FM, Kronenberger T, Wrenger C. Molecular Modeling Applied to Nucleic Acid-Based Molecule Development. Biomolecules 2018;8:E83. [PMID: 30150587 PMCID: PMC6163985 DOI: 10.3390/biom8030083] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Revised: 08/12/2018] [Accepted: 08/16/2018] [Indexed: 12/15/2022] Open

Chowdhury S, Zhang J, Kurgan L. In Silico Prediction and Validation of Novel RNA Binding Proteins and Residues in the Human Proteome. Proteomics 2018;18:e1800064. [PMID: 29806170 DOI: 10.1002/pmic.201800064] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2018] [Revised: 05/05/2018] [Indexed: 12/22/2022]

Transcriptome-wide discovery of coding and noncoding RNA-binding proteins. Proc Natl Acad Sci U S A 2018;115:E3879-E3887. [PMID: 29636419 DOI: 10.1073/pnas.1718406115] [Citation(s) in RCA: 110] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open

Gomes CP, Salgado-Somoza A, Creemers EE, Dieterich C, Lustrek M, Devaux Y. Circular RNAs in the cardiovascular system. Noncoding RNA Res 2018;3:1-11. [PMID: 30159434 PMCID: PMC6084836 DOI: 10.1016/j.ncrna.2018.02.002] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 01/16/2018] [Accepted: 02/22/2018] [Indexed: 02/06/2023] Open

Hu W, Qin L, Li M, Pu X, Guo Y. A structural dissection of protein–RNA interactions based on different RNA base areas of interfaces. RSC Adv 2018;8:10582-10592. [PMID: 35540439 PMCID: PMC9078961 DOI: 10.1039/c8ra00598b] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2018] [Accepted: 03/05/2018] [Indexed: 11/21/2022] Open

Abstract

Protein–RNA interactions are very common cellular processes, but the mechanisms of interactions are not fully understood, mainly due to the complicated RNA structures. By the elaborate investigation on RNA structures of protein–RNA complexes, it was firstly found in this paper that RNAs in these complexes could be clearly classified into three classes (high, medium and low) based on the different levels of P_base (the percentage of base area buried in the RNA interface). In view of the three RNA classes, more detailed analyses on protein–RNA interactions were comprehensively performed from various aspects, including interface area, structure, composition and interaction force, so as to achieve a deeper understanding of the recognition specificity for the three classes of protein–RNA interactions. According to our classification strategy, the three complex classes have significant differences in terms of almost all properties. Complexes in the high class have short and extended RNA structures and behave like protein–ssDNA interactions. Their hydrogen bonds and hydrophobic interactions are strong. For complexes in low class, their RNA structures are mainly double-stranded, like protein–dsDNA interactions, and electrostatic interactions frequently occur. The complexes in medium class have the longest RNA chains and largest average interface area. Meanwhile, they do not show any preference for the interaction force. On average, in terms of composition, secondary structures and intermolecular physicochemical properties, significant feature preferences can be observed in high and low complexes, but no highly specific features are found for medium complexes. We found that our proposed P_base is an important parameter which can be used as a new determinant to distinguish protein–RNA complexes. For high and low complexes, we can more easily understand the specificity of the recognition process from the interface features than for medium complexes. In the future, medium complexes should be our research focus to further structurally analyze from more feature aspects. Overall, this study may contribute to further understanding of the mechanism of protein–RNA interactions on a more detailed level.

Qualitative and quantitative measurements of the influence of structure and composition of RNA interfaces on protein–RNA interactions.

Collapse

Sharan M, Förstner KU, Eulalio A, Vogel J. APRICOT: an integrated computational pipeline for the sequence-based identification and characterization of RNA-binding proteins. Nucleic Acids Res 2017;45:e96. [PMID: 28334975 PMCID: PMC5499795 DOI: 10.1093/nar/gkx137] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2016] [Accepted: 02/27/2017] [Indexed: 11/14/2022] Open

Tawk C, Sharan M, Eulalio A, Vogel J. A systematic analysis of the RNA-targeting potential of secreted bacterial effector proteins. Sci Rep 2017;7:9328. [PMID: 28839189 PMCID: PMC5570926 DOI: 10.1038/s41598-017-09527-0] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 06/27/2017] [Indexed: 12/15/2022] Open

Review: Regulation of the cancer epigenome by long non-coding RNAs. Cancer Lett 2017;407:106-112. [PMID: 28400335 DOI: 10.1016/j.canlet.2017.03.040] [Citation(s) in RCA: 69] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Revised: 03/22/2017] [Accepted: 03/29/2017] [Indexed: 12/31/2022]

Sequence-based discrimination of protein-RNA interacting residues using a probabilistic approach. J Theor Biol 2017;418:77-83. [DOI: 10.1016/j.jtbi.2017.01.040] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2016] [Revised: 01/06/2017] [Accepted: 01/27/2017] [Indexed: 11/22/2022]

Mihailovic MK, Chen A, Gonzalez-Rivera JC, Contreras LM. Defective Ribonucleoproteins, Mistakes in RNA Processing, and Diseases. Biochemistry 2017;56:1367-1382. [PMID: 28206738 DOI: 10.1021/acs.biochem.6b01134] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Walia RR, El-Manzalawy Y, Honavar VG, Dobbs D. Sequence-Based Prediction of RNA-Binding Residues in Proteins. Methods Mol Biol 2017;1484:205-235. [PMID: 27787829 PMCID: PMC5796408 DOI: 10.1007/978-1-4939-6406-2_15] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]

De-novo protein function prediction using DNA binding and RNA binding proteins as a test case. Nat Commun 2016;7:13424. [PMID: 27869118 PMCID: PMC5121330 DOI: 10.1038/ncomms13424] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2016] [Accepted: 10/03/2016] [Indexed: 12/14/2022] Open

Ryder SP. Protein-mRNA interactome capture: cartography of the mRNP landscape. F1000Res 2016;5:2627. [PMID: 29098073 PMCID: PMC5642310 DOI: 10.12688/f1000research.9404.1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/27/2016] [Indexed: 12/21/2022] Open